stabey 89dffdd2e1 fix(apicompat): emit OpenAI-semantic input_tokens when converting Anthropic to Responses
Anthropic Messages reports input_tokens excluding cache_read/cache_creation, but
OpenAI Responses input_tokens is the total including cached tokens. The reverse
converter passed Anthropic's input_tokens straight through, so client-facing
prompt_tokens/input_tokens were short by the cached count and cache_creation
was dropped entirely.

Fix the non-stream path and the streaming state machine to add cache_read +
cache_creation back into input_tokens, and track CacheCreationInputTokens on
the streaming state. Six downstream paths benefit (Anthropic->Responses,
Anthropic->ChatCompletions, Gemini->ChatCompletions, each sync + stream).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 22:36:52 +08:00
..