DeepSeek thinking-mode tool-call conversations may require the assistant
reasoning_content from previous turns to be sent back in later requests. Without
preserving it, those conversations can fail or lose reasoning context.
Changes:
- Preserve assistant reasoning_content when converting Chat Completions messages
to Responses input by wrapping it as a thinking block.
- Add regression coverage for non-streaming DeepSeek responses.
- Add regression coverage for streaming DeepSeek deltas.
- Add regression coverage that request-side messages[].reasoning_content is
passed through with tool calls.
Tests:
go test -tags=unit ./internal/pkg/apicompat ./internal/service -run
'TestChatCompletionsToResponses_AssistantReasoningContentPreserved|
TestChatCompletionsToResponses_AssistantThinkingTagPreserved|
TestForwardAsRawChatCompletions_PreservesDeepSeekReasoningContent|
TestForwardAsRawChatCompletions_ForcesStreamUsageUpstreamAndPassesUsageDownstream
When a chat-completions message has no usable content parts (empty array,
empty text part, or filtered-out image part), marshalChatInputContent
marshalled a nil slice to JSON null. The upstream Responses API rejects a
null content field with HTTP 400. Fall back to an empty string instead.
Fixes#2515
- apply default mapped model only when scheduling fallback is actually used
- preserve reasoning in OpenAI-compatible output via reasoning_content and avoid invalid input function_call ids