Errcheck flagged three unchecked strings.Builder.WriteString calls and
gofmt rejected over-aligned trailing comment in the route table.
Rewrite writeResponsesFailedSSE with json.Marshal on typed structs
instead of Builder+strconv.Quote. Same wire format, but:
- no unchecked Write returns to silence
- strict JSON escaping (strconv.Quote emits \a and \v which are not
valid JSON; Marshal handles all runes correctly)
- omitempty model field via struct tag instead of conditional Builder
- consistent with the json.Marshal style used elsewhere in handler/
Collapse trailing comment whitespace in stream_error_event_test.go to
satisfy gofmt.
All 30+ subtests in the package still pass.
Case B: when a slot wait flushes SSE ping comments first (Writer.Written
becomes true), the previous ensureForwardErrorResponse short-circuited
on `c.Writer.Written()` and returned false without notifying the client.
Subsequent upstream errors (http2 timeout, stream INTERNAL_ERROR, etc.)
produced silent EOF; Codex CLI reported "stream closed before
response.completed" just like the user-slot timeout case.
Remove the Written() early return; coerce streamStarted to true when
Writer has already been written to, and let handleStreamingAwareError
walk the existing logic — which now (thanks to the previous commits)
emits a protocol-compliant response.failed for /responses paths and the
legacy `event: error` for others.
Update tests that previously asserted "do not override written response":
the new contract is to *append* an SSE terminal frame so the client sees
a clean close instead of EOF. recoverResponsesPanic inherits this fix.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The first revision compared GetInboundEndpoint(c) against EndpointResponses
("/v1/responses"). NormalizeInboundEndpoint only recognizes paths that
contain the literal "/v1/responses" substring, but the project actually
registers six /responses routes — three of which (top-level
r.POST("/responses", ...) and codexDirect's "/backend-api/codex/responses")
have FullPath values without the "/v1" prefix and therefore fall through
to the default branch.
Codex CLI users targeting the bare /responses route at the production
deployment (observed 2026-05-24 ~11:05 UTC, user 16) never reached the
new writeResponsesFailedSSE path: the endpoint check was false, the
legacy `event: error` frame fired, and the strict SDK kept reporting
"stream closed before response.completed".
Replace the strict equality check with inboundIsResponses(c), which
uses suffix detection on FullPath (falling back to URL.Path when
FullPath is empty in test fixtures) and covers all six route variants:
/v1/responses[/...]
/responses[/...]
/backend-api/codex/responses[/...]
Add test table covering all routes plus negative cases.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When /v1/responses streaming hits the user/account concurrency wait, the
wait loop sends SSE ping comments to keep the connection alive, which
flushes HTTP 200 + headers. If the wait then times out (or any other
post-flush error fires), handleStreamingAwareError previously emitted a
generic `event: error` frame. Codex CLI requires the stream to end with
a Responses terminal event (response.completed/failed/incomplete/cancelled),
so it reports "stream closed before response.completed" and the user-facing
rate-limit intent is lost.
This change detects inbound = /v1/responses in both handleStreamingAwareError
implementations and emits a protocol-compliant response.failed event whose
field set mirrors apicompat.makeResponsesCompletedEvent
(id/object/model/status/output/error). The synthetic id reuses
ctxkey.RequestID so client errors can be grepped against server logs.
sequence_number is intentionally omitted to preserve monotonicity on streams
that already emitted real events.
Other inbound endpoints (/v1/chat/completions, /v1/messages) keep their
legacy formats untouched.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Export ApplyBedrockCCCompat() in GatewayService, called after channel
model mapping to ensure mapped model ID is used for Opus 4.7+ detection
- Add sanitizeBedrockCCFields(): remove service_tier/interface_geo/
context_management, inject max_tokens/anthropic_version defaults
- Add sanitizeBedrockCCBetaTokens(): filter anthropic_beta to keep only
Bedrock-supported tokens, reusing autoInjectBedrockBetaTokens and
filterBedrockBetaTokens for consistent rules
- Remove unsupported beta tokens (interleaved-thinking, context-management)
from whitelist based on AWS official docs
- Simplify IsBedrockCCCompatEnabled() to check boolean toggle directly,
applying CC compat to all accounts regardless of platform
- Add unit tests for IsBedrockCCCompatEnabled (8 cases),
sanitizeBedrockCCFields (8 cases), sanitizeBedrockCCBetaTokens (7 cases)
- Update bedrock beta policy tests for removed auto-injection