When a preflight ping fails or previous_response_not_found is returned,
sub2api drops previous_response_id and retries. But if the payload
contains function_call_output (tool results), the upstream API loses
the response chain context needed to match tool_result to tool_use,
causing 400: "No tool call found for function call output".
Add hasFunctionCallOutput checks to both recovery paths:
- Preflight ping failure recovery (forcePreferredConn path)
- recoverIngressPrevResponseNotFound function
Two bugs identified from article "Claude Code封号真相":
1. cch=00000 never replaced (fix):
signBillingHeaderCCH was gated by enableCCH (default false), so the
cch=00000 placeholder injected by buildBillingAttributionBlockJSON was
sent to Anthropic as-is — an obvious fake signal. The function already
guards itself via regex match, so the enableCCH gate is removed.
2. NormalizeSystemPromptEnv was dead code (fix):
Platform/Shell/OS Version/Working directory fields in user system prompts
leaked real machine info (e.g. "Darwin 25.3.0", "/Users/win/...") that
Anthropic could use to correlate requests across accounts. Now normalized
to canonical values before injecting into the messages pair.
The forwardAsRawChatCompletions path (used when APIKey accounts target
upstreams that don't support Responses API, e.g. DeepSeek) was missing
reasoning_effort and service_tier extraction, causing all reasoning
effort values to be silently dropped.
Extract both from the raw Chat Completions body before forwarding, and
propagate them through streamRawChatCompletions / bufferRawChatCompletions
to the OpenAIForwardResult.
Fix four issues flagged by copilot-pull-request-reviewer on PR #2143:
1. Probe URL missing /v1 prefix (openai_apikey_responses_probe.go)
Replaced bare TrimSuffix + "/responses" with buildOpenAIResponsesURL(),
which handles bare domain → /v1/responses correctly. Affected:
- ProbeOpenAIAPIKeyResponsesSupport (probe URL)
- TestAccount endpoint (apiURL for APIKey accounts)
2. Create endpoint not triggering probe (account_handler.go)
Capture created account from idempotent closure and call
scheduleOpenAIResponsesProbe after success, same pattern as
BatchCreate and Update.
3. Tests (openai_gateway_chat_completions_raw_test.go)
Added TestBuildOpenAIChatCompletionsURL (7 cases covering
bare domain, /v1 suffix, trailing slash, third-party domains,
whitespace) and TestBuildOpenAIResponsesURL_ProbeURL (6 cases
locking the probe URL construction for bare-domain inputs).
All unit tests pass; go build ./cmd/server/ clean.
Address self-review findings:
R7: Use a narrow per-trust-domain header allowlist for CC raw forwarding.
The previously reused openaiAllowedHeaders contains Codex client-only headers
(originator/session_id/x-codex-turn-state/x-codex-turn-metadata/conversation_id)
that must not leak to third-party OpenAI-compatible upstreams (DeepSeek/Kimi/
GLM/Qwen). Strict upstreams may 400 with 'unknown parameter'; lenient ones
silently pollute their request statistics. New openaiCCRawAllowedHeaders only
allows generic HTTP headers (accept-language, user-agent); content-type/
authorization/accept are set explicitly by callers.
R4: Drop the dead includeUsage parameter from streamRawChatCompletions.
The CC pass-through path doesn't need to inspect the client's stream_options
flag — the upstream handles it and we only extract usage when it appears in
chunks. Killing the unused parameter removes a misleading 'parameter read
but discarded' code smell.
Sediment refs:
- pensieve/short-term/maxims/dont-reuse-shared-headers-whitelist-across-different-upstream-trust-domains
- pensieve/short-term/knowledge/openai-gateway-shared-state-quirks
- pensieve/short-term/pipelines/run-when-self-reviewing-forwarder-implementation
OpenAI APIKey accounts with base_url pointing to third-party OpenAI-compatible
upstreams (DeepSeek, Kimi, GLM, Qwen, etc.) were failing because the gateway
unconditionally converted Chat Completions requests to Responses format and
forwarded to {base_url}/v1/responses, which only exists on OpenAI's official
endpoint.
Detection-based routing:
- Probe upstream capability on account create/update via a minimal POST to
/v1/responses; HTTP 404/405 means 'unsupported', any other response means
'supported'.
- Persist result as accounts.extra.openai_responses_supported (bool).
- ForwardAsChatCompletions branches at function entry: APIKey accounts with
explicit support=false go through new forwardAsRawChatCompletions which
passthrough-forwards CC body to /v1/chat/completions without protocol
conversion.
Default behavior for accounts without the marker preserves the legacy
'always Responses' path — existing OpenAI APIKey accounts that were working
before this change continue to work without modification (the 'reality is
evidence' principle: an account that has been running implies upstream
capability).
Probe is fired async after Create / Update / BatchCreate; failures only log,
never block the admin flow. BulkUpdate omitted (low signal of base_url
changes; can be added if needed).
Implementation:
- New pkg internal/pkg/openai_compat: marker key + ShouldUseResponsesAPI
- New service file openai_apikey_responses_probe.go: probe + persist
- New service file openai_gateway_chat_completions_raw.go: CC pass-through
- Account test endpoint short-circuits with explicit message for
probed-unsupported accounts (full CC test path is a TODO)
Zero schema changes, zero migrations, zero frontend changes, zero wire
modifications — all wired through existing AccountTestService injection.
Closes: DeepSeek-OpenAI account (id=128) production failure
Backend: Fix three race conditions in SetSnapshot that caused account
scheduling anomalies and broken sticky sessions:
- Use Lua CAS script for atomic version activation, preventing version
rollback when concurrent goroutines write snapshots simultaneously
- Add UnlockBucket to release rebuild lock immediately after completion
instead of waiting 30s TTL expiry
- Replace immediate DEL of old snapshots with 60s EXPIRE grace period,
preventing readers from hitting empty ZRANGE during version switches
Frontend: Remove serial queue throttle (1-2s delay per request) from
usage loading since backend now uses passive sampling. All usage
requests execute immediately in parallel.
The errcheck linter flagged an unchecked type assertion on
item["type"].(string). Use the two-value form with require.True
to satisfy the linter and fail clearly on unexpected types.
- Security: force token_uri to Google default, preventing SSRF via crafted service account JSON
- Dedup: extract shared getVertexServiceAccountAccessToken() to eliminate ~35 lines of duplication between ClaudeTokenProvider and GeminiTokenProvider
- Fix: apply model mapping + Vertex model ID normalization in forward_as_responses and forward_as_chat_completions paths
- Fix: exclude service_account from AI Studio endpoint selection (Vertex cannot serve generativelanguage.googleapis.com)
- Feature: add model restriction/mapping UI for service_account in EditAccountModal
- Dedup: extract VERTEX_LOCATION_OPTIONS to shared constants
- i18n: replace all hardcoded Chinese strings in Vertex UI with translation keys
(*net.OpError).Error() concatenates Source/Addr fields, so the previous
disconnectMsg surfaced internal source IP/port and upstream server address
to clients via SSE error frames and UpstreamFailoverError.ResponseBody
(reported by @Wei-Shaw on PR #2066).
- Add sanitizeStreamError that maps known errors (io.ErrUnexpectedEOF,
context.Canceled, syscall.ECONNRESET/EPIPE/ETIMEDOUT/...) to fixed
descriptions and falls back to a generic placeholder, with an explicit
*net.OpError branch that drops Source/Addr fields entirely.
- Use sanitized message in client-facing disconnectMsg; full ev.err is
still preserved in the existing operator log line for diagnosis.
- Tests cover net.OpError redaction, the failover ResponseBody path, and
every known sanitized error mapping.