13 Commits

Author SHA1 Message Date
Wesley Liddick
bbe847ed3e
Merge pull request #2805 from StarryKira/feat/configurable-pool-retry-status-codes
feat(account): configurable pool-mode same-account retry status codes
2026-05-27 22:09:55 +08:00
Wesley Liddick
61ce79533e
Merge pull request #2800 from wucm667/fix/scheduler-model-not-found-per-model-cooldown
fix(scheduler): 模型 404 仅冷却该账号-模型组合,不再封整个账号
2026-05-27 21:01:52 +08:00
StarryKira
21033dceb9 feat(account): configurable pool-mode same-account retry status codes
Pool mode currently retries the same account for a fixed set of
upstream HTTP statuses: 401, 403, 429. Some upstream pool deployments
also need same-account retry for transient provider/proxy statuses
such as 502, 503, 520, 529, but hard-coding more statuses changes
behavior for everyone.

Add a per-account credentials option `pool_mode_retry_status_codes`
that lets admins choose which upstream HTTP status codes trigger
same-account retry in pool mode:

- Unset (default): preserve the current 401/403/429 default
- Explicit list: override the defaults with the configured codes
- Codes normalized to the 100-599 range, deduplicated, sorted

The standalone `isPoolModeRetryableStatus` helper is kept as the
default-only fallback. All 15 gateway call sites switch to the new
`Account.IsPoolModeRetryableStatus` method so behavior is preserved
for accounts that do not configure the new field.

Frontend admin UI gains a "Retry Status Codes" comma-separated input
under the pool-mode section in both Create/Edit account modals
(en + zh i18n).

Fixes #2731

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 11:24:25 -07:00
wucm667
a31b507484 fix(scheduler): 模型404仅冷却账号模型组合 2026-05-26 20:29:48 +08:00
benjamin
9c56fe0b0b fix(openai): mark fast-policy entrypoints business-limited
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-26 17:21:45 +08:00
mt21625457
33ac8eb27d fix openai http2 response header timeout 2026-05-26 13:57:59 +08:00
shaw
1e406fed52 fix: optimize OpenAI account cooldown scheduling 2026-05-23 10:18:43 +08:00
wucm667
6381f9e37d fix(openai): 识别上游静默拒绝并触发 failover 2026-05-19 15:48:36 +08:00
wucm667
679c0865a0 fix(openai): handle versioned compatible base URLs 2026-05-13 11:25:15 +08:00
shaw
4cbf518f0a fix: preserve raw chat completions usage billing 2026-05-03 23:31:43 +08:00
alfadb
57099a6af6 fix(openai-gateway): extract reasoning_effort in raw Chat Completions path
The forwardAsRawChatCompletions path (used when APIKey accounts target
upstreams that don't support Responses API, e.g. DeepSeek) was missing
reasoning_effort and service_tier extraction, causing all reasoning
effort values to be silently dropped.

Extract both from the raw Chat Completions body before forwarding, and
propagate them through streamRawChatCompletions / bufferRawChatCompletions
to the OpenAIForwardResult.
2026-05-02 10:22:16 +08:00
alfadb-bot
4d145300c3 fixup! fix(openai-gateway): route APIKey accounts to /v1/chat/completions when upstream lacks Responses API
Address self-review findings:

R7: Use a narrow per-trust-domain header allowlist for CC raw forwarding.
The previously reused openaiAllowedHeaders contains Codex client-only headers
(originator/session_id/x-codex-turn-state/x-codex-turn-metadata/conversation_id)
that must not leak to third-party OpenAI-compatible upstreams (DeepSeek/Kimi/
GLM/Qwen). Strict upstreams may 400 with 'unknown parameter'; lenient ones
silently pollute their request statistics. New openaiCCRawAllowedHeaders only
allows generic HTTP headers (accept-language, user-agent); content-type/
authorization/accept are set explicitly by callers.

R4: Drop the dead includeUsage parameter from streamRawChatCompletions.
The CC pass-through path doesn't need to inspect the client's stream_options
flag — the upstream handles it and we only extract usage when it appears in
chunks. Killing the unused parameter removes a misleading 'parameter read
but discarded' code smell.

Sediment refs:
- pensieve/short-term/maxims/dont-reuse-shared-headers-whitelist-across-different-upstream-trust-domains
- pensieve/short-term/knowledge/openai-gateway-shared-state-quirks
- pensieve/short-term/pipelines/run-when-self-reviewing-forwarder-implementation
2026-04-30 20:16:44 +08:00
alfadb-bot
4e4cc80971 fix(openai-gateway): route APIKey accounts to /v1/chat/completions when upstream lacks Responses API
OpenAI APIKey accounts with base_url pointing to third-party OpenAI-compatible
upstreams (DeepSeek, Kimi, GLM, Qwen, etc.) were failing because the gateway
unconditionally converted Chat Completions requests to Responses format and
forwarded to {base_url}/v1/responses, which only exists on OpenAI's official
endpoint.

Detection-based routing:
- Probe upstream capability on account create/update via a minimal POST to
  /v1/responses; HTTP 404/405 means 'unsupported', any other response means
  'supported'.
- Persist result as accounts.extra.openai_responses_supported (bool).
- ForwardAsChatCompletions branches at function entry: APIKey accounts with
  explicit support=false go through new forwardAsRawChatCompletions which
  passthrough-forwards CC body to /v1/chat/completions without protocol
  conversion.

Default behavior for accounts without the marker preserves the legacy
'always Responses' path — existing OpenAI APIKey accounts that were working
before this change continue to work without modification (the 'reality is
evidence' principle: an account that has been running implies upstream
capability).

Probe is fired async after Create / Update / BatchCreate; failures only log,
never block the admin flow. BulkUpdate omitted (low signal of base_url
changes; can be added if needed).

Implementation:
- New pkg internal/pkg/openai_compat: marker key + ShouldUseResponsesAPI
- New service file openai_apikey_responses_probe.go: probe + persist
- New service file openai_gateway_chat_completions_raw.go: CC pass-through
- Account test endpoint short-circuits with explicit message for
  probed-unsupported accounts (full CC test path is a TODO)

Zero schema changes, zero migrations, zero frontend changes, zero wire
modifications — all wired through existing AccountTestService injection.

Closes: DeepSeek-OpenAI account (id=128) production failure
2026-04-30 19:25:45 +08:00