182 Commits

Author SHA1 Message Date
alfadb
ddf91e9a7f fix(gateway): 按最终 anthropic-beta header 对 body.context_management 做能力维度 sanitize
上游 Anthropic 在 body 含 `context_management` 但最终发出去的
`anthropic-beta` header 不含 `context-management-2025-06-27` 时会拒收:

    {
      "type": "invalid_request_error",
      "message": "context_management: Extra inputs are not permitted"
    }
    (HTTP 400, request_id 形如 req_011C...)

该 400 在 haiku 路径上触发,因为三个 beta header 构造器有意排除了
context-management beta:

  - HaikuBetaHeader          (messages,     OAuth / mimic CC)
  - APIKeyHaikuBetaHeader    (messages,     API-key)
  - CountTokensBetaHeader    (count_tokens, 所有认证类型)

但 body 中仍然带着 `context_management` 字段,原因有二:

  1. normalizeClaudeOAuthRequestBody 在 thinking_enabled / thinking_adaptive
     打开时为 `clear_thinking_20251015` 主动注入;
  2. 客户端 (Claude Code CLI >= 2.1.87) 原样发送, 网关透传时一并转发。

修复方案: 能力维度对称约束
==========================

对齐已有的 Bedrock 模式
(`backend/internal/service/bedrock_request.go` 中的
`sanitizeBedrockFieldsForBetaTokens`):
根据 **最终** 发出的 `anthropic-beta` header 决定是否保留
`body.context_management`, 而不是按 model 名或路由分类来决定。

新增纯函数:

    sanitizeAnthropicBodyForBetaTokens(body, betaHeader) (body, changed)

如果 `betaHeader` 不含 `context-management-2025-06-27`, 用 sjson 把 body
字段 strip 掉; 否则原样返回。

在所有 Anthropic / Anthropic-兼容 上游出口都接入:

| 路径                                       | sanitize 接入点                                       |
|--------------------------------------------|-------------------------------------------------------|
| /v1/messages OAuth mimic CC                | buildUpstreamRequest                                  |
| /v1/messages OAuth 真 CC 透传              | buildUpstreamRequest                                  |
| /v1/messages API-key                       | buildUpstreamRequest                                  |
| /v1/messages API-key passthrough           | buildUpstreamRequestAnthropicAPIKeyPassthrough        |
| /v1/messages Vertex / service-account      | buildUpstreamRequestAnthropicVertex                   |
| /v1/messages/count_tokens (全部 4 条路径)  | buildCountTokensRequest,                              |
|                                            | buildCountTokensRequestAnthropicAPIKeyPassthrough     |
| Antigravity Anthropic-兼容 上游            | AntigravityGatewayService.ForwardUpstream             |
| Bedrock                                    | (已由 sanitizeBedrockFieldsForBetaTokens 处理)        |

为什么要重排 (而不是加一行调用)
================================

sanitize 必须 **在** `signBillingHeaderCCH` 之前运行。CCH 对整个 body
取 xxHash64 摘要后写入 billing header 里 5 位十六进制的 `cch` 字段;
如果先签名再 strip, 上游对发出去的 body 重算 hash 会和 `cch` 不一致,
请求被判为 third-party。这就要求在 `http.NewRequest` 之前算出最终的
`anthropic-beta` header, 所以把原本内联在 builder 里的 beta 计算逻辑
抽成了两个纯函数:

  - computeFinalAnthropicBeta              (messages 路径: mimic 不透传
                                            客户端 beta)
  - computeFinalCountTokensAnthropicBeta   (count_tokens 路径: mimic 不
                                            跳过白名单透传)

两者逐位保留原行为:

  - mimic 路径在 messages 上跳过客户端 beta, 在 count_tokens 上合并
  - API-key 路径尊重 `InjectBetaForAPIKey` 开关
  - dropSet (`defaultDroppedBetasSet` + BetaPolicy filter) 应用在主路径,
    passthrough / Vertex 路径有意不应用 —— 这条原有的不对称行为本 PR
    不动。

一条语义测试 (`TestSanitizeMustBeBeforeCCHSigning_HashConsistency`) 把
顺序约束文档化并强制守住: 它证明 `sanitize -> signBillingHeaderCCH`
产生的 `cch` 与最终 body 一致, 而 `signBillingHeaderCCH -> sanitize`
产生的 `cch` 会被上游 hash 重算判失败。

为什么是能力维度 (而不是 haiku 模型名匹配)
==========================================

最朴素的"按 model 名 strip"方案
(`strings.Contains(modelID, "haiku") -> DeleteBytes "context_management"`)
有四个真实失败模式:

  1. 过度删除。CLI >= 2.1.87 的真 Claude Code 客户端在 haiku 上同时
     发送 body 字段 **和** `anthropic-beta: context-management-2025-06-27`。
     一律 strip 会让该用户的 `clear_thinking_20251015` 静默失效。
  2. 别名漂移。未来的 haiku 别名 (`claude-3-haiku-...`,
     `claude-haiku-...` 等) 改变匹配面; 任何新别名都会悄悄绕过 strip。
  3. count_tokens 漏覆盖。count_tokens 有自己的 builder 和不同的 beta
     header 集合; 在一个地方做 model 名检查会漏掉这条路径。
  4. API-key passthrough 早退。passthrough builder 在 model 名 strip
     之前就 return 了, strip 根本不执行。

能力维度沿着 header 端到端走, 上述 4 个 case 都由构造方式保证正确,
不依赖任何 modelID 匹配。

防御项
======

  - 当 `sjson.DeleteBytes` 在 `gjson` 刚验证过字段存在的 body 上失败时,
    `sanitizeAnthropicBodyForBetaTokens` 会记 warning 日志 —— 这种情况
    现实中仅在请求中途被破坏时发生, 日志把此前会静默发生的 body / header
    不一致暴露出来。
  - `header_util.go` 新增 `deleteHeaderAllForms`: 在白名单透传已经写入
    canonical 大小写的 `Anthropic-Beta` 之后再覆盖, 否则会同时留下两条。

测试
====

`backend/internal/service` 下新增 44 个测试:

  - 纯函数: anthropicBetaTokensContains x 5, sanitize keep/strip x 6,
    computeFinal{Anthropic,CountTokens}AnthropicBeta x 12
  - normalize 回归 x 5
  - buildUpstreamRequest 端到端 x 4
      (OAuth mimic haiku strip / mimic sonnet preserve /
       真 CC haiku 带客户端 beta preserve / API-key haiku strip)
  - buildCountTokensRequest 端到端 x 2
  - buildUpstreamRequestAnthropicAPIKeyPassthrough x 2 (strip / preserve)
  - buildCountTokensRequestAnthropicAPIKeyPassthrough x 2 (strip / preserve)
  - buildUpstreamRequestAnthropicVertex x 2 (strip / preserve, 含 outgoing
    `anthropic-beta` header 对称断言)
  - CCH 顺序语义测试 x 1

unit 套件全过 (本机 88s), `golangci-lint` 0 issues。

已知局限 (本 PR 范围外)
========================

  - Vertex 路径用透传过来的客户端 `anthropic-beta` header 作为 sanitize
    依据, 而不是 Vertex 侧的能力矩阵。最坏情况是过度 strip (= 当前 main
    的行为, 主路径本来什么都不 strip); 不是 regression。完整的 Vertex
    能力模型属于单独的 PR。
  - Vertex builder 仍然不应用 BetaPolicy filter / dropSet。这是该 builder
    早 return 的既有架构决策, 本 PR 不动。
  - count_tokens mimic 在 haiku 上仍然注入 `context-management-2025-06-27`
    (因为原 count_tokens mimic 逻辑并不像 messages mimic 那样排除它)。
    本 PR 逐位保留 main 的行为; 是否要让它与 messages mimic 的排除策略
    统一是另一个问题。
  - `sanitizeAnthropicBodyForBetaTokens` 目前只处理
    `context_management <-> context-management-2025-06-27` 这一对。如果
    Anthropic 后续推出更多 beta-gated body 字段, 可以在后续 PR 重构为
    `{body 路径 -> required beta token}` 注册表的形式。
2026-05-28 00:02:50 +08:00
Wesley Liddick
61ce79533e
Merge pull request #2800 from wucm667/fix/scheduler-model-not-found-per-model-cooldown
fix(scheduler): 模型 404 仅冷却该账号-模型组合,不再封整个账号
2026-05-27 21:01:52 +08:00
Pluviobyte
1e6d0b602a
fix(antigravity): capture message_start input_tokens in streaming passthrough
The antigravity upstream-passthrough path (account.Type == AccountTypeUpstream
forwarding to a Claude-format upstream) drains the SSE stream via
streamUpstreamResponse + extractSSEUsage. The extractor only reads top-level
event["usage"], which matches Anthropic's message_delta but misses
message_start where usage is nested under event.message.usage.

As a result, every streaming /v1/messages request through this path drops
the input-side fields (input_tokens, cache_read_input_tokens, cache_creation_*)
and writes a usage_logs row with input_tokens=0 + output_tokens>0. The user
in #2332 observed 2,728 such rows attributed to claude-opus-4-6 / haiku-4-5
streaming requests; their billing on output is correct but the input-side
accounting is missing. (Their "duplicate write from message_delta" hypothesis
isn't borne out by the code — RecordUsage is invoked once per request and
writeUsageLogBestEffort dedupes by request_id; what they're seeing is
single records produced by this buggy extractor.)

Branch on event.type so message_start reads from event.message.usage and
other events keep using event.usage, matching how parseSSEUsagePassthrough
already handles both shapes for the Anthropic OAuth / API-key / Bedrock paths.
Adds two extractSSEUsage table cases plus a TestExtractSSEUsage_StreamingSequence
that drives the message_start → message_delta sequence end-to-end; both fail
on main and pass with this change.

Fixes #2332

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-27 09:02:15 +00:00
wucm667
a31b507484 fix(scheduler): 模型404仅冷却账号模型组合 2026-05-26 20:29:48 +08:00
benjamin
47fe90eab4 fix(antigravity): mark whitelist denials business-limited
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-26 17:19:37 +08:00
name
2eb622f2f6 Remove ops retry replay storage 2026-05-19 19:37:41 +08:00
2ue
4840194b18 Fix lint issues in image billing change 2026-05-12 16:04:24 +08:00
2ue
bb4c1abe28 Fix image billing size normalization 2026-05-12 15:21:31 +08:00
erio
3ee6f085db refactor: extract internal500 penalty logic to dedicated file
Move constants, detection, and penalty functions from
antigravity_gateway_service.go to antigravity_internal500_penalty.go.
Fix gofmt alignment and replace hardcoded duration strings with
constant references.
2026-03-27 20:11:24 +08:00
erio
7cca69a136 fix: move internal500 counter reset to cover all success paths
Move the reset logic after urlFallbackLoop so it covers both direct
success and smart retry (429/503) success paths.
2026-03-27 20:11:24 +08:00
erio
093a5a260e feat(antigravity): progressive penalty for consecutive INTERNAL 500 errors
When an antigravity account returns 500 "Internal error encountered."
on all 3 retry attempts, increment a Redis counter and apply escalating
penalties:
- 1st round: temp unschedulable 10 minutes
- 2nd round: temp unschedulable 10 hours
- 3rd round: permanently mark as error

Counter resets on any successful response (< 400).
2026-03-27 20:11:24 +08:00
Ethan0x0000
db9021f9c1 feat(ops): propagate endpoint/request-type context in handlers; add UpstreamURL to upstream error events 2026-03-21 23:47:39 +08:00
Ethan0x0000
bac408044f fix(provider): preserve requested model in antigravity and sora
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-21 01:24:30 +08:00
erio
528ff5d28c fix(antigravity): fast-fail on proxy unavailable, temp-unschedule account
## Problem

When a proxy is unreachable, token refresh retries up to 4 times with
30s timeout each, causing requests to hang for ~2 minutes before
failing with a generic 502 error. The failed account is not marked,
so subsequent requests keep hitting it.

## Changes

### Proxy connection fast-fail
- Set TCP dial timeout to 5s and TLS handshake timeout to 5s on
  antigravity client, so proxy connectivity issues fail within 5s
  instead of 30s
- Reduce overall HTTP client timeout from 30s to 10s
- Export `IsConnectionError` for service-layer use
- Detect proxy connection errors in `RefreshToken` and return
  immediately with "proxy unavailable" error (no retries)

### Token refresh temp-unschedulable
- Add 8s context timeout for token refresh on request path
- Mark account as temp-unschedulable for 10min when refresh fails
  (both background `TokenRefreshService` and request-path
  `GetAccessToken`)
- Sync temp-unschedulable state to Redis cache for immediate
  scheduler effect
- Inject `TempUnschedCache` into `AntigravityTokenProvider`

### Account failover
- Return `UpstreamFailoverError` on `GetAccessToken` failure in
  `Forward`/`ForwardGemini` to trigger handler-level account switch
  instead of returning 502 directly

### Proxy probe alignment
- Apply same 5s dial/TLS timeout to shared `httpclient` pool
- Reduce proxy probe timeout from 30s to 10s
2026-03-19 23:48:37 +08:00
erio
a6f99cf534 refactor(antigravity): unify TestConnection with dispatch retry loop
TestConnection now reuses antigravityRetryLoop instead of a standalone
HTTP loop, gaining credits overages, smart retry, and 429/503 backoff
for free. AccountSwitchError is caught and surfaced as a friendly
message. Also populates RateLimitedModel in TempUnscheduled switch error.

Test fixes:
- Use RATE_LIMIT_EXCEEDED in 503 short-delay test to avoid 60x1s timeout
- Clamp waitDuration=0 instead of 999s to avoid 15s max-wait timeout
- Enhance mockSmartRetryUpstream with repeatLast and body caching
2026-03-17 01:47:08 +08:00
kunish
d795734352
fix(antigravity): add stream keepalive to prevent connection drops
Antigravity streaming handlers were missing the keepalive mechanism
that exists in the standard gateway, causing proxy/CDN idle timeouts
to break connections during long thinking phases (e.g. claude-opus-4-6).
This resulted in truncated responses with missing tool calls.

Add StreamKeepaliveInterval support to all three Antigravity streaming
paths: Claude SSE, Gemini SSE, and upstream passthrough.
2026-03-16 17:37:15 +08:00
erio
0d2061b268 fix: remove ClaudeMax references not yet in upstream/main
Remove SimulateClaudeMaxEnabled field and related logic from
admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
from antigravity_gateway_service.go. These symbols are not yet
available in upstream/main.
2026-03-16 05:01:42 +08:00
erio
8a260defc2 refactor: replace sync.Map credits state with AICredits rate limit key
Replace process-memory sync.Map + per-model runtime state with a single
"AICredits" key in model_rate_limits, making credits exhaustion fully
isomorphic with model-level rate limiting.

Scheduler: rate-limited accounts with overages enabled + credits available
are now scheduled instead of excluded.

Forwarding: when model is rate-limited + credits available, inject credits
proactively without waiting for a 429 round trip.

Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.

Frontend: show credits_active (yellow ) when model rate-limited but
credits available, credits_exhausted (red) when AICredits key active.

Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
clearCreditsExhausted, and update existing overages tests.
2026-03-16 04:58:58 +08:00
SilentFlower
17e4033340 feat: implement resolveCreditsOveragesModelKey function to stabilize model key resolution for credit overages 2026-03-16 04:58:12 +08:00
haruka
25cb5e7505 fix 第一次 400,第二次触发切账号信号 2026-03-12 11:30:53 +08:00
haruka
f44927b9f8 add test for fix #935 2026-03-12 11:04:14 +08:00
shaw
a3791104f9 feat: 支持后台设置是否启用整流开关 2026-03-07 21:55:38 +08:00
Elysia
65a106792a fix issue #791 2026-03-06 20:37:09 +08:00
yangjianbo
bb664d9bbf feat(sync): full code sync from release 2026-02-28 15:01:20 +08:00
erio
a6f9f9f968 feat: replace gemini-3-pro-image with gemini-3.1-flash-image
- Add migration 060 to update model_mapping for all antigravity accounts
- Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
- Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
- Update frontend usage window to show GImage for new model
- Update isImageGenerationModel to support new model
2026-02-27 09:52:50 +08:00
erio
4573868c08 fix(antigravity): bill with mapped model and use final model key for rate limiting
- Use mapped model (billingModel) instead of original request model for billing
- Use resolveFinalAntigravityModelKey for 429 rate limit model key,
  ensuring rate limit records match the actual upstream model
- Add regression tests for both fixes
2026-02-24 18:08:19 +08:00
erio
0dacdf480b fix: distinguish client disconnection from upstream retry failure
Before this change, when a client disconnected mid-request, the error
message was "Upstream request failed after retries", which is misleading
and pollutes error logs. Now we check context.Err() to return a more
accurate "Client disconnected" message for both Claude and Gemini
forward paths.
2026-02-24 16:45:08 +08:00
yangjianbo
41d0383fb7 merge(test): 合并 main 并解决前端筛选器冲突 2026-02-15 22:04:06 +08:00
程序猿MT
1cf51b14f7
Merge branch 'Wei-Shaw:main' into main 2026-02-15 20:49:14 +08:00
shaw
a817cafe3d feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费
Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h
两种缓存创建 token,1h 单价远高于 5m(如 claude-3-5-haiku: 5m=$1/MTok,
1h=$6/MTok)。此前系统统一按 5m 单价计费,导致计费偏低。

后端:
- pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr
- billing_service: GetModelPricing 启用分类计费(安全守卫 1h>5m),
  CalculateCost 按 5m/1h 分别计费,无明细时回退到 5m 单价
- gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson
  提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens
- antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取
- usage_log: 修复 GORM column tag 确保写入正确的数据库列
- 新增迁移 054: 删除 GORM 自动生成的重复列

前端:
- 使用记录 tooltip 展示 5m/1h 缓存创建明细(带彩色 badge 区分)
- 表格单元格缓存写入数值旁显示 1h 标识
2026-02-14 18:15:35 +08:00
yangjianbo
abf5de69fb Merge branch 'main' into test 2026-02-12 23:43:47 +08:00
程序猿MT
174d7c774d
Merge branch 'Wei-Shaw:main' into main 2026-02-12 23:12:41 +08:00
yangjianbo
584cfc3db2 chore(logging): 完成后端日志审计与结构化迁移
- 将高密度服务与处理器日志迁移到新日志系统(LegacyPrintf/结构化日志)
- 增加 stdlog bridge 与兼容测试,保留旧日志捕获能力
- 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获
- 补齐后端相关文件 logger 引用并通过全量 go test
2026-02-12 19:01:09 +08:00
程序猿MT
8da5fac69e
Merge branch 'Wei-Shaw:main' into main 2026-02-11 18:39:52 +08:00
SilentFlower
19cca11e00 [UPDATE] 增强 Claude Thinking 模式支持与 Opus 4.6 动态预算适配
 feat(antigravity): 支持 thinking adaptive 类型并适配 Opus 4.6 动态预算
🧪 test(gateway): 增加 thinking 模式解析与签名块过滤的边界用例测试
2026-02-11 10:31:16 +08:00
Edric Li
2a1067c82b Merge remote-tracking branch 'upstream/main' 2026-02-10 21:52:33 +08:00
Edric Li
a54b81cf74 perf: 错误处理性能优化
- MatchRule 延迟/限制 body ToLower,先用 statusCode 短路,只在需要关键词匹配时转换且限制 8KB
- 预计算规则的小写关键词/平台和 error code set,消除运行时重复 ToLower 和线性扫描
- MODEL_CAPACITY_EXHAUSTED 全局去重,避免并发请求重复重试同一模型
- 503 重试 body 读取限制从 2MB 降至 8KB
- time.After 替换为 time.NewTimer,防止 context 取消时 timer 泄漏
2026-02-10 21:40:31 +08:00
Edric Li
2d4236f76e fix: 修复错误透传规则 skip_monitoring 未生效的问题
- ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查
- ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent,中间重试/故障转移事件也能触发跳过标记
- gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey
- antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用
2026-02-10 20:56:01 +08:00
yangjianbo
3b0910f664 Merge branch 'main' into test-sora 2026-02-10 18:01:17 +08:00
程序猿MT
1dd3158c7e
Merge branch 'Wei-Shaw:main' into main 2026-02-10 13:55:51 +08:00
song
1f647b120a feat(antigravity): 转发与测试支持daily/prod单URL切换 2026-02-10 13:51:29 +08:00
Edric Li
7d0a30fa8f merge: sync upstream main (antigravity single-account 503 retry)
合并上游新增的 Antigravity 单账号 503 退避重试机制,
解决与本地 MODEL_CAPACITY_EXHAUSTED 逻辑的冲突,两者共存。
2026-02-10 12:00:21 +08:00
shaw
5dd83d3cf2 fix: 移除特定system以适配新版cc客户端缓存失效的bug 2026-02-10 10:28:34 +08:00
Wesley Liddick
14e1aac9b5
Merge pull request #533 from GuangYiDing/feat/antigravity-single-account-503-retry
feat: Antigravity 单账号分组 503 退避重试机制
2026-02-10 09:59:48 +08:00
yangjianbo
58912d4ac5 perf(backend): 使用 gjson/sjson 优化热路径 JSON 处理
将 API 网关热路径中的 json.Unmarshal+json.Marshal 替换为 gjson 零拷贝查询和 sjson 精准写入:
- unwrapV1InternalResponse 性能提升 22x(4009ns→182ns),内存分配减少 28.5x
- unwrapGeminiResponse、extractGeminiUsage、estimateGeminiCountTokens、ParseGeminiRateLimitResetTime 改为接收 []byte 使用 gjson 提取
- ParseGatewayRequest 的 model/stream/metadata/thinking/max_tokens 改用 gjson 类型安全提取
- Handler 层(sora/openai)改用 gjson 提取字段、sjson 注入/修改字段,移除 map[string]any 中间变量
- Sora Client 响应解析改用 gjson ForEach 遍历,减少内存分配
- 新增约 100 个单元测试用例,所有改动函数覆盖率 >85%

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 08:59:30 +08:00
Edric Li
6114f69cca feat: MODEL_CAPACITY_EXHAUSTED 使用固定1s间隔重试60次,不切换账号
MODEL_CAPACITY_EXHAUSTED (503) 表示模型容量不足,所有账号共享同一容量池,
切换账号无意义。改为固定1s间隔重试最多60次,重试耗尽后直接返回上游错误。

- 新增 antigravityModelCapacityRetryMaxAttempts=60 和 antigravityModelCapacityRetryWait=1s
- shouldTriggerAntigravitySmartRetry 新增 isModelCapacityExhausted 返回值
- handleSmartRetry 对 MODEL_CAPACITY_EXHAUSTED 使用独立重试策略
- handleModelRateLimit 对 MODEL_CAPACITY_EXHAUSTED 仅标记 Handled,不设限流
- 重试耗尽后不设置模型限流、不清除粘性会话、不切换账号
2026-02-10 02:03:06 +08:00
Edric Li
d6c2921f2b feat: same-account retry before failover for transient errors
For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.

- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
  all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute
2026-02-10 00:53:54 +08:00
Edric Li
61c73287dc feat: failover and temp-unschedule on empty stream response
- Empty stream responses now return UpstreamFailoverError instead of
  plain 502, triggering automatic account switching (up to 10 retries)
- Add tempUnscheduleEmptyResponse: accounts returning empty responses
  are temp-unscheduled for 30 minutes
- Apply to both Claude and Gemini non-streaming paths
- Align googleConfigErrorCooldown from 60m to 30m for consistency
2026-02-09 23:25:30 +08:00
Edric Li
89905ec43d feat: failover and temp-unschedule on Google "Invalid project resource name" 400
Google 后端间歇性返回 400 "Invalid project resource name" 错误,
此前该错误直接透传给客户端且不触发账号切换,导致请求失败。

- 在 Antigravity 和 Gemini 两个平台的所有转发路径中,
  精确匹配该错误消息后触发 failover 自动换号重试
- 命中后将账号临时封禁 1 小时,避免反复调度到同一故障账号
- 提取共享函数 isGoogleProjectConfigError / tempUnscheduleGoogleConfigError
  消除跨 Service 的代码重复
2026-02-09 22:48:32 +08:00
yangjianbo
16131c3d3f Merge branch 'main' of https://github.com/mt21625457/aicodex2api 2026-02-09 20:26:03 +08:00