sub2api

Author	SHA1	Message	Date
Wesley Liddick	61ce79533e	Merge pull request #2800 from wucm667/fix/scheduler-model-not-found-per-model-cooldown fix(scheduler): 模型 404 仅冷却该账号-模型组合，不再封整个账号	2026-05-27 21:01:52 +08:00
Pluviobyte	1e6d0b602a	fix(antigravity): capture message_start input_tokens in streaming passthrough The antigravity upstream-passthrough path (account.Type == AccountTypeUpstream forwarding to a Claude-format upstream) drains the SSE stream via streamUpstreamResponse + extractSSEUsage. The extractor only reads top-level event["usage"], which matches Anthropic's message_delta but misses message_start where usage is nested under event.message.usage. As a result, every streaming /v1/messages request through this path drops the input-side fields (input_tokens, cache_read_input_tokens, cache_creation_*) and writes a usage_logs row with input_tokens=0 + output_tokens>0. The user in #2332 observed 2,728 such rows attributed to claude-opus-4-6 / haiku-4-5 streaming requests; their billing on output is correct but the input-side accounting is missing. (Their "duplicate write from message_delta" hypothesis isn't borne out by the code — RecordUsage is invoked once per request and writeUsageLogBestEffort dedupes by request_id; what they're seeing is single records produced by this buggy extractor.) Branch on event.type so message_start reads from event.message.usage and other events keep using event.usage, matching how parseSSEUsagePassthrough already handles both shapes for the Anthropic OAuth / API-key / Bedrock paths. Adds two extractSSEUsage table cases plus a TestExtractSSEUsage_StreamingSequence that drives the message_start → message_delta sequence end-to-end; both fail on main and pass with this change. Fixes #2332 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-27 09:02:15 +00:00
wucm667	a31b507484	fix(scheduler): 模型404仅冷却账号模型组合	2026-05-26 20:29:48 +08:00
benjamin	47fe90eab4	fix(antigravity): mark whitelist denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:37 +08:00
name	2eb622f2f6	Remove ops retry replay storage	2026-05-19 19:37:41 +08:00
2ue	4840194b18	Fix lint issues in image billing change	2026-05-12 16:04:24 +08:00
2ue	bb4c1abe28	Fix image billing size normalization	2026-05-12 15:21:31 +08:00
erio	3ee6f085db	refactor: extract internal500 penalty logic to dedicated file Move constants, detection, and penalty functions from antigravity_gateway_service.go to antigravity_internal500_penalty.go. Fix gofmt alignment and replace hardcoded duration strings with constant references.	2026-03-27 20:11:24 +08:00
erio	7cca69a136	fix: move internal500 counter reset to cover all success paths Move the reset logic after urlFallbackLoop so it covers both direct success and smart retry (429/503) success paths.	2026-03-27 20:11:24 +08:00
erio	093a5a260e	feat(antigravity): progressive penalty for consecutive INTERNAL 500 errors When an antigravity account returns 500 "Internal error encountered." on all 3 retry attempts, increment a Redis counter and apply escalating penalties: - 1st round: temp unschedulable 10 minutes - 2nd round: temp unschedulable 10 hours - 3rd round: permanently mark as error Counter resets on any successful response (< 400).	2026-03-27 20:11:24 +08:00
Ethan0x0000	db9021f9c1	feat(ops): propagate endpoint/request-type context in handlers; add UpstreamURL to upstream error events	2026-03-21 23:47:39 +08:00
Ethan0x0000	bac408044f	fix(provider): preserve requested model in antigravity and sora Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-03-21 01:24:30 +08:00
erio	528ff5d28c	fix(antigravity): fast-fail on proxy unavailable, temp-unschedule account ## Problem When a proxy is unreachable, token refresh retries up to 4 times with 30s timeout each, causing requests to hang for ~2 minutes before failing with a generic 502 error. The failed account is not marked, so subsequent requests keep hitting it. ## Changes ### Proxy connection fast-fail - Set TCP dial timeout to 5s and TLS handshake timeout to 5s on antigravity client, so proxy connectivity issues fail within 5s instead of 30s - Reduce overall HTTP client timeout from 30s to 10s - Export `IsConnectionError` for service-layer use - Detect proxy connection errors in `RefreshToken` and return immediately with "proxy unavailable" error (no retries) ### Token refresh temp-unschedulable - Add 8s context timeout for token refresh on request path - Mark account as temp-unschedulable for 10min when refresh fails (both background `TokenRefreshService` and request-path `GetAccessToken`) - Sync temp-unschedulable state to Redis cache for immediate scheduler effect - Inject `TempUnschedCache` into `AntigravityTokenProvider` ### Account failover - Return `UpstreamFailoverError` on `GetAccessToken` failure in `Forward`/`ForwardGemini` to trigger handler-level account switch instead of returning 502 directly ### Proxy probe alignment - Apply same 5s dial/TLS timeout to shared `httpclient` pool - Reduce proxy probe timeout from 30s to 10s	2026-03-19 23:48:37 +08:00
erio	a6f99cf534	refactor(antigravity): unify TestConnection with dispatch retry loop TestConnection now reuses antigravityRetryLoop instead of a standalone HTTP loop, gaining credits overages, smart retry, and 429/503 backoff for free. AccountSwitchError is caught and surfaced as a friendly message. Also populates RateLimitedModel in TempUnscheduled switch error. Test fixes: - Use RATE_LIMIT_EXCEEDED in 503 short-delay test to avoid 60x1s timeout - Clamp waitDuration=0 instead of 999s to avoid 15s max-wait timeout - Enhance mockSmartRetryUpstream with repeatLast and body caching	2026-03-17 01:47:08 +08:00
kunish	d795734352	fix(antigravity): add stream keepalive to prevent connection drops Antigravity streaming handlers were missing the keepalive mechanism that exists in the standard gateway, causing proxy/CDN idle timeouts to break connections during long thinking phases (e.g. claude-opus-4-6). This resulted in truncated responses with missing tool calls. Add StreamKeepaliveInterval support to all three Antigravity streaming paths: Claude SSE, Gemini SSE, and upstream passthrough.	2026-03-16 17:37:15 +08:00
erio	0d2061b268	fix: remove ClaudeMax references not yet in upstream/main Remove SimulateClaudeMaxEnabled field and related logic from admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage, applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls from antigravity_gateway_service.go. These symbols are not yet available in upstream/main.	2026-03-16 05:01:42 +08:00
erio	8a260defc2	refactor: replace sync.Map credits state with AICredits rate limit key Replace process-memory sync.Map + per-model runtime state with a single "AICredits" key in model_rate_limits, making credits exhaustion fully isomorphic with model-level rate limiting. Scheduler: rate-limited accounts with overages enabled + credits available are now scheduled instead of excluded. Forwarding: when model is rate-limited + credits available, inject credits proactively without waiting for a 429 round trip. Storage: credits exhaustion stored as model_rate_limits["AICredits"] with 5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey. Frontend: show credits_active (yellow ⚡) when model rate-limited but credits available, credits_exhausted (red) when AICredits key active. Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes, clearCreditsExhausted, and update existing overages tests.	2026-03-16 04:58:58 +08:00
SilentFlower	17e4033340	feat: implement resolveCreditsOveragesModelKey function to stabilize model key resolution for credit overages	2026-03-16 04:58:12 +08:00
haruka	25cb5e7505	fix 第一次 400，第二次触发切账号信号	2026-03-12 11:30:53 +08:00
haruka	f44927b9f8	add test for fix #935	2026-03-12 11:04:14 +08:00
shaw	a3791104f9	feat: 支持后台设置是否启用整流开关	2026-03-07 21:55:38 +08:00
Elysia	65a106792a	fix issue #791	2026-03-06 20:37:09 +08:00
yangjianbo	bb664d9bbf	feat(sync): full code sync from release	2026-02-28 15:01:20 +08:00
erio	a6f9f9f968	feat: replace gemini-3-pro-image with gemini-3.1-flash-image - Add migration 060 to update model_mapping for all antigravity accounts - Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings - Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings - Update frontend usage window to show GImage for new model - Update isImageGenerationModel to support new model	2026-02-27 09:52:50 +08:00
erio	4573868c08	fix(antigravity): bill with mapped model and use final model key for rate limiting - Use mapped model (billingModel) instead of original request model for billing - Use resolveFinalAntigravityModelKey for 429 rate limit model key, ensuring rate limit records match the actual upstream model - Add regression tests for both fixes	2026-02-24 18:08:19 +08:00
erio	0dacdf480b	fix: distinguish client disconnection from upstream retry failure Before this change, when a client disconnected mid-request, the error message was "Upstream request failed after retries", which is misleading and pollutes error logs. Now we check context.Err() to return a more accurate "Client disconnected" message for both Claude and Gemini forward paths.	2026-02-24 16:45:08 +08:00
yangjianbo	41d0383fb7	merge(test): 合并 main 并解决前端筛选器冲突	2026-02-15 22:04:06 +08:00
程序猿MT	1cf51b14f7	Merge branch 'Wei-Shaw:main' into main	2026-02-15 20:49:14 +08:00
shaw	a817cafe3d	feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费 Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h 两种缓存创建 token，1h 单价远高于 5m（如 claude-3-5-haiku: 5m=$1/MTok, 1h=$6/MTok）。此前系统统一按 5m 单价计费，导致计费偏低。后端： - pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr - billing_service: GetModelPricing 启用分类计费（安全守卫 1h>5m）， CalculateCost 按 5m/1h 分别计费，无明细时回退到 5m 单价 - gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson 提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens - antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取 - usage_log: 修复 GORM column tag 确保写入正确的数据库列 - 新增迁移 054: 删除 GORM 自动生成的重复列前端： - 使用记录 tooltip 展示 5m/1h 缓存创建明细（带彩色 badge 区分） - 表格单元格缓存写入数值旁显示 1h 标识	2026-02-14 18:15:35 +08:00
yangjianbo	abf5de69fb	Merge branch 'main' into test	2026-02-12 23:43:47 +08:00
程序猿MT	174d7c774d	Merge branch 'Wei-Shaw:main' into main	2026-02-12 23:12:41 +08:00
yangjianbo	584cfc3db2	chore(logging): 完成后端日志审计与结构化迁移 - 将高密度服务与处理器日志迁移到新日志系统（LegacyPrintf/结构化日志） - 增加 stdlog bridge 与兼容测试，保留旧日志捕获能力 - 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获 - 补齐后端相关文件 logger 引用并通过全量 go test	2026-02-12 19:01:09 +08:00
程序猿MT	8da5fac69e	Merge branch 'Wei-Shaw:main' into main	2026-02-11 18:39:52 +08:00
SilentFlower	19cca11e00	[UPDATE] 增强 Claude Thinking 模式支持与 Opus 4.6 动态预算适配 ✨ feat(antigravity): 支持 thinking adaptive 类型并适配 Opus 4.6 动态预算 🧪 test(gateway): 增加 thinking 模式解析与签名块过滤的边界用例测试	2026-02-11 10:31:16 +08:00
Edric Li	2a1067c82b	Merge remote-tracking branch 'upstream/main'	2026-02-10 21:52:33 +08:00
Edric Li	a54b81cf74	perf: 错误处理性能优化 - MatchRule 延迟/限制 body ToLower，先用 statusCode 短路，只在需要关键词匹配时转换且限制 8KB - 预计算规则的小写关键词/平台和 error code set，消除运行时重复 ToLower 和线性扫描 - MODEL_CAPACITY_EXHAUSTED 全局去重，避免并发请求重复重试同一模型 - 503 重试 body 读取限制从 2MB 降至 8KB - time.After 替换为 time.NewTimer，防止 context 取消时 timer 泄漏	2026-02-10 21:40:31 +08:00
Edric Li	2d4236f76e	fix: 修复错误透传规则 skip_monitoring 未生效的问题 - ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查 - ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent，中间重试/故障转移事件也能触发跳过标记 - gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey - antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用	2026-02-10 20:56:01 +08:00
yangjianbo	3b0910f664	Merge branch 'main' into test-sora	2026-02-10 18:01:17 +08:00
程序猿MT	1dd3158c7e	Merge branch 'Wei-Shaw:main' into main	2026-02-10 13:55:51 +08:00
song	1f647b120a	feat(antigravity): 转发与测试支持daily/prod单URL切换	2026-02-10 13:51:29 +08:00
Edric Li	7d0a30fa8f	merge: sync upstream main (antigravity single-account 503 retry) 合并上游新增的 Antigravity 单账号 503 退避重试机制，解决与本地 MODEL_CAPACITY_EXHAUSTED 逻辑的冲突，两者共存。	2026-02-10 12:00:21 +08:00
shaw	5dd83d3cf2	fix: 移除特定system以适配新版cc客户端缓存失效的bug	2026-02-10 10:28:34 +08:00
Wesley Liddick	14e1aac9b5	Merge pull request #533 from GuangYiDing/feat/antigravity-single-account-503-retry feat: Antigravity 单账号分组 503 退避重试机制	2026-02-10 09:59:48 +08:00
yangjianbo	58912d4ac5	perf(backend): 使用 gjson/sjson 优化热路径 JSON 处理将 API 网关热路径中的 json.Unmarshal+json.Marshal 替换为 gjson 零拷贝查询和 sjson 精准写入： - unwrapV1InternalResponse 性能提升 22x（4009ns→182ns），内存分配减少 28.5x - unwrapGeminiResponse、extractGeminiUsage、estimateGeminiCountTokens、ParseGeminiRateLimitResetTime 改为接收 []byte 使用 gjson 提取 - ParseGatewayRequest 的 model/stream/metadata/thinking/max_tokens 改用 gjson 类型安全提取 - Handler 层（sora/openai）改用 gjson 提取字段、sjson 注入/修改字段，移除 map[string]any 中间变量 - Sora Client 响应解析改用 gjson ForEach 遍历，减少内存分配 - 新增约 100 个单元测试用例，所有改动函数覆盖率 >85% Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 08:59:30 +08:00
Edric Li	6114f69cca	feat: MODEL_CAPACITY_EXHAUSTED 使用固定1s间隔重试60次，不切换账号 MODEL_CAPACITY_EXHAUSTED (503) 表示模型容量不足，所有账号共享同一容量池，切换账号无意义。改为固定1s间隔重试最多60次，重试耗尽后直接返回上游错误。 - 新增 antigravityModelCapacityRetryMaxAttempts=60 和 antigravityModelCapacityRetryWait=1s - shouldTriggerAntigravitySmartRetry 新增 isModelCapacityExhausted 返回值 - handleSmartRetry 对 MODEL_CAPACITY_EXHAUSTED 使用独立重试策略 - handleModelRateLimit 对 MODEL_CAPACITY_EXHAUSTED 仅标记 Handled，不设限流 - 重试耗尽后不设置模型限流、不清除粘性会话、不切换账号	2026-02-10 02:03:06 +08:00
Edric Li	d6c2921f2b	feat: same-account retry before failover for transient errors For retryable transient errors (Google 400 "invalid project resource name" and empty stream responses), retry on the same account up to 2 times (with 500ms delay) before switching to another account. - Add RetryableOnSameAccount field to UpstreamFailoverError - Add same-account retry loop in both Gemini and Claude/OpenAI handler paths - Move temp-unschedule from service layer to handler layer (only after all same-account retries exhausted) - Reduce temp-unschedule cooldown from 30 minutes to 1 minute	2026-02-10 00:53:54 +08:00
Edric Li	61c73287dc	feat: failover and temp-unschedule on empty stream response - Empty stream responses now return UpstreamFailoverError instead of plain 502, triggering automatic account switching (up to 10 retries) - Add tempUnscheduleEmptyResponse: accounts returning empty responses are temp-unscheduled for 30 minutes - Apply to both Claude and Gemini non-streaming paths - Align googleConfigErrorCooldown from 60m to 30m for consistency	2026-02-09 23:25:30 +08:00
Edric Li	89905ec43d	feat: failover and temp-unschedule on Google "Invalid project resource name" 400 Google 后端间歇性返回 400 "Invalid project resource name" 错误，此前该错误直接透传给客户端且不触发账号切换，导致请求失败。 - 在 Antigravity 和 Gemini 两个平台的所有转发路径中，精确匹配该错误消息后触发 failover 自动换号重试 - 命中后将账号临时封禁 1 小时，避免反复调度到同一故障账号 - 提取共享函数 isGoogleProjectConfigError / tempUnscheduleGoogleConfigError 消除跨 Service 的代码重复	2026-02-09 22:48:32 +08:00
yangjianbo	16131c3d3f	Merge branch 'main' of https://github.com/mt21625457/aicodex2api	2026-02-09 20:26:03 +08:00
erio	6892e84ad2	fix: skip rate limiting when custom error codes don't match upstream status Add ShouldHandleErrorCode guard at the entry of handleGeminiUpstreamError and AntigravityGatewayService.handleUpstreamError so that accounts with custom error codes (e.g. [599]) are not rate-limited when the upstream returns a non-matching status (e.g. 429).	2026-02-09 19:55:05 +08:00

1 2 3 4

181 Commits