185 Commits

Author SHA1 Message Date
win
9da079a5ee x
Some checks failed
Security Scan / backend-security (push) Failing after 3s
Security Scan / frontend-security (push) Failing after 5s
CI / test (push) Failing after 3s
CI / frontend (push) Failing after 3s
CI / golangci-lint (push) Failing after 3s
CI / windsurf-platform (macos-latest) (push) Has been cancelled
CI / windsurf-platform (windows-latest) (push) Has been cancelled
2026-04-27 19:01:41 +08:00
win
898a65314c chore: 删除 Antigravity 订制代码,回退至上游 v0.1.118
Some checks failed
CI / test (push) Failing after 3s
CI / frontend (push) Failing after 4s
CI / golangci-lint (push) Failing after 6s
CI / windsurf-platform (macos-latest) (push) Has been cancelled
CI / windsurf-platform (windows-latest) (push) Has been cancelled
Security Scan / backend-security (push) Failing after 3s
Security Scan / frontend-security (push) Failing after 3s
- 删除自定义文件:gateway_attribution, gateway_claude_runtime_headers,
  identity_service_antigravity, language_server_service, lsrpc_handler,
  antigravity_http handler/routes, 所有 antigravity 专项测试
- 将 antigravity pkg/service 文件回退至上游版本(移除 IsEnterprise、
  claude_code_tool_map、dynamic fingerprint 等定制逻辑)
- 修复 gateway_service.go:移除 NormalizeSystemPromptEnv、
  generateSessionIDForAccount、applyClaudeRuntimeOptionalHeaders 调用,
  使用上游的 session-id 同步逻辑
- 恢复 language_server_pb gen 文件(Windsurf local_ls.go 依赖)
- 保留全部 Windsurf 集成代码不变
2026-04-25 22:35:48 +08:00
win
9156585a23 chore: gofmt/goimports 后处理
合并上游后统一运行 gofmt/goimports,消除排序差异与空行不一致。
2026-04-24 11:52:53 +08:00
win
21325afb33 feat(windsurf): 补全ops日志记录与endpoint派生,对齐其他平台
Some checks failed
CI / test (push) Failing after 10s
CI / frontend (push) Failing after 8s
CI / golangci-lint (push) Failing after 5s
Security Scan / backend-security (push) Failing after 5s
Security Scan / frontend-security (push) Failing after 4s
- windsurf_gateway_service: 添加上游延迟/TTFT/错误上下文记录
- endpoint: DeriveUpstreamEndpoint 添加 PlatformWindsurf 分支
- ops_error_logger: guessPlatformFromPath 添加 /windsurf/ 识别
2026-04-23 20:46:27 +08:00
win
3403e8401c fix: revert antigravity Forward to v1internal REST path, remove broken lsrpc upstream call
lsrpc is local IPC (IDE ↔ language_server binary), not an upstream protocol.
cloudcode-pa.googleapis.com does not serve gRPC/lsrpc endpoints.
Restores antigravityRetryLoop + streamGenerateContent path which was working.
Removes antigravity_lsrpc.go (upstream caller) and lsrpc_test cmd.
Keeps lsrpc_handler.go (server side, receives IDE connections).
2026-04-19 20:03:34 +08:00
win
435ae221bc x
Some checks failed
CI / test (push) Failing after 1m32s
CI / golangci-lint (push) Failing after 31s
Security Scan / backend-security (push) Failing after 1m32s
Security Scan / frontend-security (push) Failing after 9s
2026-04-16 19:11:47 +08:00
win
12ae97b755 fix: Increase maxOutputTokens in Antigravity test request from 1 to 10
The test request was using maxOutputTokens: 1, which caused Google API to
generate only 1 token. When decoded, this single token produced "It" as the
response, making it look like an error.

Changed:
- Content: "." → "Test connection" (more meaningful prompt)
- MaxTokens: 1 → 10 (enough tokens to verify connection is working)

This fixes the issue where account test always showed "It" in the response,
which was actually just the truncated output from the single-token generation.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-11 18:49:53 +08:00
win
b01a44cd39 perf: Optimize Antigravity MODEL_CAPACITY_EXHAUSTED retry strategy
- Reduce max retry attempts from 60 to 10 (exponential backoff prevents pile-up)
- Replace fixed 1s delays with exponential backoff: 1s, 2s, 4s, 8s, 16s, 32s
- Add ±10% jitter to prevent thundering herd effect
- Cap max wait at 32 seconds to avoid excessive delays
- Improves response time when API is temporarily unavailable

Before: ~60s worst case (60 * 1s fixed delays)
After:  ~10s worst case (exponential backoff with cap)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-10 23:05:39 +08:00
win
6160636ca6 feat: Complete Go→Node.js TLS/HTTP emulation for Claude API requests
Implement comprehensive Claude Code client emulation to ensure all Go-originated
requests are indistinguishable from Node.js clients at the TLS and HTTP levels.

## Core Changes

### 1. TLS Fingerprint Enhancements
- **Enable HTTP/2**: Set ForceAttemptHTTP2=true in TLS transport to match Node.js 24.x
  behavior (HTTP/2 is preferred by modern Node.js)
- **ALPN Protocol Priority**: Changed from ["http/1.1"] to ["h2", "http/1.1"] to
  advertise HTTP/2 preference, matching actual Node.js client capability

### 2. Request Header Validation & Cleaning (Monkey Patch)
- Created new claudemask package for Node.js emulation validation
- ValidateNodeEmulation(): Verify all required Node.js headers present
- CleanRequest(): Fix any Go client indicators that slip through (Go User-Agent, etc)
- Applied in buildUpstreamRequest() as final validation before sending to Claude API
- Validates 8 required headers: User-Agent, X-Stainless-*, anthropic-version

### 3. Comprehensive Testing
- 8 unit tests covering validation and cleaning scenarios
- Tests verify: valid requests pass, missing headers detected, Go client headers fixed
- All tests passing ✓

## Why This Works

1. **TLS Level**: HTTP/2 negotiation via ALPN matches real Claude Code behavior
2. **HTTP Level**: All X-Stainless headers properly injected (language, runtime, OS)
3. **Fallback**: CleanRequest() catches any missed emulation as safety net
4. **Detection**: ValidateNodeEmulation() logs any inconsistencies for debugging

## Files Modified
- internal/pkg/tlsfingerprint/dialer.go: ALPN protocol priority
- internal/repository/http_upstream.go: Enable HTTP/2
- internal/service/gateway_service.go: Integrate validation/cleaning
- internal/pkg/claudemask/mask.go: New validation module (8 functions)
- internal/pkg/claudemask/mask_test.go: New test suite (8 tests)

## Result
Go requests now sent to Claude API are 100% consistent with Node.js clients:
- JA3/JA4 TLS fingerprints match
- HTTP/2 ALPN negotiation correct
- All identification headers present and consistent
- Fallback cleaning ensures no Go client leakage

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-10 19:24:16 +08:00
win
0e9f780815 fix: surface ls quota exhaustion in antigravity streams
Some checks failed
CI / test (push) Failing after 3s
CI / golangci-lint (push) Failing after 3s
Security Scan / backend-security (push) Failing after 4s
Security Scan / frontend-security (push) Failing after 3s
2026-03-31 01:26:48 +08:00
win
0cda0e0b96 feat: add dockerized antigravity ls worker mode
Some checks failed
CI / test (push) Failing after 8s
CI / golangci-lint (push) Failing after 5s
Security Scan / backend-security (push) Failing after 7s
Security Scan / frontend-security (push) Failing after 6s
2026-03-30 23:57:25 +08:00
erio
3ee6f085db refactor: extract internal500 penalty logic to dedicated file
Move constants, detection, and penalty functions from
antigravity_gateway_service.go to antigravity_internal500_penalty.go.
Fix gofmt alignment and replace hardcoded duration strings with
constant references.
2026-03-27 20:11:24 +08:00
erio
7cca69a136 fix: move internal500 counter reset to cover all success paths
Move the reset logic after urlFallbackLoop so it covers both direct
success and smart retry (429/503) success paths.
2026-03-27 20:11:24 +08:00
erio
093a5a260e feat(antigravity): progressive penalty for consecutive INTERNAL 500 errors
When an antigravity account returns 500 "Internal error encountered."
on all 3 retry attempts, increment a Redis counter and apply escalating
penalties:
- 1st round: temp unschedulable 10 minutes
- 2nd round: temp unschedulable 10 hours
- 3rd round: permanently mark as error

Counter resets on any successful response (< 400).
2026-03-27 20:11:24 +08:00
Ethan0x0000
db9021f9c1 feat(ops): propagate endpoint/request-type context in handlers; add UpstreamURL to upstream error events 2026-03-21 23:47:39 +08:00
Ethan0x0000
bac408044f fix(provider): preserve requested model in antigravity and sora
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-21 01:24:30 +08:00
erio
528ff5d28c fix(antigravity): fast-fail on proxy unavailable, temp-unschedule account
## Problem

When a proxy is unreachable, token refresh retries up to 4 times with
30s timeout each, causing requests to hang for ~2 minutes before
failing with a generic 502 error. The failed account is not marked,
so subsequent requests keep hitting it.

## Changes

### Proxy connection fast-fail
- Set TCP dial timeout to 5s and TLS handshake timeout to 5s on
  antigravity client, so proxy connectivity issues fail within 5s
  instead of 30s
- Reduce overall HTTP client timeout from 30s to 10s
- Export `IsConnectionError` for service-layer use
- Detect proxy connection errors in `RefreshToken` and return
  immediately with "proxy unavailable" error (no retries)

### Token refresh temp-unschedulable
- Add 8s context timeout for token refresh on request path
- Mark account as temp-unschedulable for 10min when refresh fails
  (both background `TokenRefreshService` and request-path
  `GetAccessToken`)
- Sync temp-unschedulable state to Redis cache for immediate
  scheduler effect
- Inject `TempUnschedCache` into `AntigravityTokenProvider`

### Account failover
- Return `UpstreamFailoverError` on `GetAccessToken` failure in
  `Forward`/`ForwardGemini` to trigger handler-level account switch
  instead of returning 502 directly

### Proxy probe alignment
- Apply same 5s dial/TLS timeout to shared `httpclient` pool
- Reduce proxy probe timeout from 30s to 10s
2026-03-19 23:48:37 +08:00
erio
a6f99cf534 refactor(antigravity): unify TestConnection with dispatch retry loop
TestConnection now reuses antigravityRetryLoop instead of a standalone
HTTP loop, gaining credits overages, smart retry, and 429/503 backoff
for free. AccountSwitchError is caught and surfaced as a friendly
message. Also populates RateLimitedModel in TempUnscheduled switch error.

Test fixes:
- Use RATE_LIMIT_EXCEEDED in 503 short-delay test to avoid 60x1s timeout
- Clamp waitDuration=0 instead of 999s to avoid 15s max-wait timeout
- Enhance mockSmartRetryUpstream with repeatLast and body caching
2026-03-17 01:47:08 +08:00
kunish
d795734352
fix(antigravity): add stream keepalive to prevent connection drops
Antigravity streaming handlers were missing the keepalive mechanism
that exists in the standard gateway, causing proxy/CDN idle timeouts
to break connections during long thinking phases (e.g. claude-opus-4-6).
This resulted in truncated responses with missing tool calls.

Add StreamKeepaliveInterval support to all three Antigravity streaming
paths: Claude SSE, Gemini SSE, and upstream passthrough.
2026-03-16 17:37:15 +08:00
erio
0d2061b268 fix: remove ClaudeMax references not yet in upstream/main
Remove SimulateClaudeMaxEnabled field and related logic from
admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
from antigravity_gateway_service.go. These symbols are not yet
available in upstream/main.
2026-03-16 05:01:42 +08:00
erio
8a260defc2 refactor: replace sync.Map credits state with AICredits rate limit key
Replace process-memory sync.Map + per-model runtime state with a single
"AICredits" key in model_rate_limits, making credits exhaustion fully
isomorphic with model-level rate limiting.

Scheduler: rate-limited accounts with overages enabled + credits available
are now scheduled instead of excluded.

Forwarding: when model is rate-limited + credits available, inject credits
proactively without waiting for a 429 round trip.

Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.

Frontend: show credits_active (yellow ) when model rate-limited but
credits available, credits_exhausted (red) when AICredits key active.

Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
clearCreditsExhausted, and update existing overages tests.
2026-03-16 04:58:58 +08:00
SilentFlower
17e4033340 feat: implement resolveCreditsOveragesModelKey function to stabilize model key resolution for credit overages 2026-03-16 04:58:12 +08:00
haruka
25cb5e7505 fix 第一次 400,第二次触发切账号信号 2026-03-12 11:30:53 +08:00
haruka
f44927b9f8 add test for fix #935 2026-03-12 11:04:14 +08:00
shaw
a3791104f9 feat: 支持后台设置是否启用整流开关 2026-03-07 21:55:38 +08:00
Elysia
65a106792a fix issue #791 2026-03-06 20:37:09 +08:00
yangjianbo
bb664d9bbf feat(sync): full code sync from release 2026-02-28 15:01:20 +08:00
erio
a6f9f9f968 feat: replace gemini-3-pro-image with gemini-3.1-flash-image
- Add migration 060 to update model_mapping for all antigravity accounts
- Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
- Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
- Update frontend usage window to show GImage for new model
- Update isImageGenerationModel to support new model
2026-02-27 09:52:50 +08:00
erio
4573868c08 fix(antigravity): bill with mapped model and use final model key for rate limiting
- Use mapped model (billingModel) instead of original request model for billing
- Use resolveFinalAntigravityModelKey for 429 rate limit model key,
  ensuring rate limit records match the actual upstream model
- Add regression tests for both fixes
2026-02-24 18:08:19 +08:00
erio
0dacdf480b fix: distinguish client disconnection from upstream retry failure
Before this change, when a client disconnected mid-request, the error
message was "Upstream request failed after retries", which is misleading
and pollutes error logs. Now we check context.Err() to return a more
accurate "Client disconnected" message for both Claude and Gemini
forward paths.
2026-02-24 16:45:08 +08:00
yangjianbo
41d0383fb7 merge(test): 合并 main 并解决前端筛选器冲突 2026-02-15 22:04:06 +08:00
程序猿MT
1cf51b14f7
Merge branch 'Wei-Shaw:main' into main 2026-02-15 20:49:14 +08:00
shaw
a817cafe3d feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费
Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h
两种缓存创建 token,1h 单价远高于 5m(如 claude-3-5-haiku: 5m=$1/MTok,
1h=$6/MTok)。此前系统统一按 5m 单价计费,导致计费偏低。

后端:
- pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr
- billing_service: GetModelPricing 启用分类计费(安全守卫 1h>5m),
  CalculateCost 按 5m/1h 分别计费,无明细时回退到 5m 单价
- gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson
  提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens
- antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取
- usage_log: 修复 GORM column tag 确保写入正确的数据库列
- 新增迁移 054: 删除 GORM 自动生成的重复列

前端:
- 使用记录 tooltip 展示 5m/1h 缓存创建明细(带彩色 badge 区分)
- 表格单元格缓存写入数值旁显示 1h 标识
2026-02-14 18:15:35 +08:00
yangjianbo
abf5de69fb Merge branch 'main' into test 2026-02-12 23:43:47 +08:00
程序猿MT
174d7c774d
Merge branch 'Wei-Shaw:main' into main 2026-02-12 23:12:41 +08:00
yangjianbo
584cfc3db2 chore(logging): 完成后端日志审计与结构化迁移
- 将高密度服务与处理器日志迁移到新日志系统(LegacyPrintf/结构化日志)
- 增加 stdlog bridge 与兼容测试,保留旧日志捕获能力
- 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获
- 补齐后端相关文件 logger 引用并通过全量 go test
2026-02-12 19:01:09 +08:00
程序猿MT
8da5fac69e
Merge branch 'Wei-Shaw:main' into main 2026-02-11 18:39:52 +08:00
SilentFlower
19cca11e00 [UPDATE] 增强 Claude Thinking 模式支持与 Opus 4.6 动态预算适配
 feat(antigravity): 支持 thinking adaptive 类型并适配 Opus 4.6 动态预算
🧪 test(gateway): 增加 thinking 模式解析与签名块过滤的边界用例测试
2026-02-11 10:31:16 +08:00
Edric Li
2a1067c82b Merge remote-tracking branch 'upstream/main' 2026-02-10 21:52:33 +08:00
Edric Li
a54b81cf74 perf: 错误处理性能优化
- MatchRule 延迟/限制 body ToLower,先用 statusCode 短路,只在需要关键词匹配时转换且限制 8KB
- 预计算规则的小写关键词/平台和 error code set,消除运行时重复 ToLower 和线性扫描
- MODEL_CAPACITY_EXHAUSTED 全局去重,避免并发请求重复重试同一模型
- 503 重试 body 读取限制从 2MB 降至 8KB
- time.After 替换为 time.NewTimer,防止 context 取消时 timer 泄漏
2026-02-10 21:40:31 +08:00
Edric Li
2d4236f76e fix: 修复错误透传规则 skip_monitoring 未生效的问题
- ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查
- ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent,中间重试/故障转移事件也能触发跳过标记
- gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey
- antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用
2026-02-10 20:56:01 +08:00
yangjianbo
3b0910f664 Merge branch 'main' into test-sora 2026-02-10 18:01:17 +08:00
程序猿MT
1dd3158c7e
Merge branch 'Wei-Shaw:main' into main 2026-02-10 13:55:51 +08:00
song
1f647b120a feat(antigravity): 转发与测试支持daily/prod单URL切换 2026-02-10 13:51:29 +08:00
Edric Li
7d0a30fa8f merge: sync upstream main (antigravity single-account 503 retry)
合并上游新增的 Antigravity 单账号 503 退避重试机制,
解决与本地 MODEL_CAPACITY_EXHAUSTED 逻辑的冲突,两者共存。
2026-02-10 12:00:21 +08:00
shaw
5dd83d3cf2 fix: 移除特定system以适配新版cc客户端缓存失效的bug 2026-02-10 10:28:34 +08:00
Wesley Liddick
14e1aac9b5
Merge pull request #533 from GuangYiDing/feat/antigravity-single-account-503-retry
feat: Antigravity 单账号分组 503 退避重试机制
2026-02-10 09:59:48 +08:00
yangjianbo
58912d4ac5 perf(backend): 使用 gjson/sjson 优化热路径 JSON 处理
将 API 网关热路径中的 json.Unmarshal+json.Marshal 替换为 gjson 零拷贝查询和 sjson 精准写入:
- unwrapV1InternalResponse 性能提升 22x(4009ns→182ns),内存分配减少 28.5x
- unwrapGeminiResponse、extractGeminiUsage、estimateGeminiCountTokens、ParseGeminiRateLimitResetTime 改为接收 []byte 使用 gjson 提取
- ParseGatewayRequest 的 model/stream/metadata/thinking/max_tokens 改用 gjson 类型安全提取
- Handler 层(sora/openai)改用 gjson 提取字段、sjson 注入/修改字段,移除 map[string]any 中间变量
- Sora Client 响应解析改用 gjson ForEach 遍历,减少内存分配
- 新增约 100 个单元测试用例,所有改动函数覆盖率 >85%

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 08:59:30 +08:00
Edric Li
6114f69cca feat: MODEL_CAPACITY_EXHAUSTED 使用固定1s间隔重试60次,不切换账号
MODEL_CAPACITY_EXHAUSTED (503) 表示模型容量不足,所有账号共享同一容量池,
切换账号无意义。改为固定1s间隔重试最多60次,重试耗尽后直接返回上游错误。

- 新增 antigravityModelCapacityRetryMaxAttempts=60 和 antigravityModelCapacityRetryWait=1s
- shouldTriggerAntigravitySmartRetry 新增 isModelCapacityExhausted 返回值
- handleSmartRetry 对 MODEL_CAPACITY_EXHAUSTED 使用独立重试策略
- handleModelRateLimit 对 MODEL_CAPACITY_EXHAUSTED 仅标记 Handled,不设限流
- 重试耗尽后不设置模型限流、不清除粘性会话、不切换账号
2026-02-10 02:03:06 +08:00
Edric Li
d6c2921f2b feat: same-account retry before failover for transient errors
For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.

- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
  all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute
2026-02-10 00:53:54 +08:00