2495 Commits

Author SHA1 Message Date
win
5123d92b44 feat(scheduling): add cross-tier fallback chain (subscription → API Key → Bedrock)
Adds an opt-in tier-based fallback scheduling path for Anthropic accounts:
- accountTierLevel(): derives tier from account type without DB migration
  (tier-0=OAuth/SetupToken, tier-1=APIKey, tier-2=Bedrock)
- enableTierFallbackChain(): new config flag
  gateway.scheduling.enable_tier_fallback_chain (default false)
- selectAccountWithTierFallback(): loads all Anthropic accounts, groups by
  tier, honors sticky sessions, applies all existing schedulability guards,
  then tries tiers 0→1→2 in order via tryAcquireByLegacyOrder
- Wired into SelectAccountForModelWithExclusions: Anthropic platform +
  tier fallback enabled → calls new path instead of mixed scheduling
- Fix pre-existing unit-test build break: NewGatewayService now requires
  *RPMTokenBucketService (added in Task #5); add missing nil param
- 7 tests: tier mapping, config toggle, subscription preference,
  APIKey fallback, exclusion handling, empty-pool error, Bedrock last resort
2026-04-29 03:23:39 +08:00
win
a2ab67f8c7 feat(scheduler): add P2C + quota-aware scheduling for OpenAI accounts
- Add GetQuotaRemainingFraction() to Account: returns [0,1] fraction of
  remaining quota; 1.0 when no limit is configured (unlimited accounts)
- Add Quota float64 weight field to GatewayOpenAIWSSchedulerScoreWeights
  and EnableP2CScheduling bool to GatewayOpenAIWSConfig (both default off)
- Extend selectByLoadBalance scoring with quota factor (gated by Quota>0)
- Add selectByPowerOfTwo(): O(1) P2C selection — samples 2 random candidates,
  tries the better-scored one first then the other, falls back to wait plan;
  activated when EnableP2CScheduling=true
- Add openAIWSP2CEnabled() helper on OpenAIGatewayService
- Add 6 tests covering quota fraction edge cases, P2C toggle, weight defaults,
  single-candidate P2C, two-candidate P2C selection, and quota score ordering
2026-04-29 03:13:30 +08:00
win
d1e2d39c26 feat(viewer): add real-time request stream WebSocket endpoint
Adds GET /api/v1/admin/ops/ws/requests — a fan-out WebSocket that pushes
per-request metadata (method, path, model, account_id, status, latency_ms)
to all connected admin clients the moment each gateway dispatch completes.

- service/request_event_bus.go: lock-free pub/sub with non-blocking drop
  when per-subscriber buffer (64 slots) is full; nil-safe Publish
- service/request_event_bus_test.go: 6 tests (basic, fanout, drop, nil, close)
- GatewayHandler: records reqStartTime at entry; defer emits RequestEvent on
  every return; sets status success/error/rate_limited in both Gemini and
  Anthropic dispatch paths
- OpsHandler: accepts *RequestEventBus; wires it to RequestStreamWSHandler
- ops_ws_requests_handler.go: subscribes to bus, pushes JSON per event,
  reuses existing upgrader/conn-limit/ping-pong infrastructure
- Route: ws.GET("/requests", ...) alongside existing /ws/qps
- wire_gen.go: requestEventBus shared between OpsHandler and GatewayHandler
2026-04-29 01:48:15 +08:00
win
d535688bfd feat(context): add proactive context compression for long conversations
- New context_compressor.go: pure functions operating on raw JSON body
  (gjson/sjson pattern). approxTokens uses chars/4 heuristic.
- compressMessages: removes oldest messages from front, treating
  consecutive assistant(tool_use)+user(tool_result) pairs as atomic units
  to prevent orphaned tool_result blocks.
- Hooked into Forward() after StripEmptyTextBlocks, gated on
  account.Credentials[enable_context_compression].
- Config: gateway.context_compression.max_tokens (default 190000).
- 8 unit tests covering: approx tokens, no-op when under budget,
  oldest-message trimming, tool pair preservation, atomic pair removal,
  body passthrough, body trimming.
2026-04-29 01:33:05 +08:00
win
95814974de feat(rpm): add token bucket smoothing for RPM rate limiting
- New RPMTokenBucketService: per-account continuous-refill token buckets
  (rate = rpm/60 tokens/sec, capacity = rpm). No new dependencies.
- GatewayService.AcquireRPMToken() delegates to the bucket service.
- Gateway handler inserts RPM token wait BEFORE wrapReleaseOnDone in both
  Gemini and Anthropic dispatch paths; timeout returns 429 and releases slot.
- Config: gateway.rpm_smoothing.enabled (default false) + max_wait_ms (default 5000).
- 7 unit tests covering: immediate acquire, zero RPM, timeout, wait+refill,
  context cancel, account isolation, bucket reset on RPM change.
2026-04-29 01:22:54 +08:00
win
5c8c15cdb1 feat(refresh,repo): add singleflight to dedupe concurrent token refresh and unschedulable writes
Two anti-thundering-herd improvements:

1. OAuthRefreshAPI.RefreshIfNeeded
   Wrap the existing distributed-lock + DB-reread + executor.Refresh
   pipeline in a per-process singleflight keyed by cacheKey+window.
   Without this, N concurrent goroutines on the same account each pay
   one Redis lock RTT and one DB reread; with it, only the leader pays
   and the rest share the result.

   The refreshWindow is part of the key so a long background-refresh
   window cannot starve a short foreground-refresh window.

2. accountRepository.SetTempUnschedulable
   Wrap the same path (UPDATE + scheduler outbox enqueue + scheduler
   cache sync) in a per-process singleflight keyed by id+until+reason.
   The SQL guard (existing < new) already makes the UPDATE idempotent,
   but N callers still cost N round-trips and N outbox inserts. With
   singleflight, an upstream 401 burst that hits the same account
   collapses to one execution.

Tests cover dedup behavior, key separation by account / refresh window,
and that the SQL exec count drops from N to <=2 (UPDATE + outbox).
2026-04-29 00:43:23 +08:00
win
110902ad4b feat(health): split liveness and readiness probes
Add HealthService with Liveness (no-op) and Readiness (DB+Redis ping
with per-component timeout) checks. Expose three endpoints:

- /healthz : new liveness endpoint, zero-dependency, always 200
- /ready   : new readiness endpoint, returns 503 with details on dep
             failure; suitable for K8s readinessProbe and load balancers
- /health  : preserved for backward compatibility, equivalent to
             /healthz

Switch primary docker-compose healthcheck to /ready so the container
is only marked healthy once DB+Redis are reachable. Standalone/dev/
local compose files keep /health to avoid disrupting existing setups.

Tests: unit tests cover liveness, readiness with both deps healthy,
each dep failing independently, and per-component timeout enforcement.
2026-04-28 23:39:50 +08:00
win
d6df41feaa chore(claude): bump CLI fingerprint to 2.1.88 and accept claude-code/ UA
- Centralize Claude CLI fingerprint constants (UA, x-stainless-*) in
  pkg/claude with BuildCLI/CodeUserAgent helpers
- Reuse constants in DefaultHeaders, identity_service defaults, and
  antigravity identity defaults to keep all callers in sync
- Extend ClaudeCodeValidator to accept both claude-cli/ and claude-code/
  UA prefixes (transport/helper requests use the latter)
- Update related tests to cover the new UA prefix and version
2026-04-28 22:35:24 +08:00
win
6620b56b5a fix: encode ls model credits topic values as base64
Some checks failed
CI / test (push) Failing after 3s
CI / golangci-lint (push) Failing after 2s
Security Scan / backend-security (push) Failing after 3s
Security Scan / frontend-security (push) Failing after 3s
2026-03-31 08:34:00 +08:00
win
0e9f780815 fix: surface ls quota exhaustion in antigravity streams
Some checks failed
CI / test (push) Failing after 3s
CI / golangci-lint (push) Failing after 3s
Security Scan / backend-security (push) Failing after 4s
Security Scan / frontend-security (push) Failing after 3s
2026-03-31 01:26:48 +08:00
win
860fc736bf Merge branch 'codex/internal-sync-20260330'
Some checks failed
CI / test (push) Failing after 16m34s
CI / golangci-lint (push) Failing after 1m33s
Security Scan / backend-security (push) Failing after 33s
Security Scan / frontend-security (push) Failing after 1m32s
2026-03-31 00:00:49 +08:00
win
677556ba05 Merge remote-tracking branch 'internal/main' 2026-03-31 00:00:42 +08:00
win
0cda0e0b96 feat: add dockerized antigravity ls worker mode
Some checks failed
CI / test (push) Failing after 8s
CI / golangci-lint (push) Failing after 5s
Security Scan / backend-security (push) Failing after 7s
Security Scan / frontend-security (push) Failing after 6s
2026-03-30 23:57:25 +08:00
shaw
318aa5e0d3 feat: add cache hit rate line to token usage trend chart
Add a purple dashed line showing cache hit rate percentage
(cache_read / (cache_read + cache_creation)) on a secondary
right Y-axis (0-100%). Applies to both user and admin dashboards.
2026-03-30 21:43:07 +08:00
win
8eb2bbcb20 feat: 从 main 分支迁移 Claude 指纹常量和实例级隔离配置
Some checks failed
CI / test (push) Failing after 5s
CI / golangci-lint (push) Failing after 13s
Security Scan / backend-security (push) Failing after 7s
Security Scan / frontend-security (push) Failing after 7s
将 main 分支的 Claude/Anthropic 相关逆向工作迁移到 codex 分支:
- claude/constants.go: 添加 4 个新 Beta 常量 + 版本升级至 2.1.84/0.74.0
- config.go: 添加 InstanceSalt 和 FingerprintDefaultsConfig 配置
- identity_service: 版本升级 + instanceSalt 支持 + ApplyDefaultFingerprintOverrides
- wire_gen.go: 初始化指纹覆盖 + 使用 NewIdentityServiceWithSalt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:04:52 +08:00
shaw
1dfd974432 chore: update readme
Some checks failed
CI / test (push) Failing after 4s
CI / golangci-lint (push) Failing after 6s
Security Scan / backend-security (push) Failing after 5s
Security Scan / frontend-security (push) Failing after 4s
2026-03-30 16:28:31 +08:00
shaw
cc396f59cf chore: update readme 2026-03-30 16:24:29 +08:00
github-actions[bot]
aa8b9cc508 chore: sync VERSION to 0.1.106 [skip ci] 2026-03-30 08:13:49 +00:00
Wesley Liddick
6a2cf09ee0
Merge pull request #1349 from touwaeriol/feat/antigravity-internal500-penalty
feat(antigravity): progressive penalty for consecutive INTERNAL 500 errors
2026-03-30 15:54:04 +08:00
Wesley Liddick
c6fd88116b
Merge pull request #1354 from wucm667/fix/billing-use-requested-model
fix(billing): 计费始终使用用户请求的原始模型,而非映射后的上游模型
2026-03-30 15:52:31 +08:00
Wesley Liddick
8f0dbdeaba
Merge pull request #1343 from yilinyo/fix/api-key-unique-conflict-after-soft-delete
fix(api-key):软删除apikey后key没有被释放后续无法再自定义相同的key
2026-03-30 15:47:28 +08:00
Wesley Liddick
007c09b84e
Merge pull request #1338 from LvyuanW/fix/safari-ops-log-select
fix(admin): fix Safari system log select height
2026-03-30 15:45:35 +08:00
Wesley Liddick
73f3c068ef
Merge pull request #1344 from 7836246/fix/i18n-sora-storage-missing-keys
fix(i18n): 修复 Sora 存储配置页面表格列头「存储桶」翻译缺失
2026-03-30 15:45:03 +08:00
Wesley Liddick
9a92fa4a60
Merge pull request #1370 from YanzheL/fix/1320-openai-messages-gpt54-xhigh
fix(gateway): normalize gpt-5.4-xhigh for /v1/messages
2026-03-30 15:44:34 +08:00
Wesley Liddick
576af710be
Merge pull request #1352 from StarryKira/feat/add-file-upload-oauth-scope
Feat/add file upload oauth scope
2026-03-30 15:41:18 +08:00
Wesley Liddick
b5642bd068
Merge pull request #1377 from DaydreamCoding/fix/lifecycle-stop-duplicate-close
fix(lifecycle): TokenRefreshService Stop() 防重复 close
2026-03-30 15:38:39 +08:00
Wesley Liddick
128f322252
Merge pull request #1376 from weak-fox/fix/privacy-without-refresh-token
修复缺少 refresh_token 时被临时停调度
2026-03-30 15:38:27 +08:00
Wesley Liddick
17d7e57a2e
Merge pull request #1375 from weak-fox/fix/batch-reset-temp-unsched
修复重置状态时未清理临时停调度
2026-03-30 15:37:58 +08:00
shaw
50288e6b01 fix: 修复模型定价文件更新url 2026-03-30 15:36:53 +08:00
shaw
ab3e44e4bd fix: 适配X-Claude-Code-Session-Id头 2026-03-30 11:43:07 +08:00
QTom
61607990c8 fix(lifecycle): TokenRefreshService Stop() 防重复 close
使用 sync.Once 包裹 close(stopCh),避免多次调用 Stop() 时
触发 panic: close of closed channel。
2026-03-30 10:33:06 +08:00
shaw
b65275235f feat: Anthropic oauth/setup-token账号支持自定义转发URL 2026-03-30 09:10:57 +08:00
weak-fox
e298a71834 fix: clear temp unsched when resetting account status 2026-03-30 00:22:02 +08:00
weak-fox
3f6fa1e3db fix: avoid temp unsched when refresh token is missing 2026-03-30 00:21:51 +08:00
YanzheL
f2c2abe628 fix(openai): keep xhigh normalization scoped to messages 2026-03-29 21:09:19 +08:00
YanzheL
ff5b467fbe fix(handler): normalize compat model for message routing 2026-03-29 20:53:14 +08:00
YanzheL
8c10941142 fix(openai): normalize gpt-5.4-xhigh compat mapping 2026-03-29 20:52:29 +08:00
wucm667
f5764d8dc6 fix(billing): 计费始终使用用户请求的原始模型,而非映射后的上游模型
当账号配置了模型映射(如 claude-sonnet-4-6 → glm-5.0)时,系统错误地
使用映射后的上游模型名计算费用。由于上游模型(如 glm-5.0)在定价系统中
没有价格配置,导致计费失败后被静默置为 0,用户不被扣费。

修改 forwardResultBillingModel 优先返回请求模型名,并移除 OpenAI 路径
中 BillingModel 字段对计费模型的覆盖逻辑。
2026-03-28 16:22:06 +08:00
Elysia
81ca4f12dd 修复误删的url 2026-03-28 00:55:55 +08:00
Elysia
941c469ab9 fix: use standard PKCE code verifier generation
Replace charset→base64url double-encoding with standard random
bytes→base64url approach to match official client behavior and avoid
risk control detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 00:47:31 +08:00
Elysia
8fcd819e6f feat: add user:file_upload OAuth scope
Align OAuth scopes with upstream Claude Code client which now includes
the user:file_upload scope for file upload support.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 00:40:36 +08:00
erio
9abdaed20c style: gofmt antigravity_internal500_penalty.go 2026-03-27 20:18:07 +08:00
erio
eb94342f78 chore: adjust internal500 penalty durations to 30m / 2h 2026-03-27 20:11:24 +08:00
erio
d563eb2336 test: add unit tests for INTERNAL 500 progressive penalty
Cover isAntigravityInternalServerError body matching,
applyInternal500Penalty tier escalation, handleInternal500RetryExhausted
nil-safety and error handling, and resetInternal500Counter paths.
2026-03-27 20:11:24 +08:00
erio
3ee6f085db refactor: extract internal500 penalty logic to dedicated file
Move constants, detection, and penalty functions from
antigravity_gateway_service.go to antigravity_internal500_penalty.go.
Fix gofmt alignment and replace hardcoded duration strings with
constant references.
2026-03-27 20:11:24 +08:00
erio
7cca69a136 fix: move internal500 counter reset to cover all success paths
Move the reset logic after urlFallbackLoop so it covers both direct
success and smart retry (429/503) success paths.
2026-03-27 20:11:24 +08:00
erio
093a5a260e feat(antigravity): progressive penalty for consecutive INTERNAL 500 errors
When an antigravity account returns 500 "Internal error encountered."
on all 3 retry attempts, increment a Redis counter and apply escalating
penalties:
- 1st round: temp unschedulable 10 minutes
- 2nd round: temp unschedulable 10 hours
- 3rd round: permanently mark as error

Counter resets on any successful response (< 400).
2026-03-27 20:11:24 +08:00
小海
2c072c0ed6 fix(i18n): add missing bucket column translation key for Sora S3 storage settings
The `admin.settings.soraS3.columns.bucket` key was used in
DataManagementView.vue but missing from both en.ts and zh.ts locale
files, causing the raw translation key to be displayed as a column
header instead of the localized text.
2026-03-27 16:44:14 +08:00
YilinMacAir
1f39bf8a78 fix:修复由于数据库唯一键导致软删除apikey后key没有被释放后续无法再自定义相同的key 2026-03-27 16:37:10 +08:00
github-actions[bot]
fdd8499ffc chore: sync VERSION to 0.1.105 [skip ci] 2026-03-27 08:04:27 +00:00