sub2api

Author	SHA1	Message	Date
win	fdd2d08a4d	feat: merge feat/omniroute-ideas — P2C scheduler, quota scoring, tier fallback	2026-04-29 15:42:37 +08:00
win	0a3666ef24	x Some checks failed Security Scan / backend-security (push) Failing after 1m31s Details Security Scan / frontend-security (push) Failing after 7s Details CI / test (push) Failing after 6s Details CI / frontend (push) Failing after 4s Details CI / golangci-lint (push) Failing after 4s Details CI / windsurf-platform (macos-latest) (push) Has been cancelled Details CI / windsurf-platform (windows-latest) (push) Has been cancelled Details	2026-04-29 10:32:36 +08:00
win	5123d92b44	feat(scheduling): add cross-tier fallback chain (subscription → API Key → Bedrock) Adds an opt-in tier-based fallback scheduling path for Anthropic accounts: - accountTierLevel(): derives tier from account type without DB migration (tier-0=OAuth/SetupToken, tier-1=APIKey, tier-2=Bedrock) - enableTierFallbackChain(): new config flag gateway.scheduling.enable_tier_fallback_chain (default false) - selectAccountWithTierFallback(): loads all Anthropic accounts, groups by tier, honors sticky sessions, applies all existing schedulability guards, then tries tiers 0→1→2 in order via tryAcquireByLegacyOrder - Wired into SelectAccountForModelWithExclusions: Anthropic platform + tier fallback enabled → calls new path instead of mixed scheduling - Fix pre-existing unit-test build break: NewGatewayService now requires *RPMTokenBucketService (added in Task #5); add missing nil param - 7 tests: tier mapping, config toggle, subscription preference, APIKey fallback, exclusion handling, empty-pool error, Bedrock last resort	2026-04-29 03:23:39 +08:00
win	a2ab67f8c7	feat(scheduler): add P2C + quota-aware scheduling for OpenAI accounts - Add GetQuotaRemainingFraction() to Account: returns [0,1] fraction of remaining quota; 1.0 when no limit is configured (unlimited accounts) - Add Quota float64 weight field to GatewayOpenAIWSSchedulerScoreWeights and EnableP2CScheduling bool to GatewayOpenAIWSConfig (both default off) - Extend selectByLoadBalance scoring with quota factor (gated by Quota>0) - Add selectByPowerOfTwo(): O(1) P2C selection — samples 2 random candidates, tries the better-scored one first then the other, falls back to wait plan; activated when EnableP2CScheduling=true - Add openAIWSP2CEnabled() helper on OpenAIGatewayService - Add 6 tests covering quota fraction edge cases, P2C toggle, weight defaults, single-candidate P2C, two-candidate P2C selection, and quota score ordering	2026-04-29 03:13:30 +08:00
win	d1e2d39c26	feat(viewer): add real-time request stream WebSocket endpoint Adds GET /api/v1/admin/ops/ws/requests — a fan-out WebSocket that pushes per-request metadata (method, path, model, account_id, status, latency_ms) to all connected admin clients the moment each gateway dispatch completes. - service/request_event_bus.go: lock-free pub/sub with non-blocking drop when per-subscriber buffer (64 slots) is full; nil-safe Publish - service/request_event_bus_test.go: 6 tests (basic, fanout, drop, nil, close) - GatewayHandler: records reqStartTime at entry; defer emits RequestEvent on every return; sets status success/error/rate_limited in both Gemini and Anthropic dispatch paths - OpsHandler: accepts *RequestEventBus; wires it to RequestStreamWSHandler - ops_ws_requests_handler.go: subscribes to bus, pushes JSON per event, reuses existing upgrader/conn-limit/ping-pong infrastructure - Route: ws.GET("/requests", ...) alongside existing /ws/qps - wire_gen.go: requestEventBus shared between OpsHandler and GatewayHandler	2026-04-29 01:48:15 +08:00
win	d535688bfd	feat(context): add proactive context compression for long conversations - New context_compressor.go: pure functions operating on raw JSON body (gjson/sjson pattern). approxTokens uses chars/4 heuristic. - compressMessages: removes oldest messages from front, treating consecutive assistant(tool_use)+user(tool_result) pairs as atomic units to prevent orphaned tool_result blocks. - Hooked into Forward() after StripEmptyTextBlocks, gated on account.Credentials[enable_context_compression]. - Config: gateway.context_compression.max_tokens (default 190000). - 8 unit tests covering: approx tokens, no-op when under budget, oldest-message trimming, tool pair preservation, atomic pair removal, body passthrough, body trimming.	2026-04-29 01:33:05 +08:00
win	95814974de	feat(rpm): add token bucket smoothing for RPM rate limiting - New RPMTokenBucketService: per-account continuous-refill token buckets (rate = rpm/60 tokens/sec, capacity = rpm). No new dependencies. - GatewayService.AcquireRPMToken() delegates to the bucket service. - Gateway handler inserts RPM token wait BEFORE wrapReleaseOnDone in both Gemini and Anthropic dispatch paths; timeout returns 429 and releases slot. - Config: gateway.rpm_smoothing.enabled (default false) + max_wait_ms (default 5000). - 7 unit tests covering: immediate acquire, zero RPM, timeout, wait+refill, context cancel, account isolation, bucket reset on RPM change.	2026-04-29 01:22:54 +08:00
win	5c8c15cdb1	feat(refresh,repo): add singleflight to dedupe concurrent token refresh and unschedulable writes Two anti-thundering-herd improvements: 1. OAuthRefreshAPI.RefreshIfNeeded Wrap the existing distributed-lock + DB-reread + executor.Refresh pipeline in a per-process singleflight keyed by cacheKey+window. Without this, N concurrent goroutines on the same account each pay one Redis lock RTT and one DB reread; with it, only the leader pays and the rest share the result. The refreshWindow is part of the key so a long background-refresh window cannot starve a short foreground-refresh window. 2. accountRepository.SetTempUnschedulable Wrap the same path (UPDATE + scheduler outbox enqueue + scheduler cache sync) in a per-process singleflight keyed by id+until+reason. The SQL guard (existing < new) already makes the UPDATE idempotent, but N callers still cost N round-trips and N outbox inserts. With singleflight, an upstream 401 burst that hits the same account collapses to one execution. Tests cover dedup behavior, key separation by account / refresh window, and that the SQL exec count drops from N to <=2 (UPDATE + outbox).	2026-04-29 00:43:23 +08:00
win	110902ad4b	feat(health): split liveness and readiness probes Add HealthService with Liveness (no-op) and Readiness (DB+Redis ping with per-component timeout) checks. Expose three endpoints: - /healthz : new liveness endpoint, zero-dependency, always 200 - /ready : new readiness endpoint, returns 503 with details on dep failure; suitable for K8s readinessProbe and load balancers - /health : preserved for backward compatibility, equivalent to /healthz Switch primary docker-compose healthcheck to /ready so the container is only marked healthy once DB+Redis are reachable. Standalone/dev/ local compose files keep /health to avoid disrupting existing setups. Tests: unit tests cover liveness, readiness with both deps healthy, each dep failing independently, and per-component timeout enforcement.	2026-04-28 23:39:50 +08:00
win	d6df41feaa	chore(claude): bump CLI fingerprint to 2.1.88 and accept claude-code/ UA - Centralize Claude CLI fingerprint constants (UA, x-stainless-*) in pkg/claude with BuildCLI/CodeUserAgent helpers - Reuse constants in DefaultHeaders, identity_service defaults, and antigravity identity defaults to keep all callers in sync - Extend ClaudeCodeValidator to accept both claude-cli/ and claude-code/ UA prefixes (transport/helper requests use the latter) - Update related tests to cover the new UA prefix and version	2026-04-28 22:35:24 +08:00
win	2a9c5da91a	fix(antigravity): mixed tools (web_search + functions) now use agent route Some checks failed CI / test (push) Failing after 3s Details CI / frontend (push) Failing after 3s Details CI / golangci-lint (push) Failing after 6s Details Security Scan / backend-security (push) Failing after 3s Details Security Scan / frontend-security (push) Failing after 3s Details CI / windsurf-platform (macos-latest) (push) Has been cancelled Details CI / windsurf-platform (windows-latest) (push) Has been cancelled Details - When tools contain both web_search and function declarations, use requestType=agent instead of web_search (Google web_search route rejects functionDeclarations) - Set toolConfig.mode=AUTO when mixed tools detected (VALIDATED is incompatible with googleSearch + functionDeclarations) - Add hasOnlyWebSearchTools helper - Fix buildParts test calls missing 4th arg (stripSignatures)	2026-04-28 02:05:25 +08:00
win	9da079a5ee	x Some checks failed Security Scan / backend-security (push) Failing after 3s Details Security Scan / frontend-security (push) Failing after 5s Details CI / test (push) Failing after 3s Details CI / frontend (push) Failing after 3s Details CI / golangci-lint (push) Failing after 3s Details CI / windsurf-platform (macos-latest) (push) Has been cancelled Details CI / windsurf-platform (windows-latest) (push) Has been cancelled Details	2026-04-27 19:01:41 +08:00
win	898a65314c	chore: 删除 Antigravity 订制代码，回退至上游 v0.1.118 Some checks failed CI / test (push) Failing after 3s Details CI / frontend (push) Failing after 4s Details CI / golangci-lint (push) Failing after 6s Details CI / windsurf-platform (macos-latest) (push) Has been cancelled Details CI / windsurf-platform (windows-latest) (push) Has been cancelled Details Security Scan / backend-security (push) Failing after 3s Details Security Scan / frontend-security (push) Failing after 3s Details - 删除自定义文件：gateway_attribution, gateway_claude_runtime_headers, identity_service_antigravity, language_server_service, lsrpc_handler, antigravity_http handler/routes, 所有 antigravity 专项测试 - 将 antigravity pkg/service 文件回退至上游版本（移除 IsEnterprise、 claude_code_tool_map、dynamic fingerprint 等定制逻辑） - 修复 gateway_service.go：移除 NormalizeSystemPromptEnv、 generateSessionIDForAccount、applyClaudeRuntimeOptionalHeaders 调用，使用上游的 session-id 同步逻辑 - 恢复 language_server_pb gen 文件（Windsurf local_ls.go 依赖） - 保留全部 Windsurf 集成代码不变	2026-04-25 22:35:48 +08:00
win	2064c1a19f	chore: merge upstream Wei-Shaw/sub2api 至 v0.1.118 - 保留 Windsurf 订制代码 - 上游新增：Affiliate 邀返佣功能、OpenAI compact 支持、Claude Code 完整 mimicry - 解决冲突：handler/wire.go、wire_gen.go、constants.go、gateway_service.go 等	2026-04-25 22:08:18 +08:00
win	cbf696bc82	chore(wip): save windsurf changes before upstream v0.1.118 merge	2026-04-25 21:56:42 +08:00
shaw	3af9940b85	style: fix gofmt and ineffassign lint errors - gofmt: realign AffiliateDetail struct tags in affiliate_service.go - ineffassign: remove dead seenCompleted assignment before return in account_test_service.go	2026-04-25 20:37:42 +08:00
Wesley Liddick	22b1277572	Merge pull request #1948 from hungryboy1025/fix/openai-account-test-responses-stream fix(openai): tighten responses stream account tests	2026-04-25 20:31:07 +08:00
Wesley Liddick	aff98d5ae1	Merge pull request #1960 from gaoren002/fix/openai-stream-keepalive-downstream-idle fix(openai): keep responses stream alive during pre-output failover	2026-04-25 20:24:25 +08:00
shaw	4e1bb2b445	feat(affiliate): add feature toggle and per-user custom invite settings - 在系统设置「功能开关」中新增邀请返利总开关，默认关闭；关闭态：菜单隐藏、注册忽略 aff、新充值不返利，但已有 quota 仍可转余额 - 支持管理员为指定用户设置专属邀请码（覆盖随机码，全局唯一） - 支持管理员为指定用户设置专属返利比例（覆盖全局比例，可单条/批量调整） - 在系统设置邀请返利卡片内嵌入专属用户管理表格（搜索/编辑/批量/删除），删除采用项目通用 ConfirmDialog，会同时清除专属比例并把邀请码重置为系统随机码 - /affiliate 用户页新增「我的返利比例」卡片与动态使用说明，让用户直观看到分享后能拿到多少（同源 resolveRebateRatePercent 计算，与实际充值一致） - 新增数据库迁移 132 添加 aff_rebate_rate_percent 与 aff_code_custom 列 - 新增 admin 路由组 /api/v1/admin/affiliates/users/* 共 5 个端点 - AffiliateService 改为只依赖 *SettingService，去除冗余的 SettingRepository - 邀请码格式校验放宽到 [A-Z0-9_-]{4,32}，兼容旧 12 位系统码与新自定义码 - 补充单元测试与集成测试覆盖新方法、冲突路径与边界值	2026-04-25 20:22:07 +08:00
gaoren002	dac6e52091	fix(openai): keep responses stream alive during pre-output failover	2026-04-25 12:11:27 +00:00
hungryboy1025	8987e0ba67	fix(openai): tighten responses stream account tests	2026-04-25 16:56:50 +08:00
github-actions[bot]	9d1751ec57	chore: sync VERSION to 0.1.118 [skip ci]	2026-04-25 08:06:21 +00:00
AyeSt0	5b63a9b02d	fix(openai): fail over before responses stream output	2026-04-25 15:09:40 +08:00
Wesley Liddick	641e61073f	Merge pull request #1940 from 4fuu/fix/bump-codex-cli-version-to-0.125.0 fix(openai): bump codex CLI version from 0.104.0 to 0.125.0	2026-04-25 14:57:51 +08:00
shaw	095f457c57	feat(openai): port /responses/compact account support flow (PR #1555 ) 将 vansour/sub2api#1555 的 OpenAI compact 能力建模手工移植到当前 main：账号级 compact 状态/auto-force_on-force_off 模式、compact-only 模型映射、调度器 tier 分层（已支持 > 未知 > 已知不支持）、管理后台 compact 主动探测，以及对应 i18n/状态徽章。普通 /responses 流量行为不变，无数据库迁移。	2026-04-25 14:52:58 +08:00
4fuu	1e57e88e43	fix(openai): bump codex CLI version from 0.104.0 to 0.125.0 The hardcoded codex CLI version (0.104.0) causes upstream rejection when using gpt-5.5 with compact, as the server treats the request as an outdated client and returns 400/502. Update codexCLIVersion, codexCLIUserAgent, and openAICodexProbeVersion to 0.125.0 to match the current Codex CLI release. Fixes #1933, #1887, #1865 Related: #1609, #1298, #849	2026-04-25 05:26:33 +00:00
Wesley Liddick	b95ffce244	Merge pull request #1772 from KnowSky404/fix/openai-test-state-reconciliation [codex] reconcile OpenAI admin test rate-limit state	2026-04-25 10:02:21 +08:00
Wesley Liddick	1afd81b019	Merge pull request #1920 from Wuxie233/fix/responses-web-search-tool-types fix(apicompat): recognize web_search_20250305 / google_search in Responses→Anthropic tool conversion	2026-04-25 09:00:37 +08:00
shaw	732d6495ea	chore(gateway): fix lint issues from cc-mimicry-parity merge - staticcheck QF1001: apply De Morgan's law to the OAuth-mimic header passthrough guard (`!(a && b)` → `a != ... \|\| !b`). - unused: drop `isClaudeCodeRequest`, which became dead after PR #1914 switched both `/v1/messages` and `/count_tokens` paths to unconditional `account.IsOAuth()` mimicry. The lowercase helper `isClaudeCodeClient` is kept (still referenced by `TestIsClaudeCodeClient`).	2026-04-25 08:58:57 +08:00
Wesley Liddick	6d20ab8082	Merge pull request #1914 from keh4l/feat/cc-mimicry-parity fix(claude): align Claude Code OAuth mimicry with real CLI traffic	2026-04-25 08:54:04 +08:00
shaw	aa8ee33b0a	refactor(affiliate): tighten DI and harden inviter code validation - Drop SetAffiliateService setters and ProvideAuthService / ProvidePaymentService / ProvideUserHandler wrappers in favor of direct Wire constructor injection. AffiliateService has no back-edge to Auth/Payment/User, so the indirection was never required. - Change RegisterWithVerification's variadic affiliateCode to a fixed parameter; adjust all call sites. - Validate aff_code length and charset in BindInviterByCode before any DB lookup, eliminating timing-side-channel and useless DB roundtrips on malformed input. - Make affiliate cache invalidation synchronous; surface Redis errors via the project logger instead of swallowing them in a detached goroutine. - Add an integration test guarding cross-layer tx propagation in AccrueQuota and a unit test pinning the aff_code format rules.	2026-04-25 08:44:18 +08:00
Wuxie233	5f630fbb19	fix(apicompat): recognize web_search_20250305 / google_search in Responses to Anthropic tool conversion	2026-04-25 01:09:51 +08:00
keh4l	bdbd2916f5	fix(gateway): skip client header passthrough on OAuth mimicry path Root cause of persistent third-party detection: sub2api's buildUpstreamRequest transparently forwards client headers via allowedHeaders whitelist (addHeaderRaw) before applying mimicry overrides. When third-party clients (opencode, etc.) send their own anthropic-beta / user-agent / x-stainless-* / x-claude-code-session-id values, these get appended to the request alongside our injected headers, creating an inconsistent header set that Anthropic detects. Parrot's build_upstream_headers constructs exactly 9 headers from scratch and never forwards anything from the client. This is why 'same opencode version, some users work some don't' — different opencode configs/versions send different header combinations. Fix: when tokenType=oauth and mimicClaudeCode=true, skip the client header passthrough loop entirely. The subsequent applyClaudeCodeMimicHeaders + ApplyFingerprint + beta merge pipeline constructs all necessary headers from our controlled values. Also: remove systemIncludesClaudeCodePrompt gate — OAuth accounts now unconditionally rewrite system (even if client already sent a Claude Code-style prompt), ensuring billing attribution block is always present.	2026-04-25 00:43:38 +08:00
keh4l	6dc89765fd	fix(gateway): always apply full mimicry for OAuth accounts regardless of client identity Before: isClaudeCodeRequest() checked whether the client looks like a real Claude Code CLI (UA, system prompt, X-App header, metadata format). If it looked like Claude Code, all mimicry was skipped — the assumption being that a real CLI needs no help. Problem: third-party tools like opencode partially impersonate Claude Code (sending claude-cli UA + claude-code beta + CC system prompt) but miss critical details (billing attribution block, tool-name obfuscation, cache breakpoints, full beta set). Some users' opencode instances pass the isClaudeCodeRequest check, causing sub2api to skip mimicry entirely, while Anthropic still detects the request as third-party. This explains why 'same opencode version, some users work, some don't' — it depends on which opencode features/config trigger the validator. Fix: OAuth accounts now unconditionally run the full mimicry pipeline, matching Parrot's behavior (Parrot never checks client identity). This is safe because our mimicry is strictly more complete than any third-party client's partial impersonation. Changed: - /v1/messages path: remove isClaudeCode gate - /v1/messages/count_tokens path: same	2026-04-25 00:26:37 +08:00
keh4l	f3233db01f	fix(gateway): apply D/E/F mimicry to native /v1/messages and count_tokens paths The previous commit only wired stripMessageCacheControl, addMessageCacheBreakpoints, and tool-name obfuscation into applyClaudeCodeOAuthMimicryToBody (used by /chat/completions and /responses). The native /v1/messages path and count_tokens path have their own independent mimicry code blocks and were missed. Now all three entry points share the same D/E/F pipeline: - /v1/messages (gateway_service.go forwardAnthropic) - /v1/messages/count_tokens (gateway_service.go countTokens) - OpenAI compat (applyClaudeCodeOAuthMimicryToBody)	2026-04-24 23:16:32 +08:00
keh4l	6e12578bc5	feat(gateway): port Parrot tool-name obfuscation + message cache breakpoints Implements the remaining three parity items with Parrot cc_mimicry: D) Tool-name obfuscation - Dynamic mapping when tools.length > 5 (matches Parrot threshold). Fake names follow {prefix}{name[:3]}{i:02d} (e.g. 'manage_bas00'). Go port of random.Random(hash(tuple(names))) uses fnv64a seed + math/rand; byte-exact reproduction is impossible (Python hash vs Go hash), but the two invariants that matter are preserved: * same input tool_names yield identical mapping (cache hit) * prefix pool is shuffled (names look distributed) - Static prefix map (sessions_ -> cc_sess_, session_ -> cc_ses_) applied as fallback, matching Parrot TOOL_NAME_REWRITES verbatim. - Server tools (web_search_20250305, computer_, etc.) are NOT renamed; only type=='function' and type=='custom' tools are. - tool_choice.name is rewritten in sync (only when type=='tool'). - Response side: bytes-level replace on every SSE chunk / JSON body at 6 injection points (standard stream/non-stream, passthrough stream/non-stream, chat_completions stream + non-stream, responses stream + non-stream). Reverse mapping applied longest-fake-name-first to prevent substring conflicts (parity with Parrot _restore_tool_names_in_chunk). - tool_choice is no longer unconditionally deleted in normalizeClaudeOAuthRequestBody — Parrot passes it through. E) tools[-1] cache_control breakpoint - Injected as {type:ephemeral, ttl:<DefaultCacheControlTTL>} when the last tool has no cache_control. Client-provided ttl is passed through unchanged (repo-wide policy). F) messages cache_control strategy - stripMessageCacheControl removes every client-provided messages[].content[].cache_control (multi-turn stability). - addMessageCacheBreakpoints then injects two stable breakpoints: (1) last message, and (2) second-to-last user turn when messages.length >= 4. - Combined with the system block breakpoint and tools[-1] breakpoint, this gives exactly the 4 breakpoints Anthropic allows per request. Non-trivial implementation details to be aware of when rebasing: Two new files, no upstream collision: gateway_tool_rewrite.go (D + E algorithms) gateway_messages_cache.go (F strip + breakpoints) * Two new feature calls bolted onto the tail of applyClaudeCodeOAuthMimicryToBody in gateway_service.go — rebase conflicts will be ~10 lines maximum. * Response-side injection points all wrap their existing write with reverseToolNamesIfPresent(c, ...), preserving original behavior when no mapping is stored (static prefix rollback still runs). * Non-stream chat/responses switched from c.JSON to json.Marshal + c.Data so bytes-level replace is possible. * Retry bodies (FilterThinkingBlocksForRetry, FilterSignatureSensitiveBlocksForRetry, RectifyThinkingBudget) only prune blocks — they preserve the already-obfuscated tool names, so no extra mapping re-application is needed. Manual QA: end-to-end scenario verified with 6 tools (above threshold) and tool_choice.type=='tool'. Obfuscation + restore roundtrip shown in test logs; then removed the temp test file. Tests (16 new): - buildDynamicToolMap stability + below-threshold guard - sanitizeToolName precedence (dynamic > static) - restoreToolNamesInBytes longest-first + static rollback - applyToolNameRewriteToBody skips server tools + syncs tool_choice - applyToolsLastCacheBreakpoint defaults to 5m + passes client ttl - stripMessageCacheControl + addMessageCacheBreakpoints in the 1/4/string-content cases + second-to-last user turn selection - buildToolNameRewriteFromBody ReverseOrdered is desc-by-fake-length - fake name shape follows Parrot {prefix}{head3}{i:02d}	2026-04-24 23:16:32 +08:00
keh4l	a25faecadd	feat(gateway): align body shape with real Claude Code CLI defaults Three field-level alignments in normalizeClaudeOAuthRequestBody to match real Claude Code CLI traffic byte-for-byte: 1. temperature: previously deleted unconditionally; now passes through client value, defaults to 1 when absent (real CLI always sends temperature, default 1). 2. max_tokens: defaults to 128000 when absent (real CLI default). 3. context_management: when thinking.type is enabled/adaptive and the client did not provide context_management, inject {"edits":[{"type":"clear_thinking_20251015","keep":"all"}]} to mirror real CLI behavior. tool_choice removal is unchanged (Claude Code OAuth credentials do not allow client-supplied tool_choice). Tests updated: - gateway_body_order_test.go: temperature/max_tokens are now expected in output; tool_choice still removed. - gateway_prompt_test.go: system array is now 2 blocks (billing + cc prompt), assertions adjusted. - gateway_anthropic_apikey_passthrough_test.go: same 2-block assertion.	2026-04-24 23:16:32 +08:00
keh4l	5862e2d8d9	feat(gateway): add billing attribution block with cc_version fingerprint Real Claude Code CLI always sends a 2-block system array: [0] {"type":"text", "text":"x-anthropic-billing-header: cc_version=X.Y.Z.{fp}; cc_entrypoint=cli; cch=00000;"} [1] {"type":"text", "text":"You are Claude Code...", "cache_control":{...}} Before this commit, sub2api's mimicry path only produced block [1]. The missing billing block is one of the primary third-party detection signals Anthropic uses for Claude-Code-scoped OAuth tokens. New file gateway_billing_block.go ports the fingerprint algorithm (byte-for-byte from Parrot cc_mimicry.py:compute_fingerprint): pick chars at positions [4,7,20] of the first user text, then `sha256(SALT + chars + cc_version)[:3]`. - claude/constants.go: CLICurrentVersion = "2.1.92" (must match UA) - gateway_billing_block.go: computeClaudeCodeFingerprint + buildBillingAttributionBlockJSON + extractFirstUserText - gateway_service.go: rewriteSystemForNonClaudeCode now emits both blocks in order; cch=00000 is filled in later by signBillingHeaderCCH in buildUpstreamRequest. Downstream compat note: syncBillingHeaderVersion's regex `cc_version=\d+\.\d+\.\d+` only matches the semver triple, leaving the `.{fp}` suffix intact when rewriting in buildUpstreamRequest.	2026-04-24 23:16:32 +08:00
keh4l	66d6454535	feat(claude): add ttl to cache_control with default 5m Real Claude CLI traffic sends cache_control as `{"type":"ephemeral","ttl":"1h"}`. Our previous payload only sent `{"type":"ephemeral"}`, which is a bytewise mismatch with the official CLI and one more third-party detection signal. Policy: client-provided ttl is always passed through unchanged. Proxy-generated cache_control blocks default to 5m (vs Parrot's 1h) to avoid burning the 1h cache budget on automatic breakpoints while still aligning with the `ttl` field being present. - claude/constants.go: DefaultCacheControlTTL = "5m" - apicompat/types.go: new AnthropicCacheControl type with TTL field; AnthropicTool gains optional CacheControl pointer so the mimicry path can attach a cache breakpoint to tools[-1] later. - service/gateway_service.go: anthropicCacheControlPayload gains TTL; marshalAnthropicSystemTextBlock and rewriteSystemForNonClaudeCode emit ttl=5m by default.	2026-04-24 23:16:32 +08:00
keh4l	165553cfb0	fix(gateway): use full beta list in buildUpstreamRequest mimicry path The previous commit added FullClaudeCodeMimicryBetas() but the two call sites in buildUpstreamRequest still hardcoded the old 3-token subset. Anthropic now checks the complete set of beta tokens to decide if a request qualifies as Claude Code. Wire them up: - /v1/messages mimic path: requiredBetas = FullClaudeCodeMimicryBetas() - /v1/messages/count_tokens mimic path: same + BetaTokenCounting Haiku models keep the 2-token exemption (BetaOAuth + InterleaveThinking).	2026-04-24 23:16:32 +08:00
keh4l	b5467d610a	fix(gateway): apply full Claude Code mimicry on /chat/completions and /responses Before: the OpenAI-compat forwarders only called injectClaudeCodePrompt, which prepends the Claude Code banner but leaves the rest of the body in its original non-Claude-Code shape. The codebase already admits this is insufficient (see the comment on rewriteSystemForNonClaudeCode in gateway_service.go: "仅前置追加 Claude Code 提示词无法通过检测"). Effect: OAuth accounts served through /v1/chat/completions or /v1/responses were detected as third-party apps and bled plan quota with: Third-party apps now draw from your extra usage, not your plan limits. Fix: - apicompat.AnthropicRequest: add Metadata json.RawMessage so metadata survives the OpenAI->Anthropic->Marshal round trip; without it the downstream rewrite has no user_id to work with. - service: extract applyClaudeCodeOAuthMimicryToBody, a ParsedRequest-free variant of the /v1/messages mimicry pipeline (rewriteSystemForNonClaudeCode + normalizeClaudeOAuthRequestBody + metadata.user_id injection) so the OpenAI-compat forwarders can reuse it. - service: add buildOAuthMetadataUserIDFromBody + hashBodyForSessionSeed for the same reason (no ParsedRequest at the call site). - ForwardAsChatCompletions / ForwardAsResponses: replace the 3-line prompt-prepend with the full mimicry pipeline. - applyClaudeCodeMimicHeaders: set x-client-request-id per-request (real Claude CLI always does); missing/duplicated values are one more third-party fingerprint signal. No change to the native /v1/messages path: it already called the full pipeline, we only lift those helpers into a reusable function. Tests: - go build ./... passes - go test ./internal/service/... ./internal/pkg/apicompat/... passes - lsp_diagnostics clean on all touched files - pre-existing failures in internal/config are unrelated (env-sensitive tests that also fail on upstream main)	2026-04-24 23:16:32 +08:00
keh4l	57ff97960d	chore(claude): bump mimicked CLI to 2.1.92 and extend anthropic-beta list Align Claude Code mimicry constants with the latest real CLI traffic (see Parrot's src/transform/cc_mimicry.py). Anthropic now uses the full set of anthropic-beta tokens to decide whether a request counts as "official Claude Code"; requests missing tokens that real CLI ships today are demoted to third-party usage: Third-party apps now draw from your extra usage, not your plan limits. Changes: - claude/constants.go: add new beta tokens (prompt-caching-scope, effort, redact-thinking, context-management, extended-cache-ttl) and expose FullClaudeCodeMimicryBetas() for the OAuth mimicry path. - claude/constants.go: bump default User-Agent to claude-cli/2.1.92. - identity_service.go: bump defaultFingerprint User-Agent accordingly. No behavioral change for clients that already send a newer UA (fingerprint merge still prefers the incoming value).	2026-04-24 23:16:32 +08:00
Wesley Liddick	5b5db88550	Merge pull request #1897 from VpSanta33/codex/invite-affiliate-rebate feat: 新增邀请返利功能，并支持后台配置返利比例	2026-04-24 22:36:53 +08:00
VpSanta33	f03de00cb9	feat: add affiliate invite rebate flow and admin rebate-rate setting	2026-04-24 22:22:26 +08:00
gaoren002	27ee141c1e	fix(openai): preserve mcp tool call ids	2026-04-24 13:24:21 +00:00
gaoren002	e65574dea9	fix(openai): normalize codex responses payloads	2026-04-24 12:03:19 +00:00
Wesley Liddick	1ce9dc03f9	Merge pull request #1895 from gaoren002/fix/codex-spark-limitations fix(openai): handle codex spark model limitations	2026-04-24 19:57:42 +08:00
song	959af1c8f6	fix(openai): preserve codex tool call ids	2026-04-24 19:31:49 +08:00
gaoren002	c4d496da18	fix(openai): handle codex spark model limitations	2026-04-24 07:42:31 +00:00
win	9156585a23	chore: gofmt/goimports 后处理合并上游后统一运行 gofmt/goimports，消除排序差异与空行不一致。	2026-04-24 11:52:53 +08:00

1 2 3 4 5 ...

2347 Commits