sub2api

Author	SHA1	Message	Date
win	3cffaa1e8e	x Some checks are pending CI / golangci-lint (push) Waiting to run Details CI / windsurf-platform (macos-latest) (push) Waiting to run Details CI / windsurf-platform (windows-latest) (push) Waiting to run Details CI / test (push) Waiting to run Details CI / frontend (push) Waiting to run Details Security Scan / backend-security (push) Waiting to run Details Security Scan / frontend-security (push) Waiting to run Details	2026-05-30 16:30:59 +08:00
Wesley Liddick	52292741cb	Merge pull request #2849 from Pluviobyte/fix/count-tokens-payload-filter fix(gateway): filter count_tokens generation fields	2026-05-29 16:11:01 +08:00
Pluviobyte	27600b1d2c	fix(gateway): filter count_tokens generation fields Anthropic count_tokens rejects generation-only fields such as temperature, top_p, top_k, stream, and stop sequences. Passing the original messages payload through unchanged can turn otherwise valid requests into upstream 400 errors. Sanitize only the count_tokens upstream payload after the gateway's existing request normalization, preserving fields that existing compatibility paths rely on while removing parameters the count_tokens endpoint does not accept. Fixes #2764 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-28 05:40:50 +00:00
alfadb	ddf91e9a7f	fix(gateway): 按最终 anthropic-beta header 对 body.context_management 做能力维度 sanitize 上游 Anthropic 在 body 含 `context_management` 但最终发出去的 `anthropic-beta` header 不含 `context-management-2025-06-27` 时会拒收： { "type": "invalid_request_error", "message": "context_management: Extra inputs are not permitted" } (HTTP 400, request_id 形如 req_011C...) 该 400 在 haiku 路径上触发,因为三个 beta header 构造器有意排除了 context-management beta: - HaikuBetaHeader (messages, OAuth / mimic CC) - APIKeyHaikuBetaHeader (messages, API-key) - CountTokensBetaHeader (count_tokens, 所有认证类型) 但 body 中仍然带着 `context_management` 字段,原因有二: 1. normalizeClaudeOAuthRequestBody 在 thinking_enabled / thinking_adaptive 打开时为 `clear_thinking_20251015` 主动注入; 2. 客户端 (Claude Code CLI >= 2.1.87) 原样发送, 网关透传时一并转发。修复方案: 能力维度对称约束 ========================== 对齐已有的 Bedrock 模式 (`backend/internal/service/bedrock_request.go` 中的 `sanitizeBedrockFieldsForBetaTokens`): 根据最终发出的 `anthropic-beta` header 决定是否保留 `body.context_management`, 而不是按 model 名或路由分类来决定。新增纯函数: sanitizeAnthropicBodyForBetaTokens(body, betaHeader) (body, changed) 如果 `betaHeader` 不含 `context-management-2025-06-27`, 用 sjson 把 body 字段 strip 掉; 否则原样返回。在所有 Anthropic / Anthropic-兼容上游出口都接入: \| 路径 \| sanitize 接入点 \| \|--------------------------------------------\|-------------------------------------------------------\| \| /v1/messages OAuth mimic CC \| buildUpstreamRequest \| \| /v1/messages OAuth 真 CC 透传 \| buildUpstreamRequest \| \| /v1/messages API-key \| buildUpstreamRequest \| \| /v1/messages API-key passthrough \| buildUpstreamRequestAnthropicAPIKeyPassthrough \| \| /v1/messages Vertex / service-account \| buildUpstreamRequestAnthropicVertex \| \| /v1/messages/count_tokens (全部 4 条路径) \| buildCountTokensRequest, \| \| \| buildCountTokensRequestAnthropicAPIKeyPassthrough \| \| Antigravity Anthropic-兼容上游 \| AntigravityGatewayService.ForwardUpstream \| \| Bedrock \| (已由 sanitizeBedrockFieldsForBetaTokens 处理) \| 为什么要重排 (而不是加一行调用) ================================ sanitize 必须在 `signBillingHeaderCCH` 之前运行。CCH 对整个 body 取 xxHash64 摘要后写入 billing header 里 5 位十六进制的 `cch` 字段; 如果先签名再 strip, 上游对发出去的 body 重算 hash 会和 `cch` 不一致, 请求被判为 third-party。这就要求在 `http.NewRequest` 之前算出最终的 `anthropic-beta` header, 所以把原本内联在 builder 里的 beta 计算逻辑抽成了两个纯函数: - computeFinalAnthropicBeta (messages 路径: mimic 不透传客户端 beta) - computeFinalCountTokensAnthropicBeta (count_tokens 路径: mimic 不跳过白名单透传) 两者逐位保留原行为: - mimic 路径在 messages 上跳过客户端 beta, 在 count_tokens 上合并 - API-key 路径尊重 `InjectBetaForAPIKey` 开关 - dropSet (`defaultDroppedBetasSet` + BetaPolicy filter) 应用在主路径, passthrough / Vertex 路径有意不应用 —— 这条原有的不对称行为本 PR 不动。一条语义测试 (`TestSanitizeMustBeBeforeCCHSigning_HashConsistency`) 把顺序约束文档化并强制守住: 它证明 `sanitize -> signBillingHeaderCCH` 产生的 `cch` 与最终 body 一致, 而 `signBillingHeaderCCH -> sanitize` 产生的 `cch` 会被上游 hash 重算判失败。为什么是能力维度 (而不是 haiku 模型名匹配) ========================================== 最朴素的"按 model 名 strip"方案 (`strings.Contains(modelID, "haiku") -> DeleteBytes "context_management"`) 有四个真实失败模式: 1. 过度删除。CLI >= 2.1.87 的真 Claude Code 客户端在 haiku 上同时发送 body 字段和 `anthropic-beta: context-management-2025-06-27`。一律 strip 会让该用户的 `clear_thinking_20251015` 静默失效。 2. 别名漂移。未来的 haiku 别名 (`claude-3-haiku-...`, `claude-haiku-...` 等) 改变匹配面; 任何新别名都会悄悄绕过 strip。 3. count_tokens 漏覆盖。count_tokens 有自己的 builder 和不同的 beta header 集合; 在一个地方做 model 名检查会漏掉这条路径。 4. API-key passthrough 早退。passthrough builder 在 model 名 strip 之前就 return 了, strip 根本不执行。能力维度沿着 header 端到端走, 上述 4 个 case 都由构造方式保证正确, 不依赖任何 modelID 匹配。防御项 ====== - 当 `sjson.DeleteBytes` 在 `gjson` 刚验证过字段存在的 body 上失败时, `sanitizeAnthropicBodyForBetaTokens` 会记 warning 日志 —— 这种情况现实中仅在请求中途被破坏时发生, 日志把此前会静默发生的 body / header 不一致暴露出来。 - `header_util.go` 新增 `deleteHeaderAllForms`: 在白名单透传已经写入 canonical 大小写的 `Anthropic-Beta` 之后再覆盖, 否则会同时留下两条。测试 ==== `backend/internal/service` 下新增 44 个测试: - 纯函数: anthropicBetaTokensContains x 5, sanitize keep/strip x 6, computeFinal{Anthropic,CountTokens}AnthropicBeta x 12 - normalize 回归 x 5 - buildUpstreamRequest 端到端 x 4 (OAuth mimic haiku strip / mimic sonnet preserve / 真 CC haiku 带客户端 beta preserve / API-key haiku strip) - buildCountTokensRequest 端到端 x 2 - buildUpstreamRequestAnthropicAPIKeyPassthrough x 2 (strip / preserve) - buildCountTokensRequestAnthropicAPIKeyPassthrough x 2 (strip / preserve) - buildUpstreamRequestAnthropicVertex x 2 (strip / preserve, 含 outgoing `anthropic-beta` header 对称断言) - CCH 顺序语义测试 x 1 unit 套件全过 (本机 88s), `golangci-lint` 0 issues。已知局限 (本 PR 范围外) ======================== - Vertex 路径用透传过来的客户端 `anthropic-beta` header 作为 sanitize 依据, 而不是 Vertex 侧的能力矩阵。最坏情况是过度 strip (= 当前 main 的行为, 主路径本来什么都不 strip); 不是 regression。完整的 Vertex 能力模型属于单独的 PR。 - Vertex builder 仍然不应用 BetaPolicy filter / dropSet。这是该 builder 早 return 的既有架构决策, 本 PR 不动。 - count_tokens mimic 在 haiku 上仍然注入 `context-management-2025-06-27` (因为原 count_tokens mimic 逻辑并不像 messages mimic 那样排除它)。本 PR 逐位保留 main 的行为; 是否要让它与 messages mimic 的排除策略统一是另一个问题。 - `sanitizeAnthropicBodyForBetaTokens` 目前只处理 `context_management <-> context-management-2025-06-27` 这一对。如果 Anthropic 后续推出更多 beta-gated body 字段, 可以在后续 PR 重构为 `{body 路径 -> required beta token}` 注册表的形式。	2026-05-28 00:02:50 +08:00
Wesley Liddick	bbe847ed3e	Merge pull request #2805 from StarryKira/feat/configurable-pool-retry-status-codes feat(account): configurable pool-mode same-account retry status codes	2026-05-27 22:09:55 +08:00
StarryKira	21033dceb9	feat(account): configurable pool-mode same-account retry status codes Pool mode currently retries the same account for a fixed set of upstream HTTP statuses: 401, 403, 429. Some upstream pool deployments also need same-account retry for transient provider/proxy statuses such as 502, 503, 520, 529, but hard-coding more statuses changes behavior for everyone. Add a per-account credentials option `pool_mode_retry_status_codes` that lets admins choose which upstream HTTP status codes trigger same-account retry in pool mode: - Unset (default): preserve the current 401/403/429 default - Explicit list: override the defaults with the configured codes - Codes normalized to the 100-599 range, deduplicated, sorted The standalone `isPoolModeRetryableStatus` helper is kept as the default-only fallback. All 15 gateway call sites switch to the new `Account.IsPoolModeRetryableStatus` method so behavior is preserved for accounts that do not configure the new field. Frontend admin UI gains a "Retry Status Codes" comma-separated input under the pool-mode section in both Create/Edit account modals (en + zh i18n). Fixes #2731 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 11:24:25 -07:00
wucm667	a31b507484	fix(scheduler): 模型404仅冷却账号模型组合	2026-05-26 20:29:48 +08:00
DaydreamCoding	6b39b344d8	feat(quota): 用户 × 平台 USD 配额为用户在 anthropic/openai/gemini/antigravity 四个平台上提供日/周/月三个窗口的 USD 配额管控。配额语义：未设置=不限制，0=禁用，>0=美元上限。两层模型： - 配置层：系统默认配额，以及 email/linuxdo/oidc/wechat/github/google/ dingtalk 七个鉴权来源的默认配额，存于 settings，以嵌套 JSON 整体读写（系统 1 个 key + 每个来源 1 个 key），整体替换语义。 - 运行时层：user_platform_quota 表按用户记录实际配额，与配置层解耦。后端：新增 ent schema 与 140_user_platform_quotas.sql 迁移、repository 与 service 端口、计费链路集成、管理端与用户端读写接口。前端：管理端设置页配额编辑、用户配额管理 Modal、用户 Dashboard 展示、中英文案。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 10:49:20 +08:00
erio	fe1c6c958b	feat(bedrock): add Claude Code compatibility for AWS Bedrock - Export ApplyBedrockCCCompat() in GatewayService, called after channel model mapping to ensure mapped model ID is used for Opus 4.7+ detection - Add sanitizeBedrockCCFields(): remove service_tier/interface_geo/ context_management, inject max_tokens/anthropic_version defaults - Add sanitizeBedrockCCBetaTokens(): filter anthropic_beta to keep only Bedrock-supported tokens, reusing autoInjectBedrockBetaTokens and filterBedrockBetaTokens for consistent rules - Remove unsupported beta tokens (interleaved-thinking, context-management) from whitelist based on AWS official docs - Simplify IsBedrockCCCompatEnabled() to check boolean toggle directly, applying CC compat to all accounts regardless of platform - Add unit tests for IsBedrockCCCompatEnabled (8 cases), sanitizeBedrockCCFields (8 cases), sanitizeBedrockCCBetaTokens (7 cases) - Update bedrock beta policy tests for removed auto-injection	2026-05-21 11:46:24 +08:00
erio	4fd21994c5	feat(bedrock): add Claude Code compatibility transformations for Bedrock accounts Add channel-level Bedrock CC compatibility toggle (similar to web_search_emulation) that fixes 4 types of Bedrock 400 errors seen with Claude Code: 1. thinking.type "enabled" → "adaptive" for Opus 4.7+ (only supports adaptive) 2. Add default budget_tokens when missing for older models 3. Replace illegal characters in tool_use IDs to match Bedrock's ^[a-zA-Z0-9_-]+$ pattern 4. anthropic_version / invalid beta flag (already handled elsewhere) Transformations run in Forward() before any forwarding path, so both native Bedrock accounts and apikey passthrough accounts pointing to Bedrock relays benefit. Includes channel-level toggle UI and unit tests.	2026-05-20 21:47:38 +08:00
name	8211aa7066	fix: retry on "thinking block must contain thinking" upstream error Some clients reuse assistant history from other models when switching to claude with extended thinking enabled. If a prior thinking block lacks the thinking text field, upstream returns: messages.X.content.Y.thinking: each thinking block must contain thinking Add this pattern to isThinkingBlockSignatureError so the existing FilterThinkingBlocksForRetry retry path triggers and rewrites/drops the offending blocks.	2026-05-20 18:46:50 +08:00
name	2eb622f2f6	Remove ops retry replay storage	2026-05-19 19:37:41 +08:00
DaydreamCoding	b19da9c7fe	feat(dingtalk): 钉钉 OAuth 登录接入与 internal_only 用户属性同步 ⚠️ 应用类型约束：当前实现仅支持「钉钉登录-企业内部应用」（DingTalk 开放平台 internal_app 类型）。第三方个人应用、第三方企业应用类型暂不支持——OAuth 流程相同但 corp 校验、跨企业行为不同。backend 通过 DingTalkAppKind 校验对非 internal_app 类型 fail-closed（硬约束）。钉钉 OAuth 登录主链 - 4 步 OAuth 链：ExchangeCodeForUserToken / GetUnionIdByUserToken / GetUserIdByUnionId / GetStaffInfoByUserId；app token 缓存 - pending session 机制持久化 OAuth 中间态；cookie-only token 持久化 - 三种分流：bind_login_required / email_completion / choose_account_action - corp_restriction_policy 支持 none + internal_only；stale "whitelist" 在加载层与写入层均静默 coerce 为 none + slog.Warn - bypass_registration 开关：企业内部模式豁免全局 REGISTRATION_DISABLED - isReservedEmail / signup_source / canUnbindProvider / OAuth pending flow 等横切点支持 dingtalk provider - migration 136：4 表 CHECK 约束加入 'dingtalk' provider 值 internal_only 模式同步企业邮箱/姓名/部门到用户属性 - SyncCorpEmail / SyncDisplayName / SyncDept 三个独立开关 + 对应 SyncXxxAttrKey 目标属性 key（默认 dingtalk_email / dingtalk_name / dingtalk_department）；非 internal_only policy 在写入层与加载层均 coerce 为 false，admin handler 与 setting_service 双层兜底 - 同步语义：首次注册写 users.username（昵称优先 → 企业姓名 fallback），之后每次登录刷新 3 个属性；空值也写入以覆盖旧值 - 邮箱三级 fallback：org_email > email > extension["企业邮箱"] （钉钉自定义字段 JSON） - 部门路径递归向上拼接，跳过 dept_id=1 选首个真实子部门，剥离根组织名 - GetUnionIdByUserToken 同时返回 OIDC /contact/users/me 的 nick 字段；新增 GetDeptInfo 调用 OAPI /topapi/v2/department/get - AuthHandler 注入 UserAttributeService；OAuth pending flow 在 createPendingOAuthAccount / bindPendingOAuthLogin 分别派发到 AfterRegistration（syncUsername=true）/ AfterLogin - migration 137 seed dingtalk_email/name/department 三个用户属性定义附带修复（同集成路径暴露的两个 OAuth 注册回归） - LoginOrRegisterOAuthWithTokenPair 新建用户分支用 inferLegacySignupSource 覆写 caller 显式传入的 signupSource，导致 dingtalk/linuxdo/oidc/wechat 渠道授权按 email 渠道读取；改为只在 caller 未显式传入时回退邮箱推断 - mergeProviderDefaultGrantSettings 把 parse fallback 默认值 (Concurrency=5 / Balance=0) 当作"未配置"哨兵，admin 显式设 5 时被误判退回全局默认（复现：全局默认 1 + 渠道默认并发 5 + grant_on_signup → 新用户实际 concurrency=1）；去掉哨兵，admin 任何 >=0 值都覆盖 globalDefaults 前端 - DingTalk Login / Callback / EmailCompletion / ChoiceAccount / Error 视图；router + auth API client - admin SettingsView：corp policy radio（none / internal_only）+ bypass 注册开关 + i18n；internal_only 下展示三同步开关 + 目标 attr key 下拉（拉取 user attribute definitions），展示 fieldEmail / qyapi_get_department_list 钉钉权限申请提示 - Profile：S1 主动绑定 / S5 解绑钉钉按钮 + 合成邮箱防自锁 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 15:27:47 +08:00
Wesley Liddick	a340002c6d	Merge pull request #2401 from 2ue/fix/normalize-image-billing-size 修复图片计费尺寸归一化与使用记录展示	2026-05-19 14:00:24 +08:00
lyen1688	164e2f610c	fix: add keepalive for Anthropic passthrough streams	2026-05-18 18:41:25 +08:00
2ue	bb4c1abe28	Fix image billing size normalization	2026-05-12 15:21:31 +08:00
shaw	9377c96746	fix: 让消息 cache_control 改写默认关闭	2026-05-11 21:26:41 +08:00
shaw	501b7f2772	fix: stabilize anthropic passthrough timeout error	2026-05-07 10:24:29 +08:00
2ue	6faa344916	feat: add OpenAI image generation controls	2026-05-05 03:26:54 +08:00
shaw	72d5ee4cd1	fix: drain OpenAI compat streams for usage	2026-05-03 17:11:27 +08:00
shaw	73b872998e	feat: 添加 Anthropic 缓存 TTL 注入开关	2026-04-30 13:38:22 +08:00
shaw	733627cf9d	fix: improve sticky session scheduling	2026-04-30 11:38:11 +08:00
Wesley Liddick	4d676dddd1	Merge pull request #2066 from alfadb/fix/anthropic-stream-eof-failover fix(gateway): Anthropic 流式 EOF 失败移交 + SSE error 帧标准化	2026-04-29 17:09:47 +08:00
alfadb	d78478e866	fix(gateway): sanitize stream errors to avoid leaking infrastructure topology (net.OpError).Error() concatenates Source/Addr fields, so the previous disconnectMsg surfaced internal source IP/port and upstream server address to clients via SSE error frames and UpstreamFailoverError.ResponseBody (reported by @Wei-Shaw on PR #2066). - Add sanitizeStreamError that maps known errors (io.ErrUnexpectedEOF, context.Canceled, syscall.ECONNRESET/EPIPE/ETIMEDOUT/...) to fixed descriptions and falls back to a generic placeholder, with an explicit net.OpError branch that drops Source/Addr fields entirely. - Use sanitized message in client-facing disconnectMsg; full ev.err is still preserved in the existing operator log line for diagnosis. - Tests cover net.OpError redaction, the failover ResponseBody path, and every known sanitized error mapping.	2026-04-29 15:44:54 +08:00
alfadb	4c474616b9	fix(gateway): emit Anthropic-standard SSE error events and failover body Two follow-ups to PR #2066's failover-wrap fix: 1. Failover ResponseBody (`UpstreamFailoverError.ResponseBody`) was encoded as `{"error": "<msg>"}` (string field). `ExtractUpstreamErrorMessage` probes for `error.message`, `detail`, or top-level `message` only — so `handleFailoverExhausted` and downstream passthrough rules saw an empty message, losing the EOF root cause in ops logs. Re-encode as the Anthropic standard shape `{"type":"error","error":{"type":"upstream_disconnected","message":"..."}}`. (Addresses the inline review comment from copilot-pull-request-reviewer on Wei-Shaw/sub2api#2066.) 2. The streaming `event: error` SSE frame for `response_too_large`, `stream_read_error`, and `stream_timeout` was non-standard (`{"error":"<reason>"}`). Anthropic SDKs (and Claude Code) expect `{"type":"error","error":{"type":"...","message":"..."}}` and parse `error.type`/`error.message` accordingly. Refactor `sendErrorEvent` to take both reason and message, and emit the standard frame so client SDKs surface a real diagnostic message instead of a generic stream error. This does not by itself prevent task interruption on long-stream EOF (SSE has no resume; client-side retry remains the only complete fix), but it gives both server-side ops logs and client-side error UIs a meaningful upstream message so users know the next step is to retry. Tests updated to assert the new body shape on both branches plus a new assertion that `ExtractUpstreamErrorMessage` returns a non-empty string. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 20:24:17 +08:00
alfadb	6327573534	fix(gateway): wrap Anthropic stream EOF as failover error before client output Anthropic streaming path (gateway_service.go) returned a plain error on upstream SSE read failure, so the handler-level UpstreamFailoverError check never fired and the client received a bare `stream_read_error` event, breaking long-running tasks even when no bytes had been written yet. The most common trigger is HTTP/2 GOAWAY from api.anthropic.com edge backends doing graceful rotation: Go's http.Transport surfaces this as `unexpected EOF` and never auto-retries. Mirror what the OpenAI and antigravity gateways already do: when the read error happens before any byte has reached the client (`!c.Writer.Written()`), return `*UpstreamFailoverError{StatusCode: 502, RetryableOnSameAccount: true}` so the handler can retry on the same or another account. After client output has begun, SSE has no resume protocol — keep the existing passthrough behavior. Tests cover both branches via streamReadCloser-based fixtures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 19:12:48 +08:00
Oliver	6d11f9ed77	Add Vertex service account support	2026-04-25 20:39:58 -04:00
shaw	496469ac4e	fix(gateway): skip body mimicry for real Claude Code clients to restore prompt caching PR #1914 unconditionally applied the full mimicry pipeline to all OAuth accounts, including real Claude Code CLI clients. This replaced the client's long system prompt (~10K+ tokens with stable cache_control breakpoints) with a short ~45 token [billing, CC prompt] pair, which falls below Anthropic's 1024-token minimum cacheable prefix threshold. The result: every request created a new cache but never hit an existing one. Fix: restore the Claude Code client detection gate so that real CC clients bypass body-level mimicry (system rewrite, message cache management, tool name obfuscation). Non-CC third-party clients (opencode, etc.) continue to receive full mimicry. Also harden the detection logic: - Make UA regex case-insensitive (align with claude_code_validator.go) - Validate metadata.user_id format via ParseMetadataUserID() instead of just checking non-empty, preventing third-party tools from spoofing a claude-cli/* UA with an arbitrary user_id string to bypass mimicry	2026-04-25 22:50:35 +08:00
hungryboy1025	8987e0ba67	fix(openai): tighten responses stream account tests	2026-04-25 16:56:50 +08:00
shaw	732d6495ea	chore(gateway): fix lint issues from cc-mimicry-parity merge - staticcheck QF1001: apply De Morgan's law to the OAuth-mimic header passthrough guard (`!(a && b)` → `a != ... \|\| !b`). - unused: drop `isClaudeCodeRequest`, which became dead after PR #1914 switched both `/v1/messages` and `/count_tokens` paths to unconditional `account.IsOAuth()` mimicry. The lowercase helper `isClaudeCodeClient` is kept (still referenced by `TestIsClaudeCodeClient`).	2026-04-25 08:58:57 +08:00
keh4l	bdbd2916f5	fix(gateway): skip client header passthrough on OAuth mimicry path Root cause of persistent third-party detection: sub2api's buildUpstreamRequest transparently forwards client headers via allowedHeaders whitelist (addHeaderRaw) before applying mimicry overrides. When third-party clients (opencode, etc.) send their own anthropic-beta / user-agent / x-stainless-* / x-claude-code-session-id values, these get appended to the request alongside our injected headers, creating an inconsistent header set that Anthropic detects. Parrot's build_upstream_headers constructs exactly 9 headers from scratch and never forwards anything from the client. This is why 'same opencode version, some users work some don't' — different opencode configs/versions send different header combinations. Fix: when tokenType=oauth and mimicClaudeCode=true, skip the client header passthrough loop entirely. The subsequent applyClaudeCodeMimicHeaders + ApplyFingerprint + beta merge pipeline constructs all necessary headers from our controlled values. Also: remove systemIncludesClaudeCodePrompt gate — OAuth accounts now unconditionally rewrite system (even if client already sent a Claude Code-style prompt), ensuring billing attribution block is always present.	2026-04-25 00:43:38 +08:00
keh4l	6dc89765fd	fix(gateway): always apply full mimicry for OAuth accounts regardless of client identity Before: isClaudeCodeRequest() checked whether the client looks like a real Claude Code CLI (UA, system prompt, X-App header, metadata format). If it looked like Claude Code, all mimicry was skipped — the assumption being that a real CLI needs no help. Problem: third-party tools like opencode partially impersonate Claude Code (sending claude-cli UA + claude-code beta + CC system prompt) but miss critical details (billing attribution block, tool-name obfuscation, cache breakpoints, full beta set). Some users' opencode instances pass the isClaudeCodeRequest check, causing sub2api to skip mimicry entirely, while Anthropic still detects the request as third-party. This explains why 'same opencode version, some users work, some don't' — it depends on which opencode features/config trigger the validator. Fix: OAuth accounts now unconditionally run the full mimicry pipeline, matching Parrot's behavior (Parrot never checks client identity). This is safe because our mimicry is strictly more complete than any third-party client's partial impersonation. Changed: - /v1/messages path: remove isClaudeCode gate - /v1/messages/count_tokens path: same	2026-04-25 00:26:37 +08:00
keh4l	f3233db01f	fix(gateway): apply D/E/F mimicry to native /v1/messages and count_tokens paths The previous commit only wired stripMessageCacheControl, addMessageCacheBreakpoints, and tool-name obfuscation into applyClaudeCodeOAuthMimicryToBody (used by /chat/completions and /responses). The native /v1/messages path and count_tokens path have their own independent mimicry code blocks and were missed. Now all three entry points share the same D/E/F pipeline: - /v1/messages (gateway_service.go forwardAnthropic) - /v1/messages/count_tokens (gateway_service.go countTokens) - OpenAI compat (applyClaudeCodeOAuthMimicryToBody)	2026-04-24 23:16:32 +08:00
keh4l	6e12578bc5	feat(gateway): port Parrot tool-name obfuscation + message cache breakpoints Implements the remaining three parity items with Parrot cc_mimicry: D) Tool-name obfuscation - Dynamic mapping when tools.length > 5 (matches Parrot threshold). Fake names follow {prefix}{name[:3]}{i:02d} (e.g. 'manage_bas00'). Go port of random.Random(hash(tuple(names))) uses fnv64a seed + math/rand; byte-exact reproduction is impossible (Python hash vs Go hash), but the two invariants that matter are preserved: * same input tool_names yield identical mapping (cache hit) * prefix pool is shuffled (names look distributed) - Static prefix map (sessions_ -> cc_sess_, session_ -> cc_ses_) applied as fallback, matching Parrot TOOL_NAME_REWRITES verbatim. - Server tools (web_search_20250305, computer_, etc.) are NOT renamed; only type=='function' and type=='custom' tools are. - tool_choice.name is rewritten in sync (only when type=='tool'). - Response side: bytes-level replace on every SSE chunk / JSON body at 6 injection points (standard stream/non-stream, passthrough stream/non-stream, chat_completions stream + non-stream, responses stream + non-stream). Reverse mapping applied longest-fake-name-first to prevent substring conflicts (parity with Parrot _restore_tool_names_in_chunk). - tool_choice is no longer unconditionally deleted in normalizeClaudeOAuthRequestBody — Parrot passes it through. E) tools[-1] cache_control breakpoint - Injected as {type:ephemeral, ttl:<DefaultCacheControlTTL>} when the last tool has no cache_control. Client-provided ttl is passed through unchanged (repo-wide policy). F) messages cache_control strategy - stripMessageCacheControl removes every client-provided messages[].content[].cache_control (multi-turn stability). - addMessageCacheBreakpoints then injects two stable breakpoints: (1) last message, and (2) second-to-last user turn when messages.length >= 4. - Combined with the system block breakpoint and tools[-1] breakpoint, this gives exactly the 4 breakpoints Anthropic allows per request. Non-trivial implementation details to be aware of when rebasing: Two new files, no upstream collision: gateway_tool_rewrite.go (D + E algorithms) gateway_messages_cache.go (F strip + breakpoints) * Two new feature calls bolted onto the tail of applyClaudeCodeOAuthMimicryToBody in gateway_service.go — rebase conflicts will be ~10 lines maximum. * Response-side injection points all wrap their existing write with reverseToolNamesIfPresent(c, ...), preserving original behavior when no mapping is stored (static prefix rollback still runs). * Non-stream chat/responses switched from c.JSON to json.Marshal + c.Data so bytes-level replace is possible. * Retry bodies (FilterThinkingBlocksForRetry, FilterSignatureSensitiveBlocksForRetry, RectifyThinkingBudget) only prune blocks — they preserve the already-obfuscated tool names, so no extra mapping re-application is needed. Manual QA: end-to-end scenario verified with 6 tools (above threshold) and tool_choice.type=='tool'. Obfuscation + restore roundtrip shown in test logs; then removed the temp test file. Tests (16 new): - buildDynamicToolMap stability + below-threshold guard - sanitizeToolName precedence (dynamic > static) - restoreToolNamesInBytes longest-first + static rollback - applyToolNameRewriteToBody skips server tools + syncs tool_choice - applyToolsLastCacheBreakpoint defaults to 5m + passes client ttl - stripMessageCacheControl + addMessageCacheBreakpoints in the 1/4/string-content cases + second-to-last user turn selection - buildToolNameRewriteFromBody ReverseOrdered is desc-by-fake-length - fake name shape follows Parrot {prefix}{head3}{i:02d}	2026-04-24 23:16:32 +08:00
keh4l	a25faecadd	feat(gateway): align body shape with real Claude Code CLI defaults Three field-level alignments in normalizeClaudeOAuthRequestBody to match real Claude Code CLI traffic byte-for-byte: 1. temperature: previously deleted unconditionally; now passes through client value, defaults to 1 when absent (real CLI always sends temperature, default 1). 2. max_tokens: defaults to 128000 when absent (real CLI default). 3. context_management: when thinking.type is enabled/adaptive and the client did not provide context_management, inject {"edits":[{"type":"clear_thinking_20251015","keep":"all"}]} to mirror real CLI behavior. tool_choice removal is unchanged (Claude Code OAuth credentials do not allow client-supplied tool_choice). Tests updated: - gateway_body_order_test.go: temperature/max_tokens are now expected in output; tool_choice still removed. - gateway_prompt_test.go: system array is now 2 blocks (billing + cc prompt), assertions adjusted. - gateway_anthropic_apikey_passthrough_test.go: same 2-block assertion.	2026-04-24 23:16:32 +08:00
keh4l	5862e2d8d9	feat(gateway): add billing attribution block with cc_version fingerprint Real Claude Code CLI always sends a 2-block system array: [0] {"type":"text", "text":"x-anthropic-billing-header: cc_version=X.Y.Z.{fp}; cc_entrypoint=cli; cch=00000;"} [1] {"type":"text", "text":"You are Claude Code...", "cache_control":{...}} Before this commit, sub2api's mimicry path only produced block [1]. The missing billing block is one of the primary third-party detection signals Anthropic uses for Claude-Code-scoped OAuth tokens. New file gateway_billing_block.go ports the fingerprint algorithm (byte-for-byte from Parrot cc_mimicry.py:compute_fingerprint): pick chars at positions [4,7,20] of the first user text, then `sha256(SALT + chars + cc_version)[:3]`. - claude/constants.go: CLICurrentVersion = "2.1.92" (must match UA) - gateway_billing_block.go: computeClaudeCodeFingerprint + buildBillingAttributionBlockJSON + extractFirstUserText - gateway_service.go: rewriteSystemForNonClaudeCode now emits both blocks in order; cch=00000 is filled in later by signBillingHeaderCCH in buildUpstreamRequest. Downstream compat note: syncBillingHeaderVersion's regex `cc_version=\d+\.\d+\.\d+` only matches the semver triple, leaving the `.{fp}` suffix intact when rewriting in buildUpstreamRequest.	2026-04-24 23:16:32 +08:00
keh4l	66d6454535	feat(claude): add ttl to cache_control with default 5m Real Claude CLI traffic sends cache_control as `{"type":"ephemeral","ttl":"1h"}`. Our previous payload only sent `{"type":"ephemeral"}`, which is a bytewise mismatch with the official CLI and one more third-party detection signal. Policy: client-provided ttl is always passed through unchanged. Proxy-generated cache_control blocks default to 5m (vs Parrot's 1h) to avoid burning the 1h cache budget on automatic breakpoints while still aligning with the `ttl` field being present. - claude/constants.go: DefaultCacheControlTTL = "5m" - apicompat/types.go: new AnthropicCacheControl type with TTL field; AnthropicTool gains optional CacheControl pointer so the mimicry path can attach a cache breakpoint to tools[-1] later. - service/gateway_service.go: anthropicCacheControlPayload gains TTL; marshalAnthropicSystemTextBlock and rewriteSystemForNonClaudeCode emit ttl=5m by default.	2026-04-24 23:16:32 +08:00
keh4l	165553cfb0	fix(gateway): use full beta list in buildUpstreamRequest mimicry path The previous commit added FullClaudeCodeMimicryBetas() but the two call sites in buildUpstreamRequest still hardcoded the old 3-token subset. Anthropic now checks the complete set of beta tokens to decide if a request qualifies as Claude Code. Wire them up: - /v1/messages mimic path: requiredBetas = FullClaudeCodeMimicryBetas() - /v1/messages/count_tokens mimic path: same + BetaTokenCounting Haiku models keep the 2-token exemption (BetaOAuth + InterleaveThinking).	2026-04-24 23:16:32 +08:00
keh4l	b5467d610a	fix(gateway): apply full Claude Code mimicry on /chat/completions and /responses Before: the OpenAI-compat forwarders only called injectClaudeCodePrompt, which prepends the Claude Code banner but leaves the rest of the body in its original non-Claude-Code shape. The codebase already admits this is insufficient (see the comment on rewriteSystemForNonClaudeCode in gateway_service.go: "仅前置追加 Claude Code 提示词无法通过检测"). Effect: OAuth accounts served through /v1/chat/completions or /v1/responses were detected as third-party apps and bled plan quota with: Third-party apps now draw from your extra usage, not your plan limits. Fix: - apicompat.AnthropicRequest: add Metadata json.RawMessage so metadata survives the OpenAI->Anthropic->Marshal round trip; without it the downstream rewrite has no user_id to work with. - service: extract applyClaudeCodeOAuthMimicryToBody, a ParsedRequest-free variant of the /v1/messages mimicry pipeline (rewriteSystemForNonClaudeCode + normalizeClaudeOAuthRequestBody + metadata.user_id injection) so the OpenAI-compat forwarders can reuse it. - service: add buildOAuthMetadataUserIDFromBody + hashBodyForSessionSeed for the same reason (no ParsedRequest at the call site). - ForwardAsChatCompletions / ForwardAsResponses: replace the 3-line prompt-prepend with the full mimicry pipeline. - applyClaudeCodeMimicHeaders: set x-client-request-id per-request (real Claude CLI always does); missing/duplicated values are one more third-party fingerprint signal. No change to the native /v1/messages path: it already called the full pipeline, we only lift those helpers into a reusable function. Tests: - go build ./... passes - go test ./internal/service/... ./internal/pkg/apicompat/... passes - lsp_diagnostics clean on all touched files - pre-existing failures in internal/config are unrelated (env-sensitive tests that also fail on upstream main)	2026-04-24 23:16:32 +08:00
erio	258fd145ff	fix(account): prevent quota-exceeded API key/Bedrock accounts from being scheduled Add quota exceeded check to IsSchedulable() and refactor shouldClearStickySession to delegate to IsSchedulable(), eliminating duplicated logic and fixing missed overload/rate-limit/expired checks. Frontend displays quota exceeded status independently via quota fields.	2026-04-19 18:45:04 +08:00
erio	44cdef7934	fix(usage): subscription billing honours group rate multiplier Subscription-mode billing was consuming quota at TotalCost (raw) instead of ActualCost (TotalCost * RateMultiplier), so per-group rate multipliers — including free subscriptions (multiplier = 0) — were silently ignored. Switch the three subscription cost writes in buildUsageBillingCommand, finalizePostUsageBilling, and the legacy postUsageBilling fallback to ActualCost, and add a table-driven test covering 2x / 0.5x / free multipliers plus a balance-mode regression check.	2026-04-17 22:06:32 +08:00
erio	10699eeb34	refactor: extract ReadUpstreamResponseBody to deduplicate upstream response read + too-large error handling Consolidates 9 call sites of resolveUpstreamResponseReadLimit + readUpstreamResponseBodyLimited + ErrUpstreamResponseBodyTooLarge error handling into a single ReadUpstreamResponseBody function with TooLargeWriter callback for API-format-specific error responses (Anthropic, OpenAI, countTokens).	2026-04-16 01:53:22 +08:00
erio	a9880ee7b9	fix: round-2 audit fixes — security, code quality, and UI improvements Security (HIGH): - Normalize all Redis cache keys to lowercase (verifyCode, passwordReset) - Fix verify code TTL renewal on failed attempts: use remaining TTL via ExpiresAt field instead of resetting to full 15-minute window - Add 3 missing fields to diffSettings audit log (promo_code, invitation_code, custom_endpoints) Code quality (MEDIUM): - Extract filterVerifiedEmails shared helper (balance_notify_service.go) - Add Pricing array non-empty validation for channel pricing rules - Add platform token semantics comment in gateway_service.go - Complete validatePlanPatch test coverage (+10 test cases) - Replace string types with QuotaThresholdType/QuotaResetMode across frontend - Remove duplicate getPlatformTextColor/getRateBadgeClass in ChannelsView - Return EMAIL_NOT_FOUND error on RemoveNotifyEmail miss UI improvements: - Reorder cost tooltip: user billing above separator, account billing below - Add NaN guard to accountBilled function - Move timezone selector inline into reset-mode row (no longer standalone)	2026-04-14 09:35:05 +08:00
erio	98c9d51791	fix: correct account stats pricing priority order Priority was wrong: - Before: custom rules → LiteLLM (when ApplyPricingToAccountStats) → nil - After: custom rules → totalCost (when ApplyPricingToAccountStats) → LiteLLM → nil When ApplyPricingToAccountStats is enabled, use the request's actual client billing cost (before multiplier) as account_stats_cost, instead of recalculating from LiteLLM per-token prices which produced incorrect values for per-request billing mode. LiteLLM model pricing is now the final fallback (priority 3), used only when neither custom rules nor ApplyPricingToAccountStats apply.	2026-04-14 09:28:11 +08:00
erio	1262654d97	feat: WebSearch tri-state, account stats pricing fix, quota cache fix, usage tooltip WebSearch tri-state switch: - Account-level web_search_emulation changed from bool to tri-state string: "default" (follow channel) / "enabled" / "disabled" - shouldEmulateWebSearch checks channel config when account is "default" - SQL migration converts old bool values - Frontend select replaces toggle in Edit/CreateAccountModal Account stats pricing: - resolveAccountStatsCost uses upstream model (post-mapping) for matching - Priority: custom rules → model pricing file (when toggle on) → default - Custom rules always configurable, independent of toggle - Account ID field changed to searchable selector filtered by platform - Description updated to reflect new behavior Quota notification cache fix: - CheckAccountQuotaAfterIncrement fetches real-time account from DB - Reconstructs pre-increment usage for accurate threshold crossing detection - New AccountQuotaReader interface (minimal: GetByID only) Usage tooltip: - Per-request/image billing shows per-request price instead of $0 token price - Token billing continues to show input/output price per million tokens	2026-04-14 09:26:08 +08:00
erio	11c4606874	fix(channel): use upstream model for account stats pricing and remove channel pricing fallback - resolveAccountStatsCost now uses the final upstream model (after account-level mapping) to match custom pricing rules, fixing the issue where requested model (e.g. claude-sonnet-4-5) didn't match rules configured for upstream model (e.g. claude-opus-4-6) - Remove tryChannelPricing fallback — only custom rules are applied, unmatched requests use default formula (total_cost × rate) - Remove unused billingService and serviceTier parameters - Update description: "启用后将支持自定义账号统计的模型价格"	2026-04-14 09:26:08 +08:00
erio	31550a2c6a	fix(notify): use real-time balance for crossing detection and simplify email logic - Fix cached balance causing threshold crossing to never trigger: read real-time balance from billingCacheService instead of stale API key auth snapshot - Remove email="" placeholder concept; all emails are user-managed - Only send notifications to verified && non-disabled emails - Frontend: pre-fill user's email in add input when list is empty - Remove FilterEnabledEmails/IsPrimaryDisabled helpers (no longer needed)	2026-04-14 09:26:07 +08:00
erio	c3812ce1e3	fix(notify): address review findings - accountCost formula, dedup, refactor - Fix accountCost calculation in finalizePostUsageBilling to match postUsageBilling (always multiply by AccountRateMultiplier) - Use strings.EqualFold for email dedup in collectBalanceNotifyRecipients - Extract CheckAccountQuotaAfterIncrement into smaller functions: buildQuotaDims + asyncSendQuotaAlert (< 30 lines each) - Add "not splittable" comments for HTML template functions - Extract QuotaNotifyToggle.vue sub-component to reduce QuotaLimitCard.vue from 404 to 339 lines	2026-04-14 09:23:16 +08:00
erio	b32d1a2c9f	feat(notify): add balance low & account quota notification system - User balance low notification: email alert when balance drops below configurable threshold (user email + verified extra emails) - Account quota notification: broadcast email to admin-configured recipients when daily/weekly/total quota usage exceeds alert threshold - Admin settings: global enable/disable, default threshold, quota notification email list (Email Settings tab) - User profile: enable/disable, custom threshold, add/remove extra notification emails with verification code flow - Account quota: per-dimension alert toggle and threshold in quota control card - Trigger logic: first-crossing only (old >= threshold && new < threshold for balance; old < threshold && new >= threshold for quota), naturally prevents duplicate notifications without Redis dedup	2026-04-14 09:23:02 +08:00
erio	7535e312e0	feat(channels): add custom account stats pricing rules Allow channels to configure independent model pricing for account statistics cost calculation, decoupled from user billing. Backend: - Migration 101: channels.apply_pricing_to_account_stats toggle, channel_account_stats_pricing_rules/model_pricing tables, usage_logs.account_stats_cost column - resolveAccountStatsCost: match rules by group/account, then channel pricing, fallback to original formula when unconfigured - Integrate into both GatewayService.recordUsageCore and OpenAIGatewayService.RecordUsage - Update 8 account stats SQL queries to use COALESCE(account_stats_cost, total_cost) * account_rate_multiplier - 23 unit tests for matching, pricing lookup, and cost calculation Frontend: - Channel edit dialog: toggle + custom rules UI with group/account multi-select and pricing entry cards - API types and i18n (zh/en)	2026-04-14 09:22:12 +08:00

1 2 3 4 5 ...

385 Commits