sub2api

Author	SHA1	Message	Date
Pluviobyte	27600b1d2c	fix(gateway): filter count_tokens generation fields Anthropic count_tokens rejects generation-only fields such as temperature, top_p, top_k, stream, and stop sequences. Passing the original messages payload through unchanged can turn otherwise valid requests into upstream 400 errors. Sanitize only the count_tokens upstream payload after the gateway's existing request normalization, preserving fields that existing compatibility paths rely on while removing parameters the count_tokens endpoint does not accept. Fixes #2764 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-28 05:40:50 +00:00
github-actions[bot]	89d96f4b25	chore: sync VERSION to 0.1.132 [skip ci]	2026-05-27 14:28:22 +00:00
Wesley Liddick	cc077862b3	Merge pull request #2797 from wucm667/feat/account-list-created-at-column feat(admin): 账号管理列表新增创建时间列	2026-05-27 22:10:21 +08:00
Wesley Liddick	bbe847ed3e	Merge pull request #2805 from StarryKira/feat/configurable-pool-retry-status-codes feat(account): configurable pool-mode same-account retry status codes	2026-05-27 22:09:55 +08:00
Wesley Liddick	69657b2fa1	Merge pull request #2827 from ttt132/fix/api-key-responses-sse-fallback fix: fallback to SSE body for API key responses	2026-05-27 21:56:00 +08:00
Wesley Liddick	61ce79533e	Merge pull request #2800 from wucm667/fix/scheduler-model-not-found-per-model-cooldown fix(scheduler): 模型 404 仅冷却该账号-模型组合，不再封整个账号	2026-05-27 21:01:52 +08:00
Wesley Liddick	c949d22725	Merge pull request #2821 from Pluviobyte/fix/long-context-cache-creation-multiplier fix(billing): apply long-context multiplier to cache_creation price (follow-up to #2816)	2026-05-27 21:01:27 +08:00
Wesley Liddick	8461e42a97	Merge pull request #2822 from lyen1688/feat/group-custom-models-list feat(group): 支持自定义 /v1/models 模型列表	2026-05-27 21:00:19 +08:00
Wesley Liddick	7579058e91	Merge pull request #2820 from Pluviobyte/fix/antigravity-passthrough-message-start-input-tokens fix(antigravity): capture message_start input_tokens in streaming passthrough	2026-05-27 20:59:51 +08:00
haichuan	32ea9cfe1f	fix: fallback to SSE body for API key responses	2026-05-27 20:24:52 +08:00
lyen1688	f597c1581b	feat(group): 支持自定义 /v1/models 模型列表	2026-05-27 18:00:45 +08:00
Pluviobyte	ed2aac25a6	fix(billing): apply long-context multiplier to cache_creation price Follow-up to #2816 (already merged): the same long-context pricing exemption that affected cache_read also applies to all three cache_creation price fields (standard, 5m ephemeral, 1h ephemeral). computeCacheCreationCost reads these prices directly from pricing and never sees the LongContextInputMultiplier that computeTokenBreakdown applies to inputPrice / outputPrice / cacheReadPrice. For GPT-5.4 / 5.5 above the 272k threshold, this causes the cache_write portion of long sessions to be billed at roughly half what it should be (default multiplier 2.0). Cache writes are conceptually input-side operations and should share the same long-context treatment as input / cache_read. This patch threads an explicit multiplier into computeCacheCreationCost so the function can be unit-tested in isolation and matches the existing pattern used for cache_read. computeTokenBreakdown captures the long context decision once and passes LongContextInputMultiplier when it applies, 1.0 otherwise. Adds three regression tests mirroring the #2816 cache_read tests: - positive: long-context triggered -> cache_creation scaled by 2.0x - negative: below threshold -> cache_creation stays at base price - breakdown: 5m + 1h ephemeral prices both scaled when applicable Refs #2816 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-27 09:59:58 +00:00
shaw	a391635191	更新 OpenAI 使用密钥配置	2026-05-27 17:29:20 +08:00
Pluviobyte	1e6d0b602a	fix(antigravity): capture message_start input_tokens in streaming passthrough The antigravity upstream-passthrough path (account.Type == AccountTypeUpstream forwarding to a Claude-format upstream) drains the SSE stream via streamUpstreamResponse + extractSSEUsage. The extractor only reads top-level event["usage"], which matches Anthropic's message_delta but misses message_start where usage is nested under event.message.usage. As a result, every streaming /v1/messages request through this path drops the input-side fields (input_tokens, cache_read_input_tokens, cache_creation_*) and writes a usage_logs row with input_tokens=0 + output_tokens>0. The user in #2332 observed 2,728 such rows attributed to claude-opus-4-6 / haiku-4-5 streaming requests; their billing on output is correct but the input-side accounting is missing. (Their "duplicate write from message_delta" hypothesis isn't borne out by the code — RecordUsage is invoked once per request and writeUsageLogBestEffort dedupes by request_id; what they're seeing is single records produced by this buggy extractor.) Branch on event.type so message_start reads from event.message.usage and other events keep using event.usage, matching how parseSSEUsagePassthrough already handles both shapes for the Anthropic OAuth / API-key / Bedrock paths. Adds two extractSSEUsage table cases plus a TestExtractSSEUsage_StreamingSequence that drives the message_start → message_delta sequence end-to-end; both fail on main and pass with this change. Fixes #2332 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-27 09:02:15 +00:00
Wesley Liddick	b0142146af	Merge pull request #2816 from Pluviobyte/fix/long-context-cache-read-multiplier fix(billing): apply long-context multiplier to cache_read price (#2293)	2026-05-27 15:59:11 +08:00
Wesley Liddick	2387cf9934	Merge pull request #2799 from siyuan-123/fix/ws-rate-limit-failover 修复 OpenAI WS 限额时不自动切换账号	2026-05-27 15:14:28 +08:00
SlientRainyDay	b9509e823a	fix(billing): apply long-context multiplier to cache_read price When session long-context pricing is triggered in computeTokenBreakdown (e.g. GPT-5.4 / GPT-5.5 above the 272k token threshold), the multiplier was only being applied to InputPricePerToken and OutputPricePerToken. The cache_read price was left at its base value, so CacheReadCost was silently undercharged whenever a long-context session also had cache hits — which is essentially every long Codex / Claude Code session. Concretely for gpt-5.4 with 300k cache_read tokens, the bug under-billed the request by exactly 1x the LongContextInputMultiplier on the cache portion (e.g. 0.075 instead of 0.150 in the regression test). Cache reads are conceptually input-side replays, so they should scale with LongContextInputMultiplier, matching the treatment of InputPricePerToken. Adds two regression tests: - positive: long-context triggered -> cache_read scaled by 2.0x - negative: below threshold -> cache_read stays at base price Fixes #2293 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-27 07:09:28 +00:00
wucm667	8f3b211997	ci: retrigger after transient GitHub Actions codeload outage	2026-05-27 09:34:58 +08:00
StarryKira	21033dceb9	feat(account): configurable pool-mode same-account retry status codes Pool mode currently retries the same account for a fixed set of upstream HTTP statuses: 401, 403, 429. Some upstream pool deployments also need same-account retry for transient provider/proxy statuses such as 502, 503, 520, 529, but hard-coding more statuses changes behavior for everyone. Add a per-account credentials option `pool_mode_retry_status_codes` that lets admins choose which upstream HTTP status codes trigger same-account retry in pool mode: - Unset (default): preserve the current 401/403/429 default - Explicit list: override the defaults with the configured codes - Codes normalized to the 100-599 range, deduplicated, sorted The standalone `isPoolModeRetryableStatus` helper is kept as the default-only fallback. All 15 gateway call sites switch to the new `Account.IsPoolModeRetryableStatus` method so behavior is preserved for accounts that do not configure the new field. Frontend admin UI gains a "Retry Status Codes" comma-separated input under the pool-mode section in both Create/Edit account modals (en + zh i18n). Fixes #2731 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 11:24:25 -07:00
shaw	f7ac5e5931	fix(openai): preserve chat responses usage billing	2026-05-26 21:33:28 +08:00
wucm667	a31b507484	fix(scheduler): 模型404仅冷却账号模型组合	2026-05-26 20:29:48 +08:00
Wesley Liddick	4b9b63443f	Merge pull request #2790 from Arron196/from-arron-main 修复 Ops SLA 本地限制错误统计	2026-05-26 20:21:11 +08:00
siyuan	08061717b8	fix: enable account failover for OpenAI WS rate limits	2026-05-26 20:07:00 +08:00
Wesley Liddick	4a5c5367cf	Merge pull request #2796 from DaydreamCoding/fix/account-reauth-keep-extra fix(account): 重新授权不再清空 Extra 配置	2026-05-26 20:06:48 +08:00
Wesley Liddick	b9f421d647	Merge pull request #2751 from wucm667/fix/bedrock-strip-context-management-when-beta-removed fix(bedrock): v0.1.130 回归 — beta token 被移除时同步剥离 context_management 字段	2026-05-26 20:05:43 +08:00
wucm667	b6a38ddab7	feat(admin): 账号管理列表新增创建时间列	2026-05-26 19:59:12 +08:00
DaydreamCoding	11fe7de926	fix(account): 重新授权不再清空 Extra 配置 Claude / OpenAI 账号重新授权走通用 PUT /accounts/:id 时，后端 UpdateAccount 会全量覆盖 account.Extra（仅保留 5 个 quota 用量键），导致 base_rpm / window_cost_limit / window_cost_sticky_reserve / max_sessions / quota_* / privacy_mode 等持久化配置全部丢失。新增专用接口 POST /accounts/:id/apply-oauth-credentials，沿用现有 /refresh 路径模式：Credentials-only update + Extra JSONB key 级合并（UpdateAccountExtra） + ClearError + InvalidateToken。作用域：Claude OAuth / Claude Cookie auth / OpenAI OAuth 三个调用点。Gemini / Antigravity 现有路径本就不传 extra，保持不变。顺带修复：旧重新授权路径未调用 InvalidateToken，导致重新授权后首请求可能仍用缓存中的旧 token 而立即 401。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 19:46:08 +08:00
benjamin	03ae510c68	fix(ops): exclude count-tokens from metrics errors Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:21:56 +08:00
benjamin	9c56fe0b0b	fix(openai): mark fast-policy entrypoints business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:21:45 +08:00
benjamin	5d7df678b1	fix(openai): mark local gateway denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:50 +08:00
benjamin	47fe90eab4	fix(antigravity): mark whitelist denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:37 +08:00
benjamin	c3e7476992	fix(gateway): mark local platform gates business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:23 +08:00
benjamin	c782c2d9c3	fix(ops): classify local policy denials outside SLA Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:09 +08:00
benjamin	00eb3abbe1	fix(auth): mark Google group denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:18:55 +08:00
benjamin	bd1e98ec29	fix(auth): mark API key group denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:18:41 +08:00
benjamin	5c4101ac53	feat(ops): add local business limit reasons Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:18:27 +08:00
github-actions[bot]	9ef144874a	chore: sync VERSION to 0.1.131 [skip ci]	2026-05-26 06:43:49 +00:00
Wesley Liddick	bebc082306	Merge pull request #2766 from DaydreamCoding/feat/user-platform-quota feat(quota): 用户 × 平台 USD 配额	2026-05-26 14:13:18 +08:00
Wesley Liddick	83248478e2	Merge pull request #2777 from lyen1688/feat/content-moderation-risk-threshold feat: 支持内容审计风险阈值配置	2026-05-26 14:12:54 +08:00
Wesley Liddick	a28e8e3d44	Merge pull request #2776 from mt21625457/fix-http2-bug [codex] 修复 OpenAI/Codex HTTP/2 响应头超时	2026-05-26 14:12:11 +08:00
lyen1688	23f3d426c6	feat: 支持内容审计风险阈值配置	2026-05-26 13:58:02 +08:00
mt21625457	33ac8eb27d	fix openai http2 response header timeout	2026-05-26 13:57:59 +08:00
DaydreamCoding	6b39b344d8	feat(quota): 用户 × 平台 USD 配额为用户在 anthropic/openai/gemini/antigravity 四个平台上提供日/周/月三个窗口的 USD 配额管控。配额语义：未设置=不限制，0=禁用，>0=美元上限。两层模型： - 配置层：系统默认配额，以及 email/linuxdo/oidc/wechat/github/google/ dingtalk 七个鉴权来源的默认配额，存于 settings，以嵌套 JSON 整体读写（系统 1 个 key + 每个来源 1 个 key），整体替换语义。 - 运行时层：user_platform_quota 表按用户记录实际配额，与配置层解耦。后端：新增 ent schema 与 140_user_platform_quotas.sql 迁移、repository 与 service 端口、计费链路集成、管理端与用户端读写接口。前端：管理端设置页配额编辑、用户配额管理 Modal、用户 Dashboard 展示、中英文案。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 10:49:20 +08:00
shaw	2f70d965bf	chore: update sponsors	2026-05-25 19:06:55 +08:00
shaw	53acde1efd	style: fix lint errors in response.failed SSE writer Errcheck flagged three unchecked strings.Builder.WriteString calls and gofmt rejected over-aligned trailing comment in the route table. Rewrite writeResponsesFailedSSE with json.Marshal on typed structs instead of Builder+strconv.Quote. Same wire format, but: - no unchecked Write returns to silence - strict JSON escaping (strconv.Quote emits \a and \v which are not valid JSON; Marshal handles all runes correctly) - omitempty model field via struct tag instead of conditional Builder - consistent with the json.Marshal style used elsewhere in handler/ Collapse trailing comment whitespace in stream_error_event_test.go to satisfy gofmt. All 30+ subtests in the package still pass.	2026-05-25 18:16:46 +08:00
Wesley Liddick	a18738b29e	Merge pull request #2732 from wminjay/fix/responses-stream-failed-event fix(openai): emit response.failed when /v1/responses SSE aborted post-flush	2026-05-25 18:12:25 +08:00
Wesley Liddick	2fb9fb2f71	Merge pull request #2747 from siyuan-123/fix/ws-tool-output-continuation 修复 WS 协议下工具输出续链识别问题	2026-05-25 17:58:00 +08:00
wucm667	a9c7a3a095	fix(bedrock): strip context_management when beta is removed	2026-05-25 14:15:39 +08:00
siyuan	fc66cd704a	fix: recognize codex tool outputs in ws continuation	2026-05-25 10:46:58 +08:00
Jamie Wong	b34cc71bee	fix(openai): also emit response.failed in ensureForwardErrorResponse after Writer.Written Case B: when a slot wait flushes SSE ping comments first (Writer.Written becomes true), the previous ensureForwardErrorResponse short-circuited on `c.Writer.Written()` and returned false without notifying the client. Subsequent upstream errors (http2 timeout, stream INTERNAL_ERROR, etc.) produced silent EOF; Codex CLI reported "stream closed before response.completed" just like the user-slot timeout case. Remove the Written() early return; coerce streamStarted to true when Writer has already been written to, and let handleStreamingAwareError walk the existing logic — which now (thanks to the previous commits) emits a protocol-compliant response.failed for /responses paths and the legacy `event: error` for others. Update tests that previously asserted "do not override written response": the new contract is to append an SSE terminal frame so the client sees a clean close instead of EOF. recoverResponsesPanic inherits this fix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-24 22:00:56 +08:00

1 2 3 4 5 ...

3631 Commits