sub2api

Author	SHA1	Message	Date
Wesley Liddick	7579058e91	Merge pull request #2820 from Pluviobyte/fix/antigravity-passthrough-message-start-input-tokens fix(antigravity): capture message_start input_tokens in streaming passthrough	2026-05-27 20:59:51 +08:00
shaw	a391635191	更新 OpenAI 使用密钥配置	2026-05-27 17:29:20 +08:00
Pluviobyte	1e6d0b602a	fix(antigravity): capture message_start input_tokens in streaming passthrough The antigravity upstream-passthrough path (account.Type == AccountTypeUpstream forwarding to a Claude-format upstream) drains the SSE stream via streamUpstreamResponse + extractSSEUsage. The extractor only reads top-level event["usage"], which matches Anthropic's message_delta but misses message_start where usage is nested under event.message.usage. As a result, every streaming /v1/messages request through this path drops the input-side fields (input_tokens, cache_read_input_tokens, cache_creation_*) and writes a usage_logs row with input_tokens=0 + output_tokens>0. The user in #2332 observed 2,728 such rows attributed to claude-opus-4-6 / haiku-4-5 streaming requests; their billing on output is correct but the input-side accounting is missing. (Their "duplicate write from message_delta" hypothesis isn't borne out by the code — RecordUsage is invoked once per request and writeUsageLogBestEffort dedupes by request_id; what they're seeing is single records produced by this buggy extractor.) Branch on event.type so message_start reads from event.message.usage and other events keep using event.usage, matching how parseSSEUsagePassthrough already handles both shapes for the Anthropic OAuth / API-key / Bedrock paths. Adds two extractSSEUsage table cases plus a TestExtractSSEUsage_StreamingSequence that drives the message_start → message_delta sequence end-to-end; both fail on main and pass with this change. Fixes #2332 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-27 09:02:15 +00:00
Wesley Liddick	b0142146af	Merge pull request #2816 from Pluviobyte/fix/long-context-cache-read-multiplier fix(billing): apply long-context multiplier to cache_read price (#2293)	2026-05-27 15:59:11 +08:00
Wesley Liddick	2387cf9934	Merge pull request #2799 from siyuan-123/fix/ws-rate-limit-failover 修复 OpenAI WS 限额时不自动切换账号	2026-05-27 15:14:28 +08:00
SlientRainyDay	b9509e823a	fix(billing): apply long-context multiplier to cache_read price When session long-context pricing is triggered in computeTokenBreakdown (e.g. GPT-5.4 / GPT-5.5 above the 272k token threshold), the multiplier was only being applied to InputPricePerToken and OutputPricePerToken. The cache_read price was left at its base value, so CacheReadCost was silently undercharged whenever a long-context session also had cache hits — which is essentially every long Codex / Claude Code session. Concretely for gpt-5.4 with 300k cache_read tokens, the bug under-billed the request by exactly 1x the LongContextInputMultiplier on the cache portion (e.g. 0.075 instead of 0.150 in the regression test). Cache reads are conceptually input-side replays, so they should scale with LongContextInputMultiplier, matching the treatment of InputPricePerToken. Adds two regression tests: - positive: long-context triggered -> cache_read scaled by 2.0x - negative: below threshold -> cache_read stays at base price Fixes #2293 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-27 07:09:28 +00:00
shaw	f7ac5e5931	fix(openai): preserve chat responses usage billing	2026-05-26 21:33:28 +08:00
Wesley Liddick	4b9b63443f	Merge pull request #2790 from Arron196/from-arron-main 修复 Ops SLA 本地限制错误统计	2026-05-26 20:21:11 +08:00
siyuan	08061717b8	fix: enable account failover for OpenAI WS rate limits	2026-05-26 20:07:00 +08:00
Wesley Liddick	4a5c5367cf	Merge pull request #2796 from DaydreamCoding/fix/account-reauth-keep-extra fix(account): 重新授权不再清空 Extra 配置	2026-05-26 20:06:48 +08:00
Wesley Liddick	b9f421d647	Merge pull request #2751 from wucm667/fix/bedrock-strip-context-management-when-beta-removed fix(bedrock): v0.1.130 回归 — beta token 被移除时同步剥离 context_management 字段	2026-05-26 20:05:43 +08:00
DaydreamCoding	11fe7de926	fix(account): 重新授权不再清空 Extra 配置 Claude / OpenAI 账号重新授权走通用 PUT /accounts/:id 时，后端 UpdateAccount 会全量覆盖 account.Extra（仅保留 5 个 quota 用量键），导致 base_rpm / window_cost_limit / window_cost_sticky_reserve / max_sessions / quota_* / privacy_mode 等持久化配置全部丢失。新增专用接口 POST /accounts/:id/apply-oauth-credentials，沿用现有 /refresh 路径模式：Credentials-only update + Extra JSONB key 级合并（UpdateAccountExtra） + ClearError + InvalidateToken。作用域：Claude OAuth / Claude Cookie auth / OpenAI OAuth 三个调用点。Gemini / Antigravity 现有路径本就不传 extra，保持不变。顺带修复：旧重新授权路径未调用 InvalidateToken，导致重新授权后首请求可能仍用缓存中的旧 token 而立即 401。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 19:46:08 +08:00
benjamin	03ae510c68	fix(ops): exclude count-tokens from metrics errors Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:21:56 +08:00
benjamin	9c56fe0b0b	fix(openai): mark fast-policy entrypoints business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:21:45 +08:00
benjamin	5d7df678b1	fix(openai): mark local gateway denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:50 +08:00
benjamin	47fe90eab4	fix(antigravity): mark whitelist denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:37 +08:00
benjamin	c3e7476992	fix(gateway): mark local platform gates business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:23 +08:00
benjamin	c782c2d9c3	fix(ops): classify local policy denials outside SLA Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:19:09 +08:00
benjamin	00eb3abbe1	fix(auth): mark Google group denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:18:55 +08:00
benjamin	bd1e98ec29	fix(auth): mark API key group denials business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:18:41 +08:00
benjamin	5c4101ac53	feat(ops): add local business limit reasons Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:18:27 +08:00
github-actions[bot]	9ef144874a	chore: sync VERSION to 0.1.131 [skip ci]	2026-05-26 06:43:49 +00:00
Wesley Liddick	bebc082306	Merge pull request #2766 from DaydreamCoding/feat/user-platform-quota feat(quota): 用户 × 平台 USD 配额	2026-05-26 14:13:18 +08:00
Wesley Liddick	83248478e2	Merge pull request #2777 from lyen1688/feat/content-moderation-risk-threshold feat: 支持内容审计风险阈值配置	2026-05-26 14:12:54 +08:00
Wesley Liddick	a28e8e3d44	Merge pull request #2776 from mt21625457/fix-http2-bug [codex] 修复 OpenAI/Codex HTTP/2 响应头超时	2026-05-26 14:12:11 +08:00
lyen1688	23f3d426c6	feat: 支持内容审计风险阈值配置	2026-05-26 13:58:02 +08:00
mt21625457	33ac8eb27d	fix openai http2 response header timeout	2026-05-26 13:57:59 +08:00
DaydreamCoding	6b39b344d8	feat(quota): 用户 × 平台 USD 配额为用户在 anthropic/openai/gemini/antigravity 四个平台上提供日/周/月三个窗口的 USD 配额管控。配额语义：未设置=不限制，0=禁用，>0=美元上限。两层模型： - 配置层：系统默认配额，以及 email/linuxdo/oidc/wechat/github/google/ dingtalk 七个鉴权来源的默认配额，存于 settings，以嵌套 JSON 整体读写（系统 1 个 key + 每个来源 1 个 key），整体替换语义。 - 运行时层：user_platform_quota 表按用户记录实际配额，与配置层解耦。后端：新增 ent schema 与 140_user_platform_quotas.sql 迁移、repository 与 service 端口、计费链路集成、管理端与用户端读写接口。前端：管理端设置页配额编辑、用户配额管理 Modal、用户 Dashboard 展示、中英文案。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 10:49:20 +08:00
shaw	2f70d965bf	chore: update sponsors	2026-05-25 19:06:55 +08:00
shaw	53acde1efd	style: fix lint errors in response.failed SSE writer Errcheck flagged three unchecked strings.Builder.WriteString calls and gofmt rejected over-aligned trailing comment in the route table. Rewrite writeResponsesFailedSSE with json.Marshal on typed structs instead of Builder+strconv.Quote. Same wire format, but: - no unchecked Write returns to silence - strict JSON escaping (strconv.Quote emits \a and \v which are not valid JSON; Marshal handles all runes correctly) - omitempty model field via struct tag instead of conditional Builder - consistent with the json.Marshal style used elsewhere in handler/ Collapse trailing comment whitespace in stream_error_event_test.go to satisfy gofmt. All 30+ subtests in the package still pass.	2026-05-25 18:16:46 +08:00
Wesley Liddick	a18738b29e	Merge pull request #2732 from wminjay/fix/responses-stream-failed-event fix(openai): emit response.failed when /v1/responses SSE aborted post-flush	2026-05-25 18:12:25 +08:00
Wesley Liddick	2fb9fb2f71	Merge pull request #2747 from siyuan-123/fix/ws-tool-output-continuation 修复 WS 协议下工具输出续链识别问题	2026-05-25 17:58:00 +08:00
wucm667	a9c7a3a095	fix(bedrock): strip context_management when beta is removed	2026-05-25 14:15:39 +08:00
siyuan	fc66cd704a	fix: recognize codex tool outputs in ws continuation	2026-05-25 10:46:58 +08:00
Jamie Wong	b34cc71bee	fix(openai): also emit response.failed in ensureForwardErrorResponse after Writer.Written Case B: when a slot wait flushes SSE ping comments first (Writer.Written becomes true), the previous ensureForwardErrorResponse short-circuited on `c.Writer.Written()` and returned false without notifying the client. Subsequent upstream errors (http2 timeout, stream INTERNAL_ERROR, etc.) produced silent EOF; Codex CLI reported "stream closed before response.completed" just like the user-slot timeout case. Remove the Written() early return; coerce streamStarted to true when Writer has already been written to, and let handleStreamingAwareError walk the existing logic — which now (thanks to the previous commits) emits a protocol-compliant response.failed for /responses paths and the legacy `event: error` for others. Update tests that previously asserted "do not override written response": the new contract is to append an SSE terminal frame so the client sees a clean close instead of EOF. recoverResponsesPanic inherits this fix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-24 22:00:56 +08:00
Jamie Wong	cff2f291be	fix(openai): also match bare /responses route in handleStreamingAwareError The first revision compared GetInboundEndpoint(c) against EndpointResponses ("/v1/responses"). NormalizeInboundEndpoint only recognizes paths that contain the literal "/v1/responses" substring, but the project actually registers six /responses routes — three of which (top-level r.POST("/responses", ...) and codexDirect's "/backend-api/codex/responses") have FullPath values without the "/v1" prefix and therefore fall through to the default branch. Codex CLI users targeting the bare /responses route at the production deployment (observed 2026-05-24 ~11:05 UTC, user 16) never reached the new writeResponsesFailedSSE path: the endpoint check was false, the legacy `event: error` frame fired, and the strict SDK kept reporting "stream closed before response.completed". Replace the strict equality check with inboundIsResponses(c), which uses suffix detection on FullPath (falling back to URL.Path when FullPath is empty in test fixtures) and covers all six route variants: /v1/responses[/...] /responses[/...] /backend-api/codex/responses[/...] Add test table covering all routes plus negative cases. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-24 19:32:08 +08:00
Jamie Wong	5e5c2062bf	fix(openai): emit response.failed for /v1/responses after stream started When /v1/responses streaming hits the user/account concurrency wait, the wait loop sends SSE ping comments to keep the connection alive, which flushes HTTP 200 + headers. If the wait then times out (or any other post-flush error fires), handleStreamingAwareError previously emitted a generic `event: error` frame. Codex CLI requires the stream to end with a Responses terminal event (response.completed/failed/incomplete/cancelled), so it reports "stream closed before response.completed" and the user-facing rate-limit intent is lost. This change detects inbound = /v1/responses in both handleStreamingAwareError implementations and emits a protocol-compliant response.failed event whose field set mirrors apicompat.makeResponsesCompletedEvent (id/object/model/status/output/error). The synthetic id reuses ctxkey.RequestID so client errors can be grepped against server logs. sequence_number is intentionally omitted to preserve monotonicity on streams that already emitted real events. Other inbound endpoints (/v1/chat/completions, /v1/messages) keep their legacy formats untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-24 10:58:29 +08:00
github-actions[bot]	63b0631a58	chore: sync VERSION to 0.1.130 [skip ci]	2026-05-23 06:40:10 +00:00
shaw	0cfabaa82e	fix(i18n): escape at-sign in email whitelist placeholder Escape the at-sign in email whitelist placeholder to avoid Vue I18n linked message parsing errors on the settings page.	2026-05-23 14:18:20 +08:00
shaw	0430899748	feat(admin): add compact proxy IP resource link Add a low-visibility proxy IP resource link near proxy-related controls. - Show the link beside account proxy selectors - Show the link in the create proxy dialog tab row - Keep the entry inline to avoid interrupting form workflows	2026-05-23 14:18:19 +08:00
Wesley Liddick	3c5a444802	Merge pull request #2698 from deqiying/fix/log-real-client-ip fix: 修复反代部署下拒绝日志客户端 IP 不准确	2026-05-23 11:08:47 +08:00
shaw	b6c0b40848	fix: update x/net vulnerability dependency	2026-05-23 10:55:44 +08:00
shaw	1e406fed52	fix: optimize OpenAI account cooldown scheduling	2026-05-23 10:18:43 +08:00
deqiying	0af44ce4c2	fix: 修复反代部署下拒绝日志客户端 IP 不准确将 OpenAI codex_cli_only 拒绝诊断日志中的 request_client_ip 改为复用 ip.GetClientIP，与 usage 记录和 access log 的真实客户端 IP 解析逻辑保持一致。保留 request_remote_addr 用于排查底层 Docker/反代 peer 地址，并补充单元测试覆盖反代头与 remote addr 分离的场景。	2026-05-22 23:28:21 +08:00
Wesley Liddick	f59d9a5f8e	Merge pull request #2674 from wucm667/feat/moderation-per-model-toggle feat(risk-control): 内容审计支持按模型生效	2026-05-22 20:10:38 +08:00
Wesley Liddick	301032dc72	Merge pull request #2672 from wucm667/feat/email-whitelist-wildcard-suffix feat(registration): 邮箱白名单支持后缀通配符匹配(*.edu.cn)	2026-05-22 17:33:29 +08:00
Wesley Liddick	a5efb84fa0	Merge pull request #2656 from wucm667/fix/apicompat-developer-role-to-system fix(apicompat): Responses 转 Chat Completions 时 developer role 映射为 system	2026-05-22 17:32:47 +08:00
Wesley Liddick	9f91a8af17	Merge pull request #2662 from touwaeriol/feat/bedrock-cc-compat feat(bedrock): add Claude Code compatibility for AWS Bedrock	2026-05-22 17:32:11 +08:00
Wesley Liddick	a33a294970	Merge pull request #2658 from wucm667/feat/account-test-chat-completions-path feat(account): 测试连接支持 OpenAI-compatible Chat Completions 路径	2026-05-22 17:31:14 +08:00
Wesley Liddick	ac72b01d89	Merge pull request #2681 from wucm667/fix/moderation-dedupe-agent-loop fix(risk-control): Agent 工具循环中同一用户消息重复审计去重	2026-05-22 17:29:22 +08:00

1 2 3 4 5 ...

3616 Commits