sub2api

Author	SHA1	Message	Date
Wesley Liddick	e6a3f1e12b	Merge pull request #2869 from Pluviobyte/fix/ws-first-token-terminal-event fix(ws): exclude terminal events from first-token detection	2026-05-29 10:32:16 +08:00
Pluviobyte	8a999f438d	fix(ws): exclude terminal events from first-token detection isOpenAIWSTokenEvent classified response.completed / response.done as token events. When upstream finishes a request without ever emitting a recognizable delta (e.g. cached completions or models that skip incremental output), firstTokenMs was then filled at the terminal event's timestamp, so the first-token latency metric effectively reported total request duration. Terminal events are already handled separately by isOpenAIWSTerminalEvent. Treating them as token events makes the two classifiers overlap, which violates the implicit invariant that the token-event and terminal-event sets are disjoint. The metric only affects ForwardResult.FirstTokenMs (logging and observability) — billing and routing are unchanged. Add regression tests for both directions: * TestIsOpenAIWSTokenEvent_TerminalEventsExcluded covers each classification branch. * TestIsOpenAIWSTokenEvent_DisjointWithTerminal asserts the disjoint-set invariant for every known terminal event. Both new tests fail when the old `return eventType == "response.completed" \|\| eventType == "response.done"` is restored. Fixes #2651 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-29 01:33:42 +00:00
shaw	ed1b57c597	fix(openai): gate routing by endpoint capability	2026-05-29 08:58:10 +08:00
Wesley Liddick	2387cf9934	Merge pull request #2799 from siyuan-123/fix/ws-rate-limit-failover 修复 OpenAI WS 限额时不自动切换账号	2026-05-27 15:14:28 +08:00
siyuan	08061717b8	fix: enable account failover for OpenAI WS rate limits	2026-05-26 20:07:00 +08:00
benjamin	9c56fe0b0b	fix(openai): mark fast-policy entrypoints business-limited Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-26 17:21:45 +08:00
siyuan	fc66cd704a	fix: recognize codex tool outputs in ws continuation	2026-05-25 10:46:58 +08:00
shaw	1e406fed52	fix: optimize OpenAI account cooldown scheduling	2026-05-23 10:18:43 +08:00
Wesley Liddick	a340002c6d	Merge pull request #2401 from 2ue/fix/normalize-image-billing-size 修复图片计费尺寸归一化与使用记录展示	2026-05-19 14:00:24 +08:00
name	0393bd7c82	Fix OpenAI compat usage parsing	2026-05-16 03:03:43 +08:00
2ue	bb4c1abe28	Fix image billing size normalization	2026-05-12 15:21:31 +08:00
anzhen-tech	16a315574d	fix(openai): preserve replay tool output continuation	2026-05-07 14:59:42 +08:00
shaw	fff4a300c6	feat(risk-control): add content moderation audit	2026-05-07 09:14:47 +08:00
Wesley Liddick	94e494319a	Merge pull request #2197 from learnerLj/fix/ws-preflight-ping-fc-output-recovery fix: preflight ping 恢复时跳过携带 function_call_output 的请求	2026-05-05 17:21:21 +08:00
Jiahao Luo	e71b55ec69	fix: skip previous_response_id recovery when payload has function_call_output When a preflight ping fails or previous_response_not_found is returned, sub2api drops previous_response_id and retries. But if the payload contains function_call_output (tool results), the upstream API loses the response chain context needed to match tool_result to tool_use, causing 400: "No tool call found for function call output". Add hasFunctionCallOutput checks to both recovery paths: - Preflight ping failure recovery (forcePreferredConn path) - recoverIngressPrevResponseNotFound function	2026-05-05 15:13:46 +08:00
2ue	6faa344916	feat: add OpenAI image generation controls	2026-05-05 03:26:54 +08:00
deqiying	23555be380	fix(openai): 修复 WS passthrough 使用记录缺失推理强度和 User-Agent - 为 OpenAI Responses WebSocket v2 passthrough 补齐每轮 reasoning_effort 元数据 - 传递首帧渠道映射前模型，保留模型后缀推理强度推导能力 - 增加 usage log 端到端回归，覆盖入站 User-Agent、显式 effort 和渠道映射场景	2026-05-03 19:33:09 +08:00
shaw	094e1171ef	fix(openai): infer previous response for item references	2026-04-30 12:02:08 +08:00
KnowSky404	28dc34b6a3	fix(openai): avoid inferred WS continuation on explicit tool replay	2026-04-29 17:38:08 +08:00
DaydreamCoding	30f55a1f72	feat(openai): OpenAI Fast/Flex Policy 完整实现（HTTP + WebSocket + Admin）对称参照 Claude BetaPolicy 的 fast-mode 过滤实现，新增针对 OpenAI 上游 service_tier 字段（priority / flex，含客户端 "fast" → "priority" 归一化）的 pass / filter / block 三态策略，覆盖全部 OpenAI 入口 + admin 配置入口。后端核心 - 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、 OpenAIFastPolicySettings 配置模型，含规则的 service_tier × action × scope × 模型白名单 × fallback action 维度。 - SettingService.Get/SetOpenAIFastPolicySettings；缺失时返回内置默认策略（所有模型的 priority 走 filter，whitelist 为空，fallback=pass）。设计依据：service_tier=fast 是用户级开关，与 model 字段正交，默认锁定特定 model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON 解析失败不再静默 fallback，slog.Warn 记录脏数据，便于运维定位。 - service_tier 归一化（trim + ToLower + fast→priority + 白名单 priority/flex）与策略评估（evaluateOpenAIFastPolicy）作为唯一真实来源，HTTP / WS 共用。抽出纯函数 evaluateOpenAIFastPolicyWithSettings，配合 ctx-bound settings 快照（withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext）， WS 长会话入口预取一次后所有帧复用，避免每帧打到 settingService。 HTTP 入口（4 个） - Chat Completions、Anthropic 兼容（Messages，含 BetaFastMode→priority 二次命中）、原生 Responses、Passthrough Responses 全部接入 applyOpenAIFastPolicyToBody，filter 走 sjson 顶层删除 service_tier，block 返回 403 forbidden_error JSON。 - 4 入口统一使用 upstream 视角的 model（GetMappedModel + normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug），避免 chat/messages/native /responses/passthrough 因为 model 维度不同造成 whitelist 命中差异。 - 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body，否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游导致 400/拒绝（chat-completions 入口的 normalizeResponsesBodyServiceTier 此前已具备同等行为）。 WebSocket 入口 - 新增 applyOpenAIFastPolicyToWSResponseCreate：严格匹配 type="response.create"，仅处理顶层 service_tier；filter 用 sjson 删字段， block 返回 typed *OpenAIFastBlockedError。 - ingress 路径在 parseClientPayload 内调用，block 命中先 Write Realtime 风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation =1008)，依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于 close。 - passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略，并通过 openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream 帧执行策略；后续帧无 model 字段时回退到 capturedSessionModel。 filter 闭包内同时侦测 session.update / session.created 帧的 session.model 字段刷新 capturedSessionModel，封堵"首帧 model=gpt-4o（pass）→ session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback 到 gpt-4o"的 mid-session 绕过路径。 - passthrough billing：requestServiceTier 在策略 filter 之后再从 firstClientMessage 提取，filter 命中时 OpenAIForwardResult.ServiceTier 上报 nil（default tier），与 HTTP 入口（reqBody 来自 post-filter map） / WS ingress（payload 来自 post-filter bytes）的语义一致。 - 错误事件 schema：{event_id: "evt_<32hex>", type: "error", error: {type: "forbidden_error", code: "policy_violation", message}}，与 OpenAI codex 客户端 error event 解析兼容。 Admin / Frontend - dto.SystemSettings / UpdateSettingsRequest 新增 openai_fast_policy_settings 字段（omitempty），bulk GET/PUT 接入。 - Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片： service_tier × action × scope × 模型白名单 × fallback action 全字段配置。 - 前端守门：openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写，避免 rollout/错误把默认规则覆盖成空；saveSettings 回写循环 skip 该字段，由专用刷新逻辑处理；仅 action=block 时发送 error_message，匹配后端 omitempty 行为。测试 - HTTP 路径：openai_fast_policy_test.go 覆盖默认配置（whitelist=[]，所有模型 priority filter）/ block 自定义错误 / scope 区分 / filter 删字段 / block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI fast policy 等场景。 - WebSocket 路径：openai_fast_policy_ws_test.go 覆盖 helper 单元（filter / fast→priority 归一化 / flex 透传 / block typed error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type 帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错）+ pass 路径 fast 别名归一化回归 + ingress 端到端（filter 后上游不含 service_tier / block 后客户端先收 error event 再收 close 1008 且上游 0 写）+ passthrough capturedSessionModel fallback 用例（whitelist 策略下首帧建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化）+ passthrough session.update / session.created 旋转 capturedSessionModel 的 mid-session 绕过回归 + passthrough billing post-filter ServiceTier 与 idempotent filter 回归。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 11:15:09 +08:00
shaw	095f457c57	feat(openai): port /responses/compact account support flow (PR #1555 ) 将 vansour/sub2api#1555 的 OpenAI compact 能力建模手工移植到当前 main：账号级 compact 状态/auto-force_on-force_off 模式、compact-only 模型映射、调度器 tier 分层（已支持 > 未知 > 已知不支持）、管理后台 compact 主动探测，以及对应 i18n/状态徽章。普通 /responses 流量行为不变，无数据库迁移。	2026-04-25 14:52:58 +08:00
Alex	3a07e92b60	fix(openai): do not normalize /completion API token based accounts	2026-04-07 11:40:41 +03:00
erio	e27b0adbc8	refactor: remove resolveOpenAIUpstreamModel, use normalizeCodexModel directly Eliminates unnecessary indirection layer. The wrapper function only called normalizeCodexModel with a special case for "gpt 5.3 codex spark" (space-separated variant) that is no longer needed. All call sites now use normalizeCodexModel directly.	2026-04-04 14:07:19 +08:00
InCerryGit	fa68cbad1b	Merge branch 'Wei-Shaw:main' into main	2026-03-24 19:21:30 +08:00
InCerry	995ef1348a	refactor: improve model resolution and normalization logic for OpenAI integration	2026-03-24 19:20:15 +08:00
Wang Lvyuan	fef9259aaa	fix(openai): recheck runtime state from db before final account selection	2026-03-23 03:50:03 +08:00
Ethan0x0000	2c667a159c	fix(provider): retain upstream model for gemini compat and ws Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-03-21 01:24:59 +08:00
QTom	3741617ebd	fix(gateway): WS 连接池条件式 MarkBroken 防止跨请求串流正常终端事件（response.completed 等）退出后连接归还复用，仅异常路径（读写错误、error 事件、客户端断连）MarkBroken 销毁。 Generate 模式: - 引入 cleanExit 标记，仅在 isTerminalEvent break 时设置 true - defer 中根据 cleanExit 决定是否 MarkBroken - 所有异常路径已在各自分支中提前调用 MarkBroken Ingress 模式: - 引入 lastTurnClean 标记，sendAndRelay 正常完成时设为 true - releaseSessionLease 根据 lastTurnClean 决定是否 MarkBroken - 错误路径重置 lastTurnClean = false - 客户端断连后 drain 仍保守 MarkBroken（L2916）	2026-03-16 10:50:02 +08:00
QTom	ab4e8b2cf0	fix(gateway): 防止 OpenAI Codex 跨用户串流根因：多个用户共享同一 OAuth 账号时，conversation_id/session_id 头未做用户隔离，导致上游 chatgpt.com 将不同用户的请求关联到同一会话。 HTTP SSE 修复: - 新增 isolateOpenAISessionID(apiKeyID, raw)，将 API Key ID 混入 session 标识符（xxhash），确保不同 Key 的用户产生不同上游会话 - buildUpstreamRequest: OAuth 分支先 Del 客户端透传的 session 头，再用隔离值覆盖 - buildUpstreamRequestOpenAIPassthrough: 透传路径同样隔离 - ForwardAsAnthropic: Anthropic Messages 兼容路径同步修复 - buildOpenAIWSHeaders: WS 路径的 OAuth session 头同步隔离	2026-03-16 10:28:51 +08:00
InCerry	2666422b99	fix: handle invalid encrypted content error and retry logic.	2026-03-14 11:42:42 +08:00
Wesley Liddick	391e79f8ee	Merge pull request #875 from mt21625457/fix/openai-fast-billing-clean fix(billing): 修复 OpenAI fast 档位计费并补齐展示	2026-03-09 10:32:18 +08:00
yangjianbo	87f4ed591e	fix(billing): 修复 OpenAI fast 档位计费并补齐展示 - 打通 service_tier 在 OpenAI HTTP、WS、passthrough 与 usage 记录中的传递 - 修正 priority/flex 计费逻辑，并将 fast 归一化为 priority - 在用户端和管理端补齐服务档位与计费明细展示 - 补齐前后端测试，并修复 WS 限流信号重复持久化导致的全量回归失败 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 09:51:26 +08:00
神乐	45d57018eb	fix: 修复 OpenAI WS 限流状态与调度同步	2026-03-07 23:59:39 +08:00
神乐	101ef0cf62	fix: 限流账号自动退出调度并优化提示文案	2026-03-07 21:05:37 +08:00
admin	da89583ccc	fix(openai): detect official codex client by headers	2026-03-07 14:12:38 +08:00
神乐	838ada8864	fix(openai): restore ws usage window display	2026-03-06 20:49:47 +08:00
erio	02dea7b09b	refactor: unify post-usage billing logic and fix account quota calculation - Extract postUsageBilling() to consolidate billing logic across GatewayService.RecordUsage, RecordUsageWithLongContext, and OpenAIGatewayService.RecordUsage, eliminating ~120 lines of duplicated code - Fix account quota to use TotalCost × accountRateMultiplier (was using raw TotalCost, inconsistent with account cost stats) - Fix RecordUsageWithLongContext API Key quota only updating in balance mode (now updates regardless of billing type) - Fix WebSocket client disconnect detection on Windows by adding "an established connection was aborted" to known disconnect errors	2026-03-06 00:54:17 +08:00
yangjianbo	1d0872e7ca	feat(openai-ws): 合并 WS v2 透传模式与前端 ws mode 新增 OpenAI WebSocket v2 passthrough relay 数据面与服务适配层，支持按账号 ws mode 在 ctx_pool 与 passthrough 间路由。同步调整前端 OpenAI ws mode 选项为 off/ctx_pool/passthrough，并补充 i18n 文案与对应单测。新增 Caddyfile.dmit 与 docker-compose-aicodex.yml 部署配置，用于宿主机场景下的反向代理与服务编排。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 11:50:58 +08:00
yangjianbo	bb664d9bbf	feat(sync): full code sync from release	2026-02-28 15:01:20 +08:00

39 Commits