Wesley Liddick
e6a3f1e12b
Merge pull request #2869 from Pluviobyte/fix/ws-first-token-terminal-event
...
fix(ws): exclude terminal events from first-token detection
2026-05-29 10:32:16 +08:00
Pluviobyte
8a999f438d
fix(ws): exclude terminal events from first-token detection
...
isOpenAIWSTokenEvent classified response.completed / response.done as
token events. When upstream finishes a request without ever emitting
a recognizable delta (e.g. cached completions or models that skip
incremental output), firstTokenMs was then filled at the terminal
event's timestamp, so the first-token latency metric effectively
reported total request duration.
Terminal events are already handled separately by
isOpenAIWSTerminalEvent. Treating them as token events makes the two
classifiers overlap, which violates the implicit invariant that the
token-event and terminal-event sets are disjoint.
The metric only affects ForwardResult.FirstTokenMs (logging and
observability) — billing and routing are unchanged.
Add regression tests for both directions:
* TestIsOpenAIWSTokenEvent_TerminalEventsExcluded covers each
classification branch.
* TestIsOpenAIWSTokenEvent_DisjointWithTerminal asserts the
disjoint-set invariant for every known terminal event.
Both new tests fail when the old `return eventType == "response.completed"
|| eventType == "response.done"` is restored.
Fixes #2651
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-29 01:33:42 +00:00
shaw
ed1b57c597
fix(openai): gate routing by endpoint capability
2026-05-29 08:58:10 +08:00
Wesley Liddick
2387cf9934
Merge pull request #2799 from siyuan-123/fix/ws-rate-limit-failover
...
修复 OpenAI WS 限额时不自动切换账号
2026-05-27 15:14:28 +08:00
siyuan
08061717b8
fix: enable account failover for OpenAI WS rate limits
2026-05-26 20:07:00 +08:00
benjamin
9c56fe0b0b
fix(openai): mark fast-policy entrypoints business-limited
...
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-26 17:21:45 +08:00
siyuan
fc66cd704a
fix: recognize codex tool outputs in ws continuation
2026-05-25 10:46:58 +08:00
shaw
1e406fed52
fix: optimize OpenAI account cooldown scheduling
2026-05-23 10:18:43 +08:00
Wesley Liddick
a340002c6d
Merge pull request #2401 from 2ue/fix/normalize-image-billing-size
...
修复图片计费尺寸归一化与使用记录展示
2026-05-19 14:00:24 +08:00
name
0393bd7c82
Fix OpenAI compat usage parsing
2026-05-16 03:03:43 +08:00
2ue
bb4c1abe28
Fix image billing size normalization
2026-05-12 15:21:31 +08:00
anzhen-tech
16a315574d
fix(openai): preserve replay tool output continuation
2026-05-07 14:59:42 +08:00
shaw
fff4a300c6
feat(risk-control): add content moderation audit
2026-05-07 09:14:47 +08:00
Wesley Liddick
94e494319a
Merge pull request #2197 from learnerLj/fix/ws-preflight-ping-fc-output-recovery
...
fix: preflight ping 恢复时跳过携带 function_call_output 的请求
2026-05-05 17:21:21 +08:00
Jiahao Luo
e71b55ec69
fix: skip previous_response_id recovery when payload has function_call_output
...
When a preflight ping fails or previous_response_not_found is returned,
sub2api drops previous_response_id and retries. But if the payload
contains function_call_output (tool results), the upstream API loses
the response chain context needed to match tool_result to tool_use,
causing 400: "No tool call found for function call output".
Add hasFunctionCallOutput checks to both recovery paths:
- Preflight ping failure recovery (forcePreferredConn path)
- recoverIngressPrevResponseNotFound function
2026-05-05 15:13:46 +08:00
2ue
6faa344916
feat: add OpenAI image generation controls
2026-05-05 03:26:54 +08:00
deqiying
23555be380
fix(openai): 修复 WS passthrough 使用记录缺失推理强度和 User-Agent
...
- 为 OpenAI Responses WebSocket v2 passthrough 补齐每轮 reasoning_effort 元数据
- 传递首帧渠道映射前模型,保留模型后缀推理强度推导能力
- 增加 usage log 端到端回归,覆盖入站 User-Agent、显式 effort 和渠道映射场景
2026-05-03 19:33:09 +08:00
shaw
094e1171ef
fix(openai): infer previous response for item references
2026-04-30 12:02:08 +08:00
KnowSky404
28dc34b6a3
fix(openai): avoid inferred WS continuation on explicit tool replay
2026-04-29 17:38:08 +08:00
DaydreamCoding
30f55a1f72
feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin)
...
对称参照 Claude BetaPolicy 的 fast-mode 过滤实现,新增针对 OpenAI 上游
service_tier 字段(priority / flex,含客户端 "fast" → "priority" 归一化)的
pass / filter / block 三态策略,覆盖全部 OpenAI 入口 + admin 配置入口。
后端核心
- 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、
OpenAIFastPolicySettings 配置模型,含规则的 service_tier × action × scope
× 模型白名单 × fallback action 维度。
- SettingService.Get/SetOpenAIFastPolicySettings;缺失时返回内置默认策略
(所有模型的 priority 走 filter,whitelist 为空,fallback=pass)。设计
依据:service_tier=fast 是用户级开关,与 model 字段正交,默认锁定特定
model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON
解析失败不再静默 fallback,slog.Warn 记录脏数据,便于运维定位。
- service_tier 归一化(trim + ToLower + fast→priority + 白名单 priority/flex)
与策略评估(evaluateOpenAIFastPolicy)作为唯一真实来源,HTTP / WS 共用。
抽出纯函数 evaluateOpenAIFastPolicyWithSettings,配合 ctx-bound settings
快照(withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext),
WS 长会话入口预取一次后所有帧复用,避免每帧打到 settingService。
HTTP 入口(4 个)
- Chat Completions、Anthropic 兼容(Messages,含 BetaFastMode→priority 二次
命中)、原生 Responses、Passthrough Responses 全部接入
applyOpenAIFastPolicyToBody,filter 走 sjson 顶层删除 service_tier,block
返回 403 forbidden_error JSON。
- 4 入口统一使用 upstream 视角的 model(GetMappedModel +
normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug),
避免 chat/messages/native /responses/passthrough 因为 model 维度不同
造成 whitelist 命中差异。
- 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body,
否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游
导致 400/拒绝(chat-completions 入口的 normalizeResponsesBodyServiceTier
此前已具备同等行为)。
WebSocket 入口
- 新增 applyOpenAIFastPolicyToWSResponseCreate:严格匹配
type="response.create",仅处理顶层 service_tier;filter 用 sjson 删字段,
block 返回 typed *OpenAIFastBlockedError。
- ingress 路径在 parseClientPayload 内调用,block 命中先 Write Realtime
风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation
=1008),依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于
close。
- passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略,并通过
openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream
帧执行策略;后续帧无 model 字段时回退到 capturedSessionModel。
filter 闭包内同时侦测 session.update / session.created 帧的 session.model
字段刷新 capturedSessionModel,封堵"首帧 model=gpt-4o(pass)→
session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback
到 gpt-4o"的 mid-session 绕过路径。
- passthrough billing:requestServiceTier 在策略 filter 之后再从
firstClientMessage 提取,filter 命中时 OpenAIForwardResult.ServiceTier
上报 nil(default tier),与 HTTP 入口(reqBody 来自 post-filter map)
/ WS ingress(payload 来自 post-filter bytes)的语义一致。
- 错误事件 schema:{event_id: "evt_<32hex>", type: "error",
error: {type: "forbidden_error", code: "policy_violation", message}},
与 OpenAI codex 客户端 error event 解析兼容。
Admin / Frontend
- dto.SystemSettings / UpdateSettingsRequest 新增
openai_fast_policy_settings 字段(omitempty),bulk GET/PUT 接入。
- Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片:
service_tier × action × scope × 模型白名单 × fallback action 全字段配置。
- 前端守门:openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写,
避免 rollout/错误把默认规则覆盖成空;saveSettings 回写循环 skip 该字段,
由专用刷新逻辑处理;仅 action=block 时发送 error_message,匹配后端
omitempty 行为。
测试
- HTTP 路径:openai_fast_policy_test.go 覆盖默认配置(whitelist=[],所有
模型 priority filter)/ block 自定义错误 / scope 区分 / filter 删字段 /
block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI
fast policy 等场景。
- WebSocket 路径:openai_fast_policy_ws_test.go 覆盖
helper 单元(filter / fast→priority 归一化 / flex 透传 / block typed
error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type
帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错)+
pass 路径 fast 别名归一化回归 +
ingress 端到端(filter 后上游不含 service_tier / block 后客户端先收
error event 再收 close 1008 且上游 0 写)+
passthrough capturedSessionModel fallback 用例(whitelist 策略下首帧
建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化)+
passthrough session.update / session.created 旋转 capturedSessionModel
的 mid-session 绕过回归 +
passthrough billing post-filter ServiceTier 与 idempotent filter 回归。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:15:09 +08:00
shaw
095f457c57
feat(openai): port /responses/compact account support flow (PR #1555 )
...
将 vansour/sub2api#1555 的 OpenAI compact 能力建模手工移植到当前 main:账号
级 compact 状态/auto-force_on-force_off 模式、compact-only 模型映射、调度器
tier 分层(已支持 > 未知 > 已知不支持)、管理后台 compact 主动探测,以及对应
i18n/状态徽章。普通 /responses 流量行为不变,无数据库迁移。
2026-04-25 14:52:58 +08:00
Alex
3a07e92b60
fix(openai): do not normalize /completion API token based accounts
2026-04-07 11:40:41 +03:00
erio
e27b0adbc8
refactor: remove resolveOpenAIUpstreamModel, use normalizeCodexModel directly
...
Eliminates unnecessary indirection layer. The wrapper function only
called normalizeCodexModel with a special case for "gpt 5.3 codex spark"
(space-separated variant) that is no longer needed.
All call sites now use normalizeCodexModel directly.
2026-04-04 14:07:19 +08:00
InCerryGit
fa68cbad1b
Merge branch 'Wei-Shaw:main' into main
2026-03-24 19:21:30 +08:00
InCerry
995ef1348a
refactor: improve model resolution and normalization logic for OpenAI integration
2026-03-24 19:20:15 +08:00
Wang Lvyuan
fef9259aaa
fix(openai): recheck runtime state from db before final account selection
2026-03-23 03:50:03 +08:00
Ethan0x0000
2c667a159c
fix(provider): retain upstream model for gemini compat and ws
...
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-21 01:24:59 +08:00
QTom
3741617ebd
fix(gateway): WS 连接池条件式 MarkBroken 防止跨请求串流
...
正常终端事件(response.completed 等)退出后连接归还复用,
仅异常路径(读写错误、error 事件、客户端断连)MarkBroken 销毁。
Generate 模式:
- 引入 cleanExit 标记,仅在 isTerminalEvent break 时设置 true
- defer 中根据 cleanExit 决定是否 MarkBroken
- 所有异常路径已在各自分支中提前调用 MarkBroken
Ingress 模式:
- 引入 lastTurnClean 标记,sendAndRelay 正常完成时设为 true
- releaseSessionLease 根据 lastTurnClean 决定是否 MarkBroken
- 错误路径重置 lastTurnClean = false
- 客户端断连后 drain 仍保守 MarkBroken(L2916)
2026-03-16 10:50:02 +08:00
QTom
ab4e8b2cf0
fix(gateway): 防止 OpenAI Codex 跨用户串流
...
根因:多个用户共享同一 OAuth 账号时,conversation_id/session_id 头
未做用户隔离,导致上游 chatgpt.com 将不同用户的请求关联到同一会话。
HTTP SSE 修复:
- 新增 isolateOpenAISessionID(apiKeyID, raw),将 API Key ID 混入
session 标识符(xxhash),确保不同 Key 的用户产生不同上游会话
- buildUpstreamRequest: OAuth 分支先 Del 客户端透传的 session 头,
再用隔离值覆盖
- buildUpstreamRequestOpenAIPassthrough: 透传路径同样隔离
- ForwardAsAnthropic: Anthropic Messages 兼容路径同步修复
- buildOpenAIWSHeaders: WS 路径的 OAuth session 头同步隔离
2026-03-16 10:28:51 +08:00
InCerry
2666422b99
fix: handle invalid encrypted content error and retry logic.
2026-03-14 11:42:42 +08:00
Wesley Liddick
391e79f8ee
Merge pull request #875 from mt21625457/fix/openai-fast-billing-clean
...
fix(billing): 修复 OpenAI fast 档位计费并补齐展示
2026-03-09 10:32:18 +08:00
yangjianbo
87f4ed591e
fix(billing): 修复 OpenAI fast 档位计费并补齐展示
...
- 打通 service_tier 在 OpenAI HTTP、WS、passthrough 与 usage 记录中的传递
- 修正 priority/flex 计费逻辑,并将 fast 归一化为 priority
- 在用户端和管理端补齐服务档位与计费明细展示
- 补齐前后端测试,并修复 WS 限流信号重复持久化导致的全量回归失败
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 09:51:26 +08:00
神乐
45d57018eb
fix: 修复 OpenAI WS 限流状态与调度同步
2026-03-07 23:59:39 +08:00
神乐
101ef0cf62
fix: 限流账号自动退出调度并优化提示文案
2026-03-07 21:05:37 +08:00
admin
da89583ccc
fix(openai): detect official codex client by headers
2026-03-07 14:12:38 +08:00
神乐
838ada8864
fix(openai): restore ws usage window display
2026-03-06 20:49:47 +08:00
erio
02dea7b09b
refactor: unify post-usage billing logic and fix account quota calculation
...
- Extract postUsageBilling() to consolidate billing logic across
GatewayService.RecordUsage, RecordUsageWithLongContext, and
OpenAIGatewayService.RecordUsage, eliminating ~120 lines of
duplicated code
- Fix account quota to use TotalCost × accountRateMultiplier
(was using raw TotalCost, inconsistent with account cost stats)
- Fix RecordUsageWithLongContext API Key quota only updating in
balance mode (now updates regardless of billing type)
- Fix WebSocket client disconnect detection on Windows by adding
"an established connection was aborted" to known disconnect errors
2026-03-06 00:54:17 +08:00
yangjianbo
1d0872e7ca
feat(openai-ws): 合并 WS v2 透传模式与前端 ws mode
...
新增 OpenAI WebSocket v2 passthrough relay 数据面与服务适配层,
支持按账号 ws mode 在 ctx_pool 与 passthrough 间路由。
同步调整前端 OpenAI ws mode 选项为 off/ctx_pool/passthrough,
并补充 i18n 文案与对应单测。
新增 Caddyfile.dmit 与 docker-compose-aicodex.yml 部署配置,
用于宿主机场景下的反向代理与服务编排。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 11:50:58 +08:00
yangjianbo
bb664d9bbf
feat(sync): full code sync from release
2026-02-28 15:01:20 +08:00