Wesley Liddick
f68d351158
Merge pull request #2873 from wucm667/feat/account-quota-threshold-auto-pause
...
feat(account): 支持按 5h/7d 用量阈值自动暂停账号调度
2026-05-29 15:40:33 +08:00
wucm667
c9caadb378
fix(account): address second-round review on quota auto-pause
...
- TopK initial filter now drops quota-paused accounts: fold the quota check
into isAccountRequestCompatible so session-hash, TopK pool, and per-candidate
rechecks all skip paused accounts. Previously the candidate pool was built
without the quota check, so paused accounts could fill TopK and leave the
scheduler returning "no available accounts" even with healthy ones available.
- Add per-account explicit disable flags auto_pause_5h_disabled /
auto_pause_7d_disabled with toggles in EditAccountModal. Without these,
leaving the account threshold blank silently falls back to the global default,
so admins could not exempt a single account once a global default existed.
Disable is per-window: an account can opt out of 5h auto-pause while still
honoring 7d. Schedule snapshot whitelist includes the new fields, i18n EN/ZH
updated, threshold-hint text revised to explain "blank = global default".
- Move quota auto-pause settings off the request hot path: replace the per-repo
TTL+singleflight sync DB read with a per-SettingService stale-while-revalidate
in-memory snapshot. Get is non-blocking (atomic.Pointer load + async refresh
on staleness); writes via UpdateOpsAdvancedSettings push directly into the
cache through an injected sink; wire warms the cache at startup. Adds Warm
(sync) for tests/init and SetOpenAIQuotaAutoPauseSettings (sink target).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 14:32:45 +08:00
wucm667
8b7a822706
fix(account): address review on OpenAI quota auto-pause
...
- gate previous_response_id sticky path with quota auto-pause check at both
the snapshot and DB-recheck stages (previously bypassed, #1 )
- skip pausing when the usage window already reset to avoid a stale stuck-pause;
carry codex_*_reset_at / reset_after_seconds / codex_usage_updated_at through
the scheduler snapshot whitelist (#2 )
- remove the incomplete limit mode; percentage threshold only (#3 )
- add global default 5h/7d threshold inputs to the Ops settings dialog with
validation and en/zh i18n (#4 )
- downgrade account_auto_paused_by_quota log from Info to Debug; it fires
per-candidate on the scheduling hot path (#5 )
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 12:20:30 +08:00
wucm667
ead471d64b
feat(account): 支持按 5h/7d 用量阈值自动暂停账号调度
2026-05-29 10:47:47 +08:00
Wesley Liddick
16842c2f8b
Merge pull request #2836 from siyuan-123/fix/openai-ws-compat-usage
...
修复 OpenAI WS 兼容性与 usage 统计
2026-05-29 10:47:12 +08:00
Wesley Liddick
433f8dcd13
Merge pull request #2834 from DaydreamCoding/pr/openai-codex-cli-allow-claude-code
...
feat(openai): codex_cli_only 新增放行 Claude Code Codex 插件的机制
2026-05-29 10:30:33 +08:00
shaw
ed1b57c597
fix(openai): gate routing by endpoint capability
2026-05-29 08:58:10 +08:00
siyuan
d7bed40dda
修复 OpenAI WS 兼容性与 usage 统计
...
- 对齐 WS 与流式终态 usage 解析,补齐 failed/done/incomplete/cancelled 等事件
- 兼容后续 WS response.create 省略 model,保持模型映射与权限判断一致
- 补齐 passthrough header 透传和图片 usage 字段映射
2026-05-28 01:27:11 +08:00
DaydreamCoding
56908d3c4c
feat(openai): codex_cli_only 新增放行 Claude Code Codex 插件的机制
...
适用场景:在 Claude Code 中使用 https://github.com/openai/codex-plugin-cc
插件时,插件经官方 codex app-server 以 clientInfo.name="Claude Code" 完成
initialize 握手,请求头被设为 originator=Claude Code、User-Agent 含
"Claude Code/",不在官方客户端白名单内,原本会被 codex_cli_only 拦截 403。
在官方客户端白名单未命中时评估两层独立放行(OR 语义):
- 按账号:account.Extra.codex_cli_only_allowed_clients 引用命名预设
(目前仅 claude_code),detector reason=allowed_client_matched
- 全局开关:/admin/settings 网关服务 OpenAI 区块新增
openai_allow_claude_code_codex_plugin(默认 false),开启后对所有
codex_cli_only 账号统一放行,detector reason=global_allowed_client_matched
签名仍要求 originator=Claude Code 精确等值 + UA 含 "Claude Code/"。
上游转发保持透传不变。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:55:34 +08:00
Wesley Liddick
bbe847ed3e
Merge pull request #2805 from StarryKira/feat/configurable-pool-retry-status-codes
...
feat(account): configurable pool-mode same-account retry status codes
2026-05-27 22:09:55 +08:00
Wesley Liddick
69657b2fa1
Merge pull request #2827 from ttt132/fix/api-key-responses-sse-fallback
...
fix: fallback to SSE body for API key responses
2026-05-27 21:56:00 +08:00
Wesley Liddick
61ce79533e
Merge pull request #2800 from wucm667/fix/scheduler-model-not-found-per-model-cooldown
...
fix(scheduler): 模型 404 仅冷却该账号-模型组合,不再封整个账号
2026-05-27 21:01:52 +08:00
haichuan
32ea9cfe1f
fix: fallback to SSE body for API key responses
2026-05-27 20:24:52 +08:00
StarryKira
21033dceb9
feat(account): configurable pool-mode same-account retry status codes
...
Pool mode currently retries the same account for a fixed set of
upstream HTTP statuses: 401, 403, 429. Some upstream pool deployments
also need same-account retry for transient provider/proxy statuses
such as 502, 503, 520, 529, but hard-coding more statuses changes
behavior for everyone.
Add a per-account credentials option `pool_mode_retry_status_codes`
that lets admins choose which upstream HTTP status codes trigger
same-account retry in pool mode:
- Unset (default): preserve the current 401/403/429 default
- Explicit list: override the defaults with the configured codes
- Codes normalized to the 100-599 range, deduplicated, sorted
The standalone `isPoolModeRetryableStatus` helper is kept as the
default-only fallback. All 15 gateway call sites switch to the new
`Account.IsPoolModeRetryableStatus` method so behavior is preserved
for accounts that do not configure the new field.
Frontend admin UI gains a "Retry Status Codes" comma-separated input
under the pool-mode section in both Create/Edit account modals
(en + zh i18n).
Fixes #2731
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 11:24:25 -07:00
wucm667
a31b507484
fix(scheduler): 模型404仅冷却账号模型组合
2026-05-26 20:29:48 +08:00
benjamin
5d7df678b1
fix(openai): mark local gateway denials business-limited
...
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-26 17:19:50 +08:00
Wesley Liddick
bebc082306
Merge pull request #2766 from DaydreamCoding/feat/user-platform-quota
...
feat(quota): 用户 × 平台 USD 配额
2026-05-26 14:13:18 +08:00
mt21625457
33ac8eb27d
fix openai http2 response header timeout
2026-05-26 13:57:59 +08:00
DaydreamCoding
6b39b344d8
feat(quota): 用户 × 平台 USD 配额
...
为用户在 anthropic/openai/gemini/antigravity 四个平台上提供日/周/月
三个窗口的 USD 配额管控。配额语义:未设置=不限制,0=禁用,>0=美元上限。
两层模型:
- 配置层:系统默认配额,以及 email/linuxdo/oidc/wechat/github/google/
dingtalk 七个鉴权来源的默认配额,存于 settings,以嵌套 JSON 整体读写
(系统 1 个 key + 每个来源 1 个 key),整体替换语义。
- 运行时层:user_platform_quota 表按用户记录实际配额,与配置层解耦。
后端:新增 ent schema 与 140_user_platform_quotas.sql 迁移、repository
与 service 端口、计费链路集成、管理端与用户端读写接口。
前端:管理端设置页配额编辑、用户配额管理 Modal、用户 Dashboard 展示、
中英文案。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 10:49:20 +08:00
Wesley Liddick
3c5a444802
Merge pull request #2698 from deqiying/fix/log-real-client-ip
...
fix: 修复反代部署下拒绝日志客户端 IP 不准确
2026-05-23 11:08:47 +08:00
shaw
1e406fed52
fix: optimize OpenAI account cooldown scheduling
2026-05-23 10:18:43 +08:00
deqiying
0af44ce4c2
fix: 修复反代部署下拒绝日志客户端 IP 不准确
...
将 OpenAI codex_cli_only 拒绝诊断日志中的 request_client_ip
改为复用 ip.GetClientIP,与 usage 记录和 access log 的真实客户端
IP 解析逻辑保持一致。
保留 request_remote_addr 用于排查底层 Docker/反代 peer 地址,并补充
单元测试覆盖反代头与 remote addr 分离的场景。
2026-05-22 23:28:21 +08:00
Wesley Liddick
7ec61eb2f5
Merge pull request #2606 from wucm667/fix/openai-responses-respect-force-chat-completions
...
fix(openai): /v1/responses 入口尊重 force_chat_completions 设置
2026-05-20 15:13:43 +08:00
shaw
878ad3b569
feat(openai-gateway): Codex OAuth 账号浏览器 UA 自动改写规避 Cloudflare
...
质询
2026-05-20 14:33:51 +08:00
wucm667
cae93ae137
fix(openai): /v1/responses respect force chat completions
2026-05-20 14:17:26 +08:00
name
2eb622f2f6
Remove ops retry replay storage
2026-05-19 19:37:41 +08:00
Wesley Liddick
36e461e7c9
Merge pull request #2424 from wucm667/fix/openai-versioned-base-url
...
fix(openai): handle versioned compatible base URLs
2026-05-19 14:44:37 +08:00
Wesley Liddick
ae4c738887
Merge pull request #2457 from wucm667/fix/openai-fast-policy-default-pass
...
fix: 默认透传 OpenAI service_tier
2026-05-19 14:34:37 +08:00
Wesley Liddick
a340002c6d
Merge pull request #2401 from 2ue/fix/normalize-image-billing-size
...
修复图片计费尺寸归一化与使用记录展示
2026-05-19 14:00:24 +08:00
Wesley Liddick
14f54be03f
Merge pull request #2481 from weak-fox/lyp/fix-issue-2223-capacity-retry
...
fix: 修复 OpenAI 模型容量错误未进入自动重试
2026-05-19 10:24:18 +08:00
Wesley Liddick
f9fec78b70
Merge pull request #2505 from is7Qin/fix/openai-compat-usage-parsing
...
修复 Claude 映射 GPT 后被记为 0 token 的计费漏洞
2026-05-19 09:53:50 +08:00
lyen1688
cc5328c491
修复 OpenAI Responses SSE 终止事件识别
2026-05-17 15:33:34 +08:00
name
0393bd7c82
Fix OpenAI compat usage parsing
2026-05-16 03:03:43 +08:00
weak-fox
9f07741c13
fix: retry model capacity transient errors
2026-05-15 10:43:29 +08:00
wucm667
e9637148dd
fix(openai): pass service_tier by default
2026-05-14 16:45:31 +08:00
wucm667
679c0865a0
fix(openai): handle versioned compatible base URLs
2026-05-13 11:25:15 +08:00
2ue
bb4c1abe28
Fix image billing size normalization
2026-05-12 15:21:31 +08:00
wucm667
6d69ae87c3
fix(openai): record zero-cost usage for unpriced models
2026-05-09 17:33:35 +08:00
Jlypx
26043a8f29
fix(openai): gate Codex image bridge injection
...
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-07 00:10:20 +08:00
lyen1688
0584305e5a
feat: improve OpenAI messages compatibility for Claude Code
2026-05-05 19:36:33 +08:00
2ue
6faa344916
feat: add OpenAI image generation controls
2026-05-05 03:26:54 +08:00
shaw
47fb38bca1
fix: record zero OpenAI usage logs
2026-05-03 17:43:56 +08:00
shaw
72d5ee4cd1
fix: drain OpenAI compat streams for usage
2026-05-03 17:11:27 +08:00
Wesley Liddick
55a7fa1e07
Merge pull request #2005 from gaoren002/pr/openai-strip-passthrough-fields
...
fix(openai): strip unsupported passthrough fields
2026-04-29 21:46:19 +08:00
Wesley Liddick
46f06b2498
Merge pull request #2050 from zvensmoluya/fix/openai-compact-payload-fields
...
fix(openai): preserve current Codex compact payload fields
2026-04-29 21:03:48 +08:00
DaydreamCoding
30f55a1f72
feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin)
...
对称参照 Claude BetaPolicy 的 fast-mode 过滤实现,新增针对 OpenAI 上游
service_tier 字段(priority / flex,含客户端 "fast" → "priority" 归一化)的
pass / filter / block 三态策略,覆盖全部 OpenAI 入口 + admin 配置入口。
后端核心
- 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、
OpenAIFastPolicySettings 配置模型,含规则的 service_tier × action × scope
× 模型白名单 × fallback action 维度。
- SettingService.Get/SetOpenAIFastPolicySettings;缺失时返回内置默认策略
(所有模型的 priority 走 filter,whitelist 为空,fallback=pass)。设计
依据:service_tier=fast 是用户级开关,与 model 字段正交,默认锁定特定
model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON
解析失败不再静默 fallback,slog.Warn 记录脏数据,便于运维定位。
- service_tier 归一化(trim + ToLower + fast→priority + 白名单 priority/flex)
与策略评估(evaluateOpenAIFastPolicy)作为唯一真实来源,HTTP / WS 共用。
抽出纯函数 evaluateOpenAIFastPolicyWithSettings,配合 ctx-bound settings
快照(withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext),
WS 长会话入口预取一次后所有帧复用,避免每帧打到 settingService。
HTTP 入口(4 个)
- Chat Completions、Anthropic 兼容(Messages,含 BetaFastMode→priority 二次
命中)、原生 Responses、Passthrough Responses 全部接入
applyOpenAIFastPolicyToBody,filter 走 sjson 顶层删除 service_tier,block
返回 403 forbidden_error JSON。
- 4 入口统一使用 upstream 视角的 model(GetMappedModel +
normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug),
避免 chat/messages/native /responses/passthrough 因为 model 维度不同
造成 whitelist 命中差异。
- 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body,
否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游
导致 400/拒绝(chat-completions 入口的 normalizeResponsesBodyServiceTier
此前已具备同等行为)。
WebSocket 入口
- 新增 applyOpenAIFastPolicyToWSResponseCreate:严格匹配
type="response.create",仅处理顶层 service_tier;filter 用 sjson 删字段,
block 返回 typed *OpenAIFastBlockedError。
- ingress 路径在 parseClientPayload 内调用,block 命中先 Write Realtime
风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation
=1008),依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于
close。
- passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略,并通过
openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream
帧执行策略;后续帧无 model 字段时回退到 capturedSessionModel。
filter 闭包内同时侦测 session.update / session.created 帧的 session.model
字段刷新 capturedSessionModel,封堵"首帧 model=gpt-4o(pass)→
session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback
到 gpt-4o"的 mid-session 绕过路径。
- passthrough billing:requestServiceTier 在策略 filter 之后再从
firstClientMessage 提取,filter 命中时 OpenAIForwardResult.ServiceTier
上报 nil(default tier),与 HTTP 入口(reqBody 来自 post-filter map)
/ WS ingress(payload 来自 post-filter bytes)的语义一致。
- 错误事件 schema:{event_id: "evt_<32hex>", type: "error",
error: {type: "forbidden_error", code: "policy_violation", message}},
与 OpenAI codex 客户端 error event 解析兼容。
Admin / Frontend
- dto.SystemSettings / UpdateSettingsRequest 新增
openai_fast_policy_settings 字段(omitempty),bulk GET/PUT 接入。
- Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片:
service_tier × action × scope × 模型白名单 × fallback action 全字段配置。
- 前端守门:openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写,
避免 rollout/错误把默认规则覆盖成空;saveSettings 回写循环 skip 该字段,
由专用刷新逻辑处理;仅 action=block 时发送 error_message,匹配后端
omitempty 行为。
测试
- HTTP 路径:openai_fast_policy_test.go 覆盖默认配置(whitelist=[],所有
模型 priority filter)/ block 自定义错误 / scope 区分 / filter 删字段 /
block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI
fast policy 等场景。
- WebSocket 路径:openai_fast_policy_ws_test.go 覆盖
helper 单元(filter / fast→priority 归一化 / flex 透传 / block typed
error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type
帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错)+
pass 路径 fast 别名归一化回归 +
ingress 端到端(filter 后上游不含 service_tier / block 后客户端先收
error event 再收 close 1008 且上游 0 写)+
passthrough capturedSessionModel fallback 用例(whitelist 策略下首帧
建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化)+
passthrough session.update / session.created 旋转 capturedSessionModel
的 mid-session 绕过回归 +
passthrough billing post-filter ServiceTier 与 idempotent filter 回归。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:15:09 +08:00
Zven
3d4ca5e8d1
fix(openai): preserve current Codex compact payload fields
2026-04-28 10:55:29 +08:00
gaoren002
9fe02bba7e
fix(openai): strip unsupported passthrough fields
2026-04-27 00:39:06 +00:00
gaoren002
615557ec20
fix(openai): avoid implicit image sticky sessions
2026-04-26 17:09:41 +00:00
Wesley Liddick
22b1277572
Merge pull request #1948 from hungryboy1025/fix/openai-account-test-responses-stream
...
fix(openai): tighten responses stream account tests
2026-04-25 20:31:07 +08:00