262 Commits

Author SHA1 Message Date
Wesley Liddick
f68d351158
Merge pull request #2873 from wucm667/feat/account-quota-threshold-auto-pause
feat(account): 支持按 5h/7d 用量阈值自动暂停账号调度
2026-05-29 15:40:33 +08:00
wucm667
c9caadb378 fix(account): address second-round review on quota auto-pause
- TopK initial filter now drops quota-paused accounts: fold the quota check
  into isAccountRequestCompatible so session-hash, TopK pool, and per-candidate
  rechecks all skip paused accounts. Previously the candidate pool was built
  without the quota check, so paused accounts could fill TopK and leave the
  scheduler returning "no available accounts" even with healthy ones available.
- Add per-account explicit disable flags auto_pause_5h_disabled /
  auto_pause_7d_disabled with toggles in EditAccountModal. Without these,
  leaving the account threshold blank silently falls back to the global default,
  so admins could not exempt a single account once a global default existed.
  Disable is per-window: an account can opt out of 5h auto-pause while still
  honoring 7d. Schedule snapshot whitelist includes the new fields, i18n EN/ZH
  updated, threshold-hint text revised to explain "blank = global default".
- Move quota auto-pause settings off the request hot path: replace the per-repo
  TTL+singleflight sync DB read with a per-SettingService stale-while-revalidate
  in-memory snapshot. Get is non-blocking (atomic.Pointer load + async refresh
  on staleness); writes via UpdateOpsAdvancedSettings push directly into the
  cache through an injected sink; wire warms the cache at startup. Adds Warm
  (sync) for tests/init and SetOpenAIQuotaAutoPauseSettings (sink target).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 14:32:45 +08:00
wucm667
8b7a822706 fix(account): address review on OpenAI quota auto-pause
- gate previous_response_id sticky path with quota auto-pause check at both
  the snapshot and DB-recheck stages (previously bypassed, #1)
- skip pausing when the usage window already reset to avoid a stale stuck-pause;
  carry codex_*_reset_at / reset_after_seconds / codex_usage_updated_at through
  the scheduler snapshot whitelist (#2)
- remove the incomplete limit mode; percentage threshold only (#3)
- add global default 5h/7d threshold inputs to the Ops settings dialog with
  validation and en/zh i18n (#4)
- downgrade account_auto_paused_by_quota log from Info to Debug; it fires
  per-candidate on the scheduling hot path (#5)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 12:20:30 +08:00
wucm667
ead471d64b feat(account): 支持按 5h/7d 用量阈值自动暂停账号调度 2026-05-29 10:47:47 +08:00
Wesley Liddick
16842c2f8b
Merge pull request #2836 from siyuan-123/fix/openai-ws-compat-usage
修复 OpenAI WS 兼容性与 usage 统计
2026-05-29 10:47:12 +08:00
Wesley Liddick
433f8dcd13
Merge pull request #2834 from DaydreamCoding/pr/openai-codex-cli-allow-claude-code
feat(openai): codex_cli_only 新增放行 Claude Code Codex 插件的机制
2026-05-29 10:30:33 +08:00
shaw
ed1b57c597 fix(openai): gate routing by endpoint capability 2026-05-29 08:58:10 +08:00
siyuan
d7bed40dda 修复 OpenAI WS 兼容性与 usage 统计
- 对齐 WS 与流式终态 usage 解析,补齐 failed/done/incomplete/cancelled 等事件
- 兼容后续 WS response.create 省略 model,保持模型映射与权限判断一致
- 补齐 passthrough header 透传和图片 usage 字段映射
2026-05-28 01:27:11 +08:00
DaydreamCoding
56908d3c4c feat(openai): codex_cli_only 新增放行 Claude Code Codex 插件的机制
适用场景:在 Claude Code 中使用 https://github.com/openai/codex-plugin-cc
插件时,插件经官方 codex app-server 以 clientInfo.name="Claude Code" 完成
initialize 握手,请求头被设为 originator=Claude Code、User-Agent 含
"Claude Code/",不在官方客户端白名单内,原本会被 codex_cli_only 拦截 403。

在官方客户端白名单未命中时评估两层独立放行(OR 语义):

- 按账号:account.Extra.codex_cli_only_allowed_clients 引用命名预设
  (目前仅 claude_code),detector reason=allowed_client_matched
- 全局开关:/admin/settings 网关服务 OpenAI 区块新增
  openai_allow_claude_code_codex_plugin(默认 false),开启后对所有
  codex_cli_only 账号统一放行,detector reason=global_allowed_client_matched

签名仍要求 originator=Claude Code 精确等值 + UA 含 "Claude Code/"。
上游转发保持透传不变。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:55:34 +08:00
Wesley Liddick
bbe847ed3e
Merge pull request #2805 from StarryKira/feat/configurable-pool-retry-status-codes
feat(account): configurable pool-mode same-account retry status codes
2026-05-27 22:09:55 +08:00
Wesley Liddick
69657b2fa1
Merge pull request #2827 from ttt132/fix/api-key-responses-sse-fallback
fix: fallback to SSE body for API key responses
2026-05-27 21:56:00 +08:00
Wesley Liddick
61ce79533e
Merge pull request #2800 from wucm667/fix/scheduler-model-not-found-per-model-cooldown
fix(scheduler): 模型 404 仅冷却该账号-模型组合,不再封整个账号
2026-05-27 21:01:52 +08:00
haichuan
32ea9cfe1f fix: fallback to SSE body for API key responses 2026-05-27 20:24:52 +08:00
StarryKira
21033dceb9 feat(account): configurable pool-mode same-account retry status codes
Pool mode currently retries the same account for a fixed set of
upstream HTTP statuses: 401, 403, 429. Some upstream pool deployments
also need same-account retry for transient provider/proxy statuses
such as 502, 503, 520, 529, but hard-coding more statuses changes
behavior for everyone.

Add a per-account credentials option `pool_mode_retry_status_codes`
that lets admins choose which upstream HTTP status codes trigger
same-account retry in pool mode:

- Unset (default): preserve the current 401/403/429 default
- Explicit list: override the defaults with the configured codes
- Codes normalized to the 100-599 range, deduplicated, sorted

The standalone `isPoolModeRetryableStatus` helper is kept as the
default-only fallback. All 15 gateway call sites switch to the new
`Account.IsPoolModeRetryableStatus` method so behavior is preserved
for accounts that do not configure the new field.

Frontend admin UI gains a "Retry Status Codes" comma-separated input
under the pool-mode section in both Create/Edit account modals
(en + zh i18n).

Fixes #2731

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 11:24:25 -07:00
wucm667
a31b507484 fix(scheduler): 模型404仅冷却账号模型组合 2026-05-26 20:29:48 +08:00
benjamin
5d7df678b1 fix(openai): mark local gateway denials business-limited
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-26 17:19:50 +08:00
Wesley Liddick
bebc082306
Merge pull request #2766 from DaydreamCoding/feat/user-platform-quota
feat(quota): 用户 × 平台 USD 配额
2026-05-26 14:13:18 +08:00
mt21625457
33ac8eb27d fix openai http2 response header timeout 2026-05-26 13:57:59 +08:00
DaydreamCoding
6b39b344d8 feat(quota): 用户 × 平台 USD 配额
为用户在 anthropic/openai/gemini/antigravity 四个平台上提供日/周/月
三个窗口的 USD 配额管控。配额语义:未设置=不限制,0=禁用,>0=美元上限。

两层模型:
- 配置层:系统默认配额,以及 email/linuxdo/oidc/wechat/github/google/
  dingtalk 七个鉴权来源的默认配额,存于 settings,以嵌套 JSON 整体读写
  (系统 1 个 key + 每个来源 1 个 key),整体替换语义。
- 运行时层:user_platform_quota 表按用户记录实际配额,与配置层解耦。

后端:新增 ent schema 与 140_user_platform_quotas.sql 迁移、repository
与 service 端口、计费链路集成、管理端与用户端读写接口。
前端:管理端设置页配额编辑、用户配额管理 Modal、用户 Dashboard 展示、
中英文案。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 10:49:20 +08:00
Wesley Liddick
3c5a444802
Merge pull request #2698 from deqiying/fix/log-real-client-ip
fix: 修复反代部署下拒绝日志客户端 IP 不准确
2026-05-23 11:08:47 +08:00
shaw
1e406fed52 fix: optimize OpenAI account cooldown scheduling 2026-05-23 10:18:43 +08:00
deqiying
0af44ce4c2 fix: 修复反代部署下拒绝日志客户端 IP 不准确
将 OpenAI codex_cli_only 拒绝诊断日志中的 request_client_ip
改为复用 ip.GetClientIP,与 usage 记录和 access log 的真实客户端
IP 解析逻辑保持一致。

保留 request_remote_addr 用于排查底层 Docker/反代 peer 地址,并补充
单元测试覆盖反代头与 remote addr 分离的场景。
2026-05-22 23:28:21 +08:00
Wesley Liddick
7ec61eb2f5
Merge pull request #2606 from wucm667/fix/openai-responses-respect-force-chat-completions
fix(openai): /v1/responses 入口尊重 force_chat_completions 设置
2026-05-20 15:13:43 +08:00
shaw
878ad3b569 feat(openai-gateway): Codex OAuth 账号浏览器 UA 自动改写规避 Cloudflare
质询
2026-05-20 14:33:51 +08:00
wucm667
cae93ae137 fix(openai): /v1/responses respect force chat completions 2026-05-20 14:17:26 +08:00
name
2eb622f2f6 Remove ops retry replay storage 2026-05-19 19:37:41 +08:00
Wesley Liddick
36e461e7c9
Merge pull request #2424 from wucm667/fix/openai-versioned-base-url
fix(openai): handle versioned compatible base URLs
2026-05-19 14:44:37 +08:00
Wesley Liddick
ae4c738887
Merge pull request #2457 from wucm667/fix/openai-fast-policy-default-pass
fix: 默认透传 OpenAI service_tier
2026-05-19 14:34:37 +08:00
Wesley Liddick
a340002c6d
Merge pull request #2401 from 2ue/fix/normalize-image-billing-size
修复图片计费尺寸归一化与使用记录展示
2026-05-19 14:00:24 +08:00
Wesley Liddick
14f54be03f
Merge pull request #2481 from weak-fox/lyp/fix-issue-2223-capacity-retry
fix: 修复 OpenAI 模型容量错误未进入自动重试
2026-05-19 10:24:18 +08:00
Wesley Liddick
f9fec78b70
Merge pull request #2505 from is7Qin/fix/openai-compat-usage-parsing
修复 Claude 映射 GPT 后被记为 0 token 的计费漏洞
2026-05-19 09:53:50 +08:00
lyen1688
cc5328c491 修复 OpenAI Responses SSE 终止事件识别 2026-05-17 15:33:34 +08:00
name
0393bd7c82 Fix OpenAI compat usage parsing 2026-05-16 03:03:43 +08:00
weak-fox
9f07741c13 fix: retry model capacity transient errors 2026-05-15 10:43:29 +08:00
wucm667
e9637148dd fix(openai): pass service_tier by default 2026-05-14 16:45:31 +08:00
wucm667
679c0865a0 fix(openai): handle versioned compatible base URLs 2026-05-13 11:25:15 +08:00
2ue
bb4c1abe28 Fix image billing size normalization 2026-05-12 15:21:31 +08:00
wucm667
6d69ae87c3 fix(openai): record zero-cost usage for unpriced models 2026-05-09 17:33:35 +08:00
Jlypx
26043a8f29 fix(openai): gate Codex image bridge injection
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-07 00:10:20 +08:00
lyen1688
0584305e5a feat: improve OpenAI messages compatibility for Claude Code 2026-05-05 19:36:33 +08:00
2ue
6faa344916 feat: add OpenAI image generation controls 2026-05-05 03:26:54 +08:00
shaw
47fb38bca1 fix: record zero OpenAI usage logs 2026-05-03 17:43:56 +08:00
shaw
72d5ee4cd1 fix: drain OpenAI compat streams for usage 2026-05-03 17:11:27 +08:00
Wesley Liddick
55a7fa1e07
Merge pull request #2005 from gaoren002/pr/openai-strip-passthrough-fields
fix(openai): strip unsupported passthrough fields
2026-04-29 21:46:19 +08:00
Wesley Liddick
46f06b2498
Merge pull request #2050 from zvensmoluya/fix/openai-compact-payload-fields
fix(openai): preserve current Codex compact payload fields
2026-04-29 21:03:48 +08:00
DaydreamCoding
30f55a1f72 feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin)
对称参照 Claude BetaPolicy 的 fast-mode 过滤实现,新增针对 OpenAI 上游
service_tier 字段(priority / flex,含客户端 "fast" → "priority" 归一化)的
pass / filter / block 三态策略,覆盖全部 OpenAI 入口 + admin 配置入口。

后端核心
- 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、
  OpenAIFastPolicySettings 配置模型,含规则的 service_tier × action × scope
  × 模型白名单 × fallback action 维度。
- SettingService.Get/SetOpenAIFastPolicySettings;缺失时返回内置默认策略
  (所有模型的 priority 走 filter,whitelist 为空,fallback=pass)。设计
  依据:service_tier=fast 是用户级开关,与 model 字段正交,默认锁定特定
  model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON
  解析失败不再静默 fallback,slog.Warn 记录脏数据,便于运维定位。
- service_tier 归一化(trim + ToLower + fast→priority + 白名单 priority/flex)
  与策略评估(evaluateOpenAIFastPolicy)作为唯一真实来源,HTTP / WS 共用。
  抽出纯函数 evaluateOpenAIFastPolicyWithSettings,配合 ctx-bound settings
  快照(withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext),
  WS 长会话入口预取一次后所有帧复用,避免每帧打到 settingService。

HTTP 入口(4 个)
- Chat Completions、Anthropic 兼容(Messages,含 BetaFastMode→priority 二次
  命中)、原生 Responses、Passthrough Responses 全部接入
  applyOpenAIFastPolicyToBody,filter 走 sjson 顶层删除 service_tier,block
  返回 403 forbidden_error JSON。
- 4 入口统一使用 upstream 视角的 model(GetMappedModel +
  normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug),
  避免 chat/messages/native /responses/passthrough 因为 model 维度不同
  造成 whitelist 命中差异。
- 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body,
  否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游
  导致 400/拒绝(chat-completions 入口的 normalizeResponsesBodyServiceTier
  此前已具备同等行为)。

WebSocket 入口
- 新增 applyOpenAIFastPolicyToWSResponseCreate:严格匹配
  type="response.create",仅处理顶层 service_tier;filter 用 sjson 删字段,
  block 返回 typed *OpenAIFastBlockedError。
- ingress 路径在 parseClientPayload 内调用,block 命中先 Write Realtime
  风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation
  =1008),依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于
  close。
- passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略,并通过
  openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream
  帧执行策略;后续帧无 model 字段时回退到 capturedSessionModel。
  filter 闭包内同时侦测 session.update / session.created 帧的 session.model
  字段刷新 capturedSessionModel,封堵"首帧 model=gpt-4o(pass)→
  session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback
  到 gpt-4o"的 mid-session 绕过路径。
- passthrough billing:requestServiceTier 在策略 filter 之后再从
  firstClientMessage 提取,filter 命中时 OpenAIForwardResult.ServiceTier
  上报 nil(default tier),与 HTTP 入口(reqBody 来自 post-filter map)
  / WS ingress(payload 来自 post-filter bytes)的语义一致。
- 错误事件 schema:{event_id: "evt_<32hex>", type: "error",
  error: {type: "forbidden_error", code: "policy_violation", message}},
  与 OpenAI codex 客户端 error event 解析兼容。

Admin / Frontend
- dto.SystemSettings / UpdateSettingsRequest 新增
  openai_fast_policy_settings 字段(omitempty),bulk GET/PUT 接入。
- Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片:
  service_tier × action × scope × 模型白名单 × fallback action 全字段配置。
- 前端守门:openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写,
  避免 rollout/错误把默认规则覆盖成空;saveSettings 回写循环 skip 该字段,
  由专用刷新逻辑处理;仅 action=block 时发送 error_message,匹配后端
  omitempty 行为。

测试
- HTTP 路径:openai_fast_policy_test.go 覆盖默认配置(whitelist=[],所有
  模型 priority filter)/ block 自定义错误 / scope 区分 / filter 删字段 /
  block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI
  fast policy 等场景。
- WebSocket 路径:openai_fast_policy_ws_test.go 覆盖
    helper 单元(filter / fast→priority 归一化 / flex 透传 / block typed
    error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type
    帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错)+
    pass 路径 fast 别名归一化回归 +
    ingress 端到端(filter 后上游不含 service_tier / block 后客户端先收
    error event 再收 close 1008 且上游 0 写)+
    passthrough capturedSessionModel fallback 用例(whitelist 策略下首帧
    建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化)+
    passthrough session.update / session.created 旋转 capturedSessionModel
    的 mid-session 绕过回归 +
    passthrough billing post-filter ServiceTier 与 idempotent filter 回归。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:15:09 +08:00
Zven
3d4ca5e8d1 fix(openai): preserve current Codex compact payload fields 2026-04-28 10:55:29 +08:00
gaoren002
9fe02bba7e fix(openai): strip unsupported passthrough fields 2026-04-27 00:39:06 +00:00
gaoren002
615557ec20 fix(openai): avoid implicit image sticky sessions 2026-04-26 17:09:41 +00:00
Wesley Liddick
22b1277572
Merge pull request #1948 from hungryboy1025/fix/openai-account-test-responses-stream
fix(openai): tighten responses stream account tests
2026-04-25 20:31:07 +08:00