390 Commits

Author SHA1 Message Date
shaw
f7ac5e5931 fix(openai): preserve chat responses usage billing 2026-05-26 21:33:28 +08:00
DaydreamCoding
6b39b344d8 feat(quota): 用户 × 平台 USD 配额
为用户在 anthropic/openai/gemini/antigravity 四个平台上提供日/周/月
三个窗口的 USD 配额管控。配额语义:未设置=不限制,0=禁用,>0=美元上限。

两层模型:
- 配置层:系统默认配额,以及 email/linuxdo/oidc/wechat/github/google/
  dingtalk 七个鉴权来源的默认配额,存于 settings,以嵌套 JSON 整体读写
  (系统 1 个 key + 每个来源 1 个 key),整体替换语义。
- 运行时层:user_platform_quota 表按用户记录实际配额,与配置层解耦。

后端:新增 ent schema 与 140_user_platform_quotas.sql 迁移、repository
与 service 端口、计费链路集成、管理端与用户端读写接口。
前端:管理端设置页配额编辑、用户配额管理 Modal、用户 Dashboard 展示、
中英文案。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 10:49:20 +08:00
wucm667
c4d7edba08 fix(apicompat): map developer role to system 2026-05-21 16:37:05 +08:00
wucm667
90b2b2a757 feat(usage): 用户 API Key 用量页支持按日明细 2026-05-20 15:48:38 +08:00
Wesley Liddick
7ec61eb2f5
Merge pull request #2606 from wucm667/fix/openai-responses-respect-force-chat-completions
fix(openai): /v1/responses 入口尊重 force_chat_completions 设置
2026-05-20 15:13:43 +08:00
shaw
878ad3b569 feat(openai-gateway): Codex OAuth 账号浏览器 UA 自动改写规避 Cloudflare
质询
2026-05-20 14:33:51 +08:00
wucm667
cae93ae137 fix(openai): /v1/responses respect force chat completions 2026-05-20 14:17:26 +08:00
shaw
3d22dd34d3 feat: add gemini-3.5-flash model support across backend and frontend 2026-05-20 09:28:46 +08:00
wucm667
276b5c7755 fix(apicompat): strip temperature/top_p for reasoning models in Responses conversion
gpt-5.x models served via the OpenAI Responses API reject requests that
include temperature or top_p with:
  {"detail":"Unsupported parameter: temperature"}

This caused ClaudeCode agent/subagent tool requests to fail with a 400
error when an OpenAI group had the Messages-format support enabled.

Root cause: AnthropicToResponses and ChatCompletionsToResponses were
unconditionally forwarding temperature and top_p from the incoming
request to the ResponsesRequest, even though all gpt-5.x reasoning
models reject these sampling parameters.

Fix:
- Add isReasoningModel(model string) bool helper that returns true for
  any model whose name starts with "gpt-5".
- Skip temperature and top_p when converting to ResponsesRequest for
  reasoning models. Non-reasoning models (e.g. gpt-4o) are unaffected.
- ResponsesRequest.Temperature and TopP are already *float64 with
  omitempty, so nil values are safely omitted from the JSON body.

Tests:
- TestAnthropicToResponses_TemperatureStrippedForReasoningModel
- TestAnthropicToResponses_TemperatureStrippedForAllGpt5Variants
- TestChatCompletionsToResponses_TemperatureStrippedForReasoningModel
- TestChatCompletionsToResponses_TemperaturePreservedForNonReasoningModel

Fixes #2487
2026-05-19 20:03:16 +08:00
Wesley Liddick
e65fb8b086
Merge pull request #2543 from L494264Tt/fix/deepseek-reasoning-content
fix: preserve DeepSeek reasoning_content in chat compatibility paths
2026-05-19 17:34:58 +08:00
L494264Tt
fe3283a1d5 fix: satisfy errcheck for reasoning content conversion 2026-05-19 17:17:39 +08:00
Wesley Liddick
6c8b6843fd
Merge pull request #2546 from nanobanana123/fix/anthropic-empty-thinking-sse
fix(apicompat): preserve empty streaming thinking blocks
2026-05-19 17:03:54 +08:00
L494264Tt
6082d02d22 Merge origin/main into fix/deepseek-reasoning-content 2026-05-19 17:00:57 +08:00
DaydreamCoding
664e9fdcd4 feat(usage): 用户用量按平台拆分 + UsersView 列设置可配置 + 用量列排序
后端
- BatchUserUsageStats / UserDashboardStats 新增 ByPlatform 字段
  复用 ops 路径 COALESCE(g.platform, a.platform) 语义,不冗余 DB 字段
- 抽出 usageLogEffectivePlatformExpr 常量供管理员与用户两路径共用
- GetBatchUsersUsage cacheKey 加 v=2 + 当日日期,修复跨午夜旧缓存兼容新字段

前端
- 新建 PlatformUsageBreakdown:管理员用量列 hover tooltip 展示各平台 today/total
- 新建 PlatformCostCell:单平台 today/total 紧凑单元格
- UsersView 列设置新增 Claude/OpenAI/Gemini/Antigravity 四个平台子列,默认隐藏可手动启用
- 普通用户 Dashboard 新增 Row 3 平台拆分卡片,受 isSimple 控制
- 平台之和 < 总值时显式展示"其他"行,避免数字对不齐
- last_active_at 从 FORCED_VISIBLE_COLUMNS 移除,允许用户隐藏并持久化
- 列设置加 schema 版本号 + 迁移机制,老用户升级时新增默认隐藏列自动应用
- UsersView 用量列(汇总 + 4 平台子列)加入前端单页排序:列头单按钮 + 弹出菜单
  切换"今日 / 近30天",三态循环 desc → asc → off;菜单底部备注"仅对本页数据排序"
- sortedUsers computed 在 server-side-sort 结果之上叠加本地排序,缺失值按 0 处理;
  usageSort 状态独立 localStorage 持久化,互不干扰后端 sort_by
- i18n 新增 admin.users.sortBy / sortCurrentPageOnly

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 15:25:34 +08:00
Wesley Liddick
e365aae450
Merge pull request #2450 from wucm667/codex/issue-2431-responses-api-support
feat: 支持后台配置 OpenAI Responses API 路由
2026-05-19 14:47:10 +08:00
Wesley Liddick
23e95b77b7
Merge pull request #2528 from wucm667/fix/openai-responses-null-content
fix(openai): 修复 chat-completions 转 responses 时 content 为 null 导致上游 400
2026-05-19 14:43:55 +08:00
Wesley Liddick
03473d3ee8
Merge pull request #2554 from Arron196/feature/sync-upstream-models-pr
feat: 支持从上游同步账号可用模型列表
2026-05-19 14:42:47 +08:00
benjamin
b9ecf25207 fix: harden Antigravity model list requests
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-18 19:01:23 +08:00
nanobabanan
e9a25e7b92 fix(apicompat): preserve empty streaming thinking blocks
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
2026-05-18 15:10:35 +08:00
L494264Tt
1d47fd6300 This preserves DeepSeek reasoning_content across chat compatibility paths.
DeepSeek thinking-mode tool-call conversations may require the assistant
  reasoning_content from previous turns to be sent back in later requests. Without
  preserving it, those conversations can fail or lose reasoning context.

  Changes:
  - Preserve assistant reasoning_content when converting Chat Completions messages
    to Responses input by wrapping it as a thinking block.
  - Add regression coverage for non-streaming DeepSeek responses.
  - Add regression coverage for streaming DeepSeek deltas.
  - Add regression coverage that request-side messages[].reasoning_content is
    passed through with tool calls.

  Tests:
  go test -tags=unit ./internal/pkg/apicompat ./internal/service -run
  'TestChatCompletionsToResponses_AssistantReasoningContentPreserved|
  TestChatCompletionsToResponses_AssistantThinkingTagPreserved|
  TestForwardAsRawChatCompletions_PreservesDeepSeekReasoningContent|
  TestForwardAsRawChatCompletions_ForcesStreamUsageUpstreamAndPassesUsageDownstream
2026-05-18 11:22:23 +08:00
wucm667
df82a3bc69 fix(openai): avoid null content when converting chat-completions to responses
When a chat-completions message has no usable content parts (empty array,
empty text part, or filtered-out image part), marshalChatInputContent
marshalled a nil slice to JSON null. The upstream Responses API rejects a
null content field with HTTP 400. Fall back to an empty string instead.

Fixes #2515
2026-05-17 11:20:05 +08:00
name
0393bd7c82 Fix OpenAI compat usage parsing 2026-05-16 03:03:43 +08:00
wucm667
862819042c feat(openai): 支持后台配置 Responses API 路由 2026-05-14 11:46:24 +08:00
shaw
a07a0dac63 feat: add configurable Antigravity user agent version 2026-05-11 22:25:20 +08:00
shaw
297b54d066 fix: 完善工具名改写测试和格式 2026-05-11 17:27:04 +08:00
shaw
57fd7998d3 fix(gateway): stop default redact thinking beta injection 2026-05-07 18:56:11 +08:00
lyen1688
0584305e5a feat: improve OpenAI messages compatibility for Claude Code 2026-05-05 19:36:33 +08:00
Wesley Liddick
dc09b367dc
Merge pull request #2143 from alfadb/fix/openai-apikey-cc-default-routing
修复:APIKey 账户上游不支持 OpenAI Responses API 时的 Chat Completions 路由回退
2026-05-03 22:58:26 +08:00
shaw
72d5ee4cd1 fix: drain OpenAI compat streams for usage 2026-05-03 17:11:27 +08:00
alfadb-bot
4e4cc80971 fix(openai-gateway): route APIKey accounts to /v1/chat/completions when upstream lacks Responses API
OpenAI APIKey accounts with base_url pointing to third-party OpenAI-compatible
upstreams (DeepSeek, Kimi, GLM, Qwen, etc.) were failing because the gateway
unconditionally converted Chat Completions requests to Responses format and
forwarded to {base_url}/v1/responses, which only exists on OpenAI's official
endpoint.

Detection-based routing:
- Probe upstream capability on account create/update via a minimal POST to
  /v1/responses; HTTP 404/405 means 'unsupported', any other response means
  'supported'.
- Persist result as accounts.extra.openai_responses_supported (bool).
- ForwardAsChatCompletions branches at function entry: APIKey accounts with
  explicit support=false go through new forwardAsRawChatCompletions which
  passthrough-forwards CC body to /v1/chat/completions without protocol
  conversion.

Default behavior for accounts without the marker preserves the legacy
'always Responses' path — existing OpenAI APIKey accounts that were working
before this change continue to work without modification (the 'reality is
evidence' principle: an account that has been running implies upstream
capability).

Probe is fired async after Create / Update / BatchCreate; failures only log,
never block the admin flow. BulkUpdate omitted (low signal of base_url
changes; can be added if needed).

Implementation:
- New pkg internal/pkg/openai_compat: marker key + ShouldUseResponsesAPI
- New service file openai_apikey_responses_probe.go: probe + persist
- New service file openai_gateway_chat_completions_raw.go: CC pass-through
- Account test endpoint short-circuits with explicit message for
  probed-unsupported accounts (full CC test path is a TODO)

Zero schema changes, zero migrations, zero frontend changes, zero wire
modifications — all wired through existing AccountTestService injection.

Closes: DeepSeek-OpenAI account (id=128) production failure
2026-04-30 19:25:45 +08:00
shaw
40feb86ba4 fix(httputil): add decompression bomb guard and fix errcheck lint 2026-04-29 22:11:45 +08:00
Wesley Liddick
f972a2faf2
Merge pull request #1990 from haha1903/feat/zstd-request-decompression
feat(httputil): decode zstd/gzip/deflate request bodies
2026-04-29 22:08:28 +08:00
ivanvolt
04b2866f65 fix: use Responses-compatible function tool_choice format 2026-04-28 16:26:09 +08:00
Cloud370
3022090365 fix(anthropic): drop empty Read.pages in responses-to-anthropic tool input 2026-04-26 20:21:38 +08:00
Hai Chang
798fd673e9 feat(httputil): decode compressed request bodies (zstd/gzip/deflate)
Codex CLI 0.125+ defaults to sending request bodies with
Content-Encoding: zstd. Without server-side decompression the gateway
returns 'Failed to parse request body' on /v1/responses (and any other
JSON endpoint) because gjson sees raw zstd bytes.

ReadRequestBodyWithPrealloc now inspects Content-Encoding and
transparently decodes zstd, gzip/x-gzip, and deflate bodies before
returning them, then strips the encoding headers and updates
ContentLength so downstream code can reuse the bytes safely.
Unsupported encodings produce a clear error.

Adds unit tests covering identity, zstd, gzip, deflate, unsupported
encoding, corrupt zstd payloads, nil bodies, and explicit identity.
2026-04-26 20:52:45 +10:00
deqiying
b17704d6ef fix(anthropic): 修正缓存 token 的 Anthropic 用量语义 2026-04-26 01:14:59 +08:00
Wesley Liddick
1afd81b019
Merge pull request #1920 from Wuxie233/fix/responses-web-search-tool-types
fix(apicompat): recognize web_search_20250305 / google_search in Responses→Anthropic tool conversion
2026-04-25 09:00:37 +08:00
Wuxie233
5f630fbb19 fix(apicompat): recognize web_search_20250305 / google_search in Responses to Anthropic tool conversion 2026-04-25 01:09:51 +08:00
keh4l
5862e2d8d9 feat(gateway): add billing attribution block with cc_version fingerprint
Real Claude Code CLI always sends a 2-block system array:

  [0] {"type":"text", "text":"x-anthropic-billing-header: cc_version=X.Y.Z.{fp}; cc_entrypoint=cli; cch=00000;"}
  [1] {"type":"text", "text":"You are Claude Code...", "cache_control":{...}}

Before this commit, sub2api's mimicry path only produced block [1].
The missing billing block is one of the primary third-party detection
signals Anthropic uses for Claude-Code-scoped OAuth tokens.

New file gateway_billing_block.go ports the fingerprint algorithm
(byte-for-byte from Parrot cc_mimicry.py:compute_fingerprint):
pick chars at positions [4,7,20] of the first user text, then
`sha256(SALT + chars + cc_version)[:3]`.

  - claude/constants.go: CLICurrentVersion = "2.1.92" (must match UA)
  - gateway_billing_block.go: computeClaudeCodeFingerprint +
    buildBillingAttributionBlockJSON + extractFirstUserText
  - gateway_service.go: rewriteSystemForNonClaudeCode now emits both
    blocks in order; cch=00000 is filled in later by
    signBillingHeaderCCH in buildUpstreamRequest.

Downstream compat note: syncBillingHeaderVersion's regex
`cc_version=\d+\.\d+\.\d+` only matches the semver triple,
leaving the `.{fp}` suffix intact when rewriting in buildUpstreamRequest.
2026-04-24 23:16:32 +08:00
keh4l
66d6454535 feat(claude): add ttl to cache_control with default 5m
Real Claude CLI traffic sends cache_control as
`{"type":"ephemeral","ttl":"1h"}`. Our previous payload only
sent `{"type":"ephemeral"}`, which is a bytewise mismatch with
the official CLI and one more third-party detection signal.

Policy: client-provided ttl is always passed through unchanged.
Proxy-generated cache_control blocks default to 5m (vs Parrot's 1h)
to avoid burning the 1h cache budget on automatic breakpoints while
still aligning with the `ttl` field being present.

  - claude/constants.go: DefaultCacheControlTTL = "5m"
  - apicompat/types.go: new AnthropicCacheControl type with TTL field;
    AnthropicTool gains optional CacheControl pointer so the mimicry
    path can attach a cache breakpoint to tools[-1] later.
  - service/gateway_service.go: anthropicCacheControlPayload gains TTL;
    marshalAnthropicSystemTextBlock and rewriteSystemForNonClaudeCode
    emit ttl=5m by default.
2026-04-24 23:16:32 +08:00
keh4l
b5467d610a fix(gateway): apply full Claude Code mimicry on /chat/completions and /responses
Before: the OpenAI-compat forwarders only called injectClaudeCodePrompt,
which prepends the Claude Code banner but leaves the rest of the body
in its original non-Claude-Code shape. The codebase already admits this
is insufficient (see the comment on rewriteSystemForNonClaudeCode in
gateway_service.go: "仅前置追加 Claude Code 提示词无法通过检测").

Effect: OAuth accounts served through /v1/chat/completions or /v1/responses
were detected as third-party apps and bled plan quota with:

    Third-party apps now draw from your extra usage, not your plan limits.

Fix:
  - apicompat.AnthropicRequest: add Metadata json.RawMessage so metadata
    survives the OpenAI->Anthropic->Marshal round trip; without it the
    downstream rewrite has no user_id to work with.
  - service: extract applyClaudeCodeOAuthMimicryToBody, a ParsedRequest-free
    variant of the /v1/messages mimicry pipeline
    (rewriteSystemForNonClaudeCode + normalizeClaudeOAuthRequestBody +
    metadata.user_id injection) so the OpenAI-compat forwarders can reuse it.
  - service: add buildOAuthMetadataUserIDFromBody + hashBodyForSessionSeed
    for the same reason (no ParsedRequest at the call site).
  - ForwardAsChatCompletions / ForwardAsResponses: replace the 3-line
    prompt-prepend with the full mimicry pipeline.
  - applyClaudeCodeMimicHeaders: set x-client-request-id per-request
    (real Claude CLI always does); missing/duplicated values are one more
    third-party fingerprint signal.

No change to the native /v1/messages path: it already called the full
pipeline, we only lift those helpers into a reusable function.

Tests:
  - go build ./... passes
  - go test ./internal/service/... ./internal/pkg/apicompat/... passes
  - lsp_diagnostics clean on all touched files
  - pre-existing failures in internal/config are unrelated (env-sensitive
    tests that also fail on upstream main)
2026-04-24 23:16:32 +08:00
keh4l
57ff97960d chore(claude): bump mimicked CLI to 2.1.92 and extend anthropic-beta list
Align Claude Code mimicry constants with the latest real CLI traffic
(see Parrot's src/transform/cc_mimicry.py). Anthropic now uses the full
set of anthropic-beta tokens to decide whether a request counts as
"official Claude Code"; requests missing tokens that real CLI ships
today are demoted to third-party usage:

  Third-party apps now draw from your extra usage, not your plan limits.

Changes:
  - claude/constants.go: add new beta tokens (prompt-caching-scope,
    effort, redact-thinking, context-management, extended-cache-ttl) and
    expose FullClaudeCodeMimicryBetas() for the OAuth mimicry path.
  - claude/constants.go: bump default User-Agent to claude-cli/2.1.92.
  - identity_service.go: bump defaultFingerprint User-Agent accordingly.

No behavioral change for clients that already send a newer UA (fingerprint
merge still prefers the incoming value).
2026-04-24 23:16:32 +08:00
shaw
a4e329c18b fix: openai默认模型新增gpt5.5 2026-04-24 09:08:31 +08:00
shaw
4d0483f5b8 feat: 补充gpt生图模型测试功能 2026-04-22 18:12:03 +08:00
lucas morgan
c548021921 feat(openai): 同步生图 API 支持并接入图片计费调度
- 同步 OpenAI 图片生成与编辑接口
- 接入图片请求解析、账号调度、转发与用量记录
- 接入图片计费与图片用量落库
- 限制 OAuth 生图仅支持无显式模型和尺寸的基础请求
2026-04-22 12:30:08 +08:00
erio
bbc4aed3d9 fix(openai): 移除已下线 Codex 模型并修复归一化兜底副作用
- backend: 删除 gpt-5 / 5.1 / 5.1-codex / 5.1-codex-max / 5.1-codex-mini / 5.2-codex / 5.4-nano 的内置映射与 DefaultModels 条目
- backend: normalizeCodexModel 默认兜底由 gpt-5.1 改为 gpt-5.4,gpt-5.3-codex-spark 独立保留映射
- backend: 修复 isOpenAIGPT54Model 与 shouldAutoInjectPromptCacheKeyForCompat 对 claude / gpt-4o 的误判(之前依赖 gpt-5.1 作为非 GPT 族的隐式 sentinel,改后需要显式前缀守卫)
- backend: 清理 billing_service 中已不可达的 fallback 价格与 switch 分支
- frontend: 从白名单、OpenCode 配置、预设映射中移除已下线模型
- 同步更新所有相关单测

Refs: #1758, parallels upstream #1759 but adds downstream guard fixes
2026-04-20 22:01:41 +08:00
shaw
a789c8c4c7 feat: 支持opus-4.7 2026-04-17 09:37:25 +08:00
erio
db27e8f000 feat(usage): add account cost to breakdown sub-table and admin usage log
- UserBreakdownItem: add AccountCost field + SQL aggregation
- UserBreakdownSubTable: add orange account cost column
- Admin usage table: add account_cost column (after cost, default visible)
- Column settings: add account_cost toggle option
2026-04-15 15:40:40 +08:00
erio
22680dc602 test(usage): add unit tests for account_cost and fix gofmt
- Fix mock for GetModelStatsWithFilters: add account_cost column
- Add assertion: GetStatsWithFilters always returns TotalAccountCost
- New test: GetModelStatsAccountCostColumn verifies scan of AccountCost
- New test: GetGroupStatsAccountCostColumn verifies scan of AccountCost
- New test: GetStatsWithFiltersAlwaysReturnsAccountCost (no AccountID filter)
- Integration test: add TotalAccountCost/TodayAccountCost assertions
- Fix gofmt alignment in usage_log_types.go
2026-04-15 15:02:21 +08:00
erio
6ade6d30a8 feat(usage): add account cost display to admin dashboard and usage pages
- Add account_cost column to dashboard aggregation tables (migration 107)
- DashboardStats: add TotalAccountCost/TodayAccountCost fields
- ModelStat/GroupStat: add AccountCost field with SQL aggregation
- GetStatsWithFilters: always return TotalAccountCost (remove accountID filter)
- Dashboard Token cards: show user(green)/cost(orange)/standard(gray)
- Usage stats card: show account cost and standard below main value
- Model/Group distribution tables: add orange cost column
2026-04-15 15:02:21 +08:00