Follow-up to #2816 (already merged): the same long-context pricing
exemption that affected cache_read also applies to all three
cache_creation price fields (standard, 5m ephemeral, 1h ephemeral).
computeCacheCreationCost reads these prices directly from pricing and
never sees the LongContextInputMultiplier that computeTokenBreakdown
applies to inputPrice / outputPrice / cacheReadPrice.
For GPT-5.4 / 5.5 above the 272k threshold, this causes the cache_write
portion of long sessions to be billed at roughly half what it should
be (default multiplier 2.0). Cache writes are conceptually input-side
operations and should share the same long-context treatment as input /
cache_read.
This patch threads an explicit multiplier into computeCacheCreationCost
so the function can be unit-tested in isolation and matches the existing
pattern used for cache_read. computeTokenBreakdown captures the long
context decision once and passes LongContextInputMultiplier when it
applies, 1.0 otherwise.
Adds three regression tests mirroring the #2816 cache_read tests:
- positive: long-context triggered -> cache_creation scaled by 2.0x
- negative: below threshold -> cache_creation stays at base price
- breakdown: 5m + 1h ephemeral prices both scaled when applicable
Refs #2816
Co-authored-by: Cursor <cursoragent@cursor.com>
When session long-context pricing is triggered in computeTokenBreakdown
(e.g. GPT-5.4 / GPT-5.5 above the 272k token threshold), the multiplier
was only being applied to InputPricePerToken and OutputPricePerToken.
The cache_read price was left at its base value, so CacheReadCost was
silently undercharged whenever a long-context session also had cache
hits — which is essentially every long Codex / Claude Code session.
Concretely for gpt-5.4 with 300k cache_read tokens, the bug
under-billed the request by exactly 1x the LongContextInputMultiplier
on the cache portion (e.g. 0.075 instead of 0.150 in the regression
test).
Cache reads are conceptually input-side replays, so they should scale
with LongContextInputMultiplier, matching the treatment of
InputPricePerToken.
Adds two regression tests:
- positive: long-context triggered -> cache_read scaled by 2.0x
- negative: below threshold -> cache_read stays at base price
Fixes#2293
Co-authored-by: Cursor <cursoragent@cursor.com>
- applyRequestTierOverrides now uses filterValidIntervals consistently
with applyTokenOverrides (per_request/image modes were not filtering)
- CostInput accepts optional pre-resolved pricing via Resolved field,
eliminating duplicate Resolver.Resolve() calls in gateway billing paths
- calculateTokenCost reduced from 80 to 15 lines
- calculateCostInternal reduced from 91 to 15 lines
- Shared logic in computeTokenBreakdown + computeCacheCreationCost
- Unified rateMultiplier <= 0 protection in both paths
Instead of hardcoding BillingMode="image" when ImageCount>0,
let cost.BillingMode (set by CalculateCostUnified/CalculateImageCost)
take priority. This ensures channel token pricing shows "token" mode.
- Parse candidatesTokensDetails from Gemini API to separate image/text output tokens
- Add image_output_tokens and image_output_cost to usage_log (migration 089)
- Support per-image-token pricing via output_cost_per_image_token from model pricing data
- Channel pricing ImageOutputPrice override works in token billing mode
- Auto-fill image_output_price in channel pricing form from model defaults
- Add "channel_mapped" billing model source as new default (migration 088)
- Bills by model name after channel mapping, before account mapping
- Fix channel cache error TTL sign error (115s → 5s)
- Fix Update channel only invalidating new groups, not removed groups
- Fix frontend model_mapping clearing sending undefined instead of {}
- Credits balance precheck via shared AccountUsageService cache before injection
- Skip credits injection for accounts with insufficient balance
- Don't mark credits exhausted for "exhausted your capacity on this model" 429s
Previously 2K images used the same base price as 1K ($0.134).
Now 2K uses 1.5x multiplier ($0.201), consistent with 4K using 2x ($0.268).
- Backend: add 2K size branch in getDefaultImagePrice
- Frontend: update 2K placeholder from 0.134 to 0.201
- Tests: update assertions for new 2K default price