- gate previous_response_id sticky path with quota auto-pause check at both
the snapshot and DB-recheck stages (previously bypassed, #1)
- skip pausing when the usage window already reset to avoid a stale stuck-pause;
carry codex_*_reset_at / reset_after_seconds / codex_usage_updated_at through
the scheduler snapshot whitelist (#2)
- remove the incomplete limit mode; percentage threshold only (#3)
- add global default 5h/7d threshold inputs to the Ops settings dialog with
validation and en/zh i18n (#4)
- downgrade account_auto_paused_by_quota log from Info to Debug; it fires
per-candidate on the scheduling hot path (#5)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pool mode currently retries the same account for a fixed set of
upstream HTTP statuses: 401, 403, 429. Some upstream pool deployments
also need same-account retry for transient provider/proxy statuses
such as 502, 503, 520, 529, but hard-coding more statuses changes
behavior for everyone.
Add a per-account credentials option `pool_mode_retry_status_codes`
that lets admins choose which upstream HTTP status codes trigger
same-account retry in pool mode:
- Unset (default): preserve the current 401/403/429 default
- Explicit list: override the defaults with the configured codes
- Codes normalized to the 100-599 range, deduplicated, sorted
The standalone `isPoolModeRetryableStatus` helper is kept as the
default-only fallback. All 15 gateway call sites switch to the new
`Account.IsPoolModeRetryableStatus` method so behavior is preserved
for accounts that do not configure the new field.
Frontend admin UI gains a "Retry Status Codes" comma-separated input
under the pool-mode section in both Create/Edit account modals
(en + zh i18n).
Fixes#2731
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a low-visibility proxy IP resource link near proxy-related controls.
- Show the link beside account proxy selectors
- Show the link in the create proxy dialog tab row
- Keep the entry inline to avoid interrupting form workflows
Cache Hit Rate was calculated as cache_read / (cache_read + cache_creation),
which always yields 100% for OpenAI models since cache_creation is never
reported by the OpenAI API. The denominator should include all prompt tokens
(input_tokens + cache_read_tokens + cache_creation_tokens) so the rate
reflects the actual percentage of input tokens served from cache.
Fixes#2291
Add channel-level Bedrock CC compatibility toggle (similar to web_search_emulation)
that fixes 4 types of Bedrock 400 errors seen with Claude Code:
1. thinking.type "enabled" → "adaptive" for Opus 4.7+ (only supports adaptive)
2. Add default budget_tokens when missing for older models
3. Replace illegal characters in tool_use IDs to match Bedrock's ^[a-zA-Z0-9_-]+$ pattern
4. anthropic_version / invalid beta flag (already handled elsewhere)
Transformations run in Forward() before any forwarding path, so both native Bedrock
accounts and apikey passthrough accounts pointing to Bedrock relays benefit.
Includes channel-level toggle UI and unit tests.