6 Commits

Author SHA1 Message Date
win
fdd2d08a4d feat: merge feat/omniroute-ideas — P2C scheduler, quota scoring, tier fallback 2026-04-29 15:42:37 +08:00
win
5c8c15cdb1 feat(refresh,repo): add singleflight to dedupe concurrent token refresh and unschedulable writes
Two anti-thundering-herd improvements:

1. OAuthRefreshAPI.RefreshIfNeeded
   Wrap the existing distributed-lock + DB-reread + executor.Refresh
   pipeline in a per-process singleflight keyed by cacheKey+window.
   Without this, N concurrent goroutines on the same account each pay
   one Redis lock RTT and one DB reread; with it, only the leader pays
   and the rest share the result.

   The refreshWindow is part of the key so a long background-refresh
   window cannot starve a short foreground-refresh window.

2. accountRepository.SetTempUnschedulable
   Wrap the same path (UPDATE + scheduler outbox enqueue + scheduler
   cache sync) in a per-process singleflight keyed by id+until+reason.
   The SQL guard (existing < new) already makes the UPDATE idempotent,
   but N callers still cost N round-trips and N outbox inserts. With
   singleflight, an upstream 401 burst that hits the same account
   collapses to one execution.

Tests cover dedup behavior, key separation by account / refresh window,
and that the SQL exec count drops from N to <=2 (UPDATE + outbox).
2026-04-29 00:43:23 +08:00
haruka
49e99e9d51 fix: resolve errcheck lint for sync.Map type assertion
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:44:15 +08:00
haruka
ad2cd97618 fix: resolve refresh token race condition causing false invalid_grant errors
When multiple goroutines/workers concurrently refresh the same OAuth token,
the first succeeds but invalidates the old refresh_token (rotation). Subsequent
attempts using the stale token get invalid_grant, which was incorrectly treated
as non-retryable, permanently marking the account as ERROR.

Three complementary fixes:
1. Race-aware recovery: after invalid_grant, re-read DB to check if another
   worker already refreshed (refresh_token changed) — return success instead
   of error
2. In-process mutex (sync.Map of per-account locks): prevents concurrent
   refreshes within the same process, complementing the Redis distributed lock
3. Increase default lock TTL from 30s to 60s to reduce TTL-expiry races

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:23:38 +08:00
Wang Lvyuan
ad7c10727a fix(account): preserve runtime state during credentials-only updates 2026-03-23 03:49:28 +08:00
erio
1fc9dd7b68 feat: unified OAuth token refresh API with distributed locking
Introduce OAuthRefreshAPI as the single entry point for all OAuth token
refresh operations, eliminating the race condition where background
refresh and inline refresh could simultaneously use the same
refresh_token (fixes #1035).

Key changes:
- Add OAuthRefreshExecutor interface extending TokenRefresher with CacheKey
- Add OAuthRefreshAPI.RefreshIfNeeded with lock → DB re-read → double-check flow
- Add ProviderRefreshPolicy / BackgroundRefreshPolicy strategy types
- Simplify all 4 TokenProviders to delegate to OAuthRefreshAPI
- Rewrite TokenRefreshService.refreshWithRetry to use unified API path
- Add MergeCredentials and BuildClaudeAccountCredentials helpers
- Add 40 unit tests covering all new and modified code paths
2026-03-16 01:31:54 +08:00