- Reduce max retry attempts from 60 to 10 (exponential backoff prevents pile-up)
- Replace fixed 1s delays with exponential backoff: 1s, 2s, 4s, 8s, 16s, 32s
- Add ±10% jitter to prevent thundering herd effect
- Cap max wait at 32 seconds to avoid excessive delays
- Improves response time when API is temporarily unavailable
Before: ~60s worst case (60 * 1s fixed delays)
After: ~10s worst case (exponential backoff with cap)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>