Overload Retry

GitAuto handles HTTP 529 (overloaded) errors with exponential backoff retry. When the model's API returns an overload response, GitAuto waits progressively longer between retries instead of immediately failing.

Why This Exists

HTTP 529 errors are transient - the model is temporarily busy serving other requests. Retrying after a short wait usually succeeds because the server has processed some of its queue. Without backoff, rapid retries make the problem worse by adding more load to an already overloaded system. A single 529 error should not cause an entire PR generation to fail when waiting 30 seconds would resolve it.

How It Works

When a 529 response is received, GitAuto waits before retrying. The wait time doubles with each retry: first 10 seconds, then 20, then 40, and so on. This exponential backoff gives the server time to recover. After a configurable number of retries (typically 3-5), if the model is still overloaded, GitAuto falls back to the next model in the fallback chain rather than continuing to retry indefinitely.

Other Transient Failures

The same retry mechanism covers transient upstream failures from any provider, not only model overload. HTTP 5xx responses (500, 502, 503, 504) from GitHub, Anthropic, OpenAI, Supabase, or any other API are retried with linear backoff before the call gives up. Connection-level drops (RemoteDisconnected, ChunkedEncodingError) and Google CANCELLED responses are treated the same way. Rate-limit responses (403/429 with a Retry-After hint) honor the server's suggested delay instead of the backoff curve. The retry is bounded so a genuinely down service still surfaces in Sentry instead of looping forever.

Related Features

Model Fallback - switches to a different model after retries are exhausted

Need Help?

Have questions or suggestions? We're here to help you get the most out of GitAuto.

Model Fallback Forced Verification