Model Fallback

GitAuto supports multiple AI providers (Anthropic and Google AI) with per-provider fallback chains. For Anthropic: Claude Opus 4.7 → 4.6 → 4.5 → Sonnet 4.6. For Google: Gemma 4 31B → Gemini 2.5 Flash. When a model fails due to overload, rate limiting, or errors, GitAuto automatically falls back to the next model in the same provider chain.

Why This Exists

AI model APIs occasionally have capacity issues, especially during peak usage periods. A single model being unavailable would cause the entire PR generation to fail, leaving users waiting for the service to recover. Rather than failing the PR, falling back to a slightly less capable model within the same provider keeps the work going. Fallback stays within the same provider to maintain consistent pricing.

How It Works

When an API call returns a 529 (overloaded), 429 (rate limited), or 500 (server error) status, GitAuto catches the error and retries with the next model in the chain. The conversation history is preserved across model switches - only the model parameter changes. If all models in the chain fail, the error is raised and the PR generation stops with a clear error message explaining which models were attempted.

Cost-Cap Downgrade

A separate fallback path triggers on cost, not failure. When the running LLM cost on a PR exceeds an internal budget, GitAuto swaps the active model to a free-tier model and keeps working instead of stopping the PR. The conversation history is preserved across the swap. An expensive task that would have otherwise been abandoned can still finish, on a smaller model, without further charges to the customer.

Related Features

Overload Retry - retries the same model with exponential backoff before falling back
Infrastructure Failure Detection - detects when failures are infrastructure-related vs. code-related

Need Help?

Have questions or suggestions? We're here to help you get the most out of GitAuto.

On-Demand Diff Overload Retry