File Query Routing
When the agent needs to learn something about a file (test framework, naming conventions, project structure), loading the full file into the main model's context is wasteful. The agent only needs the answer, not the source code. GitAuto routes these queries through Claude Haiku so only the answer enters the expensive model's context.
How It Works
- The agent provides a file path and a question (e.g., "What test framework and fixtures are used?").
- GitAuto reads the file from the local clone and formats it with line numbers.
- The file content and question are sent to Claude Haiku ($1 per million input tokens), which returns a focused answer.
- Only the answer (typically a few hundred characters) enters the main model's conversation, not the full file (often 10,000+ characters).
Same Pattern as Web Fetch
This follows the same architecture as URL fetching, which routes web pages through Haiku before passing summaries to the main model. The difference is the data source: local files instead of URLs.
Three Modes of File Access
The agent picks the cheapest way to interact with a file:
- Full read— loads the entire file into context. For when the agent needs exact code to edit.
- Query— routes through Haiku, returns only the answer. For when the agent needs to learn something about the file.
- Forget— drops file content already in context. For when a full read happened but the content is no longer needed.
Cost Impact
A 15,000-character file queried through Haiku costs roughly $0.004 in Haiku input tokens plus a small amount for the answer. Loading the same file directly into Opus context costs more per turn, and the cost compounds across every subsequent API call in the conversation. Input tokens account for roughly 95% of our Claude costs, making this the highest-leverage optimization target.
Need Help?
Have questions or suggestions? We're here to help you get the most out of GitAuto.
Contact us with your questions or feedback!