Context Forgetting
When an AI agent runs a multi-step coding task, it reads reference files early to learn patterns: test naming conventions, project structure, framework idioms. These files enter the conversation context and stay there for every subsequent API call. Input tokens account for roughly 95% of our Claude costs, so accumulated stale content is the highest-leverage target for optimization.
Why the Agent Decides
Most agent frameworks manage context from the outside: the orchestrator truncates, summarizes, or compacts the conversation. But coding agents read specific files for specific reasons. The orchestrator cannot know when the agent is "done" with a file. Only the agent knows.
How It Works
- The agent reads reference files (existing tests, config files, etc.) to learn patterns.
- Once the agent has extracted the patterns it needs, it calls
forget_messageswith the list of file paths to forget. - GitAuto replaces the file contents in the conversation history with a short placeholder like
['src/ref.py' content removed because agent already extracted needed patterns]. - The placeholder reminds the agent the file existed. If it needs the file again later, it can re-read it.
Token Economics
If the agent forgets 15,000 characters of reference files on turn 5 of a 50-turn run:
- Cost of forgetting: near zero (one tool call, a few placeholder tokens)
- Savings per subsequent turn: 15,000 fewer input characters
- Total savings: 15,000 x 45 remaining turns = 675,000 fewer characters of input
The breakeven is immediate. The only risk is forgetting too early, but the agent can always re-read the file.
Monitoring
Each forget_messages call logs the characters saved so the team can monitor usage patterns and verify the agent is making good decisions about what to forget and when.
Need Help?
Have questions or suggestions? We're here to help you get the most out of GitAuto.
Contact us with your questions or feedback!