Context Forgetting

When an AI agent runs a multi-step coding task, it reads reference files early to learn patterns: test naming conventions, project structure, framework idioms. These files enter the conversation context and stay there for every subsequent API call. Input tokens account for roughly 95% of our Claude costs, so accumulated stale content is the highest-leverage target for optimization.

Why the Agent Decides

Most agent frameworks manage context from the outside: the orchestrator truncates, summarizes, or compacts the conversation. But coding agents read specific files for specific reasons. The orchestrator cannot know when the agent is "done" with a file. Only the agent knows.

How It Works

The agent reads reference files (existing tests, config files, etc.) to learn patterns.
Once the agent has extracted the patterns it needs, it calls forget_messages with the list of file paths to forget.
GitAuto replaces the file contents in the conversation history with a short placeholder like ['src/ref.py' content removed because agent already extracted needed patterns].
The placeholder reminds the agent the file existed. If it needs the file again later, it can re-read it.

Token Economics

If the agent forgets 15,000 characters of reference files on turn 5 of a 50-turn run:

Cost of forgetting: near zero (one tool call, a few placeholder tokens)
Savings per subsequent turn: 15,000 fewer input characters
Total savings: 15,000 x 45 remaining turns = 675,000 fewer characters of input

The breakeven is immediate. The only risk is forgetting too early, but the agent can always re-read the file.

Monitoring

Each forget_messages call logs the characters saved so the team can monitor usage patterns and verify the agent is making good decisions about what to forget and when.

Web Fetch Summarization File Query Routing

Need Help?

Have questions or suggestions? We're here to help you get the most out of GitAuto.