GitAuto Logo
  1. Home
  2. Pricing
  3. Docs
  4. Dashboard
  5. Blog
  6. Contact
  1. Home
  2. How It Works
  3. Use Cases
  4. Pricing
  5. Docs
  6. Dashboard
  7. FAQ
  8. Blog
  9. Contact

Web Fetch Summarization

When the agent fetches a web page (documentation, API references, etc.), the raw HTML converts to 10K-50K+ tokens of markdown. Most of it is navigation, sidebars, and boilerplate. GitAuto uses Claude Haiku 4.5 as a summarization layer to extract only the relevant information before passing it to the main reasoning model.

Why This Exists

The main reasoning model (Claude Opus) costs $5/$25 per million input/output tokens. Claude Haiku costs $1/$5. Passing a 30K-token web page directly to Opus costs ~$0.15 in input tokens alone. Running it through Haiku first and returning a 1K-token summary costs ~$0.03 for Haiku input + ~$0.005 for Haiku output + ~$0.005 for Opus input on the summary. That's roughly an 80% cost reduction per web fetch.

How It Works

  1. The agent calls web_fetch with a URL and a prompt describing what information to extract.
  2. GitAuto fetches the page, strips unnecessary HTML elements (nav, footer, ads, scripts), and converts the main content area to markdown.
  3. The markdown and the extraction prompt are sent to Claude Haiku, which returns a focused summary containing only the requested information.
  4. The summary (not the full markdown) is returned to the main model's conversation.

Two Tools, Not One

Not every URL needs summarization. JSON API responses, raw text files, and configuration files should be returned as-is. GitAuto provides two tools:

  • web_fetch - Fetches HTML pages, converts to markdown, summarizes with Haiku. For documentation, articles, and web content.
  • curl - Fetches raw content with no processing. For JSON APIs, text files, and anything where exact content matters.

Why the Model Cannot Solve This

The main model is smart enough to ignore irrelevant content on a web page. But by the time it sees the content, you have already paid for the input tokens. Asking the model to "focus on the relevant parts" does not reduce the cost - the full page is already in the context window. The filtering must happen before the tokens reach the expensive model, which is an application-layer decision, not a model capability.

CI Log DeduplicationContext Forgetting

Need Help?

Have questions or suggestions? We're here to help you get the most out of GitAuto.

Contact us with your questions or feedback!

Getting Started

  • Installation
  • Setup

Triggers

  • Overview
  • Schedule Trigger
  • Test Failure Trigger
  • Review Comment Trigger
  • Dashboard Trigger

Coverage Dashboard

  • Overview
  • Python Testing
  • JavaScript Testing
  • Java Testing
  • Go Testing
  • PHP Testing
  • Ruby Testing
  • Flutter Testing
  • Multi-Language
  • Coverage Charts

Customization

  • Repository Rules
  • Output Language
  • GITAUTO.md

Integrations

  • CircleCI Integration
  • npm Integration

How It Works

Context Enrichment

  • Line Numbers
  • Full File Reads
  • Test File Preloading
  • Test Naming Detection
  • Error Baselines
  • CI Log Cleaning
  • Trigger-Specific Prompts
  • Coding Standards

Output Auto-Correction

  • Diff Hunk Repair
  • Diff Prefix Repair
  • Tool Name Correction
  • Tool Argument Correction
  • Import Sorting
  • Trailing Space Removal
  • Final Newline
  • Line Ending Preservation
  • Sanitize Tool Arguments
  • Lint Disable Headers

Quality Verification

  • Formatting
  • Linting
  • Type Checking
  • Test Execution
  • Coverage Enforcement
  • phpcs / phpstan Support
  • PHPUnit Support
  • pytest Support
  • Snapshot Auto-Update
  • Untestable Detection
  • Should-Skip Detection
  • Dead Code Removal
  • Quality Check Scoring
  • Quality Checklist

Safety Guardrails

  • File Edit Restrictions
  • Temperature Zero
  • PR/Branch Checks
  • Race Condition Prevention
  • Bot Loop Prevention
  • Webhook Deduplication
  • Duplicate Error Hashing
  • Infrastructure Failure Detection
  • Strict Tool Schemas
  • No-Change Detection

Token/Cost Management

  • Token Trimming
  • Outdated Diff Removal
  • Stale File Replacement
  • Skip CI Intermediate
  • CI Log Deduplication
  • Web Fetch Summarization
  • Context Forgetting
  • File Query Routing
  • On-Demand Diff

Resilience & Recovery

  • Model Fallback
  • Overload Retry
  • Forced Verification
  • Error Files Editable

Hallucination Prevention

  • Web Search
  • URL Fetching
  • Anti-Hallucination Prompts
  • GITAUTO.md Restrictions
  • Review Response Guardrails

Ready to improve your test coverage?

Go from 0% to 90% test coverage with GitAuto. Start for free, no credit card required.

Install FreeContact Sales

Product

  • Home
  • Why GitAuto
  • What GitAuto Does
  • How It Works
  • Use Cases
  • How to Get Started
  • Solution
  • Pricing
  • Pricing Details
  • ROI Calculator
  • ROI Methodology
  • FAQ
  • Blog
  • Contact

Dashboard

  • Dashboard
  • Coverage Trends
  • File Coverage
  • Credits
  • Open PRs
  • Usage
  • Triggers
  • Actions
  • References
  • Rules
  • CircleCI Integration
  • npm Integration

Documentation

  • Docs
  • Getting Started
  • Setup
  • Triggers
  • Coverage Setup
  • Customization
  • How It Works
  • Auto Merge
  • CircleCI
  • npm

Legal

  • Privacy Policy
  • Terms of Service

Connect

  • GitHub
  • LinkedIn
  • Twitter
  • YouTube
GitAuto Logo© 2026 GitAuto, Inc. All Rights Reserved