GitAuto Logo
  1. Home
  2. Pricing
  3. Docs
  4. Dashboard
  5. Blog
  6. Contact
  1. Home
  2. How It Works
  3. Use Cases
  4. Pricing
  5. Docs
  6. Dashboard
  7. FAQ
  8. Blog
  9. Contact

CI Log Deduplication

When a CI run fails, the error log is included in every API call so the model always has the failure context. If the log contains duplicate errors (e.g., 39 test files all failing with the same TypeError), those duplicates multiply token costs across every iteration. GitAuto deduplicates identical errors before sending them to the model and saves oversized logs to disk for on-demand reading.

Why This Exists

Jest and similar test runners execute each test file independently. When a shared module has an error, every test file that imports it produces an identical failure with the same stack trace. A repo with 39 test files importing one broken module generates 39 copies of the same error. Our log cleaning pipeline already stripped ANSI codes, removed node_modules from stack traces, and extracted the Jest summary section, but it treated each copy as unique. The result: a 390K character log embedded in every API call, costing hundreds of dollars on a single PR.

How It Works

The deduplication pipeline has three stages, each in its own module:

  • Extract Jest summary section- pulls out the "Summary of all failing tests" block and header commands, discarding verbose per-test output
  • Strip node_modules from stack traces - removes internal framework lines that add characters without useful information
  • Deduplicate identical errors- groups failures by their error message and stack trace content. When multiple test files produce the same error, only one example is kept with a count (e.g., "39 tests failed with this same error")

For logs that are still large after cleaning (over 50K characters), the full log is saved to a file in the cloned repository. A 5K character preview is included in the initial message with a pointer to the full file. The agent can then read or search the file on demand instead of carrying the entire log in every API call.

Impact

In the case that triggered this feature, 39 identical Jest TypeErrors inflated a CI log to 390K characters (roughly 100K tokens). After deduplication, the same log is under 10K characters. Over 8 retry iterations, this saves millions of input tokens per PR.

Related Features

  • CI Log Cleaning - the upstream cleaning pipeline that removes ANSI codes and extracts summaries
  • Token Trimming - removes oldest messages when the conversation exceeds the context window

Need Help?

Have questions or suggestions? We're here to help you get the most out of GitAuto.

Contact us with your questions or feedback!

Skip CI IntermediateWeb Fetch Summarization

Getting Started

  • Installation
  • Setup

Triggers

  • Overview
  • Schedule Trigger
  • Test Failure Trigger
  • Review Comment Trigger
  • Dashboard Trigger

Coverage Dashboard

  • Overview
  • Python Testing
  • JavaScript Testing
  • Java Testing
  • Go Testing
  • PHP Testing
  • Ruby Testing
  • Flutter Testing
  • Multi-Language
  • Coverage Charts

Customization

  • Repository Rules
  • Output Language
  • GITAUTO.md

Integrations

  • CircleCI Integration
  • npm Integration

How It Works

Context Enrichment

  • Line Numbers
  • Full File Reads
  • Test File Preloading
  • Test Naming Detection
  • Error Baselines
  • CI Log Cleaning
  • Trigger-Specific Prompts
  • Coding Standards

Output Auto-Correction

  • Diff Hunk Repair
  • Diff Prefix Repair
  • Tool Name Correction
  • Tool Argument Correction
  • Import Sorting
  • Trailing Space Removal
  • Final Newline
  • Line Ending Preservation
  • Sanitize Tool Arguments
  • Lint Disable Headers

Quality Verification

  • Formatting
  • Linting
  • Type Checking
  • Test Execution
  • Coverage Enforcement
  • phpcs / phpstan Support
  • PHPUnit Support
  • pytest Support
  • Snapshot Auto-Update
  • Untestable Detection
  • Should-Skip Detection
  • Dead Code Removal
  • Quality Check Scoring
  • Quality Checklist

Safety Guardrails

  • File Edit Restrictions
  • Temperature Zero
  • PR/Branch Checks
  • Race Condition Prevention
  • Bot Loop Prevention
  • Webhook Deduplication
  • Duplicate Error Hashing
  • Infrastructure Failure Detection
  • Strict Tool Schemas
  • No-Change Detection

Token/Cost Management

  • Token Trimming
  • Outdated Diff Removal
  • Stale File Replacement
  • Skip CI Intermediate
  • CI Log Deduplication
  • Web Fetch Summarization
  • Context Forgetting
  • File Query Routing
  • On-Demand Diff

Resilience & Recovery

  • Model Fallback
  • Overload Retry
  • Forced Verification
  • Error Files Editable

Hallucination Prevention

  • Web Search
  • URL Fetching
  • Anti-Hallucination Prompts
  • GITAUTO.md Restrictions
  • Review Response Guardrails

Ready to improve your test coverage?

Go from 0% to 90% test coverage with GitAuto. Start for free, no credit card required.

Install FreeContact Sales

Product

  • Home
  • Why GitAuto
  • What GitAuto Does
  • How It Works
  • Use Cases
  • How to Get Started
  • Solution
  • Pricing
  • Pricing Details
  • ROI Calculator
  • ROI Methodology
  • FAQ
  • Blog
  • Contact

Dashboard

  • Dashboard
  • Coverage Trends
  • File Coverage
  • Credits
  • Open PRs
  • Usage
  • Triggers
  • Actions
  • References
  • Rules
  • CircleCI Integration
  • npm Integration

Documentation

  • Docs
  • Getting Started
  • Setup
  • Triggers
  • Coverage Setup
  • Customization
  • How It Works
  • Auto Merge
  • CircleCI
  • npm

Legal

  • Privacy Policy
  • Terms of Service

Connect

  • GitHub
  • LinkedIn
  • Twitter
  • YouTube
GitAuto Logo© 2026 GitAuto, Inc. All Rights Reserved