Skip to content

⚡ Copilot Token Optimization2026-05-15 — Documentation Maintainer #3200

Description

@github-actions

Target Workflow: Documentation Maintainer

**Source (redacted) Daily Copilot Token Usage Analyzer run (2026-05-15)
Estimated cost per run: N/A (Copilot billing)
Total tokens per run: ~1,177,011
Effective tokens per run: ~11,719,789
Cache hit rate (cache reads / total input): ~89.7% (1,049,528 cache read / 1,170,266 input)
Cache write tokens: 0 (no new prefix cache established this run)
LLM requests: 28
LLM turns (conversation): 1
Model: claude-sonnet-4.6
Last run status: ❌ Failure
Duration: 6.7 minutes


Current Configuration

Setting Value
Tools loaded 2 (edit, bash)
Network groups none configured
Pre-agent steps ✅ Yes (git-changes, doc-files)
Prompt body size ~4,000 chars (~1,000 tokens)
Timeout 15 minutes
Model claude-sonnet-4.6 (frontier)

The tool surface is already minimal. The large token count (~1.17M) comes from 28 sequential bash round-trips, each carrying a growing conversation history as the agent individually reads source files, runs git show <sha> per commit, and reads documentation files.


Recommendations

1. Downgrade model to claude-haiku-4-5 or gpt-4.1-mini

Estimated savings: ~60–80% cost reduction; ~20–30% throughput improvement

The system's own agentic assessment flagged this run as model_downgrade_available:

"This Repo Maintenance run may not need a frontier model. A smaller model (e.g. gpt-4.1-mini, claude-haiku-4-5) could handle the task at lower cost."

Documentation review — comparing markdown files against git diffs and editing prose — is well within smaller model capability. Add to frontmatter in .github/workflows/doc-maintainer.md:

engine:
  model: claude-haiku-4-5

or alternatively:

engine:
  model: gpt-4.1-mini

2. Pre-compute git diffs in the steps block (reduce 28 LLM requests)

Estimated savings: ~300,000–500,000 tokens/run (~25–40% reduction)

Currently the agent gets a bare commit list (git log --name-only) and then runs git show <sha> individually for each relevant commit. Each additional bash call adds the full conversation history to the next request. With 28 total requests, the history accumulation is the primary token driver.

Replace the git-changes step to pre-compute full diffs for source-touching commits in a single shell pass:

steps:
  - name: Gather recent git diffs
    id: git-changes
    run: |
      echo "RECENT_DIFFS<<EOF" >> $GITHUB_OUTPUT
      # Get full diffs for src/, containers/, scripts/ changes in one pass
      git log --since="7 days ago" --format="%H %s" -- src/ containers/ scripts/ | \
        while read sha title; do
          echo "=== Commit $sha: $title ==="
          git show --stat --unified=3 "$sha" -- src/ containers/ scripts/ docs/ "*.md" 2>/dev/null | head -150
        done | head -500
      echo "EOF" >> $GITHUB_OUTPUT
  - name: List documentation files
    id: doc-files
    run: |
      echo "DOC_FILES<<EOF" >> $GITHUB_OUTPUT
      find docs/ -name "*.md" 2>/dev/null | sort
      find . -maxdepth 1 -name "*.md" | sort
      echo "EOF" >> $GITHUB_OUTPUT
  - name: Identify affected docs
    id: affected-docs
    run: |
      echo "AFFECTED_DOCS<<EOF" >> $GITHUB_OUTPUT
      # Find which doc files may need updates based on changed source files
      git log --since="7 days ago" --name-only --format="" -- src/ containers/ scripts/ | \
        grep -v '^$' | sort -u | head -30
      echo "EOF" >> $GITHUB_OUTPUT

Then update the prompt to reference ${{ steps.git-changes.outputs.RECENT_DIFFS }} and instruct the agent to work from the pre-computed diffs rather than running git show individually. This alone could eliminate 10–15 bash round-trips.

3. Fix zero cache-write tokens (improve future cache reuse)

Estimated savings: reduces effective token cost by up to 50% on repeat runs

This run had 0 cache_write_tokens, meaning no new cache prefix was established. This is significant: the 89.7% cache-read rate came from prior runs' cache, but if this pattern continues (0 writes), future cache hits will expire. The likely cause is that dynamic git data (RECENT_COMMITS) is injected early in the system prompt before stable static content, preventing the model from establishing a stable cache prefix.

Fix: Structure the prompt so static instructions appear first, dynamic data last:

# Documentation Maintainer

[All static instructions and guidelines — keep at top for caching]

---

## Context for This Run

### Recent Changes (Pre-computed)
${{ steps.git-changes.outputs.RECENT_DIFFS }}

### Documentation Files
${{ steps.doc-files.outputs.DOC_FILES }}

This ensures the large static block (~1,000 tokens) is always a stable prefix, allowing the model to cache it across daily runs.

4. Scope-limit documentation review with pre-computed affected files

Estimated savings: ~100,000–200,000 tokens/run (~8–15% reduction)

Currently the agent reads all documentation files to decide which ones are affected. The affected-docs pre-step above can identify candidate docs automatically, and the prompt can instruct the agent to start with those files rather than reviewing all docs:

## Prioritized Documentation Files

The following files are likely affected based on changed source paths:

${{ steps.affected-docs.outputs.AFFECTED_DOCS }}


Review these first. Only review other docs if a clear connection to the changes is evident.

5. Add explicit exit condition to reduce wasted requests when no changes are relevant

Estimated savings: up to 1,177,011 tokens on no-op runs (eliminates full run)

The workflow has a skip-if-match guard for open PRs, but no guard for "no relevant source code changed in 7 days." Add a pre-step that checks and skips:

  - name: Check for relevant changes
    id: has-changes
    run: |
      COUNT=$(git log --since="7 days ago" --oneline -- src/ containers/ scripts/ | wc -l)
      echo "changed_count=$COUNT" >> $GITHUB_OUTPUT
      echo "has_changes=$([ $COUNT -gt 0 ] && echo true || echo false)" >> $GITHUB_OUTPUT

Then in the prompt, use a condition or instruct the agent to exit immediately if ${{ steps.has-changes.outputs.has_changes }} is false.


Expected Impact

Metric Current Projected Savings
Total tokens/run ~1,177,011 ~600,000–700,000 ~40–50%
Effective tokens/run ~11,719,789 ~2,000,000–3,000,000 ~75–80%
LLM requests/run 28 10–15 ~45%
Cost/run (Copilot) Significant reduction with model downgrade ~60–80%
Duration 6.7m ~3–4m ~40%
Run status ❌ Failed (investigate root cause separately)

Implementation Checklist

  • Add engine: { model: claude-haiku-4-5 } to frontmatter
  • Replace git-changes step with full-diff pre-computation
  • Add affected-docs step to scope document review
  • Add has-changes guard step for no-op run skipping
  • Restructure prompt: static instructions first, dynamic data last
  • Recompile: gh aw compile .github/workflows/doc-maintainer.md
  • Post-process: npx tsx scripts/ci/postprocess-smoke-workflows.ts (if applicable)
  • Verify CI passes on PR
  • Compare token usage on new run vs this baseline (~1.17M tokens/run)

Additional Notes

  • The last run failed (error_count=1). Investigate the failure cause independently — if it's a tool or code issue it may inflate future token counts too.
  • Cache write tokens were 0 this run. After restructuring the prompt (rec feat: add integration test for rostbuness #3), verify cache_write_tokens > 0 on the next run.
  • The agentic_fraction = 1 confirms the agent spent the entire session in agentic (tool-calling) mode — reducing round-trips (rec Secret proxying #2) is the highest-leverage structural change.

Generated by Daily Copilot Token Optimization Advisor · ● 7.3M ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions