Skip to content

[agentic-token-optimizer] CLI Consistency Checker — AIC Optimization (3 high-impact fixes) #38873

@github-actions

Description

@github-actions

Target Workflow

CLI Consistency Checker (cli-consistency-checker.md) — selected as the highest-AIC workflow in the 7-day window at 1,265 AIC/run (next closest: 788 AIC). Not previously optimized; no "Token" in name.


Analysis Period & Runs Audited

Period Runs Source
2026-06-06 → 2026-06-12 1 run all-runs.json + /usage/agent/token_usage.jsonl

⚠️ Single-run caveat: conclusions are based on one execution (run §27420894190). Validate recommendations against 3+ future runs before merging.


Cost Profile

Metric Value
Total AIC (run) 1,265
Main-agent AIC (logged) 303
Inferred sub-agent AIC ~962 (3 sub-agents, unlogged)
Raw input tokens 3,480,617
Raw output tokens 83,119
Cache read tokens 3,206,082
Cache hit rate 92%
Model API requests 68 (main agent)
Max context window 103,317 tokens
Action minutes 17 min
Conclusion success

Cost allocation insight: The 3 inline sub-agents (typo-grammar-extractor, flag-consistency-analyzer, docs-vs-help-comparer) account for an estimated ~76% of total AIC (~962 AIC), yet only their orchestration is visible in the main-agent logs. Three requests within the main agent's own run each hit large output ceilings, producing 77 AIC together (25% of main-agent cost):

Request Input toks Output toks AIC
Sub-agent output #1 30,995 15,362 26.7
Sub-agent output #2 47,048 16,000 (max) 25.9
Sub-agent output #3 31,577 16,000 (max) 25.0

Ranked Recommendations

1 — Constrain sub-agent output verbosity · est. –50 to –80 AIC/run

Evidence: Two of the three sub-agent interaction turns hit the 16,000-token output ceiling (max per response), and a third produced 15,362 tokens. Each of these responses costs ~25 AIC in the main-agent context alone; the sub-agents' own internal turns (not fully logged) likely dwarf this.

Action: Add an explicit output cap to all three sub-agent prompts:

-Return a concise bulleted list.
+Return a concise bulleted list (max 25 findings, ≤ 3 lines each).
+Stop after 25 findings even if more exist; summarize remaining counts.

Apply this to typo-grammar-extractor, flag-consistency-analyzer, and docs-vs-help-comparer.

Why safe: The downstream consumer (main agent + issue body) already wraps long sections in <details> blocks. Truncating to 25 findings still surfaces the highest-value signal.


2 — Replace flag-consistency-analyzer file-per-file reads with all-help.txt · est. –30 to –60 AIC/run

Evidence: The sub-agent prompt currently reads:

Read all per-command help files in `/tmp/gh-aw/agent/help-output/`.

This triggers one file read per command (34+ commands). The pre-agent-steps already concatenates all help into /tmp/gh-aw/agent/all-help.txt. The sub-agent is doing redundant I/O, loading ~34 separate files and growing its own context unnecessarily.

Action:

-Read all per-command help files in `/tmp/gh-aw/agent/help-output/`.
+Read `/tmp/gh-aw/agent/all-help.txt` (pre-concatenated by the setup step).

Why safe: all-help.txt is created by pre-agent-steps via:

cat "${help_files[@]}" > /tmp/gh-aw/agent/all-help.txt

All per-command files are already included verbatim.


3 — Block redundant help re-collection in the main agent · est. –25 to –40 AIC/run

Evidence: Step 1 instructs the main agent to read all-help.txt as the canonical source, but the agent still emitted 44 bash calls (alongside 46 view calls), including at least one round of ./gh-aw [cmd] --help and source-file reads (cmd/gh-aw/main.go). This drives the main context from 15,765 tokens to 103,317 tokens (6.5× growth). The 20+ sequential bash+view pairs in the early turns are the primary context-inflation mechanism.

Action: Strengthen Step 1 and Step 2 with an explicit prohibition:

## Step 1: Load Pre-Collected Help Output

Read `/tmp/gh-aw/agent/all-help.txt` and use it as the primary input for analysis.
+
+**Do NOT run `./gh-aw [cmd] --help` or read individual files under**
+**`/tmp/gh-aw/agent/help-output/`. The pre-agent step has already done this.**
+**All CLI help is in `all-help.txt`.**

Also scope source-code verification (e.g., cmd/gh-aw/main.go reads) to targeted, single bash grep calls rather than full-file view reads.

Why safe: The pre-agent step validates that at least one help file was collected (exit 1 if none), so all-help.txt is guaranteed to exist and be complete before the agent starts.


4 — Remove list_agents poll after sub-agent launch · est. –3 AIC/run

Evidence: The run shows 1 list_agents call after the 3 task calls. The task tool confirms launch synchronously; polling list_agents to verify they started adds an unnecessary round-trip.

Action: Add to the main prompt:

+After launching sub-agents with the `task` tool, proceed directly to
+`read_agent` for results. Do not call `list_agents` to verify launch.

Summary Table

# Recommendation Est. AIC savings/run Confidence
1 Cap sub-agent output at 25 findings 50–80 Medium (single run)
2 flag-consistency-analyzer → use all-help.txt 30–60 Medium
3 Block main-agent help re-collection 25–40 Medium
4 Remove list_agents poll 3 High
Total (conservative) ~110–180

Structural Optimization Notes

The workflow already uses 3 inline sub-agents launched in parallel — this pattern is correct and efficient. No additional sub-agents are recommended.

The main opportunity is prompt precision: the three sub-agents and main agent are doing redundant work (re-reading pre-collected files, growing context unnecessarily, generating oversized outputs) that conservative prompt changes can eliminate.


Run evidence — token_usage.jsonl excerpt (last 20 requests, main agent)
input=47535  cache=46616  aic=2.45   cumulative=174.3
input=84854  cache=82632  aic=3.78   cumulative=178.0
input=47048  cache=45212  aic=25.91  cumulative=203.9  ← sub-agent output (16K tokens)
input=85981  cache=84853  aic=3.89   cumulative=207.8
input=86915  cache=85980  aic=3.04   cumulative=210.9
input=87164  cache=86855  aic=2.84   cumulative=213.7
input=52414  cache=47534  aic=6.53   cumulative=220.2
input=31577  cache=31298  aic=25.02  cumulative=245.3  ← sub-agent output (16K tokens)
input=87393  cache=87163  aic=4.04   cumulative=249.3
input=57053  cache=52413  aic=8.62   cumulative=257.9
input=92143  cache=87335  aic=5.83   cumulative=263.7
input=93550  cache=92089  aic=3.33   cumulative=267.1
input=93880  cache=93549  aic=8.59   cumulative=275.7
input=97959  cache=93879  aic=4.19   cumulative=279.9
input=98346  cache=97958  aic=3.25   cumulative=283.1
input=98772  cache=98345  aic=3.26   cumulative=286.4
input=99194  cache=98771  aic=3.28   cumulative=289.7
input=99510  cache=99193  aic=5.55   cumulative=295.2
input=101207 cache=99509  aic=3.81   cumulative=299.0
input=103317 cache=101206 aic=4.21   cumulative=303.2

References: §27420894190

Generated by Agentic Workflow AIC Usage Optimizer · 696.2 AIC · ⊞ 24.5K ·

  • expires on Jun 19, 2026, 8:08 AM UTC-08:00

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions