Target Workflow
CLI Consistency Checker (cli-consistency-checker.md) — selected as the highest-AIC workflow in the 7-day window at 1,265 AIC/run (next closest: 788 AIC). Not previously optimized; no "Token" in name.
Analysis Period & Runs Audited
| Period |
Runs |
Source |
| 2026-06-06 → 2026-06-12 |
1 run |
all-runs.json + /usage/agent/token_usage.jsonl |
⚠️ Single-run caveat: conclusions are based on one execution (run §27420894190). Validate recommendations against 3+ future runs before merging.
Cost Profile
| Metric |
Value |
| Total AIC (run) |
1,265 |
| Main-agent AIC (logged) |
303 |
| Inferred sub-agent AIC |
~962 (3 sub-agents, unlogged) |
| Raw input tokens |
3,480,617 |
| Raw output tokens |
83,119 |
| Cache read tokens |
3,206,082 |
| Cache hit rate |
92% |
| Model API requests |
68 (main agent) |
| Max context window |
103,317 tokens |
| Action minutes |
17 min |
| Conclusion |
success |
Cost allocation insight: The 3 inline sub-agents (typo-grammar-extractor, flag-consistency-analyzer, docs-vs-help-comparer) account for an estimated ~76% of total AIC (~962 AIC), yet only their orchestration is visible in the main-agent logs. Three requests within the main agent's own run each hit large output ceilings, producing 77 AIC together (25% of main-agent cost):
| Request |
Input toks |
Output toks |
AIC |
| Sub-agent output #1 |
30,995 |
15,362 |
26.7 |
| Sub-agent output #2 |
47,048 |
16,000 (max) |
25.9 |
| Sub-agent output #3 |
31,577 |
16,000 (max) |
25.0 |
Ranked Recommendations
1 — Constrain sub-agent output verbosity · est. –50 to –80 AIC/run
Evidence: Two of the three sub-agent interaction turns hit the 16,000-token output ceiling (max per response), and a third produced 15,362 tokens. Each of these responses costs ~25 AIC in the main-agent context alone; the sub-agents' own internal turns (not fully logged) likely dwarf this.
Action: Add an explicit output cap to all three sub-agent prompts:
-Return a concise bulleted list.
+Return a concise bulleted list (max 25 findings, ≤ 3 lines each).
+Stop after 25 findings even if more exist; summarize remaining counts.
Apply this to typo-grammar-extractor, flag-consistency-analyzer, and docs-vs-help-comparer.
Why safe: The downstream consumer (main agent + issue body) already wraps long sections in <details> blocks. Truncating to 25 findings still surfaces the highest-value signal.
2 — Replace flag-consistency-analyzer file-per-file reads with all-help.txt · est. –30 to –60 AIC/run
Evidence: The sub-agent prompt currently reads:
Read all per-command help files in `/tmp/gh-aw/agent/help-output/`.
This triggers one file read per command (34+ commands). The pre-agent-steps already concatenates all help into /tmp/gh-aw/agent/all-help.txt. The sub-agent is doing redundant I/O, loading ~34 separate files and growing its own context unnecessarily.
Action:
-Read all per-command help files in `/tmp/gh-aw/agent/help-output/`.
+Read `/tmp/gh-aw/agent/all-help.txt` (pre-concatenated by the setup step).
Why safe: all-help.txt is created by pre-agent-steps via:
cat "${help_files[@]}" > /tmp/gh-aw/agent/all-help.txt
All per-command files are already included verbatim.
3 — Block redundant help re-collection in the main agent · est. –25 to –40 AIC/run
Evidence: Step 1 instructs the main agent to read all-help.txt as the canonical source, but the agent still emitted 44 bash calls (alongside 46 view calls), including at least one round of ./gh-aw [cmd] --help and source-file reads (cmd/gh-aw/main.go). This drives the main context from 15,765 tokens to 103,317 tokens (6.5× growth). The 20+ sequential bash+view pairs in the early turns are the primary context-inflation mechanism.
Action: Strengthen Step 1 and Step 2 with an explicit prohibition:
## Step 1: Load Pre-Collected Help Output
Read `/tmp/gh-aw/agent/all-help.txt` and use it as the primary input for analysis.
+
+**Do NOT run `./gh-aw [cmd] --help` or read individual files under**
+**`/tmp/gh-aw/agent/help-output/`. The pre-agent step has already done this.**
+**All CLI help is in `all-help.txt`.**
Also scope source-code verification (e.g., cmd/gh-aw/main.go reads) to targeted, single bash grep calls rather than full-file view reads.
Why safe: The pre-agent step validates that at least one help file was collected (exit 1 if none), so all-help.txt is guaranteed to exist and be complete before the agent starts.
4 — Remove list_agents poll after sub-agent launch · est. –3 AIC/run
Evidence: The run shows 1 list_agents call after the 3 task calls. The task tool confirms launch synchronously; polling list_agents to verify they started adds an unnecessary round-trip.
Action: Add to the main prompt:
+After launching sub-agents with the `task` tool, proceed directly to
+`read_agent` for results. Do not call `list_agents` to verify launch.
Summary Table
| # |
Recommendation |
Est. AIC savings/run |
Confidence |
| 1 |
Cap sub-agent output at 25 findings |
50–80 |
Medium (single run) |
| 2 |
flag-consistency-analyzer → use all-help.txt |
30–60 |
Medium |
| 3 |
Block main-agent help re-collection |
25–40 |
Medium |
| 4 |
Remove list_agents poll |
3 |
High |
| Total (conservative) |
|
~110–180 |
|
Structural Optimization Notes
The workflow already uses 3 inline sub-agents launched in parallel — this pattern is correct and efficient. No additional sub-agents are recommended.
The main opportunity is prompt precision: the three sub-agents and main agent are doing redundant work (re-reading pre-collected files, growing context unnecessarily, generating oversized outputs) that conservative prompt changes can eliminate.
Run evidence — token_usage.jsonl excerpt (last 20 requests, main agent)
input=47535 cache=46616 aic=2.45 cumulative=174.3
input=84854 cache=82632 aic=3.78 cumulative=178.0
input=47048 cache=45212 aic=25.91 cumulative=203.9 ← sub-agent output (16K tokens)
input=85981 cache=84853 aic=3.89 cumulative=207.8
input=86915 cache=85980 aic=3.04 cumulative=210.9
input=87164 cache=86855 aic=2.84 cumulative=213.7
input=52414 cache=47534 aic=6.53 cumulative=220.2
input=31577 cache=31298 aic=25.02 cumulative=245.3 ← sub-agent output (16K tokens)
input=87393 cache=87163 aic=4.04 cumulative=249.3
input=57053 cache=52413 aic=8.62 cumulative=257.9
input=92143 cache=87335 aic=5.83 cumulative=263.7
input=93550 cache=92089 aic=3.33 cumulative=267.1
input=93880 cache=93549 aic=8.59 cumulative=275.7
input=97959 cache=93879 aic=4.19 cumulative=279.9
input=98346 cache=97958 aic=3.25 cumulative=283.1
input=98772 cache=98345 aic=3.26 cumulative=286.4
input=99194 cache=98771 aic=3.28 cumulative=289.7
input=99510 cache=99193 aic=5.55 cumulative=295.2
input=101207 cache=99509 aic=3.81 cumulative=299.0
input=103317 cache=101206 aic=4.21 cumulative=303.2
References: §27420894190
Generated by Agentic Workflow AIC Usage Optimizer · 696.2 AIC · ⊞ 24.5K · ◷
Target Workflow
CLI Consistency Checker (
cli-consistency-checker.md) — selected as the highest-AIC workflow in the 7-day window at 1,265 AIC/run (next closest: 788 AIC). Not previously optimized; no "Token" in name.Analysis Period & Runs Audited
all-runs.json+/usage/agent/token_usage.jsonlCost Profile
Cost allocation insight: The 3 inline sub-agents (
typo-grammar-extractor,flag-consistency-analyzer,docs-vs-help-comparer) account for an estimated ~76% of total AIC (~962 AIC), yet only their orchestration is visible in the main-agent logs. Three requests within the main agent's own run each hit large output ceilings, producing 77 AIC together (25% of main-agent cost):Ranked Recommendations
1 — Constrain sub-agent output verbosity · est. –50 to –80 AIC/run
Evidence: Two of the three sub-agent interaction turns hit the 16,000-token output ceiling (max per response), and a third produced 15,362 tokens. Each of these responses costs ~25 AIC in the main-agent context alone; the sub-agents' own internal turns (not fully logged) likely dwarf this.
Action: Add an explicit output cap to all three sub-agent prompts:
Apply this to
typo-grammar-extractor,flag-consistency-analyzer, anddocs-vs-help-comparer.Why safe: The downstream consumer (main agent + issue body) already wraps long sections in
<details>blocks. Truncating to 25 findings still surfaces the highest-value signal.2 — Replace
flag-consistency-analyzerfile-per-file reads withall-help.txt· est. –30 to –60 AIC/runEvidence: The sub-agent prompt currently reads:
This triggers one file read per command (34+ commands). The
pre-agent-stepsalready concatenates all help into/tmp/gh-aw/agent/all-help.txt. The sub-agent is doing redundant I/O, loading ~34 separate files and growing its own context unnecessarily.Action:
Why safe:
all-help.txtis created bypre-agent-stepsvia:All per-command files are already included verbatim.
3 — Block redundant help re-collection in the main agent · est. –25 to –40 AIC/run
Evidence: Step 1 instructs the main agent to read
all-help.txtas the canonical source, but the agent still emitted 44 bash calls (alongside 46 view calls), including at least one round of./gh-aw [cmd] --helpand source-file reads (cmd/gh-aw/main.go). This drives the main context from 15,765 tokens to 103,317 tokens (6.5× growth). The 20+ sequential bash+view pairs in the early turns are the primary context-inflation mechanism.Action: Strengthen Step 1 and Step 2 with an explicit prohibition:
Also scope source-code verification (e.g.,
cmd/gh-aw/main.goreads) to targeted, singlebash grepcalls rather than full-file view reads.Why safe: The pre-agent step validates that at least one help file was collected (
exit 1if none), soall-help.txtis guaranteed to exist and be complete before the agent starts.4 — Remove
list_agentspoll after sub-agent launch · est. –3 AIC/runEvidence: The run shows 1
list_agentscall after the 3taskcalls. Thetasktool confirms launch synchronously; pollinglist_agentsto verify they started adds an unnecessary round-trip.Action: Add to the main prompt:
Summary Table
flag-consistency-analyzer→ useall-help.txtlist_agentspollStructural Optimization Notes
The workflow already uses 3 inline sub-agents launched in parallel — this pattern is correct and efficient. No additional sub-agents are recommended.
The main opportunity is prompt precision: the three sub-agents and main agent are doing redundant work (re-reading pre-collected files, growing context unnecessarily, generating oversized outputs) that conservative prompt changes can eliminate.
Run evidence — token_usage.jsonl excerpt (last 20 requests, main agent)
References: §27420894190