Target Workflow: smoke-copilot-byok
Source report: #4427
Estimated cost per run: ~$8.78 (estimated from token counts × Anthropic pricing)
Total tokens per run: ~398K (avg over 5 runs)
Effective tokens per run: ~12.3M avg (30.8× cost multiplier)
Cache hit rate: ~0% (unique prompts every run — see Root Cause below)
LLM turns: 1
Current Configuration
| Setting |
Value |
| Model |
claude-opus-4.8 (explicitly overridden in env:) |
| Tools loaded |
bash: ["*"], github: toolsets: [pull_requests] |
| Network groups |
defaults, github |
| Pre-agent steps |
✅ Yes (pre-computes PR data, HTTP check, file I/O) |
| Prompt size |
~7.2 KB (173 lines) |
| Dynamic content in prompt |
✅ 4 template substitutions that change every run |
Root Cause Analysis
Problem 1 — Wrong Model for Task Complexity
The smoke test performs four trivial verifications:
- Run
cat on a pre-created file to confirm it exists
- Call
github-list_pull_requests and confirm data returns
- Check
SMOKE_HTTP_CODE (always 200) is OK
- Confirm BYOK inference is working (trivially proven by the agent responding at all)
claude-opus-4.8 costs $15/M input and $75/M output — the most expensive tier. For a task requiring only basic instruction-following and one cat command, this is extreme overkill. The sibling Smoke Copilot workflow achieves the same validation pattern with the default claude-sonnet-4.6, producing only a ~10× effective-token multiplier vs BYOK's 30.8×.
Problem 2 — Cache-Busting File Path
The pre-step creates a unique file path per run:
TEST_FILE="$TEST_DIR/smoke-test-copilot-byok-${GITHUB_RUN_ID}.txt"
This path is injected into the prompt body via ${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}. Since GITHUB_RUN_ID changes every run, the entire prompt is unique — prefix caching never activates, so every token is billed at full price.
The combination of an expensive model + zero caching explains the 30.8× multiplier (vs ~2× we'd expect for Haiku with good caching).
Recommendations
1. Switch Model from claude-opus-4.8 to claude-haiku-3.5
Estimated savings: ~11.6M effective tokens/run (~94.7%)
Change in .github/workflows/smoke-copilot-byok.md:
env:
- COPILOT_MODEL: claude-opus-4.8
+ COPILOT_MODEL: claude-haiku-3.5
Pricing comparison:
| Model |
Input |
Output |
Est. cost/run (no cache) |
claude-opus-4.8 (current) |
$15/M |
$75/M |
~$8.78 |
claude-haiku-3.5 (proposed) |
$0.80/M |
$4/M |
~$0.47 |
Effective tokens: ~12.3M/run → ~655K/run
The agent needs zero advanced reasoning: it confirms pre-computed results and writes a short 5–10 line comment. Haiku is purpose-built for exactly this level of task.
Note: the sibling smoke-copilot.md does NOT override COPILOT_MODEL and runs on the repo-level default (claude-sonnet-4.6). The BYOK override to Opus appears unnecessary for smoke validation — it just needs any model that can follow instructions and call tools.
If validating an Opus-specific inference path is genuinely required, consider a separate targeted test that runs less frequently (weekly instead of every 12h + every PR).
2. Fix Cache-Busting File Path
Estimated savings: additional ~20–30% after recommendation #1
Replace the run-unique file name with a deterministic one in the pre-step:
-TEST_FILE="$TEST_DIR/smoke-test-copilot-byok-${GITHUB_RUN_ID}.txt"
+TEST_FILE="$TEST_DIR/smoke-test-copilot-byok.txt"
The file is still freshly written each run (no stale-data risk), but the path is now constant. This allows prefix caching to activate on the stable instruction portion of the prompt.
3. Move Dynamic Data to End of Prompt
Estimated savings: additional ~20% — enables prefix caching of the stable ~80% prefix
The current prompt inlines dynamic substitutions in the middle of the document, preventing prefix caching. Restructure so all ${{ steps.smoke-data.outputs.* }} references appear only in a final section:
-### 2. GitHub.com Connectivity
-Pre-step result: HTTP ${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }} from github.com.
-✅ if HTTP 200 or 301, ❌ otherwise.
-
-### 3. File Write/Read Test
-Pre-step wrote and read back: "${{ steps.smoke-data.outputs.SMOKE_FILE_CONTENT }}"
-File path: ${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}
-Verify by running `cat` on the file path using bash to confirm it exists.
+### 2. GitHub.com Connectivity
+Check the HTTP code in **Pre-Fetched Data** below. ✅ if HTTP 200 or 301, ❌ otherwise.
+
+### 3. File Write/Read Test
+Run `cat` on the file path from **Pre-Fetched Data** below to confirm it exists.
+
+...
-## Pre-Fetched PR Data
-
-```
-${{ steps.smoke-data.outputs.SMOKE_PR_DATA }}
-```
+## Pre-Fetched Data
+<!-- Dynamic section — keep all template substitutions here, at the end, to enable prefix caching above -->
+
+- HTTP code: `${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }}`
+- File path: `${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}`
+- File content: `${{ steps.smoke-data.outputs.SMOKE_FILE_CONTENT }}`
+- PR data:
+```
+${{ steps.smoke-data.outputs.SMOKE_PR_DATA }}
+```
With the stable instruction prefix fixed, prefix caching should activate on the majority of each prompt.
Expected Impact
| Metric |
Current |
After #1 (model only) |
After #1+#2+#3 (full) |
| Effective tokens/run |
~12.3M |
~655K |
~400K |
| Est. cost/run |
~$8.78 |
~$0.47 |
~$0.30 |
| Effective token multiplier |
30.8× |
~1.6× |
~1.0× |
| Reduction |
— |
-94.7% |
-96.7% |
Projected across 5 runs/12h period (as observed in report):
|
Current |
Projected |
| Total effective tokens |
61.3M |
~2.0M |
| Savings |
— |
-96.7% |
Implementation Checklist
Generated by Daily Copilot Token Optimization Advisor · sonnet46 1.7M · ◷
Target Workflow:
smoke-copilot-byokSource report: #4427
Estimated cost per run: ~$8.78 (estimated from token counts × Anthropic pricing)
Total tokens per run: ~398K (avg over 5 runs)
Effective tokens per run: ~12.3M avg (30.8× cost multiplier)
Cache hit rate: ~0% (unique prompts every run — see Root Cause below)
LLM turns: 1
Current Configuration
claude-opus-4.8(explicitly overridden inenv:)bash: ["*"],github: toolsets: [pull_requests]defaults,githubRoot Cause Analysis
Problem 1 — Wrong Model for Task Complexity
The smoke test performs four trivial verifications:
caton a pre-created file to confirm it existsgithub-list_pull_requestsand confirm data returnsSMOKE_HTTP_CODE(always 200) is OKclaude-opus-4.8costs $15/M input and $75/M output — the most expensive tier. For a task requiring only basic instruction-following and onecatcommand, this is extreme overkill. The siblingSmoke Copilotworkflow achieves the same validation pattern with the defaultclaude-sonnet-4.6, producing only a ~10× effective-token multiplier vs BYOK's 30.8×.Problem 2 — Cache-Busting File Path
The pre-step creates a unique file path per run:
TEST_FILE="$TEST_DIR/smoke-test-copilot-byok-${GITHUB_RUN_ID}.txt"This path is injected into the prompt body via
${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}. SinceGITHUB_RUN_IDchanges every run, the entire prompt is unique — prefix caching never activates, so every token is billed at full price.The combination of an expensive model + zero caching explains the 30.8× multiplier (vs ~2× we'd expect for Haiku with good caching).
Recommendations
1. Switch Model from
claude-opus-4.8toclaude-haiku-3.5Estimated savings: ~11.6M effective tokens/run (~94.7%)
Change in
.github/workflows/smoke-copilot-byok.md:Pricing comparison:
claude-opus-4.8(current)claude-haiku-3.5(proposed)Effective tokens: ~12.3M/run → ~655K/run
The agent needs zero advanced reasoning: it confirms pre-computed results and writes a short 5–10 line comment. Haiku is purpose-built for exactly this level of task.
Note: the sibling
smoke-copilot.mddoes NOT overrideCOPILOT_MODELand runs on the repo-level default (claude-sonnet-4.6). The BYOK override to Opus appears unnecessary for smoke validation — it just needs any model that can follow instructions and call tools.If validating an Opus-specific inference path is genuinely required, consider a separate targeted test that runs less frequently (weekly instead of every 12h + every PR).
2. Fix Cache-Busting File Path
Estimated savings: additional ~20–30% after recommendation #1
Replace the run-unique file name with a deterministic one in the pre-step:
The file is still freshly written each run (no stale-data risk), but the path is now constant. This allows prefix caching to activate on the stable instruction portion of the prompt.
3. Move Dynamic Data to End of Prompt
Estimated savings: additional ~20% — enables prefix caching of the stable ~80% prefix
The current prompt inlines dynamic substitutions in the middle of the document, preventing prefix caching. Restructure so all
${{ steps.smoke-data.outputs.* }}references appear only in a final section:With the stable instruction prefix fixed, prefix caching should activate on the majority of each prompt.
Expected Impact
Projected across 5 runs/12h period (as observed in report):
Implementation Checklist
.github/workflows/smoke-copilot-byok.md: changeCOPILOT_MODEL: claude-opus-4.8→COPILOT_MODEL: claude-haiku-3.5run:block: changesmoke-test-copilot-byok-${GITHUB_RUN_ID}.txt→smoke-test-copilot-byok.txt${{ steps.smoke-data.outputs.* }}substitutions to a single Pre-Fetched Data section at the very end of the prompt bodygh aw compile .github/workflows/smoke-copilot-byok.mdnpx tsx scripts/ci/postprocess-smoke-workflows.ts