⚡ Copilot Token Optimization2026-06-06 — Smoke Copilot BYOK

## Target Workflow: `smoke-copilot-byok`

**Source report:** #4427
**Estimated cost per run:** ~$8.78 (estimated from token counts × Anthropic pricing)
**Total tokens per run:** ~398K (avg over 5 runs)
**Effective tokens per run:** ~12.3M avg (30.8× cost multiplier)
**Cache hit rate:** ~0% (unique prompts every run — see Root Cause below)
**LLM turns:** 1

## Current Configuration

| Setting | Value |
|---------|-------|
| Model | `claude-opus-4.8` (explicitly overridden in `env:`) |
| Tools loaded | `bash: ["*"]`, `github: toolsets: [pull_requests]` |
| Network groups | `defaults`, `github` |
| Pre-agent steps | ✅ Yes (pre-computes PR data, HTTP check, file I/O) |
| Prompt size | ~7.2 KB (173 lines) |
| Dynamic content in prompt | ✅ 4 template substitutions that change every run |

## Root Cause Analysis

### Problem 1 — Wrong Model for Task Complexity

The smoke test performs four trivial verifications:
1. Run `cat` on a pre-created file to confirm it exists
2. Call `github-list_pull_requests` and confirm data returns
3. Check `SMOKE_HTTP_CODE` (always 200) is OK
4. Confirm BYOK inference is working (trivially proven by the agent responding at all)

**`claude-opus-4.8` costs $15/M input and $75/M output** — the most expensive tier. For a task requiring only basic instruction-following and one `cat` command, this is extreme overkill. The sibling `Smoke Copilot` workflow achieves the same validation pattern with the default `claude-sonnet-4.6`, producing only a ~10× effective-token multiplier vs BYOK's 30.8×.

### Problem 2 — Cache-Busting File Path

The pre-step creates a **unique file path per run**:
```bash
TEST_FILE="$TEST_DIR/smoke-test-copilot-byok-${GITHUB_RUN_ID}.txt"
```

This path is injected into the prompt body via `${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}`. Since `GITHUB_RUN_ID` changes every run, the entire prompt is unique — prefix caching never activates, so every token is billed at full price.

The combination of an expensive model + zero caching explains the 30.8× multiplier (vs ~2× we'd expect for Haiku with good caching).

## Recommendations

### 1. Switch Model from `claude-opus-4.8` to `claude-haiku-3.5`

**Estimated savings: ~11.6M effective tokens/run (~94.7%)**

Change in `.github/workflows/smoke-copilot-byok.md`:
```diff
 env:
-  COPILOT_MODEL: claude-opus-4.8
+  COPILOT_MODEL: claude-haiku-3.5
```

Pricing comparison:

| Model | Input | Output | Est. cost/run (no cache) |
|-------|-------|--------|--------------------------|
| `claude-opus-4.8` (current) | $15/M | $75/M | ~$8.78 |
| `claude-haiku-3.5` (proposed) | $0.80/M | $4/M | ~$0.47 |

Effective tokens: ~12.3M/run → **~655K/run**

The agent needs zero advanced reasoning: it confirms pre-computed results and writes a short 5–10 line comment. Haiku is purpose-built for exactly this level of task.

Note: the sibling `smoke-copilot.md` does NOT override `COPILOT_MODEL` and runs on the repo-level default (`claude-sonnet-4.6`). The BYOK override to Opus appears unnecessary for smoke validation — it just needs any model that can follow instructions and call tools.

If validating an Opus-specific inference path is genuinely required, consider a separate targeted test that runs less frequently (weekly instead of every 12h + every PR).

### 2. Fix Cache-Busting File Path

**Estimated savings: additional ~20–30% after recommendation #1**

Replace the run-unique file name with a deterministic one in the pre-step:
```diff
-TEST_FILE="$TEST_DIR/smoke-test-copilot-byok-${GITHUB_RUN_ID}.txt"
+TEST_FILE="$TEST_DIR/smoke-test-copilot-byok.txt"
```

The file is still freshly written each run (no stale-data risk), but the path is now constant. This allows prefix caching to activate on the stable instruction portion of the prompt.

### 3. Move Dynamic Data to End of Prompt

**Estimated savings: additional ~20% — enables prefix caching of the stable ~80% prefix**

The current prompt inlines dynamic substitutions in the middle of the document, preventing prefix caching. Restructure so all `${{ steps.smoke-data.outputs.* }}` references appear **only in a final section**:

```diff
-### 2. GitHub.com Connectivity
-Pre-step result: HTTP ${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }} from github.com.
-✅ if HTTP 200 or 301, ❌ otherwise.
-
-### 3. File Write/Read Test
-Pre-step wrote and read back: "${{ steps.smoke-data.outputs.SMOKE_FILE_CONTENT }}"
-File path: ${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}
-Verify by running `cat` on the file path using bash to confirm it exists.
+### 2. GitHub.com Connectivity
+Check the HTTP code in **Pre-Fetched Data** below. ✅ if HTTP 200 or 301, ❌ otherwise.
+
+### 3. File Write/Read Test
+Run `cat` on the file path from **Pre-Fetched Data** below to confirm it exists.
+
+...
 
-## Pre-Fetched PR Data
-
-```
-${{ steps.smoke-data.outputs.SMOKE_PR_DATA }}
-```
+## Pre-Fetched Data
+
+
+- HTTP code: `${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }}`
+- File path: `${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}`
+- File content: `${{ steps.smoke-data.outputs.SMOKE_FILE_CONTENT }}`
+- PR data:
+```
+${{ steps.smoke-data.outputs.SMOKE_PR_DATA }}
+```
```

With the stable instruction prefix fixed, prefix caching should activate on the majority of each prompt.

## Expected Impact

| Metric | Current | After #1 (model only) | After #1+#2+#3 (full) |
|--------|---------|----------------------|----------------------|
| Effective tokens/run | ~12.3M | ~655K | ~400K |
| Est. cost/run | ~$8.78 | ~$0.47 | ~$0.30 |
| Effective token multiplier | 30.8× | ~1.6× | ~1.0× |
| Reduction | — | **-94.7%** | **-96.7%** |

Projected across 5 runs/12h period (as observed in report):

| | Current | Projected |
|---|---|---|
| Total effective tokens | 61.3M | ~2.0M |
| Savings | — | **-96.7%** |

## Implementation Checklist

- [ ] In `.github/workflows/smoke-copilot-byok.md`: change `COPILOT_MODEL: claude-opus-4.8` → `COPILOT_MODEL: claude-haiku-3.5`
- [ ] In pre-step `run:` block: change `smoke-test-copilot-byok-${GITHUB_RUN_ID}.txt` → `smoke-test-copilot-byok.txt`
- [ ] Restructure prompt: move all `${{ steps.smoke-data.outputs.* }}` substitutions to a single **Pre-Fetched Data** section at the very end of the prompt body
- [ ] Recompile: `gh aw compile .github/workflows/smoke-copilot-byok.md`
- [ ] Post-process: `npx tsx scripts/ci/postprocess-smoke-workflows.ts`
- [ ] Verify CI passes (smoke tests should still pass — task complexity unchanged)
- [ ] Compare effective tokens on next run vs baseline (~12.3M avg); target <1M effective




> Generated by [Daily Copilot Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/27058534262) · sonnet46 1.7M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fcopilot-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡ Copilot Token Optimization2026-06-06 — Smoke Copilot BYOK #4429

Target Workflow: `smoke-copilot-byok`

Current Configuration

Root Cause Analysis

Problem 1 — Wrong Model for Task Complexity

Problem 2 — Cache-Busting File Path

Recommendations

1. Switch Model from `claude-opus-4.8` to `claude-haiku-3.5`

2. Fix Cache-Busting File Path

3. Move Dynamic Data to End of Prompt

Expected Impact

Implementation Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting	Value
Model	`claude-opus-4.8` (explicitly overridden in `env:`)
Tools loaded	`bash: ["*"]`, `github: toolsets: [pull_requests]`
Network groups	`defaults`, `github`
Pre-agent steps	✅ Yes (pre-computes PR data, HTTP check, file I/O)
Prompt size	~7.2 KB (173 lines)
Dynamic content in prompt	✅ 4 template substitutions that change every run

Model	Input	Output	Est. cost/run (no cache)
`claude-opus-4.8` (current)	$15/M	$75/M	~$8.78
`claude-haiku-3.5` (proposed)	$0.80/M	$4/M	~$0.47

Metric	Current	After #1 (model only)	After #1+#2+#3 (full)
Effective tokens/run	~12.3M	~655K	~400K
Est. cost/run	~$8.78	~$0.47	~$0.30
Effective token multiplier	30.8×	~1.6×	~1.0×
Reduction	—	-94.7%	-96.7%

Uh oh!

⚡ Copilot Token Optimization2026-06-06 — Smoke Copilot BYOK #4429

Description

Target Workflow: smoke-copilot-byok

Current Configuration

Root Cause Analysis

Problem 1 — Wrong Model for Task Complexity

Problem 2 — Cache-Busting File Path

Recommendations

1. Switch Model from claude-opus-4.8 to claude-haiku-3.5

2. Fix Cache-Busting File Path

3. Move Dynamic Data to End of Prompt

Expected Impact

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `smoke-copilot-byok`

1. Switch Model from `claude-opus-4.8` to `claude-haiku-3.5`