⚡ Claude Token Optimization2026-06-04 — security-guard

## Target Workflow: `security-guard`

**Source report:** #4304
**Estimated cost per active run:** $0.45
**Total tokens per active run:** ~424K raw / ~3.4M effective
**Cache read rate:** N/A (api-proxy passthrough — no cache telemetry)
**Cache write rate:** N/A
**LLM turns:** 6.2 avg (instrumented runs); max-turns cap is 6, many runs exceed via +1 final turn
**Model:** claude-sonnet-4-5

## Current Configuration

| Setting | Value |
|---------|-------|
| Tools loaded | `github` (pull_requests + repos toolsets) — ~10 tools |
| Tools actually used | `mcp__github__get_pull_request_diff`, `mcp__github__list_pull_request_files`, `mcp__github__get_pull_request` |
| Bash tools (actual) | `gh pr diff`, `gh pr view`, `gh api repos/...` — all avoidable |
| Network groups | `github` only ✅ |
| Pre-agent steps | Yes — fetches up to 100 KB of PR diff |
| Prompt size | ~4,200 chars (~1,050 tokens static) + injected diff (up to ~25K tokens) |

**Effective token multiplier: 8.1×** — context is 424K raw tokens repeated across 6+ turns = 3.4M effective tokens billed.

## Key Problem: Agent Ignores the Pre-Fetched Diff

The `steps:` pre-agent block fetches the full PR diff and injects it as `PR_FILES` into the prompt. Despite this, the tool_usage log shows the agent making **7 redundant `gh pr diff` bash calls** (across 4 runs) and **13 `gh api repos/...` calls** (across 10 runs), plus **4 `gh pr view` calls**.

This adds 1–2 extra turns, each of which re-sends the entire accumulated context (~50–70K tokens). Across 12 instrumented runs at $0.45/run, this alone costs an estimated **$2.00–$2.50** in avoidable turns per week.

## Recommendations

### 1. Enforce fast-path: block ALL bash diff/API calls in the prompt

**Estimated savings:** ~100–150K effective tokens/run (~35–45% cost reduction)

The current prompt says "Do NOT call `gh pr diff`, `git diff`, or `gh api .../files`" but allows "direct file reads from the checked-out repository" and doesn't explicitly block `gh pr view`. The agent is treating these as loopholes.

**Change in `.github/workflows/security-guard.md`** — replace the current instruction block:

```markdown
3. **Use the pre-fetched diff below as your primary source of truth. Do NOT call `gh pr diff`, `git diff`, or `gh api .../files`.** If you see `[DIFF TRUNCATED ...]`, fetch full context once with `mcp__github__get_pull_request_diff`, then continue.
4. **Do not use local branch comparisons or commit history** (for example `git diff main...HEAD` or `git log main..`) unless you first confirm the base branch exists locally...
5. **Use direct file reads from the checked-out repository** only for files you need to inspect further...
```

Replace with a single hard rule:

```markdown
3. **Use ONLY the pre-fetched diff below.** Do NOT call `gh pr diff`, `gh pr view`, `gh api`, `git diff`, `git log`, or `git show`. Do NOT read files from the checkout. If `[DIFF TRUNCATED ...]` appears, call `mcp__github__get_pull_request_diff` once — then stop making tool calls and analyze inline.
```

This eliminates the 2–3 extra bash-tool turns per active run.

### 2. Pre-fetch PR metadata in `steps:` to eliminate `gh pr view` calls

**Estimated savings:** ~35–50K effective tokens/run (~10–15%)

The agent makes `gh pr view` calls to get PR title, description, and author. Pre-fetch this in the `steps:` block and inject it alongside the diff.

Add to the `pr-diff` step or as a new step:

```yaml
- name: Fetch PR metadata
  id: pr-meta
  if: github.event.pull_request.number
  run: |
    PR_INFO=$(gh pr view "$PR_NUMBER" --repo "$GH_REPO" \
      --json title,author,body,baseRefName,headRefName \
      --jq '"**Title:** " + .title + "\n**Author:** " + .author.login + "\n**Base→Head:** " + .baseRefName + "→" + .headRefName')
    echo "PR_META<<GHAWMETA" >> "$GITHUB_OUTPUT"
    echo "$PR_INFO" >> "$GITHUB_OUTPUT"
    echo "GHAWMETA" >> "$GITHUB_OUTPUT"
  env:
    GH_TOKEN: ${{ github.token }}
    PR_NUMBER: ${{ github.event.pull_request.number }}
    GH_REPO: ${{ github.repository }}
```

Then inject `${{ steps.pr-meta.outputs.PR_META }}` at the top of the "Changed Files" section in the prompt.

### 3. Remove verbose "Repository Context" section

**Estimated savings:** ~20–30K effective tokens/run (~5–8%) across 6 turns

The current prompt has a 400+ word "Repository Context" and "Architecture" description explaining how AWF containers work. The security agent only needs to know which patterns are dangerous — not the full architecture.

**Replace the entire "Repository Context" block** with a 3-line summary:

```markdown
## Repository Context

AWF is a network firewall for AI agents. Security-critical files: `src/host-iptables.ts`, `containers/agent/setup-iptables.sh`, `src/squid-config.ts`, `src/docker-manager.ts`, `containers/agent/entrypoint.sh`, `src/domain-patterns.ts`.
```

This saves ~450 tokens × 6 turns = ~2,700 tokens raw, ~21,900 effective tokens per run.

### 4. Reduce max-turns from 6 to 4

**Estimated savings:** ~80–120K effective tokens/run (~25%) for runs currently hitting the cap

With recommendation #1 eliminating bash tool calls, the agent should complete in 3–4 turns:
- Turn 1: Read pre-fetched diff + fast-path check
- Turn 2: Deep analysis (if needed)
- Turn 3: Write PR comment + call noop/add_labels

```yaml
engine:
  id: claude
  model: claude-sonnet-4-5
  max-turns: 4   # was: 6
```

## Cache Analysis (Anthropic-Specific)

Cache telemetry is not available for this workflow (`token_usage_summary` absent — api-proxy operates in passthrough mode without cache tracking).

However, the **8.1× effective/raw multiplier** indicates Anthropic's automatic prefix caching IS active: the growing context window is cached and reused within each session. This is beneficial — the first turn's large context (PR diff + system prompt) is cached for subsequent turns.

**The problem is not cache inefficiency — it's too many turns causing too many cache reads.** Reducing turns (via recommendations #1 and #4) will cut cache-read roundtrips proportionally.

| Turn | Est. Raw Tokens | Effective Tokens (8.1× factor) | Note |
|------|----------------:|-------------------------------:|------|
| 1 | ~55K | ~55K | First turn — full context |
| 2 | ~60K | ~120K | Growing context |
| 3 | ~65K | ~260K | Cache reads compound |
| 4 | ~70K | ~540K | Large cache-read |
| 5 | ~75K | ~1.1M | Dominates cost |
| 6 | ~80K | ~1.3M | Often just writing the comment |

Reducing from 6 to 3 turns cuts effective tokens by ~75%.

## Expected Impact

| Metric | Current | Projected | Savings |
|--------|---------|-----------|---------|
| Total tokens/active run (raw) | 424K | ~200K | −53% |
| Effective tokens/active run | 3.4M | ~900K | −74% |
| Cost/active run | $0.45 | ~$0.12 | −73% |
| LLM turns/active run | 6.2 | ~3 | −3 turns |
| Total weekly cost (Security Guard) | $5.36 | ~$1.45 | −$3.91/wk |

## Implementation Checklist

- [ ] Edit `.github/workflows/security-guard.md`: replace multi-line "do NOT call" rules with single hard rule (Rec #1)
- [ ] Edit `.github/workflows/security-guard.md`: add `pr-meta` pre-agent step and inject into prompt (Rec #2)
- [ ] Edit `.github/workflows/security-guard.md`: condense "Repository Context" section to 3 lines (Rec #3)
- [ ] Edit `.github/workflows/security-guard.md`: set `max-turns: 4` (Rec #4)
- [ ] Recompile: `gh aw compile .github/workflows/security-guard.md`
- [ ] Post-process: `npx tsx scripts/ci/postprocess-smoke-workflows.ts`
- [ ] Verify CI passes on PR
- [ ] Compare token usage on new run vs baseline (target: ≤200K raw tokens/active run)




> Generated by [Daily Claude Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/26943846231) · sonnet46 1.7M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fclaude-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡ Claude Token Optimization2026-06-04 — security-guard #4306

Target Workflow: `security-guard`

Current Configuration

Key Problem: Agent Ignores the Pre-Fetched Diff

Recommendations

1. Enforce fast-path: block ALL bash diff/API calls in the prompt

2. Pre-fetch PR metadata in `steps:` to eliminate `gh pr view` calls

3. Remove verbose "Repository Context" section

4. Reduce max-turns from 6 to 4

Cache Analysis (Anthropic-Specific)

Expected Impact

Implementation Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting	Value
Tools loaded	`github` (pull_requests + repos toolsets) — ~10 tools
Tools actually used	`mcp__github__get_pull_request_diff`, `mcp__github__list_pull_request_files`, `mcp__github__get_pull_request`
Bash tools (actual)	`gh pr diff`, `gh pr view`, `gh api repos/...` — all avoidable
Network groups	`github` only ✅
Pre-agent steps	Yes — fetches up to 100 KB of PR diff
Prompt size	~4,200 chars (~1,050 tokens static) + injected diff (up to ~25K tokens)

Turn	Est. Raw Tokens	Effective Tokens (8.1× factor)	Note
1	~55K	~55K	First turn — full context
2	~60K	~120K	Growing context
3	~65K	~260K	Cache reads compound
4	~70K	~540K	Large cache-read
5	~75K	~1.1M	Dominates cost
6	~80K	~1.3M	Often just writing the comment

Metric	Current	Projected	Savings
Total tokens/active run (raw)	424K	~200K	−53%
Effective tokens/active run	3.4M	~900K	−74%
Cost/active run	$0.45	~$0.12	−73%
LLM turns/active run	6.2	~3	−3 turns
Total weekly cost (Security Guard)	$5.36	~$1.45	−$3.91/wk

Uh oh!

⚡ Claude Token Optimization2026-06-04 — security-guard #4306

Description

Target Workflow: security-guard

Current Configuration

Key Problem: Agent Ignores the Pre-Fetched Diff

Recommendations

1. Enforce fast-path: block ALL bash diff/API calls in the prompt

2. Pre-fetch PR metadata in steps: to eliminate gh pr view calls

3. Remove verbose "Repository Context" section

4. Reduce max-turns from 6 to 4

Cache Analysis (Anthropic-Specific)

Expected Impact

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `security-guard`

2. Pre-fetch PR metadata in `steps:` to eliminate `gh pr view` calls