You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AgentRx normalized the 30 most recent gh-aw workflow runs (4.3h wall, 25.7M tokens, $6,987 AIC) into a canonical IR trajectory and combined it with per-run MCP audit telemetry. The fleet had exactly one hard failure, and its root cause is concrete and generalizable: the Daily Safe Outputs Git Simulator run was marked failurenot because the agent failed — the agent succeeded (all 4 git configs PASS, read-only noop) — but because the downstream push_repo_memory job was rejected 4× by a repository ruleset (GH013: "Commits must have verified signatures") when pushing the first commit of the new orphan branch memory/git-simulator.
The push retry loop in push_repo_memory.cjs treats this permanent, non-retryable policy rejection as transient and burns 4 exponential-backoff attempts (~36s) before failing. The highest-impact fix is to detect rule-violation rejections and fail fast, plus seed new memory/* branches with a signed commit (or exempt memory/** from the verified-signatures ruleset) so repo-memory-pushing workflows succeed on first run.
AgentRx Evidence
Critical step:push_repo_memory job → step "Push repo-memory changes (default)" (deterministic post-agent step, IR step set #5 / run #4 in the fleet trajectory). The agent job (agent, detection, safe_outputs) all concluded success; only push_repo_memory concluded failure.
Failure category: Reliability — non-retryable git push rejection misclassified as transient (retry/backoff misfire). Remote error: GH013: Repository rule violations found for refs/heads/memory/git-simulator ... Commits must have verified signatures ... push declined due to repository rule violations.
Frequency / impact: 1/1 = 100% failure rate for this workflow (AgentRx flagged it as the fleet's sole reliability hotspot, severity=high). The failed run still consumed 558,293 tokens / 152.6 AIC and 16 turns; the retry loop added ~36s of wasted CI before the inevitable failure. Identical remote rejected errors logged on attempts 1/4, 2/4, 3/4, 4/4.
Representative runs:[§27397597917](https://github.com/github/gh-aw/actions/runs/27397597917) (the failure).
Secondary findings (not the recommended fix)
Network friction:Documentation Noob Tester blocked 85/288 requests (30%) — all to Google domains (accounts.google.com=20, android.clients.google.com=13, www.google.com, clients2.google.com, safebrowsing...), i.e. Chromium/Playwright background telemetry. Wasteful but non-fatal; a candidate for a follow-up (suppress browser telemetry via launch flags).
Telemetry quality: the failed run's audit produced 14 garbage subagent_model_requests (f.write, os.path.getsize, len, ...) inferred from agent-stdio.log — a parser artifact, not a real model mismatch. The audit also mislabeled the run as "failed before agent activation," which is incorrect.
Config gap (worked around): the run's noop notes config-simulator sub-agent type is not registered in this harness; it fell back to general-purpose agents.
AgentRx Artifacts
Run dir:/tmp/gh-aw/agent/agentrx/runs/gh-aw-daily
IR summary (Stage 1/6 — completed, deterministic): The MCP run fleet was converted to a single canonical IR trajectory gh-aw-daily-agent-fleet with 32 steps (1 instruction turn + 30 per-run turns + 1 fleet-insight turn), ir_used_llm_fallback=false, ir_from_markdown=true. Each step encodes a run's engine, status, conclusion, duration, turns, token_usage, error_count, and api calls. Validation: 1 trajectory loaded, 1 valid.
Invariant / checker / judge stages (Stages 2–5):Could not run in this environment.static, dynamic, check, and judge all require the GitHub Copilot CLI LLM endpoint (RuntimeError: 'copilot' CLI not found on PATH); Azure/TRAPI endpoints need env vars that are unset, and gh is unauthenticated. No check.json / judge.json were produced. Per the runbook's degraded-mode guidance, the recommendation below is grounded in the deterministic IR plus per-run MCP audit telemetry (which carries the exact remote error logs) rather than LLM-generated invariants.
Known limitations: AgentRx install required CPython (the default python is PyPy 7.3.23, for which the jiter dependency has no wheel and its Rust source build is firewall-blocked); the venv was rebuilt with CPython 3.12. Root-cause classification (judge) was done manually from audit logs in lieu of the LLM judge.
Failure-pattern classification (manual, in lieu of failure-pattern-classifier):
violation
evidence
fix_type
rationale
Push to orphan memory/git-simulator rejected by ruleset
remote: GH013 ... Commits must have verified signatures ... push declined (run 27397597917, push_repo_memory step 7)
precondition check before expensive retry
First commit on a new orphan branch is unsigned; a verified-signatures ruleset will reject it deterministically.
stdio-log heuristic misreads Python snippets as model requests; use token_usage.jsonl.
Recommended Optimization
One change: In actions/setup/js/push_repo_memory.cjs, make the push retry loop (lines ~481–500) fail fast on non-retryable repository-rule rejections instead of retrying. Before the setTimeout backoff, inspect errMsg and, if it matches a permanent policy rejection (GH013, push declined due to repository rule violations, or must have verified signatures), break out of the loop immediately and core.setFailed with the actionable seed-the-branch guidance that push_signed_commits.cjs:298-301 already emits.
Reliability remediation (to make the run actually pass): seed each new memory/* orphan branch with one signed commit before the first scheduled run (the code already prints the exact git switch --orphan ... commands), or add a ruleset bypass / exempt memory/** from the "Require verified signatures" rule for the gh-aw bot identity. The GraphQL createCommitOnBranch path produces server-signed commits, but the orphan first-push falls back to a plain git push (push_signed_commits.cjs:282-301), which the ruleset rejects.
Why highest impact: It targets the fleet's only hard failure and a class of failures (every workflow that pushes a brand-new memory/* branch under this ruleset hits the identical wall). The retry change is a few lines, removes wasted CI time, and converts a confusing "failed after 4 retries" into a single actionable error; the seeding/exemption converts a 100% failure into success.
actions/setup/js/push_signed_commits.cjs (orphan-branch path ~282–301) — surface a typed/non-retryable error so the caller can branch on it.
Repo ruleset settings (memory/**) — seed or exempt for reliability.
Validation Plan
Confirm fast-fail: re-run Daily Safe Outputs Git Simulator on a still-unseeded memory/* branch; expect 1 push attempt (not 4) and a clear core.setFailed naming GH013 + seed instructions. Check the push_repo_memory step log for a single remote rejected line.
Confirm reliability fix: after seeding memory/git-simulator with a signed commit (or exempting memory/**), re-run and expect push_repo_memory to conclude success and the overall run conclusion=success.
Expected metric changes:Daily Safe Outputs Git Simulator failure rate 100% → 0%; push_repo_memory wasted retry time ~36s → ~0; fleet total_errors 1 → 0; no change to agent token spend (the agent was never the problem).
References
§27397597917 — Daily Safe Outputs Git Simulator (the failure; GH013 ruleset rejection in push_repo_memory)
Executive Summary
AgentRx normalized the 30 most recent gh-aw workflow runs (4.3h wall, 25.7M tokens, $6,987 AIC) into a canonical IR trajectory and combined it with per-run MCP audit telemetry. The fleet had exactly one hard failure, and its root cause is concrete and generalizable: the
Daily Safe Outputs Git Simulatorrun was markedfailurenot because the agent failed — the agent succeeded (all 4 git configs PASS, read-only noop) — but because the downstreampush_repo_memoryjob was rejected 4× by a repository ruleset (GH013: "Commits must have verified signatures") when pushing the first commit of the new orphan branchmemory/git-simulator.The push retry loop in
push_repo_memory.cjstreats this permanent, non-retryable policy rejection as transient and burns 4 exponential-backoff attempts (~36s) before failing. The highest-impact fix is to detect rule-violation rejections and fail fast, plus seed newmemory/*branches with a signed commit (or exemptmemory/**from the verified-signatures ruleset) so repo-memory-pushing workflows succeed on first run.AgentRx Evidence
push_repo_memoryjob → step "Push repo-memory changes (default)" (deterministic post-agent step, IR step set#5/ run#4in the fleet trajectory). The agent job (agent,detection,safe_outputs) all concludedsuccess; onlypush_repo_memoryconcludedfailure.GH013: Repository rule violations found for refs/heads/memory/git-simulator ... Commits must have verified signatures ... push declined due to repository rule violations.remote rejectederrors logged on attempts 1/4, 2/4, 3/4, 4/4.[§27397597917](https://github.com/github/gh-aw/actions/runs/27397597917)(the failure).Secondary findings (not the recommended fix)
Documentation Noob Testerblocked 85/288 requests (30%) — all to Google domains (accounts.google.com=20,android.clients.google.com=13,www.google.com,clients2.google.com,safebrowsing...), i.e. Chromium/Playwright background telemetry. Wasteful but non-fatal; a candidate for a follow-up (suppress browser telemetry via launch flags).subagent_model_requests(f.write,os.path.getsize,len, ...) inferred fromagent-stdio.log— a parser artifact, not a real model mismatch. The audit also mislabeled the run as "failed before agent activation," which is incorrect.config-simulatorsub-agent type is not registered in this harness; it fell back to general-purpose agents.AgentRx Artifacts
Run dir:
/tmp/gh-aw/agent/agentrx/runs/gh-aw-dailyIR summary (Stage 1/6 — completed, deterministic): The MCP run fleet was converted to a single canonical IR trajectory
gh-aw-daily-agent-fleetwith 32 steps (1 instruction turn + 30 per-run turns + 1 fleet-insight turn),ir_used_llm_fallback=false,ir_from_markdown=true. Each step encodes a run's engine, status, conclusion, duration, turns, token_usage, error_count, and api calls. Validation: 1 trajectory loaded, 1 valid.Invariant / checker / judge stages (Stages 2–5): Could not run in this environment.
static,dynamic,check, andjudgeall require the GitHub Copilot CLI LLM endpoint (RuntimeError: 'copilot' CLI not found on PATH); Azure/TRAPI endpoints need env vars that are unset, andghis unauthenticated. Nocheck.json/judge.jsonwere produced. Per the runbook's degraded-mode guidance, the recommendation below is grounded in the deterministic IR plus per-run MCPaudittelemetry (which carries the exact remote error logs) rather than LLM-generated invariants.Known limitations: AgentRx install required CPython (the default
pythonis PyPy 7.3.23, for which thejiterdependency has no wheel and its Rust source build is firewall-blocked); the venv was rebuilt with CPython 3.12. Root-cause classification (judge) was done manually from audit logs in lieu of the LLM judge.Failure-pattern classification (manual, in lieu of
failure-pattern-classifier):memory/git-simulatorrejected by rulesetremote: GH013 ... Commits must have verified signatures ... push declined(run 27397597917, push_repo_memory step 7)Push failed (attempt 1/4...4/4), retrying;push_repo_memory.cjs:477-500(MAX_RETRIES=3, exp backoff)documentation-noob-tester blocked=85 total=288; blocked Google domainssubagent_model_requests: f.write/os.path.getsize/len...,REQUESTED_MODEL_NOT_OBSERVEDtoken_usage.jsonl.Recommended Optimization
One change: In
actions/setup/js/push_repo_memory.cjs, make the push retry loop (lines ~481–500) fail fast on non-retryable repository-rule rejections instead of retrying. Before thesetTimeoutbackoff, inspecterrMsgand, if it matches a permanent policy rejection (GH013,push declined due to repository rule violations, ormust have verified signatures), break out of the loop immediately andcore.setFailedwith the actionable seed-the-branch guidance thatpush_signed_commits.cjs:298-301already emits.Reliability remediation (to make the run actually pass): seed each new
memory/*orphan branch with one signed commit before the first scheduled run (the code already prints the exactgit switch --orphan ...commands), or add a ruleset bypass / exemptmemory/**from the "Require verified signatures" rule for the gh-aw bot identity. The GraphQLcreateCommitOnBranchpath produces server-signed commits, but the orphan first-push falls back to a plaingit push(push_signed_commits.cjs:282-301), which the ruleset rejects.Why highest impact: It targets the fleet's only hard failure and a class of failures (every workflow that pushes a brand-new
memory/*branch under this ruleset hits the identical wall). The retry change is a few lines, removes wasted CI time, and converts a confusing "failed after 4 retries" into a single actionable error; the seeding/exemption converts a 100% failure into success.Where to implement:
actions/setup/js/push_repo_memory.cjs(retry loop ~481–500) — non-retryable-error short-circuit.actions/setup/js/push_signed_commits.cjs(orphan-branch path ~282–301) — surface a typed/non-retryable error so the caller can branch on it.memory/**) — seed or exempt for reliability.Validation Plan
Daily Safe Outputs Git Simulatoron a still-unseededmemory/*branch; expect 1 push attempt (not 4) and a clearcore.setFailednaming GH013 + seed instructions. Check thepush_repo_memorystep log for a singleremote rejectedline.memory/git-simulatorwith a signed commit (or exemptingmemory/**), re-run and expectpush_repo_memoryto concludesuccessand the overall runconclusion=success.Daily Safe Outputs Git Simulatorfailure rate 100% → 0%;push_repo_memorywasted retry time ~36s → ~0; fleettotal_errors1 → 0; no change to agent token spend (the agent was never the problem).References
push_repo_memory)Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
index.crates.ioSee Network Configuration for more information.