Agentic Workflow Portfolio Yield Report
Date: 2026-05-26 | Workflows analyzed: 236 | Evidence quality: Low (telemetry backend unavailable at precompute time; Sentry live data used for validation)
Executive Summary
The gh-aw workflow portfolio of 236 workflows has a computed yield of −89.12, driven by extreme overlap drag (356.57). The portfolio operates at 97.2% agentic fraction with only 2.8% deterministic logic — near-zero guardrails across most workflows. Fragmentation is maximal (1.0) and governance drag is critically high (0.91).
The precompute reported 0% telemetry coverage, but live Sentry validation contradicts this: ~24 workflows have active spans in the past 7 days, confirming the Grafana telemetry backend was unavailable during precompute rather than workflows lacking instrumentation. Sentry identifies Smoke Copilot as the highest-failure workflow (64 failures, 16.3% failure rate) and Deployment Incident Monitor as a degraded operational workflow (6 failures, strict mode disabled).
The portfolio's central structural problem is not individual workflow quality but fragmentation: 189 of 236 workflows are merge candidates due to overlapping semantic scope, and only 3 meet the bar for unconditional keep.
Portfolio Health
| Metric |
Value |
Signal |
| Workflow count |
236 |
🔴 Oversized |
| Portfolio yield |
−89.12 |
🔴 Critical |
| Overlap drag |
356.57 |
🔴 Critical |
| Average agentic fraction |
97.2% |
🔴 Near-zero determinism |
| Maintenance drag |
0.89 |
🔴 High |
| Portfolio cost |
0.60 |
🟡 Elevated |
| Portfolio risk |
0.67 |
🟡 Elevated |
| Fragmentation |
1.0 |
🔴 Maximum |
| Governance drag |
0.91 |
🔴 Critical |
| Reuse signal |
0.89 |
🟡 Mixed (high but untested) |
| Trust concentration |
0.24 |
🔴 Low |
| Telemetry observed (Sentry, 7d) |
~24 workflows |
🟡 Partial |
Workflow Portfolio
Summary by recommendation category:
| Category |
Count |
Rationale |
| ✅ Keep |
3 |
High yield, low overlap, acceptable risk |
| ✏️ Revise |
9 |
Recoverable with targeted hardening |
| 🔀 Merge |
189 |
Overlap-driven consolidation candidates |
| 📡 Instrument |
29 |
Missing telemetry or safe-outputs |
| 🗑️ Retire |
6 |
Risk ≥ 0.80, all compliance checks failing |
Keep
| Workflow |
Notes |
aw-portfolio-yield.md |
Core governance infrastructure; lowest agentic fraction (0.374) |
example-permissions-warning.md |
Canonical permissions reference; fix: enable strict mode |
terminal-stylist.md |
Unique utility with no overlap peers; fix: add safe-outputs |
Retire
| Workflow |
Risk |
Critical Issues |
ace-editor.md |
1.0 |
No strict mode, no timeout, no safe-outputs |
copilot-pr-nlp-analysis.md |
1.0 |
Stale lockfile, no strict mode, no safe-outputs |
daily-semgrep-scan.md |
0.80 |
No strict mode, no timeout (security workflow) |
daily-sentrux-report.md |
0.94 |
No strict mode, no timeout, no safe-outputs |
firewall.md |
0.98 |
Stale lockfile, no strict mode, no safe-outputs |
test-workflow.md |
1.0 |
No strict mode, no timeout, no safe-outputs, no value |
Top Revise Candidates
| Workflow |
Risk |
Issues |
Sentry |
deployment-incident-monitor.md |
0.73 |
No strict mode |
6 failures/7d 🔴 |
constraint-solving-potd.md |
0.84 |
No strict mode, no timeout |
2 failures/7d |
daily-model-inventory.md |
0.68 |
Stale lockfile, highest cost (0.675) |
— |
weekly-editors-health-check.md |
0.41 |
High cost (0.672), no telemetry |
— |
dependabot-repair.md |
0.81 |
No strict mode, no timeout |
— |
Overlap Clusters
Three high-priority consolidation clusters identified (overlap > 0.75):
Cluster 1 — OTel Instrumentation Advisors (overlap: 0.978) 🔴 Near-duplicate
daily-grafana-otel-instrumentation-advisor.md <─┐
daily-otel-instrumentation-advisor.md ───┘ Merge: Grafana as optional backend flag
Cluster 2 — Smoke Agent Scope Tests (overlap: 0.787)
smoke-agent-all-merged.md <─┐
smoke-agent-all-none.md ──┤ Merge: parameterize scope (all/public, merged/none)
smoke-agent-public-none.md ──┘
Cluster 3 — Engine Smoke Tests (overlap: 0.745)
smoke-antigravity.md <─┐
smoke-gemini.md ──┤ Merge: engine matrix workflow
smoke-pi.md ──┘
+ smoke-opencode.md, smoke-crush.md, smoke-claude.md, smoke-copilot-arm.md, smoke-codex.md
Additional high-overlap pairs (0.60–0.75):
| Pair |
Overlap |
Action |
smoke-create-cross-repo-pr + smoke-update-cross-repo-pr |
0.682 |
Merge CRUD pair |
research + scout |
0.655 |
Differentiate or merge |
daily-safe-output-optimizer + safe-output-health |
0.635 |
Merge with mode flag |
copilot-pr-prompt-analysis + prompt-clustering-analysis |
0.632 |
Merge |
grumpy-reviewer + pr-code-quality-reviewer |
0.622 |
Retire grumpy into active PR reviewer |
dependabot-campaign + dependabot-worker |
0.639 |
Evaluate orchestrator/worker split |
Episode-Level Observations
- No
episode_metrics were computed in precompute — cross-workflow coordination patterns are not yet tracked.
- The smoke-engine cluster (antigravity, gemini, pi, opencode, crush, claude, copilot-arm, codex) spans 8 workflows testing different AI engines with near-identical structure — a natural episode candidate for a single parameterized smoke suite.
- The
daily-* prefix group (30+ workflows) shows no episode-level coordination despite sharing daily cadence — potential for a unified daily maintenance episode with shared health reporting.
- Sentry data shows Smoke CI (1970 spans) and Test Quality Sentinel (991 spans) running at similar cadences — these may form a natural CI quality episode.
Organizational Health Signals
| Signal |
Score |
Interpretation |
| Fragmentation |
1.00 |
Maximum — every workflow is a silo |
| Governance drag |
0.91 |
Critically elevated by broad scope + missing telemetry |
| Reuse signal |
0.89 |
High declared reuse, unconfirmed (telemetry backend gap) |
| Trust concentration |
0.24 |
Low — execution trust spread thinly across many low-confidence workflows |
The portfolio exhibits a classic sprawl pattern: rapid workflow creation without consolidation, no episode-level coordination, and telemetry gaps masking actual value delivery.
Deterministic vs Agentic Findings
- 97.2% portfolio agentic fraction — near-zero deterministic validation across 236 workflows
- Only
aw-portfolio-yield.md achieves a meaningfully low agentic fraction (0.374) with real pre/post-agent guardrails
- All 6 retirement candidates have
agentic_fraction = 1.0 and zero deterministic steps
- Revise candidates average risk 0.628 vs retire candidates 0.954 — revise group is recoverable
- The precompute
telemetry_coverage = 0.0 is a Grafana backend availability gap — Sentry live data confirms real telemetry for ~24 workflows
Highest-Value Actions
-
🔀 Merge daily-grafana-otel-instrumentation-advisor + daily-otel-instrumentation-advisor — 0.978 overlap, near-duplicate. Make Grafana an optional backend parameter.
-
🔀 Parameterize the smoke-agent-all-* cluster into one matrix workflow with scope flags — all 3 are confirmed active, 0.787 overlap.
-
🗑️ Retire firewall.md, ace-editor.md, test-workflow.md — risk ≥ 0.981, all compliance checks failing, zero yield.
-
📡 Instrument daily-compiler-quality.md and deployment-incident-monitor.md — both have confirmed Sentry failures but insufficient telemetry to diagnose.
-
🔧 Fix deployment-incident-monitor.md strict mode — 6 Sentry failures in 7 days with strict mode disabled. Highest operational risk in production workflows.
Retirement Candidates
| Workflow |
Risk |
Blockers to Retirement |
ace-editor.md |
1.0 |
None — no consumers observed |
copilot-pr-nlp-analysis.md |
1.0 |
Verify not referenced by other workflows |
daily-semgrep-scan.md |
0.80 |
Confirm security scanning covered elsewhere |
daily-sentrux-report.md |
0.94 |
Archive last report output before retiring |
firewall.md |
0.98 |
Stale lockfile suggests already inactive |
test-workflow.md |
1.0 |
None |
Consolidation Opportunities
| Priority |
Workflows |
Overlap |
Estimated Reduction |
| 🔴 Critical |
grafana-otel + otel-advisor |
0.978 |
2 → 1 |
| 🔴 Critical |
smoke-agent-all-* (3 workflows) |
0.787 |
3 → 1 |
| 🟡 High |
smoke engine cluster (8 workflows) |
0.65–0.745 |
8 → 1 matrix |
| 🟡 High |
smoke CRUD pair |
0.682 |
2 → 1 |
| 🟡 High |
research + scout |
0.655 |
2 → 1 |
| 🟢 Medium |
safe-output-optimizer + safe-output-health |
0.635 |
2 → 1 |
| 🟢 Medium |
copilot-pr-prompt + prompt-clustering |
0.632 |
2 → 1 |
| 🟢 Medium |
grumpy-reviewer → pr-code-quality-reviewer |
0.622 |
2 → 1 |
Potential portfolio reduction: 236 → ~150 workflows if all Tier 1–2 consolidations are executed.
Instrumentation Gaps
-
Precompute telemetry_coverage = 0.0 is a backend gap, not a true gap — Sentry confirms ~24 active workflows. Future precomputes should fall back to Sentry when Grafana is unavailable.
-
gh-aw.run.status null for ~50% of Sentry spans — status field population in send_otlp_span.cjs is inconsistently invoked, masking half of all failure rates.
-
Zero Sentry error events — gh-aw workflows only emit span-level gh-aw.run.status:failure, never Sentry error events. This prevents alerting and issue auto-creation.
-
29 workflows have no OTEL spans. Priority targets:
agentic-token-audit.md — has safe-outputs, missing OTEL
daily-agentrx-trace-optimizer.md — ironic gap for a trace optimizer
daily-compiler-quality.md — 4 confirmed Sentry failures need context
copilot-cli-deep-research.md — research outcomes not tracked
Deterministic Portfolio JSON
Click to expand precompute summary
{
"workflow_count": 236,
"portfolio_yield": -89.1218,
"portfolio_cost": 0.6005,
"portfolio_risk": 0.6653,
"portfolio_maintenance_drag": 0.8877,
"portfolio_overlap_drag": 356.5721,
"average_agentic_fraction": 0.9716,
"average_deterministic_fraction": 0.0284,
"observability_declared_coverage": 0.8771,
"telemetry_coverage": 0.0,
"evidence_quality": "low",
"organizational_health": {
"fragmentation": 1.0,
"governance_drag": 0.9081,
"reuse": 0.8941,
"trust_concentration": 0.2425
},
"recommendations_seed": {
"keep": 3,
"revise": 9,
"merge": 189,
"instrument": 29,
"retire": 6
},
"top_overlap_cluster": {
"workflows": [
"daily-grafana-otel-instrumentation-advisor.md",
"daily-otel-instrumentation-advisor.md"
],
"max_overlap": 0.9777
},
"sentry_validation": {
"active_workflows_7d": "~24",
"highest_failure_rate": "Smoke Copilot: 64 failures, 16.3%",
"highest_operational_failures": "Deployment Incident Monitor: 6 failures/7d",
"portfolio_yield_run_failures": 4,
"run.status_null_rate": "~50% of spans"
}
}
Generated by 📊 Agentic Workflow Portfolio Yield · sonnet46 1.9M · ◷
Agentic Workflow Portfolio Yield Report
Date: 2026-05-26 | Workflows analyzed: 236 | Evidence quality: Low (telemetry backend unavailable at precompute time; Sentry live data used for validation)
Executive Summary
The gh-aw workflow portfolio of 236 workflows has a computed yield of −89.12, driven by extreme overlap drag (356.57). The portfolio operates at 97.2% agentic fraction with only 2.8% deterministic logic — near-zero guardrails across most workflows. Fragmentation is maximal (1.0) and governance drag is critically high (0.91).
The precompute reported 0% telemetry coverage, but live Sentry validation contradicts this: ~24 workflows have active spans in the past 7 days, confirming the Grafana telemetry backend was unavailable during precompute rather than workflows lacking instrumentation. Sentry identifies Smoke Copilot as the highest-failure workflow (64 failures, 16.3% failure rate) and Deployment Incident Monitor as a degraded operational workflow (6 failures, strict mode disabled).
The portfolio's central structural problem is not individual workflow quality but fragmentation: 189 of 236 workflows are merge candidates due to overlapping semantic scope, and only 3 meet the bar for unconditional keep.
Portfolio Health
Workflow Portfolio
Summary by recommendation category:
Keep
aw-portfolio-yield.mdexample-permissions-warning.mdterminal-stylist.mdRetire
ace-editor.mdcopilot-pr-nlp-analysis.mddaily-semgrep-scan.mddaily-sentrux-report.mdfirewall.mdtest-workflow.mdTop Revise Candidates
deployment-incident-monitor.mdconstraint-solving-potd.mddaily-model-inventory.mdweekly-editors-health-check.mddependabot-repair.mdOverlap Clusters
Three high-priority consolidation clusters identified (overlap > 0.75):
Cluster 1 — OTel Instrumentation Advisors (overlap: 0.978) 🔴 Near-duplicate
Cluster 2 — Smoke Agent Scope Tests (overlap: 0.787)
Cluster 3 — Engine Smoke Tests (overlap: 0.745)
Additional high-overlap pairs (0.60–0.75):
smoke-create-cross-repo-pr+smoke-update-cross-repo-prresearch+scoutdaily-safe-output-optimizer+safe-output-healthcopilot-pr-prompt-analysis+prompt-clustering-analysisgrumpy-reviewer+pr-code-quality-reviewerdependabot-campaign+dependabot-workerEpisode-Level Observations
episode_metricswere computed in precompute — cross-workflow coordination patterns are not yet tracked.daily-*prefix group (30+ workflows) shows no episode-level coordination despite sharing daily cadence — potential for a unified daily maintenance episode with shared health reporting.Organizational Health Signals
The portfolio exhibits a classic sprawl pattern: rapid workflow creation without consolidation, no episode-level coordination, and telemetry gaps masking actual value delivery.
Deterministic vs Agentic Findings
aw-portfolio-yield.mdachieves a meaningfully low agentic fraction (0.374) with real pre/post-agent guardrailsagentic_fraction = 1.0and zero deterministic stepstelemetry_coverage = 0.0is a Grafana backend availability gap — Sentry live data confirms real telemetry for ~24 workflowsHighest-Value Actions
🔀 Merge
daily-grafana-otel-instrumentation-advisor+daily-otel-instrumentation-advisor— 0.978 overlap, near-duplicate. Make Grafana an optional backend parameter.🔀 Parameterize the
smoke-agent-all-*cluster into one matrix workflow with scope flags — all 3 are confirmed active, 0.787 overlap.🗑️ Retire
firewall.md,ace-editor.md,test-workflow.md— risk ≥ 0.981, all compliance checks failing, zero yield.📡 Instrument
daily-compiler-quality.mdanddeployment-incident-monitor.md— both have confirmed Sentry failures but insufficient telemetry to diagnose.🔧 Fix
deployment-incident-monitor.mdstrict mode — 6 Sentry failures in 7 days with strict mode disabled. Highest operational risk in production workflows.Retirement Candidates
ace-editor.mdcopilot-pr-nlp-analysis.mddaily-semgrep-scan.mddaily-sentrux-report.mdfirewall.mdtest-workflow.mdConsolidation Opportunities
Potential portfolio reduction: 236 → ~150 workflows if all Tier 1–2 consolidations are executed.
Instrumentation Gaps
Precompute telemetry_coverage = 0.0 is a backend gap, not a true gap — Sentry confirms ~24 active workflows. Future precomputes should fall back to Sentry when Grafana is unavailable.
gh-aw.run.statusnull for ~50% of Sentry spans — status field population insend_otlp_span.cjsis inconsistently invoked, masking half of all failure rates.Zero Sentry error events — gh-aw workflows only emit span-level
gh-aw.run.status:failure, never Sentry error events. This prevents alerting and issue auto-creation.29 workflows have no OTEL spans. Priority targets:
agentic-token-audit.md— has safe-outputs, missing OTELdaily-agentrx-trace-optimizer.md— ironic gap for a trace optimizerdaily-compiler-quality.md— 4 confirmed Sentry failures need contextcopilot-cli-deep-research.md— research outcomes not trackedDeterministic Portfolio JSON
Click to expand precompute summary
{ "workflow_count": 236, "portfolio_yield": -89.1218, "portfolio_cost": 0.6005, "portfolio_risk": 0.6653, "portfolio_maintenance_drag": 0.8877, "portfolio_overlap_drag": 356.5721, "average_agentic_fraction": 0.9716, "average_deterministic_fraction": 0.0284, "observability_declared_coverage": 0.8771, "telemetry_coverage": 0.0, "evidence_quality": "low", "organizational_health": { "fragmentation": 1.0, "governance_drag": 0.9081, "reuse": 0.8941, "trust_concentration": 0.2425 }, "recommendations_seed": { "keep": 3, "revise": 9, "merge": 189, "instrument": 29, "retire": 6 }, "top_overlap_cluster": { "workflows": [ "daily-grafana-otel-instrumentation-advisor.md", "daily-otel-instrumentation-advisor.md" ], "max_overlap": 0.9777 }, "sentry_validation": { "active_workflows_7d": "~24", "highest_failure_rate": "Smoke Copilot: 64 failures, 16.3%", "highest_operational_failures": "Deployment Incident Monitor: 6 failures/7d", "portfolio_yield_run_failures": 4, "run.status_null_rate": "~50% of spans" } }