diff --git a/.github/workflows/discussion-task-mining.campaign.lock.yml b/.github/workflows/discussion-task-mining.campaign.lock.yml index 73a3efc78c7..20bc83a29ca 100644 --- a/.github/workflows/discussion-task-mining.campaign.lock.yml +++ b/.github/workflows/discussion-task-mining.campaign.lock.yml @@ -817,6 +817,16 @@ jobs: 8. Only predefined project fields may be updated. 9. **Project Update Instructions take precedence for all project writes.** + ### Why These Principles Matter + + **Workers are immutable** - Allows reuse across campaigns without coupling. You coordinate existing workflows, don't modify them. + + **Reads and writes are separate** - Prevents race conditions and inconsistent state. Always read all data first, then make all writes. + + **Idempotent operation** - Campaign can be re-run safely if interrupted. The orchestrator picks up where it left off using the cursor. + + **Only predefined fields** - Prevents accidental project board corruption. The orchestrator only updates fields it's configured to manage. + --- ## Required Phases (Execute In Order) @@ -911,19 +921,27 @@ jobs: - Closed (issue/discussion) → `Done` - Merged (PR) → `Done` + **Why use explicit GitHub state?** - GitHub is the source of truth for work status. Inferring status from other signals (labels, comments) would be unreliable and could cause incorrect tracking. + 6) Calculate required date fields for each item (per Project Update Instructions): - `start_date`: format `created_at` as `YYYY-MM-DD` - `end_date`: - if closed/merged → format `closed_at`/`merged_at` as `YYYY-MM-DD` - if open → **today's date** formatted `YYYY-MM-DD` (required for roadmap view) + **Why use today for open items?** - GitHub Projects requires end_date for roadmap views. Using today's date shows the item is actively tracked and updates automatically each run until completion. + 7) Do NOT implement idempotency by comparing against the board. You may compare for reporting only. + **Why no comparison for idempotency?** - The safe-output system handles deduplication. Comparing would add complexity and potential race conditions. Trust the infrastructure. + 8) Apply write budget: - If `MaxProjectUpdatesPerRun > 0`, select at most that many items this run using deterministic order (e.g., oldest `updated_at` first; tie-break by ID/number). - Defer remaining items to next run via cursor. + **Why use deterministic order?** - Ensures predictable behavior and prevents starvation. Oldest items are processed first, ensuring fair treatment of all work items. The cursor saves progress for next run. + ### Phase 3 — Write State (Execution) [WRITES ONLY] 9) For each selected item, send an `update-project` request. @@ -1112,6 +1130,12 @@ jobs: ### Adding New Issues + PROMPT_EOF + - name: Append prompt (part 2) + env: + GH_AW_PROMPT: /tmp/gh-aw/aw-prompts/prompt.txt + run: | + cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" When first adding an item to the project, you MUST write ALL required fields. ```yaml @@ -1151,12 +1175,6 @@ jobs: campaign_id: "discussion-task-mining" content_type: "issue" # or "pull_request" content_number: 123 - PROMPT_EOF - - name: Append prompt (part 2) - env: - GH_AW_PROMPT: /tmp/gh-aw/aw-prompts/prompt.txt - run: | - cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" fields: status: "Done" ``` diff --git a/.github/workflows/docs-quality-maintenance-project67.campaign.lock.yml b/.github/workflows/docs-quality-maintenance-project67.campaign.lock.yml index 61c94967c37..c95e2d8da67 100644 --- a/.github/workflows/docs-quality-maintenance-project67.campaign.lock.yml +++ b/.github/workflows/docs-quality-maintenance-project67.campaign.lock.yml @@ -827,6 +827,16 @@ jobs: 8. Only predefined project fields may be updated. 9. **Project Update Instructions take precedence for all project writes.** + ### Why These Principles Matter + + **Workers are immutable** - Allows reuse across campaigns without coupling. You coordinate existing workflows, don't modify them. + + **Reads and writes are separate** - Prevents race conditions and inconsistent state. Always read all data first, then make all writes. + + **Idempotent operation** - Campaign can be re-run safely if interrupted. The orchestrator picks up where it left off using the cursor. + + **Only predefined fields** - Prevents accidental project board corruption. The orchestrator only updates fields it's configured to manage. + --- ## Required Phases (Execute In Order) @@ -921,19 +931,27 @@ jobs: - Closed (issue/discussion) → `Done` - Merged (PR) → `Done` + **Why use explicit GitHub state?** - GitHub is the source of truth for work status. Inferring status from other signals (labels, comments) would be unreliable and could cause incorrect tracking. + 6) Calculate required date fields for each item (per Project Update Instructions): - `start_date`: format `created_at` as `YYYY-MM-DD` - `end_date`: - if closed/merged → format `closed_at`/`merged_at` as `YYYY-MM-DD` - if open → **today's date** formatted `YYYY-MM-DD` (required for roadmap view) + **Why use today for open items?** - GitHub Projects requires end_date for roadmap views. Using today's date shows the item is actively tracked and updates automatically each run until completion. + 7) Do NOT implement idempotency by comparing against the board. You may compare for reporting only. + **Why no comparison for idempotency?** - The safe-output system handles deduplication. Comparing would add complexity and potential race conditions. Trust the infrastructure. + 8) Apply write budget: - If `MaxProjectUpdatesPerRun > 0`, select at most that many items this run using deterministic order (e.g., oldest `updated_at` first; tie-break by ID/number). - Defer remaining items to next run via cursor. + **Why use deterministic order?** - Ensures predictable behavior and prevents starvation. Oldest items are processed first, ensuring fair treatment of all work items. The cursor saves progress for next run. + ### Phase 3 — Write State (Execution) [WRITES ONLY] 9) For each selected item, send an `update-project` request. @@ -1102,6 +1120,12 @@ jobs: - `size`: default `Medium` unless explicitly known - `start_date`: issue/PR `created_at` formatted `YYYY-MM-DD` - `end_date`: + PROMPT_EOF + - name: Append prompt (part 2) + env: + GH_AW_PROMPT: /tmp/gh-aw/aw-prompts/prompt.txt + run: | + cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" - if closed/merged → `closed_at` / `merged_at` formatted `YYYY-MM-DD` - if open → **today’s date** formatted `YYYY-MM-DD` (**required for roadmap view; do not leave blank**) @@ -1139,12 +1163,6 @@ jobs: size: "Medium" start_date: "2025-12-15" end_date: "2026-01-03" - PROMPT_EOF - - name: Append prompt (part 2) - env: - GH_AW_PROMPT: /tmp/gh-aw/aw-prompts/prompt.txt - run: | - cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" ``` --- diff --git a/.github/workflows/file-size-reduction-project71.campaign.lock.yml b/.github/workflows/file-size-reduction-project71.campaign.lock.yml index 8ed70ac6d37..66c650c0a18 100644 --- a/.github/workflows/file-size-reduction-project71.campaign.lock.yml +++ b/.github/workflows/file-size-reduction-project71.campaign.lock.yml @@ -826,6 +826,16 @@ jobs: 8. Only predefined project fields may be updated. 9. **Project Update Instructions take precedence for all project writes.** + ### Why These Principles Matter + + **Workers are immutable** - Allows reuse across campaigns without coupling. You coordinate existing workflows, don't modify them. + + **Reads and writes are separate** - Prevents race conditions and inconsistent state. Always read all data first, then make all writes. + + **Idempotent operation** - Campaign can be re-run safely if interrupted. The orchestrator picks up where it left off using the cursor. + + **Only predefined fields** - Prevents accidental project board corruption. The orchestrator only updates fields it's configured to manage. + --- ## Required Phases (Execute In Order) @@ -920,19 +930,27 @@ jobs: - Closed (issue/discussion) → `Done` - Merged (PR) → `Done` + **Why use explicit GitHub state?** - GitHub is the source of truth for work status. Inferring status from other signals (labels, comments) would be unreliable and could cause incorrect tracking. + 6) Calculate required date fields for each item (per Project Update Instructions): - `start_date`: format `created_at` as `YYYY-MM-DD` - `end_date`: - if closed/merged → format `closed_at`/`merged_at` as `YYYY-MM-DD` - if open → **today's date** formatted `YYYY-MM-DD` (required for roadmap view) + **Why use today for open items?** - GitHub Projects requires end_date for roadmap views. Using today's date shows the item is actively tracked and updates automatically each run until completion. + 7) Do NOT implement idempotency by comparing against the board. You may compare for reporting only. + **Why no comparison for idempotency?** - The safe-output system handles deduplication. Comparing would add complexity and potential race conditions. Trust the infrastructure. + 8) Apply write budget: - If `MaxProjectUpdatesPerRun > 0`, select at most that many items this run using deterministic order (e.g., oldest `updated_at` first; tie-break by ID/number). - Defer remaining items to next run via cursor. + **Why use deterministic order?** - Ensures predictable behavior and prevents starvation. Oldest items are processed first, ensuring fair treatment of all work items. The cursor saves progress for next run. + ### Phase 3 — Write State (Execution) [WRITES ONLY] 9) For each selected item, send an `update-project` request. @@ -1119,6 +1137,12 @@ jobs: ### Adding New Issues + PROMPT_EOF + - name: Append prompt (part 2) + env: + GH_AW_PROMPT: /tmp/gh-aw/aw-prompts/prompt.txt + run: | + cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" When first adding an item to the project, you MUST write ALL required fields. ```yaml @@ -1158,12 +1182,6 @@ jobs: campaign_id: "file-size-reduction-project71" content_type: "issue" # or "pull_request" content_number: 123 - PROMPT_EOF - - name: Append prompt (part 2) - env: - GH_AW_PROMPT: /tmp/gh-aw/aw-prompts/prompt.txt - run: | - cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" fields: status: "Done" ``` diff --git a/docs/src/content/docs/guides/campaigns/getting-started.md b/docs/src/content/docs/guides/campaigns/getting-started.md index bfe6e00b7e5..f0f377314a2 100644 --- a/docs/src/content/docs/guides/campaigns/getting-started.md +++ b/docs/src/content/docs/guides/campaigns/getting-started.md @@ -10,6 +10,17 @@ This guide is the shortest path from “we want a campaign” to a working dashb > Using [agentic workflows](/gh-aw/reference/glossary/#agentic-workflow) (AI-powered workflows that can make autonomous decisions) means giving AI [agents](/gh-aw/reference/glossary/#agent) (autonomous AI systems) the ability to make decisions and take actions in your repository. This requires careful attention to security considerations and human supervision. > Review all outputs carefully and use time-limited trials to evaluate effectiveness for your team. +## Campaign Best Practices + +Before creating your first campaign, keep these core principles in mind: + +- **Start small**: One clear goal per campaign (e.g., "Upgrade Node.js to v20") +- **Start passive**: Use passive mode first to observe behavior and build trust +- **Reuse workflows**: Search existing workflows before creating new ones +- **Minimal permissions**: Grant only necessary permissions (issues/draft PRs, not merges) +- **Standardized outputs**: Use consistent patterns for issues, PRs, and comments +- **Escalate when uncertain**: Create issues requesting human review for risky decisions + ## Quick start (5 steps) 1. Create a GitHub Project board (manual, one-time) and copy its URL. @@ -35,6 +46,38 @@ Copy the Project URL (e.g., `https://github.com/orgs/myorg/projects/42`). Create `.github/workflows/.campaign.md` with frontmatter like: +**For your first campaign** (passive mode - recommended): + +```yaml +id: framework-upgrade +version: "v1" +name: "Framework Upgrade" + +project-url: "https://github.com/orgs/ORG/projects/1" +tracker-label: "campaign:framework-upgrade" + +objective: "Upgrade all services to Framework vNext with zero downtime." +kpis: + - id: services_upgraded + name: "Services upgraded" + priority: primary + direction: "increase" + baseline: 0 + target: 50 + time-window-days: 30 + +workflows: + - framework-upgrade # Use an existing workflow + +# Governance (conservative defaults for first campaign) +governance: + max-new-items-per-run: 5 + max-project-updates-per-run: 5 + max-comments-per-run: 3 +``` + +**For experienced users** (active mode - advanced): + ```yaml id: framework-upgrade version: "v1" @@ -47,15 +90,31 @@ objective: "Upgrade all services to Framework vNext with zero downtime." kpis: - id: services_upgraded name: "Services upgraded" - primary: true + priority: primary direction: "increase" + baseline: 0 target: 50 + time-window-days: 30 workflows: - - framework-upgrade + - framework-scanner + - framework-upgrader + +# Enable active execution (ADVANCED - only after passive campaign experience) +execute-workflows: true + +# Governance (still start conservative even in active mode) +governance: + max-new-items-per-run: 10 + max-project-updates-per-run: 10 + max-comments-per-run: 5 ``` -You can add governance and repo-memory wiring later; start with a working loop. +**Key differences:** +- **Passive mode**: Discovers and tracks work created by existing workflows (safer, simpler) +- **Active mode**: Can execute workflows and create missing ones (powerful but complex) + +**Start passive** unless you have prior campaign experience. You can enable active execution later. ## 3) Compile @@ -96,6 +155,50 @@ Items with campaign labels (`campaign:*`) are automatically protected from other This ensures your campaign items remain under the control of the campaign orchestrator and aren't interfered with by other workflows. +## Migrating from Passive to Active Mode + +Once you've successfully run a passive campaign for 1-2 weeks and understand how it works, you can enable active execution: + +**Prerequisites before enabling active mode:** +1. ✅ You've run at least 2-3 passive campaign runs successfully +2. ✅ You understand how the orchestrator coordinates work +3. ✅ You've reviewed the project board and it's tracking items correctly +4. ✅ You have clear governance rules and conservative limits set + +**Migration steps:** + +1. **Update your campaign spec** to add `execute-workflows: true`: + ```yaml + execute-workflows: true # Enable active execution + + governance: + max-new-items-per-run: 10 # Start conservative + max-project-updates-per-run: 10 + max-comments-per-run: 5 + ``` + +2. **Recompile** the campaign: `gh aw compile ` + +3. **Test with a manual run** before scheduling: + - Trigger the workflow manually from GitHub Actions + - Watch the run logs carefully + - Verify it behaves as expected + +4. **Monitor closely** for the first few runs: + - Check that workflows execute correctly + - Review any new workflows it creates + - Ensure governance limits are appropriate + +5. **Adjust governance** based on observed behavior: + - Increase limits if runs are too conservative + - Decrease limits if runs are too aggressive + - Add opt-out labels if needed + +**Rollback if needed:** +- Remove `execute-workflows: true` from spec +- Recompile: `gh aw compile ` +- Campaign reverts to passive mode + ## Optional: repo-memory for durable state Enable repo-memory for campaigns using this layout: `memory/campaigns//cursor.json` and `memory/campaigns//metrics/.json`. Campaign writes must include a cursor and at least one metrics snapshot. diff --git a/pkg/campaign/prompts/campaign_creation_instructions.md b/pkg/campaign/prompts/campaign_creation_instructions.md index 939df6647f9..f62e04ae45b 100644 --- a/pkg/campaign/prompts/campaign_creation_instructions.md +++ b/pkg/campaign/prompts/campaign_creation_instructions.md @@ -460,8 +460,212 @@ When creating a PR for the new campaign: --- +## Workflow Creation Guardrails + +### When to Create New Workflows + +**ONLY create new workflows when:** +1. No existing workflow does the required task +2. The campaign objective clearly requires a specific capability that's missing +3. Workflows from the agentics collection don't fit the need +4. You can articulate a clear, focused purpose for the new workflow + +**AVOID creating workflows when:** +- An existing workflow can be used (even if not perfect) +- The task can be handled by manual processes initially +- You're unsure about the exact requirements +- The workflow would duplicate functionality + +### Workflow Creation Safety Checklist + +Before suggesting a new workflow, verify: + +- [ ] **Clear purpose**: Can you describe in one sentence what the workflow does? +- [ ] **Defined trigger**: Is it clear when/how the workflow should run? +- [ ] **Bounded scope**: Does it have clear input/output boundaries? +- [ ] **Testable**: Can the workflow be tested independently? +- [ ] **Reusable**: Could other campaigns use this workflow? + +### Workflow Naming Guidelines + +**Good names** (specific, action-oriented): +- `security-vulnerability-scanner` - Scans for vulnerabilities +- `node-version-checker` - Checks Node.js versions +- `dependency-update-pr-creator` - Creates dependency update PRs + +**Poor names** (vague, too broad): +- `security-workflow` - What does it do? +- `checker` - Check what? +- `helper` - Help with what? + +--- + +## Passive vs Active Campaigns + +### Start Passive (Default) + +**Passive campaigns** (recommended for beginners): +- Discover and track existing work +- Lower risk and complexity +- No workflow creation or execution +- Good for learning campaign patterns + +```yaml +# Passive campaign (default) +workflows: + - existing-workflow-1 + - existing-workflow-2 +# execute-workflows: false # Default, can omit +``` + +### Progress to Active (Advanced) + +**Active campaigns** (for experienced users): +- Execute workflows directly +- Can create missing workflows +- Higher complexity and risk +- Requires careful testing + +**Prerequisites before enabling `execute-workflows: true`:** +1. You've successfully run at least one passive campaign +2. You understand how the orchestrator coordinates work +3. You have clear success criteria and governance rules +4. You're prepared to monitor and adjust during execution + +```yaml +# Active campaign (advanced) +workflows: + - framework-scanner + - framework-upgrader +execute-workflows: true # Explicitly enable + +governance: + max-project-updates-per-run: 10 # Start conservative + max-comments-per-run: 5 +``` + +**Migration path**: Start passive → Monitor for 1-2 weeks → Add governance rules → Enable active execution + +--- + +## Decision Rationale Guidelines + +When making decisions in campaigns, always explain WHY: + +### Example: Workflow Selection + +**Poor**: "Use `security-scanner` workflow" + +**Good**: "Use `security-scanner` workflow because: +- It scans all Go files for vulnerabilities (matches campaign scope) +- It creates issues for findings (supports our reporting needs) +- It's already tested and stable (reduces risk)" + +### Example: Governance Settings + +**Poor**: "Set max-project-updates-per-run to 10" + +**Good**: "Set max-project-updates-per-run to 10 because: +- We have ~50 services to track (conservative pacing) +- First campaign - want to monitor impact closely +- Can increase after observing initial runs" + +### Example: KPI Selection + +**Poor**: "Track services upgraded" + +**Good**: "Track 'Services upgraded' as primary KPI because: +- Directly measures campaign objective +- Easily quantifiable (count of completed upgrades) +- Updated automatically from project board status" + +--- + +## Failure Handling and Recovery + +### Default Failure Behaviors + +**Compilation failures** - Campaign generator should: +1. Report the error clearly with context +2. Suggest specific fixes for common issues +3. Provide a link to documentation +4. **NOT** delete partially created files (for debugging) + +**Runtime failures** - Orchestrator should: +1. Continue with other work items (don't stop entire campaign) +2. Report failures in status update with context +3. Suggest recovery actions when possible +4. Maintain cursor/state for next run + +### First-Time User Support + +**For users creating their first campaign**: + +1. **Validate requirements upfront**: + - GitHub Project board exists and is accessible + - At least one workflow exists or will be created + - Governance settings are appropriate for first campaign + +2. **Provide conservative defaults**: + ```yaml + governance: + max-new-items-per-run: 5 # Start small + max-project-updates-per-run: 5 # Monitor impact + max-comments-per-run: 3 # Avoid noise + ``` + +3. **Include onboarding guidance in campaign body**: + ```markdown + ## First Campaign? Read This! + + This is your first campaign - here's what to expect: + + 1. **First run** - The orchestrator will initialize the project board and add the Epic issue + 2. **Monitor** - Check the project board after the first run to verify items appear correctly + 3. **Adjust** - Based on first run, you may want to adjust governance settings + 4. **Learn** - Each run provides status updates explaining what happened and why + ``` + +--- + ## Best Practices +### Core Campaign Principles + +Follow these fundamental principles when creating campaigns: + +1. **Start with one small, clear goal per campaign** + - Focus on a single, well-defined objective + - Avoid scope creep - multiple goals should be separate campaigns + - Example: "Upgrade Node.js to v20" not "Upgrade Node.js and refactor auth" + +2. **Use passive mode first to observe and build trust** + - Always start with passive mode (`execute-workflows: false` or omitted) + - Monitor 1-2 weeks to understand orchestration behavior + - Build confidence before enabling active execution + +3. **Reuse existing workflows before creating new ones** + - Thoroughly search `.github/workflows/*.md` for existing solutions + - Check the [agentics collection](https://github.com/githubnext/agentics) for reusable workflows + - Only create new workflows when existing ones don't meet requirements + +4. **Keep permissions minimal (issues / draft PRs, no merges)** + - Grant only the permissions needed for the campaign's scope + - Prefer read permissions over write when possible + - Use draft PRs instead of direct merges for code changes + - Example: `issues: read, pull-requests: write` for issue tracking with PR creation + +5. **Make outputs standardized and predictable** + - Use consistent safe-output configurations across workflows + - Document expected outputs in workflow descriptions + - Follow established patterns for issue/PR formatting + +6. **Escalate to humans when unsure** + - Don't make risky decisions autonomously + - Create issues or comments requesting human review + - Include context and reasoning in escalation messages + - Example: "This change affects authentication - requesting human review" + ### DO: - ✅ Generate unique campaign IDs in kebab-case - ✅ Scan existing workflows before suggesting new ones @@ -470,6 +674,10 @@ When creating a PR for the new campaign: - ✅ Include clear ownership and governance - ✅ Check for file conflicts before creating - ✅ Compile and validate before creating PR +- ✅ Start with passive mode for first campaign +- ✅ Provide clear rationale for all decisions +- ✅ Use conservative defaults for beginners +- ✅ Test new workflows before campaign use ### DON'T: - ❌ Create campaigns with duplicate IDs @@ -478,6 +686,10 @@ When creating a PR for the new campaign: - ❌ Skip risk assessment and governance - ❌ Create campaigns without project board URL (when required) - ❌ Skip compilation validation +- ❌ Enable execute-workflows for first campaign +- ❌ Create workflows without clear purpose +- ❌ Use high governance limits for beginners +- ❌ Make decisions without explaining rationale --- diff --git a/pkg/campaign/prompts/orchestrator_instructions.md b/pkg/campaign/prompts/orchestrator_instructions.md index 789d4393204..cc3121337d8 100644 --- a/pkg/campaign/prompts/orchestrator_instructions.md +++ b/pkg/campaign/prompts/orchestrator_instructions.md @@ -73,6 +73,16 @@ and synchronizing campaign state into a GitHub Project board. 8. Only predefined project fields may be updated. 9. **Project Update Instructions take precedence for all project writes.** +### Why These Principles Matter + +**Workers are immutable** - Allows reuse across campaigns without coupling. You coordinate existing workflows, don't modify them. + +**Reads and writes are separate** - Prevents race conditions and inconsistent state. Always read all data first, then make all writes. + +**Idempotent operation** - Campaign can be re-run safely if interrupted. The orchestrator picks up where it left off using the cursor. + +**Only predefined fields** - Prevents accidental project board corruption. The orchestrator only updates fields it's configured to manage. + --- ## Required Phases (Execute In Order) @@ -167,19 +177,27 @@ and synchronizing campaign state into a GitHub Project board. - Closed (issue/discussion) → `Done` - Merged (PR) → `Done` +**Why use explicit GitHub state?** - GitHub is the source of truth for work status. Inferring status from other signals (labels, comments) would be unreliable and could cause incorrect tracking. + 6) Calculate required date fields for each item (per Project Update Instructions): - `start_date`: format `created_at` as `YYYY-MM-DD` - `end_date`: - if closed/merged → format `closed_at`/`merged_at` as `YYYY-MM-DD` - if open → **today's date** formatted `YYYY-MM-DD` (required for roadmap view) +**Why use today for open items?** - GitHub Projects requires end_date for roadmap views. Using today's date shows the item is actively tracked and updates automatically each run until completion. + 7) Do NOT implement idempotency by comparing against the board. You may compare for reporting only. +**Why no comparison for idempotency?** - The safe-output system handles deduplication. Comparing would add complexity and potential race conditions. Trust the infrastructure. + 8) Apply write budget: - If `MaxProjectUpdatesPerRun > 0`, select at most that many items this run using deterministic order (e.g., oldest `updated_at` first; tie-break by ID/number). - Defer remaining items to next run via cursor. +**Why use deterministic order?** - Ensures predictable behavior and prevents starvation. Oldest items are processed first, ensuring fair treatment of all work items. The cursor saves progress for next run. + ### Phase 3 — Write State (Execution) [WRITES ONLY] 9) For each selected item, send an `update-project` request. diff --git a/pkg/campaign/prompts/workflow_execution.md b/pkg/campaign/prompts/workflow_execution.md index 71d35755c53..0954f53a893 100644 --- a/pkg/campaign/prompts/workflow_execution.md +++ b/pkg/campaign/prompts/workflow_execution.md @@ -2,6 +2,8 @@ This campaign is configured to **actively execute workflows**. Your role is to run the workflows listed in sequence, collect their outputs, and use those outputs to drive the campaign forward. +**IMPORTANT: Active execution is an advanced feature. Exercise caution and follow all guidelines carefully.** + --- ## Workflows to Execute @@ -15,55 +17,161 @@ The following workflows should be executed in order: --- +## Workflow Creation Guardrails + +### Before Creating Any Workflow, Ask: + +1. **Does this workflow already exist?** - Check `.github/workflows/` thoroughly +2. **Can an existing workflow be used?** - Even if not perfect, existing is safer +3. **Is the requirement clear?** - Can you articulate exactly what it should do? +4. **Is it testable?** - Can you verify it works before using it in the campaign? +5. **Is it reusable?** - Could other campaigns benefit from this workflow? + +### Only Create New Workflows When: + +✅ **All these conditions are met:** +- No existing workflow does the required task +- The campaign objective explicitly requires this capability +- You have a clear, specific design for the workflow +- The workflow has a focused, single-purpose scope +- You can test it independently before campaign use + +❌ **Never create workflows when:** +- You're unsure about requirements +- An existing workflow "mostly" works +- The workflow would be complex or multi-purpose +- You haven't verified it doesn't already exist +- You can't clearly explain what it does in one sentence + +--- + ## Execution Process For each workflow: 1. **Check if workflow exists** - Look for `.github/workflows/.md` or `.github/workflows/.lock.yml` -2. **Create workflow if needed** - If the workflow doesn't exist: - - Use your understanding of the campaign objective to design an appropriate workflow - - Create the workflow file at `.github/workflows/.md` with: - - Appropriate trigger: `workflow_dispatch` (required for execution) - - Required tools and permissions - - Safe outputs for any GitHub operations (issues, PRs, comments) - - Clear prompt describing what the workflow should do +2. **Create workflow if needed** - Only if ALL guardrails above are satisfied: + + **Design requirements:** + - **Single purpose**: One clear task (e.g., "scan for outdated dependencies", not "scan and update") + - **Explicit trigger**: Must include `workflow_dispatch` for manual/programmatic execution + - **Minimal tools**: Only include tools actually needed (principle of least privilege) + - **Safe outputs only**: Use appropriate safe-output limits (max: 5 for first version) + - **Clear prompt**: Describe exactly what the workflow should do and return + + **Create the workflow file at `.github/workflows/.md`:** + ```yaml + --- + name: + description: + + on: + workflow_dispatch: # Required for execution + + tools: + github: + toolsets: [default] # Adjust based on needs + # Only add other tools if absolutely necessary + + safe-outputs: + create-issue: + max: 3 # Start conservative + add-comment: + max: 2 + --- + + # + + You are a focused workflow that . + + ## Task + + + + ## Output + + + ``` + - Compile it with `gh aw compile .md` - - **Test the newly created workflow** before using it: - - Trigger a test run: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - - Wait for completion and verify it succeeds - - Review outputs to ensure they match expectations - - If the test fails, revise the workflow and test again - - Only proceed with campaign execution after successful test - -3. **Execute the workflow** - Use GitHub MCP tools (skip if just tested): + - **CRITICAL: Test before use** (see testing requirements below) + +3. **Test newly created workflows** (MANDATORY): + + **Why test?** - Untested workflows may fail during campaign execution, blocking progress. Test first to catch issues early. + + **Testing steps:** + - Trigger test run: `mcp__github__run_workflow(workflow_id: "", ref: "main")` + - Wait for completion: Poll until status is "completed" + - **Verify success**: Check that workflow succeeded and produced expected outputs + - **Review outputs**: Ensure results match expectations (check artifacts, issues created, etc.) + - **If test fails**: Revise the workflow, recompile, and test again + - **Only proceed** after successful test run + + **Test failure actions:** + - DO NOT use the workflow in the campaign if testing fails + - Analyze the failure logs to understand what went wrong + - Make necessary corrections to the workflow + - Recompile and retest + - If you can't fix it after 2 attempts, report in status update and skip this workflow + +4. **Execute the workflow** (skip if just tested successfully): - Trigger: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - Wait for completion: Poll `mcp__github__get_workflow_run(run_id)` until status is "completed" - Collect outputs: Check `mcp__github__download_workflow_run_artifact()` for any artifacts + - **Handle failures gracefully**: If execution fails, note it in status update but continue campaign -4. **Use outputs for next steps** - Use information from workflow runs to: - - Inform subsequent workflow executions - - Update project board items - - Make decisions about campaign progress +5. **Use outputs for next steps** - Use information from workflow runs to: + - Inform subsequent workflow executions (e.g., scanner results → upgrader inputs) + - Update project board items with relevant information + - Make decisions about campaign progress and next actions --- ## Guidelines +**Execution order:** - Execute workflows **sequentially** (one at a time) - Wait for each workflow to complete before starting the next -- **Test newly created workflows** before using them in the campaign +- **Why sequential?** - Ensures dependencies between workflows are respected and reduces API load + +**Workflow creation:** +- **Always test newly created workflows** before using them in the campaign +- **Why test first?** - Prevents campaign disruption from broken workflows +- Start with minimal, focused workflows (easier to test and debug) +- **Why minimal?** - Reduces complexity and points of failure +- Keep designs simple and aligned with campaign objective +- **Why simple?** - Easier to understand, test, and maintain + +**Failure handling:** - If a workflow test fails, revise and retest before proceeding -- If a workflow fails during campaign execution, note the failure and continue with campaign coordination -- Keep workflow designs simple and focused on the campaign objective +- **Why retry?** - Initial failures often due to minor issues easily fixed +- If a workflow fails during campaign execution, note the failure and continue +- **Why continue?** - One workflow failure shouldn't block entire campaign progress +- Report all failures in the status update with context +- **Why report?** - Transparency helps humans intervene if needed + +**Workflow reusability:** - Workflows you create should be reusable for future campaign runs +- **Why reusable?** - Reduces need to create workflows repeatedly, builds library of capabilities +- Avoid campaign-specific logic in workflows (keep them generic) +- **Why generic?** - Enables reuse across different campaigns + +**Permissions and safety:** +- Keep workflow permissions minimal (only what's needed) +- **Why minimal?** - Reduces risk and follows principle of least privilege +- Prefer draft PRs over direct merges for code changes +- **Why drafts?** - Requires human review before merging changes +- Escalate to humans when uncertain about decisions +- **Why escalate?** - Human oversight prevents risky autonomous actions --- ## After Workflow Execution Once all workflows have been executed (or created and executed), proceed with the normal orchestrator phases: -- Phase 1: Discovery -- Phase 2: Planning -- Phase 3: Project Updates -- Phase 4: Status Reporting +- Phase 1: Discovery (read state from manifest and project board) +- Phase 2: Planning (determine what needs updating) +- Phase 3: Project Updates (write state to project board) +- Phase 4: Status Reporting (report progress, failures, and next steps)