Skip to content

bug: call-workflow compiler propagates worker permissions to caller instead of checking them #40169

@dsyme

Description

@dsyme

Summary

test-copilot-call-workflow has had startup_failure on every run since it was introduced. The immediate trigger is vulnerability-alerts: read appearing in the generated call-test-copilot-call-worker job's permissions: block, which GitHub Actions rejects. But the real bug is a design issue in buildCallWorkflowJobs: the compiler propagates the worker's permissions up into the caller's lockfile, when it should instead check that the caller's declared permissions are sufficient.

Current (wrong) behaviour

buildCallWorkflowJobs in compiler_safe_output_jobs.go calls extractCallWorkflowPermissions() to read all job-level permissions from the worker's .lock.yml and unions them into a single set. That set is then written as the permissions: block on the call-<worker> job in the caller's lockfile (comment at line 171: "includes a job-level permissions block that is the union of all the worker's job-level permissions, so GitHub allows the nested jobs to run").

This is the wrong approach: the caller's compiled permissions block should be derived from the caller's own frontmatter/declared permissions, not reverse-engineered from the worker's implementation. The worker's permissions are the worker's business; the caller should declare its own scope and be told if that's insufficient.

How the vulnerability-alerts symptom appears

The worker's agent job has permissions: read-all. When extractJobPermissionsFromParsedWorkflow merges that shorthand with the explicit-scope maps from other jobs, Merge() in permissions_operations.go expands read-all by iterating over GetAllPermissionScopes(). That function includes PermissionVulnerabilityAlerts = "vulnerability-alerts" (declared as a scope GitHub is rolling out). The expansion materialises vulnerability-alerts: read as an explicit key, which gets written into the caller's lockfile. GitHub Actions rejects the workflow at parse time: startup_failure.

The actionlint warning for this scope is deliberately suppressed in lint_command.go with the comment "GitHub is rolling out an additional permissions scope before actionlint support" — but real runs confirm GitHub does not accept it for GITHUB_TOKEN, so the suppression masks a genuine error.

Proposed fix

Change the call-<worker> job generation to use the caller's declared permissions (from its frontmatter or the permissions the compiler would otherwise assign) rather than deriving them from the worker. The extractCallWorkflowPermissions logic should become a validation/warning step: after determining what the caller's call-<worker> job will have, check that it covers what the worker requires, and emit a warning or error if not. The workflow being compiled should not have its permissions modified as a side-effect of reading the worker's lockfile.

This would fix the immediate vulnerability-alerts startup failure and would also make the design more correct: callers control their own permission surface; the compiler helps them validate it is sufficient rather than silently inflating it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions