docs(security): PR-N — pyspy capability surface + SecurityContext guide#200
Merged
Conversation
Adds docs/migration/v0.2-to-v0.3.md covering the v0.3.0 security-posture migration per RFC-0013 §migration. Cooperative pyspy (zero capabilities, in-process faulthandler) is deleted at v0.3.0; operators who want Python profiling deploy parca-agent, which requires CAP_SYS_ADMIN (or root) + hostPID + BTF-enabled kernel. The guide names the exact capability surface, kernel requirement, kernel syscall + errno failure shapes (not paraphrased agent log strings), minimum-grant SecurityContext snippet, and rollback path. Conservative on CAP_BPF/CAP_PERFMON — upstream parca-agent does not document the narrower split today. Updates docs/migration/v0.1-to-v0.2.md pyspy row to forward-reference the new guide (was claiming "no upstream replacement exists today" — RFC-0013 names parca-agent at v0.3.0). Updates docs/README.md to index the migration/ subdirectory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Tri Lam <tri@maydow.com>
4 tasks
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
## Summary Reconcile the four pivot-tracking docs (`docs/rfcs/0013-distro-first-pivot.md`, `CHANGELOG.md`, `MILESTONES.md`, `docs/migration/v0.1-to-v0.2.md`) with the wave-3 (PR-B1-shape sibling ports) and wave-4 (PR-B2-shape upstream-only ports + PR-F.1 + PR-J + PR-L + PR-N) landings. Pure doc sweep — no code or config touched. ## What changed ### `docs/rfcs/0013-distro-first-pivot.md` §migration PR sequence rows updated with PR-number citations and landed markers: - **PR-A2** (landed, #189, 2026-05-30) - **PR-B2** (landed, #201) — also enumerates sibling-receiver follow-ups under PR-B2 to dispel the slug collision with #188's PR-B2-labelled dcgm port: stdoutexporter (#202), pyspy (#203), kernelevents (#208), containerstdout (#209) - **PR-F.1** (landed) — fleshed-out delete list (`internal/{selftelemetry,telemetry}` + `components/receivers/dcgm/` + `pkg/dcgm/` + one orphan clockreceiver integration test) - **PR-F.2** re-scoped — now deletes the whole `internal/{componentstatus,pipeline,pipelinebuilder,consumer,fanout,runtime/lifecycle}` bundle in one cut once the last three pipeline+consumer-importing receivers land (#204 k8sevents, #205 clockreceiver, #207 otlphttp). Per the import-graph state — `internal/componentstatus`'s only non-test consumer is `internal/pipeline`, so they delete together - **PR-G** (landed, #182), **PR-H** (landed, #183) - **PR-I.1a** (in flight — scaffold agent), **PR-I.1b** (pre-staged; gate satisfied by #201) - **PR-J** (landed, #195) — kept existing marker - **PR-K.1** (in flight — separate agent landing) - **PR-L** (landed, skeleton #179 + body #191) — flagged as living document - **PR-N** (landed, #200) — shipped at v0.1.0 ahead of v0.3.0 as a doc-only update at `docs/migration/v0.2-to-v0.3.md` ### `CHANGELOG.md` [Unreleased] - Restructured the pivot wave list as **four waves** (was three). Wave 3 enumerates PR-B1-shape sibling ports + support infra (#180-#194/#196). Wave 4 enumerates PR-B2-shape upstream-only ports + PR-J (#195) + PR-F.1 (#206) + PR-N (#200) + lint/TOCTOU hardening (#198/#210). - Tightened the PR-F.2 deferred note to point at the three open ports (#204/#205/#207) as the gate. ### `MILESTONES.md` - **M1** (pipeline runtime) — status row now cites PR-A2 (#189), PR-F.1 (#206), PR-F.2 gate (#204/#205/#207), PR-E (#180), retains `internal/config/` (still load-bearing for `tracecore validate`). - **M2** (self-telemetry) — status row now cites PR-F.1 (#206); flags `internal/componentstatus` as travelling with `internal/pipeline` in PR-F.2. - **M8** (DCGM receiver) — status flipped to *landed-and-replaced*: cites PR-F.1 (#206) deletion + PR-J (#195) `docs/integrations/prometheus-scrape.md` recipe. Notes the inert chart toggle retention until PR-K.3. ### `docs/migration/v0.1-to-v0.2.md` - §`internal/*` package deletion (PR-F) status flips from "not yet open" to "PR-F.1 landed (#206), PR-F.2 gated on three open ports". - Open-items checklist expanded from 5 to 13 entries — tracks every PR letter the migration guide cares about (A2 / E / F.1 / F.2 / I.1a-c / J / K.1-3 / L / N) with PR numbers and links. ## Why now Tracking docs accumulated drift across wave-3 + wave-4 because every sibling-port PR (and the support-infra PRs around them) updated the bottom of `CHANGELOG.md` but did not always touch the upstream sequencing section in RFC-0013. Per memory rule `[Keeping this document current]`: status drift is a review blocker. This PR is the consolidated catch-up; future port PRs include their RFC-row flip in-PR. ## What this PR does NOT change - No code, no config, no YAML, no chart — only the four tracking docs. - No new doc gates added; existing gates pass. - No PRs other than the four named docs are modified. ## Test plan - [x] `bash scripts/doc-check.sh` clean (33 test refs, 528 links resolve, comment-noise diff gate clean vs `origin/main`, all 13 gates green). - [x] Pre-commit hook (`commitlint` 72-char subject limit + DCO + AI-trailer gates) passed. - [x] Pre-push hook (`make ci-fast` equivalent: `golangci-lint`, `go vet`, `go mod verify`, `no-autoupdate-check`, `doc-check.sh`) passed on second attempt after `git fetch origin main` populated the worktree's `origin/main` ref — first push failed because the worktree previously tracked the (gone) `pr-a2-ocb-main-swap` branch, so `doc-check.sh`'s comment-noise diff-scope gate exited 128 on the missing ref. Root cause fixed by the fetch; not a workaround. - [ ] CI green on this branch. ```release-notes NONE ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
This was referenced Jun 1, 2026
trilamsr
added a commit
that referenced
this pull request
Jun 1, 2026
) ## Summary PR-N (`docs/migration/v0.2-to-v0.3.md`) landed at #200 assuming PR-M (delete pyspy + ship `parca-agent` recipe) would cut at v0.3.0, which made the CAP_SYS_PTRACE → CAP_SYS_ADMIN/CAP_BPF migration a v0.3.0 break. #222 subsequently deferred PR-M to v0.4.0+ (triggers: OTel Profiles → Beta + feature-gate removed AND parca-agent gains OTLP export). The migration doc now contradicts reality — `components/receivers/pyspy/` still ships in v0.3.0's OCB binary, `tracecore-pyspy` stays on PyPI, and the chart's `receivers.pyspy.*` key is still honoured. This PR reframes the timeline throughout: pyspy STAYS in v0.3.0 with unchanged zero-capability posture, the CAP migration is preserved as forward-looking operator preparation material for v0.4.0+, and #222 is named as the source of truth for re-evaluation triggers. The CAP migration content itself (the `parca-agent` requirements table, minimum-grant SecurityContext, failure-mode table, removal checklist) is retained because operators planning v0.4.0+ upgrades still need it; only the timeline framing changes. ## Test plan - [x] `make check` — green (fmt, tidy-check, lint, vet, mod-verify). - [x] `make doc-check` — green (501 markdown links resolve, banned-phrase lint clean across 106 files, comment-noise diff gate clean vs origin/main, no rot-prone reference drift). - [ ] Reviewer confirms the reframed timeline matches #222's deferral memo (PR-M unblocks on OTel Profiles → Beta + parca-agent OTLP export, neither met at v0.3.0 cut). - [ ] Reviewer confirms the CAP migration content (parca-agent requirements, minimum-grant SecurityContext, failure-mode table) is preserved verbatim — only framing prose changed. ## Noticed but out-of-scope - `components/receivers/pyspy/README.md` still carries a 2026-05-22 banner saying the receiver is "Scheduled for deletion at v0.3.0 per RFC-0013 §7". Same staleness pattern as the migration doc, but kept out of this PR's scope per the reconcile-only contract. Worth a follow-up sweep that re-times the README + RUNBOOK banners against #222. - RFC-0013 §Migration / rollout still carries the original PR-M / PR-N sequencing pre-deferral. References section now flags this with a "supersede with #222" note, but the RFC itself is untouched. ## Release notes ```release-notes - docs(migration): reframe docs/migration/v0.2-to-v0.3.md to reflect PR-M deferral to v0.4.0+ (per #222) — pyspy receiver and tracecore-pyspy PyPI helper continue to ship in v0.3.0 with unchanged zero-capability posture; the CAP_SYS_PTRACE → CAP_SYS_ADMIN/CAP_BPF migration content is retained as forward-looking operator preparation for the v0.4.0+ cutover. ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
docs/migration/v0.2-to-v0.3.mdcovering the v0.3.0 security-posture migration per RFC-0013 §migration. The single operator-visible break at v0.3.0 is the Python-profiling story: the cooperativepyspyreceiver (zero capabilities added; in-processfaulthandler.dump_tracebackover UDS per RFC-0009) is deleted in PR-M, and the replacementparca-agent(eBPF) requiresCAP_SYS_ADMIN(or root) +hostPID: true+ a BTF-enabled kernel ≥5.3.The guide names:
rootorCAP_SYS_ADMINper parca-dev/parca-agent) and whyCAP_BPF/CAP_PERFMONis not yet a documented narrower alternative (conservative grant remainsCAP_SYS_ADMIN).bpf(BPF_PROG_LOAD,…)→EPERM,perf_event_open(…)→EACCES,open("/sys/kernel/btf/vmlinux",…)→ENOENT, etc.) — the stable surface across parca-agent versions, not paraphrased agent log strings.SecurityContextsnippet (DaemonSet-shaped,add: [SYS_ADMIN]notprivileged: true) with explicit disclaimers aboutreadOnlyRootFilesystem(deferred to upstream manifest verification) and PSS interactions.tracecore-pyspyPyPI helper remains installable one minor past v0.3.0).Why this PR exists despite cooperative pyspy needing zero capabilities
tracecore's
pyspyreceiver does not useCAP_SYS_PTRACE. Per RFC-0009 §Safety properties and the chart'sconftestpolicy (add: []asserted inchart.yml), the cooperative design walks Python frames in-process viafaulthandler.dump_tracebackand ships them over UDS — noptrace, noprocess_vm_readv. The security-posture change at v0.3.0 is the delta from the cooperative path (zero capabilities, tracecore pod) to the eBPF path (CAP_SYS_ADMIN, separateparca-agentpod). PR-N documents that delta.Why now (not deferred with PR-M)
parca-agent research confirms the OTel Profiles signal is still Alpha (Mar 2026 release) and parca-agent has no OTLP profiles exporter yet. That means operators upgrading to v0.3.0 will run parca-agent alongside tracecore for at least one more minor — they need the security-posture delta documented before PR-M cuts the receiver. PR-N landing ahead of PR-M gives operators an evaluation window.
Drift fix
docs/migration/v0.1-to-v0.2.md'spyspyrow claimed "Deferred until OTel Profiles GA. No upstream replacement exists today; the toggle survives until contrib shipspprofreceiver." This contradicted RFC-0013, which has namedparca-agent(separate DaemonSet) as the replacement since the pivot landed. Updated the row to forward-reference the new guide and removed the stale "no upstream replacement" claim. Root cause: the v0.1-to-v0.2.md skeleton (PR #179, then fleshed in PR #191) predated the RFC-0013 §2 adoption-matrix line that explicitly mapscomponents/receivers/pyspy/→parca-agentat v0.3.0. Fixed at the row, not paved over.Test plan
make doc-check— 510 markdown links resolve to on-disk files; banned-phrase lint clean across 109 markdown files; new file's RFC + pyspy README/RUNBOOK cross-links verified.make check—golangci-lint run ./...0 issues;go vet ./...clean;go mod verifyclean.apps/v1.DaemonSet(apiVersion + spec.template structure).bpf(2),perf_event_open(2)).