TraceCoreAI · trilamsr · Jun 1, 2026 · Jun 1, 2026
diff --git a/docs/migration/v0.2-to-v0.3.md b/docs/migration/v0.2-to-v0.3.md
@@ -1,22 +1,28 @@
 # Migration: v0.2.x → v0.3.0
 
-This guide tells operators how to move from a `v0.2.x` deployment to `v0.3.0`. The single operator-visible break this release is the Python-profiling story: tracecore's in-tree `pyspy` receiver and its `tracecore-pyspy` PyPI helper are deleted, and the upstream-recipe replacement (`parca-agent`) changes the SecurityContext budget. Everything else is unchanged from `v0.2.x` — see [`v0.1-to-v0.2.md`](v0.1-to-v0.2.md) for the prior cut's surface.
+This guide tells operators how to move from a `v0.2.x` deployment to `v0.3.0`. **The Python-profiling story is unchanged in v0.3.0**: the cooperative `pyspy` receiver and its `tracecore-pyspy` PyPI helper ship as in `v0.2.x`, with the same zero-capability posture. PR-M (delete pyspy + ship `parca-agent` recipe) has been **deferred to v0.4.0+** per [#222](https://github.com/TraceCoreAI/tracecore/issues/222). Everything else is unchanged from `v0.2.x` — see [`v0.1-to-v0.2.md`](v0.1-to-v0.2.md) for the prior cut's surface.
+
+This guide remains in the v0.2→v0.3 lane because the security-posture work (PR-N) landed at v0.3.0 as **operator preparation material** for the eventual v0.4.0+ cutover. The CAP_SYS_PTRACE → CAP_SYS_ADMIN/CAP_BPF migration is a forward-looking reference: it tells operators what the eBPF-profiler future will require so they can budget cluster policy, kernel versions, and PSS exceptions ahead of time.
 
 ## TL;DR
 
-The cooperative `pyspy` receiver (Phase 2 design: in-process Python helper + UDS + `faulthandler.dump_traceback`, zero capabilities added per [RFC-0009 §Safety properties](../rfcs/0009-pyspy-receiver-scope.md)) is deleted in PR-M. Operators who want Python stack sampling deploy `parca-agent` as a separate DaemonSet via the upstream chart. `parca-agent` is an eBPF profiler and **requires `CAP_SYS_ADMIN` (or root)** on its pod — a strictly larger capability budget than the zero-capability cooperative pyspy needed. The tracecore DaemonSet's capability set is unchanged (still `drop: [ALL]`, `add: []`); the new capability lives on the `parca-agent` pod, deployed and governed separately by the operator.
+**v0.3.0 actual behaviour.** The cooperative `pyspy` receiver (Phase 2 design: in-process Python helper + UDS + `faulthandler.dump_traceback`, zero capabilities added per [RFC-0009 §Safety properties](../rfcs/0009-pyspy-receiver-scope.md)) **ships as-is in v0.3.0**. No deletion, no PyPI yank, no chart-values key removal, no SecurityContext change on the tracecore DaemonSet — operator action required to upgrade from v0.2.x to v0.3.0 is *zero* on the profiling surface.
+
+**v0.4.0+ planned behaviour (PR-M, deferred).** When PR-M lands, the cooperative pyspy receiver and `tracecore-pyspy` PyPI helper are deleted, and operators who want Python stack sampling deploy `parca-agent` as a separate DaemonSet via the upstream chart. `parca-agent` is an eBPF profiler and **requires `CAP_SYS_ADMIN` (or root)** on its pod — a strictly larger capability budget than the zero-capability cooperative pyspy needs. The tracecore DaemonSet's capability set will remain unchanged (`drop: [ALL]`, `add: []`); the new capability lives on the `parca-agent` pod, deployed and governed separately by the operator.
+
+**Re-evaluation triggers** (per [#222](https://github.com/TraceCoreAI/tracecore/issues/222)): PR-M unblocks when (1) OTel Profiles reaches Beta and the `service.profilesSupport` feature-gate is removed, **and** (2) parca-agent gains OTLP export (or PR-M is re-scoped to an "otelcol-ebpf-profiler sibling distro" pattern). Neither condition is met at v0.3.0 cut.
 
-This guide names the new capability surface, the kernel requirement, the failure modes operators will see if the SecurityContext is too restrictive, and a minimum-grant SecurityContext snippet. The cooperative-pyspy design rationale (why tracecore avoided `CAP_SYS_PTRACE` for two minor releases) is in [RFC-0009 §Alternatives](../rfcs/0009-pyspy-receiver-scope.md#alternatives-considered); the deletion rationale is in [RFC-0013 §Adoption matrix](../rfcs/0013-distro-first-pivot.md#2-adoption-matrix).
+The remainder of this guide names the eventual capability surface, the kernel requirement, the failure modes operators will see if the SecurityContext is too restrictive, and a minimum-grant SecurityContext snippet — all material that operators planning v0.4.0+ upgrades should pre-evaluate today. The cooperative-pyspy design rationale (why tracecore avoided `CAP_SYS_PTRACE` for two minor releases) is in [RFC-0009 §Alternatives](../rfcs/0009-pyspy-receiver-scope.md#alternatives-considered); the deferral rationale is in [#222](https://github.com/TraceCoreAI/tracecore/issues/222).
 
-## Why the security posture changes
+## Why the security posture will change (at v0.4.0+)
 
-Cooperative pyspy (v0.1.x, v0.2.x) walked Python frames **inside** the workload process via `faulthandler.dump_traceback`, then shipped the rendered output over a per-process Unix domain socket. No memory of another process was read; no `ptrace`, no `process_vm_readv`, no signal. The tradeoff was operator-side: every workload had to `pip install tracecore-pyspy` and call `attach()` once at startup, and the helper only worked against cooperating CPython interpreters.
+Cooperative pyspy (v0.1.x, v0.2.x, **and v0.3.0**) walks Python frames **inside** the workload process via `faulthandler.dump_traceback`, then ships the rendered output over a per-process Unix domain socket. No memory of another process is read; no `ptrace`, no `process_vm_readv`, no signal. The tradeoff is operator-side: every workload has to `pip install tracecore-pyspy` and call `attach()` once at startup, and the helper only works against cooperating CPython interpreters.
 
-`parca-agent` (v0.3.0+) walks frames **out-of-process** via eBPF programs attached to the kernel's perf-events subsystem. The eBPF approach removes the workload-side cooperation requirement (any binary the kernel can sample is in scope, including non-Python runtimes), and removes the per-language helper-distribution problem. The cost is privilege: loading eBPF programs requires `CAP_SYS_ADMIN` (or root), and reading symbolized stacks from kernel + user space requires that the agent see the global PID namespace (`hostPID: true`) and the on-disk binaries of every workload it samples.
+`parca-agent` (v0.4.0+ when PR-M lands) walks frames **out-of-process** via eBPF programs attached to the kernel's perf-events subsystem. The eBPF approach removes the workload-side cooperation requirement (any binary the kernel can sample is in scope, including non-Python runtimes), and removes the per-language helper-distribution problem. The cost is privilege: loading eBPF programs requires `CAP_SYS_ADMIN` (or root), and reading symbolized stacks from kernel + user space requires that the agent see the global PID namespace (`hostPID: true`) and the on-disk binaries of every workload it samples.
 
-The change is a tradeoff, not a regression: tracecore preserves the cooperative path through end-of-life at v0.3.0 specifically so operators with restricted-tier Pod Security Standards have one release to evaluate whether the eBPF capability cost is acceptable for their cluster.
+The change is a tradeoff, not a regression: tracecore preserves the cooperative path past v0.3.0 to give operators with restricted-tier Pod Security Standards multiple releases to evaluate whether the eBPF capability cost is acceptable for their cluster.
 
-## What `parca-agent` requires
+## What `parca-agent` will require (forward-looking, v0.4.0+)
 
 | Requirement | Value | Source |
 |---|---|---|
@@ -28,7 +34,7 @@ The change is a tradeoff, not a regression: tracecore preserves the cooperative
 
 **On `CAP_BPF` / `CAP_PERFMON`.** Linux kernel 5.8 split `CAP_SYS_ADMIN`'s BPF surface into the narrower `CAP_BPF` (load BPF programs and maps) + `CAP_PERFMON` (open perf events) capabilities. In principle a profiler that uses only BPF + perf-events can run with `add: [BPF, PERFMON]` instead of `add: [SYS_ADMIN]`. **Upstream `parca-agent` does not document support for this narrower set today** (per its security docs, the requirement is `root` or `CAP_SYS_ADMIN`); operators interested in the narrower split should track [parca-dev/parca-agent#3115](https://github.com/parca-dev/parca-agent/issues) (CAP_BPF/CAP_PERFMON tracking) and validate against their kernel before relying on it. The conservative grant remains `CAP_SYS_ADMIN`.
 
-## What tracecore's pod still requires
+## What tracecore's pod still requires (v0.3.0)
 
 **Unchanged from v0.2.x.** The tracecore DaemonSet's container SecurityContext is still:
 
@@ -41,11 +47,11 @@ containerSecurityContext:
     add: []
 ```
 
-The chart's conftest policy (`install/kubernetes/tracecore/policy/`) still rejects any capability addition — there is no v0.3.0 operator path that puts `CAP_SYS_ADMIN` on the tracecore pod itself. All new capability surface lives on the **separate** `parca-agent` DaemonSet.
+The chart's conftest policy (`install/kubernetes/tracecore/policy/`) still rejects any capability addition — there is no operator path (v0.3.0 *or* v0.4.0+) that puts `CAP_SYS_ADMIN` on the tracecore pod itself. When PR-M lands, all new capability surface will live on the **separate** `parca-agent` DaemonSet.
 
-## Minimum-grant `parca-agent` SecurityContext
+## Minimum-grant `parca-agent` SecurityContext (forward-looking)
 
-A starting point for operators who want to deploy `parca-agent` alongside tracecore. Place the agent in its own namespace; do not co-locate it in the tracecore pod.
+A starting point for operators who want to plan ahead for the v0.4.0+ `parca-agent` deployment alongside tracecore. Place the agent in its own namespace; do not co-locate it in the tracecore pod. **Do not deploy this in v0.3.0** — `parca-agent` is not part of v0.3.0's recipe set; the cooperative pyspy receiver is still the supported path.
 
 ```yaml
 apiVersion: apps/v1
@@ -98,9 +104,9 @@ This is a **starting point**, not the upstream-recommended manifest. Pull the ca
 1. **Pod Security Standards.** `hostPID: true` and `add: [SYS_ADMIN]` both violate **baseline** PSS (and therefore restricted). Clusters with namespace labels `pod-security.kubernetes.io/enforce: baseline` (or restricted) must place `parca-agent` in an exempted namespace.
 2. **OPA / Kyverno cluster policies.** Custom admission policies that ban capability additions, `hostPID`, or host-path mounts must add a `parca-agent`-namespace exception.
 
-## Failure modes when capabilities are missing
+## Failure modes when capabilities are missing (forward-looking, v0.4.0+)
 
-These are the kernel-level failure shapes operators will see in `kubectl logs ds/parca-agent` when the SecurityContext is too restrictive. The agent's exact log strings vary by parca-agent version; the **errno / syscall** column is the stable surface — grep for the syscall name + errno code rather than the prose string.
+These are the kernel-level failure shapes operators will see in `kubectl logs ds/parca-agent` when PR-M has landed and the SecurityContext is too restrictive. The agent's exact log strings vary by parca-agent version; the **errno / syscall** column is the stable surface — grep for the syscall name + errno code rather than the prose string.
 
 | Failure shape | Underlying syscall + errno | Root cause | Remediation |
 |---|---|---|---|
@@ -112,64 +118,59 @@ These are the kernel-level failure shapes operators will see in `kubectl logs ds
 
 When triaging a real failure, capture the agent's full log (`kubectl logs --previous` for crash loops) and check it against [parca-dev/parca-agent/issues](https://github.com/parca-dev/parca-agent/issues) — operator failures outside the patterns above are upstream concerns, not tracecore concerns.
 
-## Helper / receiver removal checklist
+## Helper / receiver removal checklist (forward-looking, v0.4.0+)
 
-The following artefacts are gone at v0.3.0. Any operator config or CI workflow that references them fails fast (chart-render rejects unknown receiver keys; `pip install` fails on the deleted PyPI package).
+**Nothing to remove in v0.3.0.** This checklist is the *eventual* artefact removal once PR-M lands at v0.4.0+. Operators should not act on it at the v0.2.x → v0.3.0 upgrade; it is here so config / CI / alerting owners can stage the eventual cleanup ahead of time.
 
-| Artefact | Action required |
+| Artefact | Action required (at v0.4.0+, not v0.3.0) |
 |---|---|
-| Chart values key `receivers.pyspy.*` | Remove the block. Chart-render in v0.3.0 emits a `NOTES.txt` deprecation warning for one minor; v0.4.0 removes the key entirely. |
-| `pip install tracecore-pyspy` in workload images | Remove from `Dockerfile` / `requirements.txt`. The PyPI package is yanked at v0.3.0; rebuilds will fail with `No matching distribution`. |
+| Chart values key `receivers.pyspy.*` | Remove the block. The chart will emit a `NOTES.txt` deprecation warning for one minor before the values key is removed. |
+| `pip install tracecore-pyspy` in workload images | Remove from `Dockerfile` / `requirements.txt`. The PyPI package will be yanked when PR-M lands; rebuilds will then fail with `No matching distribution`. |
 | Workload-side `from tracecore_pyspy import attach; attach()` calls | Delete the import and call. No-op replacement — `parca-agent` requires zero workload code changes. |
 | Per-pod `/var/run/tracecore/pyspy/` `emptyDir` volume | Remove from your Pod spec. Was only needed for the UDS rendezvous. |
 | Alerts on `tracecore_receiver_errors_total{component="pyspy",kind=…}` | Delete. No corresponding metric in `parca-agent`; pivot to `parca_agent_*` self-metrics if you alert on profiler health. |
-| Pre-merge CI hooks for `tools/pyspy-lint` | Delete. The symbol-table lint guarded the cooperative receiver's "no out-of-process memory reads" property; it has no purpose once the receiver is gone. |
+| Pre-merge CI hooks for `tools/pyspy-lint` | Delete. The symbol-table lint guards the cooperative receiver's "no out-of-process memory reads" property; it has no purpose once the receiver is gone. |
 
 ## Verification
 
-1. **Before upgrading**, confirm parca-agent is deployable on at least one canary node:
-
-   ```bash
-   # Verify kernel BTF on a canary node
-   kubectl debug node/<canary-node> -it --image=busybox -- ls -la /host/sys/kernel/btf/vmlinux
-   # Expect: file exists. If missing, kernel upgrade required before v0.3.0 cutover.
-   ```
-
-2. **After upgrading**, verify the tracecore pod's SecurityContext is unchanged:
+1. **After upgrading to v0.3.0**, verify the tracecore pod's SecurityContext is unchanged:
 
    ```bash
    kubectl -n tracecore-system get ds tracecore -o yaml \
      | yq '.spec.template.spec.containers[0].securityContext'
    # Expect: capabilities.drop == [ALL], capabilities.add == [] or null.
    ```
 
-3. **Verify parca-agent boot** (in its own namespace):
+2. **Cooperative pyspy still works in v0.3.0.** No re-deploy of the helper is required; existing `tracecore-pyspy` `attach()` calls and `receivers.pyspy.*` chart-values keys continue to function. The receiver remains registered in v0.3.0's OCB binary.
+
+3. **Forward-looking: BTF check for the eventual parca-agent migration.** Operators planning the v0.4.0+ upgrade can confirm kernel BTF availability on canary nodes ahead of time:
 
    ```bash
-   kubectl -n parca logs ds/parca-agent --tail=50 \
-     | grep -E 'started|listening|attached'
-   # Expect: "started" line. EPERM / BTF errors per the table above indicate misconfiguration.
+   # Verify kernel BTF on a canary node (v0.4.0+ prerequisite)
+   kubectl debug node/<canary-node> -it --image=busybox -- ls -la /host/sys/kernel/btf/vmlinux
+   # Expect: file exists. If missing, kernel upgrade required before the v0.4.0+ cutover.
    ```
 
 ## Rollback
 
-The cooperative pyspy receiver is **not** registered in v0.3.0's OCB binary (per `builder-config.yaml`). Recipe-toggle rollback is not available. If parca-agent doesn't meet your security or compatibility budget, pin the chart and image at the last v0.2.x tag (`v0.2.0-…`; substitute the latest `v0.2.x` tag from `git tag -l 'v0.2.*'`) and keep running the cooperative receiver:
+There is no profiling-surface rollback required for the v0.2.x → v0.3.0 upgrade — the cooperative pyspy receiver is still registered in v0.3.0's OCB binary, the `tracecore-pyspy` PyPI package is still installable, and the chart's `receivers.pyspy.*` values key is still honoured. If a v0.3.0 upgrade introduces an unrelated regression, the standard chart-pin rollback applies:
 
 ```bash
 helm upgrade tracecore install/kubernetes/tracecore \
   --version <chart-package version matching v0.2.x> \
   --set image.tag=<v0.2.x binary tag>
 ```
 
-The cooperative receiver's PyPI helper (`tracecore-pyspy`) remains installable from PyPI's archive for one minor release after v0.3.0 cuts; pin `tracecore-pyspy==0.1.0` in your workload `requirements.txt`. PyPI yank happens at v0.4.0.
+(Substitute the latest `v0.2.x` tag from `git tag -l 'v0.2.*'`.)
 
 ## References
 
-- [RFC-0013 §Adoption matrix](../rfcs/0013-distro-first-pivot.md#2-adoption-matrix) — why pyspy is deleted in favour of parca-agent
-- [RFC-0013 §Migration / rollout](../rfcs/0013-distro-first-pivot.md#migration--rollout) — PR-M and PR-N sequencing
-- [RFC-0009 §Safety properties](../rfcs/0009-pyspy-receiver-scope.md#proposal) — historical record of the cooperative receiver's zero-capability design
-- [`components/receivers/pyspy/README.md`](../../components/receivers/pyspy/README.md) — cooperative receiver's user-facing docs (carries the v0.3.0 deletion banner)
-- [`components/receivers/pyspy/RUNBOOK.md`](../../components/receivers/pyspy/RUNBOOK.md) — per-kind operator triage for the cooperative receiver (preserved for operators still on v0.2.x)
+- [#222: PR-M deferral memo](https://github.com/TraceCoreAI/tracecore/issues/222) — current PR-M status + re-evaluation triggers (OTel Profiles → Beta, parca-agent OTLP export)
+- [RFC-0013 §Adoption matrix](../rfcs/0013-distro-first-pivot.md#2-adoption-matrix) — why pyspy is on the eventual deletion path in favour of parca-agent (note: timing in the RFC predates the #222 deferral)
+- [RFC-0013 §Migration / rollout](../rfcs/0013-distro-first-pivot.md#migration--rollout) — original PR-M and PR-N sequencing (supersede with #222 for current timeline)
+- [RFC-0009 §Safety properties](../rfcs/0009-pyspy-receiver-scope.md#proposal) — design record of the cooperative receiver's zero-capability posture (still in force at v0.3.0)
+- [`components/receivers/pyspy/README.md`](../../components/receivers/pyspy/README.md) — cooperative receiver's user-facing docs (the receiver ships in v0.3.0)
+- [`components/receivers/pyspy/RUNBOOK.md`](../../components/receivers/pyspy/RUNBOOK.md) — per-kind operator triage for the cooperative receiver
 - [Parca Agent / Requirements](https://github.com/parca-dev/parca-agent#requirements)
 - [Parca Agent / Security](https://www.parca.dev/docs/parca-agent-security)
 - [Linux Yama LSM (`ptrace_scope`)](https://docs.kernel.org/admin-guide/LSM/Yama.html) — relevant for operators evaluating in-cluster debugging policy alongside eBPF profiling