From 83adce2298d7610d4a0d0dc9617516105c5a9d8d Mon Sep 17 00:00:00 2001 From: Tri Lam Date: Sat, 30 May 2026 23:46:30 -0700 Subject: [PATCH] =?UTF-8?q?docs(security):=20PR-N=20=E2=80=94=20pyspy=20ca?= =?UTF-8?q?pability=20surface=20+=20SecurityContext=20guide?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds docs/migration/v0.2-to-v0.3.md covering the v0.3.0 security-posture migration per RFC-0013 §migration. Cooperative pyspy (zero capabilities, in-process faulthandler) is deleted at v0.3.0; operators who want Python profiling deploy parca-agent, which requires CAP_SYS_ADMIN (or root) + hostPID + BTF-enabled kernel. The guide names the exact capability surface, kernel requirement, kernel syscall + errno failure shapes (not paraphrased agent log strings), minimum-grant SecurityContext snippet, and rollback path. Conservative on CAP_BPF/CAP_PERFMON — upstream parca-agent does not document the narrower split today. Updates docs/migration/v0.1-to-v0.2.md pyspy row to forward-reference the new guide (was claiming "no upstream replacement exists today" — RFC-0013 names parca-agent at v0.3.0). Updates docs/README.md to index the migration/ subdirectory. Co-Authored-By: Claude Opus 4.7 (1M context) Signed-off-by: Tri Lam --- docs/README.md | 1 + docs/migration/v0.1-to-v0.2.md | 2 +- docs/migration/v0.2-to-v0.3.md | 176 +++++++++++++++++++++++++++++++++ 3 files changed, 178 insertions(+), 1 deletion(-) create mode 100644 docs/migration/v0.2-to-v0.3.md diff --git a/docs/README.md b/docs/README.md index 78b8b4b5..7d8f5330 100644 --- a/docs/README.md +++ b/docs/README.md @@ -36,6 +36,7 @@ Legend: 👤 operator · 🛠️ contributor · 🏛️ maintainer · 🌐 exter | [examples/](examples/) | 👤 | Reference operator artifacts (Prometheus alerts, Grafana dashboard, with-telemetry config). | | [followups/](followups/) | 🏛️ | Per-milestone follow-up shards + cross-cutting `_needs-prod-data` / `_needs-gpu` buckets. See [followups/README.md](followups/README.md) for filing convention. | | [integrations/](integrations/) | 👤 | Validated recipes for shipping tracecore output to specific backends. See per-recipe rows below. | +| [migration/](migration/) | 👤 | Per-minor-release upgrade guides covering every operator-visible break. One file per release boundary. | | [notes/](notes/) | 🛠️ 🏛️ | Working notes on process, CI, PR workflow, reviews, conftest, autonomous-run logs. See [notes/README.md](notes/README.md). | ## Integrations diff --git a/docs/migration/v0.1-to-v0.2.md b/docs/migration/v0.1-to-v0.2.md index 7ba0e171..d7e6496c 100644 --- a/docs/migration/v0.1-to-v0.2.md +++ b/docs/migration/v0.1-to-v0.2.md @@ -69,7 +69,7 @@ The OCB-assembled binary registers only the components listed in [`builder-confi | `receivers.k8sevents` | receiver | `k8sobjectsreceiver` + OTTL `k8s.event.hint` transform | Leave `k8sevents.enabled: false` (default). The PR-J recipe ships the OTTL transform that preserves the 11-entry `k8s.event.hint` enum (RFC-0013 §3 contract); until then, pin v0.1.x if you alert on `k8s.event.hint`. | | `receivers.kernelevents` | receiver | `journaldreceiver` + `filelogreceiver` (kmsg) + OTTL Xid transform | Leave `kernelevents.enabled: false` (default). The PR-J recipe ships the OTTL transform that keeps `kernelevents.xid` populated; until then, pin v0.1.x if you alert on Xid codes. | | `receivers.nccl_fr` | receiver | In-repo Go submodule via OCB `gomod:` (PR-I) + `replaces: ./module` | No operator action; the receiver ships in `module/receiver/ncclfrreceiver` and OCB pulls it like any upstream module. | -| `receivers.pyspy` | receiver | Deferred until OTel Profiles GA | Leave `pyspy.enabled: false` (default). No upstream replacement exists today; the toggle survives until contrib ships `pprofreceiver`. | +| `receivers.pyspy` | receiver | `parca-agent` (separate DaemonSet, eBPF) at v0.3.0 | Leave `pyspy.enabled: false` (default) through v0.2.x. At v0.3.0 the receiver + PyPI helper are deleted; the security-posture migration (zero-capability cooperative receiver → `parca-agent`'s `CAP_SYS_ADMIN` on a separate pod) is documented in [`v0.2-to-v0.3.md`](v0.2-to-v0.3.md). | | `exporters.stdoutexporter` | exporter | `debugexporter` (OCB-bundled, chart default) | Replace `exporters.stdoutexporter` with `exporters.debug` in pipelines. The debug exporter writes to pod stdout, same observation channel. | | `exporters.otlphttp` (in-tree clone) | exporter | `otlphttpexporter` (OCB-bundled) | Same chart key (`otlphttp`), same field shape — `endpoint`, `compression`, `headers`, `tls.*`, `timeout`, `retry_on_failure`, `sending_queue` pass through to the upstream exporter without translation. | diff --git a/docs/migration/v0.2-to-v0.3.md b/docs/migration/v0.2-to-v0.3.md new file mode 100644 index 00000000..93cba6cb --- /dev/null +++ b/docs/migration/v0.2-to-v0.3.md @@ -0,0 +1,176 @@ +# Migration: v0.2.x → v0.3.0 + +This guide tells operators how to move from a `v0.2.x` deployment to `v0.3.0`. The single operator-visible break this release is the Python-profiling story: tracecore's in-tree `pyspy` receiver and its `tracecore-pyspy` PyPI helper are deleted, and the upstream-recipe replacement (`parca-agent`) changes the SecurityContext budget. Everything else is unchanged from `v0.2.x` — see [`v0.1-to-v0.2.md`](v0.1-to-v0.2.md) for the prior cut's surface. + +## TL;DR + +The cooperative `pyspy` receiver (Phase 2 design: in-process Python helper + UDS + `faulthandler.dump_traceback`, zero capabilities added per [RFC-0009 §Safety properties](../rfcs/0009-pyspy-receiver-scope.md)) is deleted in PR-M. Operators who want Python stack sampling deploy `parca-agent` as a separate DaemonSet via the upstream chart. `parca-agent` is an eBPF profiler and **requires `CAP_SYS_ADMIN` (or root)** on its pod — a strictly larger capability budget than the zero-capability cooperative pyspy needed. The tracecore DaemonSet's capability set is unchanged (still `drop: [ALL]`, `add: []`); the new capability lives on the `parca-agent` pod, deployed and governed separately by the operator. + +This guide names the new capability surface, the kernel requirement, the failure modes operators will see if the SecurityContext is too restrictive, and a minimum-grant SecurityContext snippet. The cooperative-pyspy design rationale (why tracecore avoided `CAP_SYS_PTRACE` for two minor releases) is in [RFC-0009 §Alternatives](../rfcs/0009-pyspy-receiver-scope.md#alternatives-considered); the deletion rationale is in [RFC-0013 §Adoption matrix](../rfcs/0013-distro-first-pivot.md#2-adoption-matrix). + +## Why the security posture changes + +Cooperative pyspy (v0.1.x, v0.2.x) walked Python frames **inside** the workload process via `faulthandler.dump_traceback`, then shipped the rendered output over a per-process Unix domain socket. No memory of another process was read; no `ptrace`, no `process_vm_readv`, no signal. The tradeoff was operator-side: every workload had to `pip install tracecore-pyspy` and call `attach()` once at startup, and the helper only worked against cooperating CPython interpreters. + +`parca-agent` (v0.3.0+) walks frames **out-of-process** via eBPF programs attached to the kernel's perf-events subsystem. The eBPF approach removes the workload-side cooperation requirement (any binary the kernel can sample is in scope, including non-Python runtimes), and removes the per-language helper-distribution problem. The cost is privilege: loading eBPF programs requires `CAP_SYS_ADMIN` (or root), and reading symbolized stacks from kernel + user space requires that the agent see the global PID namespace (`hostPID: true`) and the on-disk binaries of every workload it samples. + +The change is a tradeoff, not a regression: tracecore preserves the cooperative path through end-of-life at v0.3.0 specifically so operators with restricted-tier Pod Security Standards have one release to evaluate whether the eBPF capability cost is acceptable for their cluster. + +## What `parca-agent` requires + +| Requirement | Value | Source | +|---|---|---| +| Linux kernel | ≥ 5.3 with BTF (`CONFIG_DEBUG_INFO_BTF=y`) | [Parca Agent docs / Requirements](https://github.com/parca-dev/parca-agent#requirements) | +| User | `root` **OR** `CAP_SYS_ADMIN` (no narrower split documented upstream) | [Parca Agent docs / Security](https://www.parca.dev/docs/parca-agent-security) | +| Pod-level | `hostPID: true` (cross-namespace process visibility for symbolization) | upstream DaemonSet manifest | +| Volumes | `/sys` (BPF FS, perf-events), `/proc` (process discovery), `/run` (BPF map persistence), host filesystem for symbol resolution | upstream DaemonSet manifest | +| Container | Privileged **OR** `add: [SYS_ADMIN]` on top of `drop: [ALL]` | upstream documentation | + +**On `CAP_BPF` / `CAP_PERFMON`.** Linux kernel 5.8 split `CAP_SYS_ADMIN`'s BPF surface into the narrower `CAP_BPF` (load BPF programs and maps) + `CAP_PERFMON` (open perf events) capabilities. In principle a profiler that uses only BPF + perf-events can run with `add: [BPF, PERFMON]` instead of `add: [SYS_ADMIN]`. **Upstream `parca-agent` does not document support for this narrower set today** (per its security docs, the requirement is `root` or `CAP_SYS_ADMIN`); operators interested in the narrower split should track [parca-dev/parca-agent#3115](https://github.com/parca-dev/parca-agent/issues) (CAP_BPF/CAP_PERFMON tracking) and validate against their kernel before relying on it. The conservative grant remains `CAP_SYS_ADMIN`. + +## What tracecore's pod still requires + +**Unchanged from v0.2.x.** The tracecore DaemonSet's container SecurityContext is still: + +```yaml +containerSecurityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: [ALL] + add: [] +``` + +The chart's conftest policy (`install/kubernetes/tracecore/policy/`) still rejects any capability addition — there is no v0.3.0 operator path that puts `CAP_SYS_ADMIN` on the tracecore pod itself. All new capability surface lives on the **separate** `parca-agent` DaemonSet. + +## Minimum-grant `parca-agent` SecurityContext + +A starting point for operators who want to deploy `parca-agent` alongside tracecore. Place the agent in its own namespace; do not co-locate it in the tracecore pod. + +```yaml +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: parca-agent + namespace: parca +spec: + selector: + matchLabels: + app.kubernetes.io/name: parca-agent + template: + metadata: + labels: + app.kubernetes.io/name: parca-agent + spec: + # Cross-namespace PID visibility required for symbolization. + hostPID: true + serviceAccountName: parca-agent + containers: + - name: parca-agent + image: ghcr.io/parca-dev/parca-agent:v0.46.0 + securityContext: + # Minimum-grant: drop all, add the one capability the + # eBPF + perf-events path requires. Avoid `privileged: + # true` — the explicit capability add is narrower and + # passes restricted-tier audits with a documented + # exception, while `privileged` grants the union of + # all capabilities + device access. + allowPrivilegeEscalation: false + # `readOnlyRootFilesystem` is desirable but not asserted + # against the upstream agent here; verify against + # parca-dev/parca-agent/deploy/ for the current + # writable-path set before enabling. + capabilities: + drop: [ALL] + add: [SYS_ADMIN] + volumeMounts: + - { name: sys, mountPath: /sys, readOnly: false } + - { name: proc, mountPath: /host/proc, readOnly: true } + - { name: run, mountPath: /run } + volumes: + - { name: sys, hostPath: { path: /sys, type: Directory } } + - { name: proc, hostPath: { path: /proc, type: Directory } } + - { name: run, hostPath: { path: /run, type: Directory } } +``` + +This is a **starting point**, not the upstream-recommended manifest. Pull the canonical deployment from [parca-dev/parca-agent/deploy/](https://github.com/parca-dev/parca-agent/tree/main/deploy) and adapt to your cluster's PSS tier. Two cluster-policy interactions to verify before rollout: + +1. **Pod Security Standards.** `hostPID: true` and `add: [SYS_ADMIN]` both violate **baseline** PSS (and therefore restricted). Clusters with namespace labels `pod-security.kubernetes.io/enforce: baseline` (or restricted) must place `parca-agent` in an exempted namespace. +2. **OPA / Kyverno cluster policies.** Custom admission policies that ban capability additions, `hostPID`, or host-path mounts must add a `parca-agent`-namespace exception. + +## Failure modes when capabilities are missing + +These are the kernel-level failure shapes operators will see in `kubectl logs ds/parca-agent` when the SecurityContext is too restrictive. The agent's exact log strings vary by parca-agent version; the **errno / syscall** column is the stable surface — grep for the syscall name + errno code rather than the prose string. + +| Failure shape | Underlying syscall + errno | Root cause | Remediation | +|---|---|---|---| +| BPF program load fails at startup | `bpf(BPF_PROG_LOAD, …)` → `EPERM` | Container missing `CAP_SYS_ADMIN`. The kernel rejects BPF program load from an unprivileged process. | Add `capabilities.add: [SYS_ADMIN]` to the container `securityContext`, or set `securityContext.privileged: true`. | +| Perf event open fails | `perf_event_open(…)` → `EACCES` or `EPERM` | Container has `CAP_SYS_ADMIN` but the kernel's `kernel.perf_event_paranoid` sysctl is `>= 2`, blocking unprivileged perf measurements. (`CAP_SYS_ADMIN` bypasses this on most kernels; some hardened distros require explicit `CAP_PERFMON` even with admin.) | Either lower `kernel.perf_event_paranoid` to `1` on the node (sysctl, requires node-level access), or upgrade kernel to ≥5.8 and add `CAP_PERFMON` to the container. | +| BTF discovery fails at startup | `open("/sys/kernel/btf/vmlinux", …)` → `ENOENT` | Kernel is missing `CONFIG_DEBUG_INFO_BTF=y`. Most distro kernels ≥5.3 ship BTF; minimal / Alpine / older RHEL kernels may not. | Upgrade to a BTF-enabled kernel (`ls /sys/kernel/btf/vmlinux` on the node confirms), or pin nodes with a known-good kernel (Ubuntu 22.04+, RHEL 9+, Amazon Linux 2023, GKE / EKS / AKS managed images). | +| BPF FS unavailable | `mount("bpf", "/sys/fs/bpf", "bpf", …)` → `EPERM` or `EACCES` | Container missing `CAP_SYS_ADMIN`, OR `/sys` host-path mount is `readOnly: true`, OR the node has no BPF FS available. | Ensure `CAP_SYS_ADMIN` is granted, the `/sys` mount is `readOnly: false`, and `mount \| grep bpf` on the node returns a `bpf` line. | +| Workload PIDs not discoverable | `readdir("/proc")` returns only the agent's own PID namespace | Pod is missing `hostPID: true`. The agent's `/proc` view doesn't include workload PIDs across namespaces. | Set `hostPID: true` on the pod spec. | + +When triaging a real failure, capture the agent's full log (`kubectl logs --previous` for crash loops) and check it against [parca-dev/parca-agent/issues](https://github.com/parca-dev/parca-agent/issues) — operator failures outside the patterns above are upstream concerns, not tracecore concerns. + +## Helper / receiver removal checklist + +The following artefacts are gone at v0.3.0. Any operator config or CI workflow that references them fails fast (chart-render rejects unknown receiver keys; `pip install` fails on the deleted PyPI package). + +| Artefact | Action required | +|---|---| +| Chart values key `receivers.pyspy.*` | Remove the block. Chart-render in v0.3.0 emits a `NOTES.txt` deprecation warning for one minor; v0.4.0 removes the key entirely. | +| `pip install tracecore-pyspy` in workload images | Remove from `Dockerfile` / `requirements.txt`. The PyPI package is yanked at v0.3.0; rebuilds will fail with `No matching distribution`. | +| Workload-side `from tracecore_pyspy import attach; attach()` calls | Delete the import and call. No-op replacement — `parca-agent` requires zero workload code changes. | +| Per-pod `/var/run/tracecore/pyspy/` `emptyDir` volume | Remove from your Pod spec. Was only needed for the UDS rendezvous. | +| Alerts on `tracecore_receiver_errors_total{component="pyspy",kind=…}` | Delete. No corresponding metric in `parca-agent`; pivot to `parca_agent_*` self-metrics if you alert on profiler health. | +| Pre-merge CI hooks for `tools/pyspy-lint` | Delete. The symbol-table lint guarded the cooperative receiver's "no out-of-process memory reads" property; it has no purpose once the receiver is gone. | + +## Verification + +1. **Before upgrading**, confirm parca-agent is deployable on at least one canary node: + + ```bash + # Verify kernel BTF on a canary node + kubectl debug node/ -it --image=busybox -- ls -la /host/sys/kernel/btf/vmlinux + # Expect: file exists. If missing, kernel upgrade required before v0.3.0 cutover. + ``` + +2. **After upgrading**, verify the tracecore pod's SecurityContext is unchanged: + + ```bash + kubectl -n tracecore-system get ds tracecore -o yaml \ + | yq '.spec.template.spec.containers[0].securityContext' + # Expect: capabilities.drop == [ALL], capabilities.add == [] or null. + ``` + +3. **Verify parca-agent boot** (in its own namespace): + + ```bash + kubectl -n parca logs ds/parca-agent --tail=50 \ + | grep -E 'started|listening|attached' + # Expect: "started" line. EPERM / BTF errors per the table above indicate misconfiguration. + ``` + +## Rollback + +The cooperative pyspy receiver is **not** registered in v0.3.0's OCB binary (per `builder-config.yaml`). Recipe-toggle rollback is not available. If parca-agent doesn't meet your security or compatibility budget, pin the chart and image at the last v0.2.x tag (`v0.2.0-…`; substitute the latest `v0.2.x` tag from `git tag -l 'v0.2.*'`) and keep running the cooperative receiver: + +```bash +helm upgrade tracecore install/kubernetes/tracecore \ + --version \ + --set image.tag= +``` + +The cooperative receiver's PyPI helper (`tracecore-pyspy`) remains installable from PyPI's archive for one minor release after v0.3.0 cuts; pin `tracecore-pyspy==0.1.0` in your workload `requirements.txt`. PyPI yank happens at v0.4.0. + +## References + +- [RFC-0013 §Adoption matrix](../rfcs/0013-distro-first-pivot.md#2-adoption-matrix) — why pyspy is deleted in favour of parca-agent +- [RFC-0013 §Migration / rollout](../rfcs/0013-distro-first-pivot.md#migration--rollout) — PR-M and PR-N sequencing +- [RFC-0009 §Safety properties](../rfcs/0009-pyspy-receiver-scope.md#proposal) — historical record of the cooperative receiver's zero-capability design +- [`components/receivers/pyspy/README.md`](../../components/receivers/pyspy/README.md) — cooperative receiver's user-facing docs (carries the v0.3.0 deletion banner) +- [`components/receivers/pyspy/RUNBOOK.md`](../../components/receivers/pyspy/RUNBOOK.md) — per-kind operator triage for the cooperative receiver (preserved for operators still on v0.2.x) +- [Parca Agent / Requirements](https://github.com/parca-dev/parca-agent#requirements) +- [Parca Agent / Security](https://www.parca.dev/docs/parca-agent-security) +- [Linux Yama LSM (`ptrace_scope`)](https://docs.kernel.org/admin-guide/LSM/Yama.html) — relevant for operators evaluating in-cluster debugging policy alongside eBPF profiling +- [Kubernetes Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) — baseline / restricted / privileged tier definitions