Symptom
When the WP Codebox PHPUnit runner crashes before tests start executing (bootstrap failure, Playground init error, missing PHP extension, fatal during plugin load, etc.), it exits with code 1 and prints only this banner:
============================================
UNCLASSIFIED WP CODEBOX FAILURE (exit=1)
============================================
BUILD FAILED: WP Codebox exited with code 1 (unclassified)
============================================
The actual child-process error is completely swallowed. From CI you cannot tell whether the failure was:
- A PHP fatal in the plugin being tested
- A WordPress Playground bootstrap failure
- A missing PHP extension
- A dependency-resolver bug
- A real test failure
- The Playground runtime panicking
- A network/CDN issue pulling Playground assets
The homeboy wrapper around wp-codebox reports raw_output.stderr_tail and raw_output.stdout_tail, but in the pre-test crash case both are essentially empty — stderr_tail contains only the harmless info line:
Resolved dependency 'agents-api' via final validation dependency path: ... (HEAD <sha> (detached))
That line is normal — it's wp-codebox's own dependency resolver logging success. Nothing about the actual crash gets written to stderr or stdout before exit.
Canonical repro
PR Extra-Chill/data-machine#2154 — adds two filter hooks to TaxonomyHandler.php and one PHPUnit test file (tests/Unit/Core/WordPress/TaxonomyHandlerFiltersTest.php). The diff is small and passes php -l on both files. The same main branch passes WP Codebox tests (1321/1324, ~73s). The PR run dies after 12 seconds with the unclassified banner.
Run: https://github.com/Extra-Chill/data-machine/actions/runs/26198772928/job/77083947823
Timeline from the captured log:
00:57:42 Running PHPUnit tests via WP Codebox...
00:57:42 Plugin: data-machine
00:57:42 Backend: wp-codebox (WordPress Playground runtime)
00:57:53 UNCLASSIFIED WP CODEBOX FAILURE (exit=1)
00:57:53 BUILD FAILED: WP Codebox exited with code 1 (unclassified)
11 seconds, zero useful output. The actual crash happened somewhere in that gap and we have no way to know what or where.
Compare to a healthy run on the same repo's main branch (https://github.com/Extra-Chill/data-machine/actions/runs/26183503804): Installing... PHPUnit 9.6.34 ... 1324 tests ... OK over ~73s with full PHPUnit dot output streamed to stdout.
Why this matters
This is the core observability gap: wp-codebox's primary value proposition is reproducible CI testing inside a sandboxed Playground, and right now any failure in the sandbox is opaque. You can't fix what you can't see. Every pre-test crash becomes a binary-search expedition (revert, rerun, revert, rerun) instead of a "read the PHP fatal and fix it" five-second job.
Proposed fix
When the child runner process exits non-zero before producing structured PHPUnit results, capture its full stderr and stdout (or at least the last N KB of each) and print them in the failure banner before the "UNCLASSIFIED" line. Something like:
============================================
WP CODEBOX BOOTSTRAP FAILURE (exit=1)
============================================
--- Last 200 lines of child stderr ---
<actual PHP fatal / Playground panic / bootstrap error>
--- Last 50 lines of child stdout ---
<whatever the child wrote before dying>
============================================
BUILD FAILED: WP Codebox exited with code 1
============================================
The "unclassified" framing should be the last resort when the child genuinely produces zero output — not the first response when child output exists but the runner failed to capture it.
Where to look
The runner that wraps the Playground process and emits the banner is presumably in packages/cli/ (or wherever the CLI entry-point lives). Whatever spawns the Playground subprocess needs to:
- Pipe child stderr to a buffer (not /dev/null).
- Pipe child stdout to a buffer (not /dev/null).
- On non-zero exit, if no
[test-results] line was seen, dump the buffer to the wrapper's stderr before the unclassified banner.
If the child process is itself crashing inside the Playground WASM runtime (PHP fatal in WordPress core / plugin code), the Playground host should be flushing the WASM stdout buffer before reporting the crash. That's a different layer but same shape of fix.
Acceptance
- Cause an intentional bootstrap fatal in a plugin under test (e.g.
throw new \Error('canary') in the plugin's main file or a setUp() method).
- Run via
homeboy test / wp-codebox test.
- The captured CI log must contain the literal error message ("canary") and ideally the stack trace.
- Only if NO output was produced by the child should the "UNCLASSIFIED" banner appear.
Related
- The
wp-codebox-test-canary workflow on data-machine (commit db200773 on the wp-codebox-test-canary branch) was passing as recently as 2026-05-20 13:42 UTC — confirming wp-codebox itself works for happy-path runs. The bug here is exclusively about failure-path observability.
- This issue does NOT block normal usage of wp-codebox where tests pass. It blocks diagnosis when they don't.
- Discovered while reviewing the Extra Chill events location-taxonomy cleanup PRs; the data-machine#2154 PR is otherwise clean code (PR diff lints, no domain leakage, generic filter hooks with WP-style docblocks) but cannot be confirmed green because the runner won't say what's wrong.
Out of scope
- Whatever specific thing crashed in PR #2154's bootstrap. That's downstream of this issue — once stderr is visible, fixing it is mechanical.
- Playground host runtime bugs. If WASM PHP crashes hard (segfault), capturing stdout/stderr from the host process is the minimum useful behavior; debugging Playground itself is its own project.
Symptom
When the WP Codebox PHPUnit runner crashes before tests start executing (bootstrap failure, Playground init error, missing PHP extension, fatal during plugin load, etc.), it exits with code 1 and prints only this banner:
The actual child-process error is completely swallowed. From CI you cannot tell whether the failure was:
The homeboy wrapper around
wp-codeboxreportsraw_output.stderr_tailandraw_output.stdout_tail, but in the pre-test crash case both are essentially empty —stderr_tailcontains only the harmless info line:That line is normal — it's
wp-codebox's own dependency resolver logging success. Nothing about the actual crash gets written to stderr or stdout before exit.Canonical repro
PR Extra-Chill/data-machine#2154 — adds two filter hooks to
TaxonomyHandler.phpand one PHPUnit test file (tests/Unit/Core/WordPress/TaxonomyHandlerFiltersTest.php). The diff is small and passesphp -lon both files. The samemainbranch passes WP Codebox tests (1321/1324, ~73s). The PR run dies after 12 seconds with the unclassified banner.Run: https://github.com/Extra-Chill/data-machine/actions/runs/26198772928/job/77083947823
Timeline from the captured log:
11 seconds, zero useful output. The actual crash happened somewhere in that gap and we have no way to know what or where.
Compare to a healthy run on the same repo's
mainbranch (https://github.com/Extra-Chill/data-machine/actions/runs/26183503804):Installing... PHPUnit 9.6.34 ... 1324 tests ... OKover ~73s with full PHPUnit dot output streamed to stdout.Why this matters
This is the core observability gap: wp-codebox's primary value proposition is reproducible CI testing inside a sandboxed Playground, and right now any failure in the sandbox is opaque. You can't fix what you can't see. Every pre-test crash becomes a binary-search expedition (revert, rerun, revert, rerun) instead of a "read the PHP fatal and fix it" five-second job.
Proposed fix
When the child runner process exits non-zero before producing structured PHPUnit results, capture its full stderr and stdout (or at least the last N KB of each) and print them in the failure banner before the "UNCLASSIFIED" line. Something like:
The "unclassified" framing should be the last resort when the child genuinely produces zero output — not the first response when child output exists but the runner failed to capture it.
Where to look
The runner that wraps the Playground process and emits the banner is presumably in
packages/cli/(or wherever the CLI entry-point lives). Whatever spawns the Playground subprocess needs to:[test-results]line was seen, dump the buffer to the wrapper's stderr before the unclassified banner.If the child process is itself crashing inside the Playground WASM runtime (PHP fatal in WordPress core / plugin code), the Playground host should be flushing the WASM stdout buffer before reporting the crash. That's a different layer but same shape of fix.
Acceptance
throw new \Error('canary')in the plugin's main file or asetUp()method).homeboy test/wp-codebox test.Related
wp-codebox-test-canaryworkflow on data-machine (commit db200773 on thewp-codebox-test-canarybranch) was passing as recently as 2026-05-20 13:42 UTC — confirming wp-codebox itself works for happy-path runs. The bug here is exclusively about failure-path observability.Out of scope