Skip to content

fix: emit real unified diff for remote workspace pending changes (#429)#430

Merged
chubes4 merged 2 commits into
mainfrom
fix-unified-diff-generator
May 18, 2026
Merged

fix: emit real unified diff for remote workspace pending changes (#429)#430
chubes4 merged 2 commits into
mainfrom
fix-unified-diff-generator

Conversation

@chubes4
Copy link
Copy Markdown
Member

@chubes4 chubes4 commented May 18, 2026

Summary

RemoteWorkspaceBackend::build_full_file_diff() was not a unified diff. It dumped the entire old content as - lines and the entire new content as + lines regardless of how small the actual change was. Agents calling workspace_git_diff to verify a surgical workspace_edit saw their whole file marked for removal and re-addition, and consistently misinterpreted this as "whole-file line-ending weather" — concluding the edit had nuked the file even though the actual write to GitHub was correct.

Real example: the world-of-wordpress agent hit this pattern three cycles in a row on the Observatory page in run 26050485830. A one-line <h2> heading change showed up as @@ -1,250 +1,278 @@ followed by every line removed then re-added, and the agent reverted its (correct) edit to avoid committing a noisy rewrite. The agent's restraint was correct given what it saw — but it was based on a phantom diff.

What this change does

Replaces the fake whole-file diff with a real unified-diff generator:

  1. Common-prefix/suffix trimming on the line arrays, so the O(ND) core only runs on the actually-different middle window. Typical "surgical edit in large file" cases finish in O(N) instead of O(N^2).
  2. Myers' O(ND) diff algorithm with V-array trace, walked backwards to reconstruct an ordered edit script.
  3. Hunk grouping with three lines of context on each side, exactly the format git diff --no-color produces.
  4. Proper @@ -start,count +start,count @@ markers, including the 0,0 edge cases for pure addition and pure deletion.

No new dependencies. ~280 lines of well-commented PHP added to a single private helper cluster.

Before vs after

Before (Observatory-like 250-line file with one change at line 125):

@@ -1,250 +1,251 @@
-line 1
-line 2
-line 3
...
-line 250
+line 1
+line 2
+line 3
...
+line 250

501 diff lines for a 1-line change. Agent panics.

After:

@@ -122,7 +122,8 @@
 line 122
 line 123
 line 124
-line 125
+replacement A
+replacement B
 line 126
 line 127
 line 128

7 diff lines, real context, agent sees the actual change.

Verification

Tested against seven representative cases:

Case Result
two-line surgical (return 'old'return 'new') Single context line + one -/+ pair
identical content Header only, no hunks
pure addition @@ -0,0 +1,N @@ matching git
pure deletion @@ -1,N +0,0 @@ matching git
middle change in 10-line file Correct hunk with 3-line context
two separate hunks Correctly split into two @@ blocks
observatory-like 250 lines 7-line hunk instead of 528-line phantom diff

tests/smoke-remote-workspace-backend.php passes 15/15 assertions:

  • Existing diff assertion still matches (real unified diff still contains -return 'old'; and +return 'new';)
  • New assertion: real hunk header @@ -1,2 +1,2 @@ is emitted
  • New assertion: unchanged <?php line appears as space-prefixed context, not as a -/+ pair

homeboy lint --changed-since origin/main → 0 findings.
homeboy test --changed-since origin/main → passed.

Why this is upstream-first

The world creator's daily memory describes this as "line-ending weather" — a real perception bug that has already shaped three days of agent decisions. Per the site's "fix upstream first, never paper over" rule, the right move is to fix DMC so every consumer of workspace_git_diff sees real diffs, rather than papering it over in the world-creator bundle prompt with "ignore the diff, trust the edit return value."

AI assistance

  • AI assistance: Yes
  • Tool(s): Claude Code (Sonnet 4.5)
  • Used for: Diagnosed the agent "line-ending weather" pattern from the world-creator transcript, traced it to build_full_file_diff, implemented Myers' algorithm in PHP, and verified the output against representative cases including the Observatory-sized file. Chris confirmed the upstream-fix-first approach over agent-side workarounds.

Closes #429.

RemoteWorkspaceBackend::build_full_file_diff() was not a unified diff.
It dumped the entire old content as '-' lines and the entire new content
as '+' lines regardless of how small the actual change was. Agents that
called workspace_git_diff to verify a surgical workspace_edit saw their
whole file marked for removal and re-addition, and consistently
misinterpreted this as 'whole-file line-ending weather' — concluding the
edit had nuked the file even though the actual write to GitHub was
correct.

Real example: the world-of-wordpress agent hit this pattern three cycles
in a row on the Observatory page in run 26050485830. A one-line <h2>
heading change showed up as '@@ -1,250 +1,278 @@' followed by every line
of the file removed then re-added, and the agent reverted its (correct)
edit to avoid committing a noisy rewrite.

This change replaces the fake whole-file diff with a real unified-diff
generator:

- Common-prefix/suffix trimming on the line arrays, so the O(ND) core
  only runs on the actually-different middle window. Typical 'surgical
  edit in large file' cases finish in O(N) instead of O(N^2).
- Myers' O(ND) diff algorithm with V-array trace, walked backwards to
  reconstruct an ordered edit script.
- Hunk grouping with three lines of context on each side, exactly the
  format git diff --no-color produces.
- Proper @@ -start,count +start,count @@ markers, including the 0,0
  edge cases for pure addition and pure deletion.

Verified on seven representative cases:
- two-line surgical (replaces phantom whole-file replace with single
  context line + one -/+ pair)
- identical content (header only, no hunks)
- pure addition (@@ -0,0 +1,N @@ matching git)
- pure deletion (@@ -1,N +0,0 @@ matching git)
- middle change in 10-line file (correct hunk with 3-line context)
- two separate hunks (correctly split into two @@ blocks)
- observatory-like 250-line file with one change at line 125 (7-line
  hunk instead of 528-line phantom diff)

Existing smoke test passes unchanged (the -return 'old';/+return 'new';
assertions still match real unified diff output). Two new assertions
lock in the new behavior: a real hunk header is emitted, and unchanged
lines appear as space-prefixed context rather than as -/+ pairs.

Closes #429.

AI assistance: Yes - Claude Code (Sonnet 4.5) diagnosed the agent
'line-ending weather' pattern from the world-creator transcript, traced
it to build_full_file_diff, implemented Myers' algorithm in PHP, and
verified the output against representative cases including the
Observatory-sized file. Chris confirmed the upstream-fix-first approach
over agent-side workarounds.
@homeboy-ci
Copy link
Copy Markdown
Contributor

homeboy-ci Bot commented May 18, 2026

Homeboy Results — data-machine-code

Lint

lint — passed

ℹ️ Full options: homeboy docs commands/lint
ℹ️ Save lint baseline: homeboy lint data-machine-code --baseline
Deep dive: homeboy lint data-machine-code --changed-since 3bc3032

Test

test — passed

ℹ️ Auto-fix lint issues: homeboy refactor data-machine-code --from lint --write
ℹ️ Collect coverage: homeboy test data-machine-code --coverage
ℹ️ Pass args to test runner: homeboy test -- [args]
ℹ️ Full options: homeboy docs commands/test
Deep dive: homeboy test data-machine-code --changed-since 3bc3032

Audit

audit — passed

  • duplication — 3 finding(s)
  • intra-method-duplication — 2 finding(s)
  • dead_code — 1 finding(s)
  • test_coverage — 1 finding(s)
  • Total: 7 finding(s)

Deep dive: homeboy audit data-machine-code --changed-since 3bc3032

Tooling versions
  • Homeboy CLI: homeboy 0.182.0+bf5469d
  • Extension: wordpress from https://github.com/Extra-Chill/homeboy-extensions
  • Extension revision: 65942142
  • Action: unknown@unknown

Pre-existing phpstan finding surfaced by CI's --changed-since scoping
when the diff generator overhaul touched this same file. $count > 0 at
line 357 guarantees strpos cannot return false, but phpstan can't see
that across the early-return. Make it explicit so static analysis is
clean.

No behavior change — the false branch falls back to the original
content (which is unreachable anyway given the prior check).
@chubes4 chubes4 merged commit 9ef9d59 into main May 18, 2026
5 checks passed
@chubes4 chubes4 deleted the fix-unified-diff-generator branch May 18, 2026 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RemoteWorkspaceBackend::build_full_file_diff() emits whole-file replace as diff, not unified diff

1 participant