Skip to content

fix(sync): filesystem-based change detection (catch git pull & non-git edits)#414

Merged
colbymchenry merged 2 commits into
mainfrom
fix/filesystem-based-sync
May 25, 2026
Merged

fix(sync): filesystem-based change detection (catch git pull & non-git edits)#414
colbymchenry merged 2 commits into
mainfrom
fix/filesystem-based-sync

Conversation

@colbymchenry
Copy link
Copy Markdown
Owner

Problem

Incremental sync (the file watcher and git hooks both call codegraph sync)
detected changes via git status --porcelain, which only reports uncommitted
working-tree changes. After a git pull / checkout / merge / rebase the
working tree is clean, so sync reconciled nothing and the index silently went
stale until a full codegraph index -f. Non-git projects also leaned on a slow
full-rescan path. Surfaced from #393.

Reproduced: commit a change (clean tree) → codegraph sync → the new symbol is
missed; only index -f picked it up.

Fix

  • src/extraction/index.ts — change detection is now filesystem-based and
    git-independent.
    Enumerate current source files, skip unchanged ones with a
    cheap (size, mtime) stat pre-filter (both columns already stored per file),
    then confirm the rest with a content hash. Removals are checked against the
    filesystem (git ls-files still lists a file deleted-but-unstaged, so set
    membership alone misses deletions). This catches committed pull/checkout/merge/
    rebase changes, plain edits in non-git projects, adds, and deletes — uniformly.
  • src/mcp/index.ts — catch-up sync on connect. CodeGraph.open() doesn't
    sync, so a non-blocking catchUpSync() now runs when a client connects,
    reconciling anything changed while the server was down (e.g. a terminal git pull). It logs Caught up N file(s) changed since last run and runs even when
    the watcher is unavailable (WSL2 /mnt), where it matters most.

The one theoretical blind spot is a content change that preserves both size and
mtime exactly — the same limitation every mtime-based incremental tool accepts;
codegraph index -f is the escape hatch (git bumps mtime on checkout, so pulls
are caught).

Validation

10-check suite (pull-like committed add, delete, non-git plain edit, no-op
incrementality with stable counts + intact edges, and catch-up-on-connect), plus
the full unit suite:

  • macOS (FSEvents) — all cases + vitest 815 passed
  • Linux (Docker node:22, inotify) — all 10 checks
  • Windows 11 (ARM64 VM, ReadDirectoryChangesW) — all 10 checks

Notes / follow-ups

  • getChangedFiles() (the codegraph status display helper) still uses git status, so status can under-report pending changes after a pull. Display-only;
    sync() itself is now correct. Left for a separate change.
  • package.json is intentionally not bumped here — version bump + tag belong
    to the release commit/workflow.

🤖 Generated with Claude Code

colbymchenry and others added 2 commits May 25, 2026 17:32
Incremental sync detected changes with `git status --porcelain`, which only sees uncommitted working-tree changes — so committed changes from git pull/checkout/merge/rebase (clean tree afterward) were never reconciled, and non-git projects leaned on a slow full rescan. Change detection is now filesystem-based and git-independent: a (size, mtime) stat pre-filter skips unchanged files, then a content hash confirms the rest; removals are checked against the filesystem (git ls-files still lists deleted-but-unstaged files). Also adds a non-blocking catch-up sync on MCP connect so changes made while the server was down (e.g. a terminal git pull) are reconciled on connect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@colbymchenry colbymchenry merged commit 4a94696 into main May 25, 2026
@colbymchenry colbymchenry deleted the fix/filesystem-based-sync branch May 25, 2026 22:36
colbymchenry added a commit that referenced this pull request May 26, 2026
…457)

Originated from issue #438 ("Will newly created files be missing from
query results if sync is not manually run?"). Real users are second-
guessing whether their agent's freshly-created files are getting
indexed. They shouldn't have to test for themselves to find out.

## site/src/content/docs/guides/indexing.md

Expanded the existing 2-sentence "Stay fresh automatically" section
into the full three-layer explanation:

  1. File watcher with debounced auto-sync (default 2000ms, tunable
     via CODEGRAPH_WATCH_DEBOUNCE_MS, clamp [100ms, 60s]).
  2. Per-file staleness banner (#403) — covers the debounce window.
     Quoted the actual banner format + the verified Claude Code
     follow-up Read behaviour.
  3. Connect-time catch-up (#414) — covers gaps when the MCP server
     wasn't running.

Plus: how to verify state via codegraph_status (### Pending sync:),
when manual codegraph sync DOES make sense (watcher disabled / CI
scripting), and a link out to the v0.9.5 release notes.

## README.md

Added a <details><summary> collapsible right under the Key Features
table — primed by the existing 'Always Fresh' row in that table.
Condensed to ~10 lines covering the same three layers + a code-block
flow diagram + the verify command, with a deep link to the full guide.
GitHub renders <details> blocks natively, so the section is collapsed
by default and doesn't make the README scroll-length grow visibly.

Heading kept as 'Stay fresh automatically' (single-word slug) so the
README's deep-link anchor is predictable; the longer tagline lives on
its own line below.

940/942 tests still pass; no code changes.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
KannaKuron added a commit to KannaKuron/codegraph4bevy that referenced this pull request May 26, 2026
…, watcher fixes

Key upstream changes merged:
- feat(extraction): Objective-C language support (colbymchenry#165)
- feat(mcp): share one serve --mcp per project across MCP clients (colbymchenry#411)
- feat(mcp): per-file staleness banner + tunable watcher debounce (colbymchenry#403)
- feat(mcp): detect borrowed git worktree index (colbymchenry#312)
- feat(index): default-ignore dependency/build/cache dirs (colbymchenry#407)
- feat(resolution): mixed iOS / React Native / Expo bridging (colbymchenry#430)
- fix(watcher): exclude ignored dirs before watching (colbymchenry#276)
- fix(sync): filesystem-based change detection (colbymchenry#414)

Conflicts resolved: CLAUDE.md, .cursor/rules, extraction tests (kept split),
MCP files (tools/server-instructions/transport/engine), package.json (kept jieba)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant