fix: Prevent FK constraint failure from nodes with empty names#62
Merged
Conversation
Fixes #42 — tree-sitter can produce nodes with empty names (e.g. from complex C/C++ declarators in header files). These nodes were silently skipped at DB insert time, but their containment edges were still inserted, causing a FOREIGN KEY constraint violation that crashed indexing. Two-layer fix: - createNode() now returns null for empty names, preventing the node and its edges from ever being created (Option A) - storeExtractionResult() filters edges and unresolved refs to only reference nodes that passed validation, as a safety net (Option B)
andreinknv
added a commit
to andreinknv/codegraph
that referenced
this pull request
May 9, 2026
Closes friction tracker colbymchenry#59. codegraph_entry_points was including test-bed HTTP routes from extraction-resolution-accuracy.test.ts and framework-language-gate.test.ts (GET /docfake, GET /api/users, etc.) and CLI commander fixtures from mcp-entry-points.test.ts itself (cmd admin, cmd search, cmd build-similarity-edges, cmd admin-sub, cmd foo). Real production routes/commands were crowded out by fixtures. handleEntryPoints now applies the same isTestPath filter that codegraph_digest uses, dropping test paths from the `route` node sweep before the HTTP / CLI bucket split. Self-introduced in d3c80d2 when the cli-commander framework resolver shipped — my own test fixture leaked into the agent-facing report. Reviewer info finding filed as colbymchenry#62: bucketEntryPointCandidates (public-exports path) has the same gap; deferred since no-in-tree- callers is a strong secondary filter. Regression test extends the existing commander fixture to write a __tests__/fixture.test.ts under the fixture project and asserts its `test-only-fixture-cmd` does NOT appear in the response. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
to andreinknv/codegraph
that referenced
this pull request
May 9, 2026
…LI-files buckets (colbymchenry#62) After colbymchenry#59 filtered test-bed HTTP routes / CLI command nodes out of entry_points, the Public exports bucket remained vulnerable. Exported functions/classes in test files (e.g. `export function fakeFromTest()` in `__tests__/test-helper.ts`) could still surface in the agent-facing report — digest.ts already did the equivalent filter at line 148; entry-points.ts did not. Add `if (isTestPath(n.filePath)) continue;` as the first guard in `collectExportedDefinitions`, before the `!n.isExported` check, so test helpers are unconditionally excluded from both CLI-files and public-exports buckets. Update the file-level docstring and the function JSDoc to document the filter. Add a `colbymchenry#62` test that places `export function fakeFromTest()` in `__tests__/test-helper.ts` of the fixture project and asserts it does not appear in any bucket. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 tasks
colbymchenry
added a commit
that referenced
this pull request
May 26, 2026
…violations (#455) (#463) PR #62 plugged this FK violation at the extraction-layer insertEdges site (empty-named nodes whose containment edges had no target), but the same violation kept reappearing on v0.9.5 during the daemon's *watch sync* once an agent's daemon had been running long enough. The resolution-layer insertEdges (and the callback-synthesizer pass) wasn't guarded the same way: a per-resolver name cache or a framework resolver's WeakMap-keyed lookup map could hand back a Node whose row had been removed by a recent file rewrite, and the FK check then aborted the entire resolution batch, leaving the daemon log filling with `Watch sync failed { error: 'FOREIGN KEY constraint failed' }`. The resolution layer now mirrors the #62 defense — one cache-aware getNodesByIds per pass drops any edge whose source or target is no longer in the nodes table, so the rest of the resolved batch still lands. Regression test seeds the resolver's nameCache with a stale Node and calls resolveAndPersist directly; verified to throw FOREIGN KEY constraint failed without the fix and pass with it. Full suite: 984/984 pass. Closes #455. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jorgerobles
pushed a commit
to jorgerobles/codegraph
that referenced
this pull request
Jun 1, 2026
…t-empty-names fix: Prevent FK constraint failure from nodes with empty names
jorgerobles
pushed a commit
to jorgerobles/codegraph
that referenced
this pull request
Jun 1, 2026
…violations (colbymchenry#455) (colbymchenry#463) PR colbymchenry#62 plugged this FK violation at the extraction-layer insertEdges site (empty-named nodes whose containment edges had no target), but the same violation kept reappearing on v0.9.5 during the daemon's *watch sync* once an agent's daemon had been running long enough. The resolution-layer insertEdges (and the callback-synthesizer pass) wasn't guarded the same way: a per-resolver name cache or a framework resolver's WeakMap-keyed lookup map could hand back a Node whose row had been removed by a recent file rewrite, and the FK check then aborted the entire resolution batch, leaving the daemon log filling with `Watch sync failed { error: 'FOREIGN KEY constraint failed' }`. The resolution layer now mirrors the colbymchenry#62 defense — one cache-aware getNodesByIds per pass drops any edge whose source or target is no longer in the nodes table, so the rest of the resolved batch still lands. Regression test seeds the resolver's nameCache with a stale Node and calls resolveAndPersist directly; verified to throw FOREIGN KEY constraint failed without the fix and pass with it. Full suite: 984/984 pass. Closes colbymchenry#455. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #42 —
FOREIGN KEY constraint failedwhen indexing C/C++ header files.Root cause: Tree-sitter can produce function nodes with empty names from complex C/C++ declarators (e.g. macro-generated declarations in
.hfiles). These nodes were silently skipped byinsertNode()validation, but their containment edges were still inserted into the database — causing a FK violation since the target node didn't exist.Two-layer fix:
createNode()now returnsnullfor empty names, preventing the node and its containment edges from ever being created. All 7 callers that use the return value handle thenullcase.storeExtractionResult()now filters edges and unresolved references to only include those referencing nodes that passed validation, preventing any future FK violations from similar edge cases.Test plan
.hfiles)