Skip to content

fix: Prevent FK constraint failure from nodes with empty names#62

Merged
colbymchenry merged 1 commit into
mainfrom
fix/fk-constraint-empty-names
Mar 18, 2026
Merged

fix: Prevent FK constraint failure from nodes with empty names#62
colbymchenry merged 1 commit into
mainfrom
fix/fk-constraint-empty-names

Conversation

@colbymchenry
Copy link
Copy Markdown
Owner

Summary

Fixes #42FOREIGN KEY constraint failed when indexing C/C++ header files.

Root cause: Tree-sitter can produce function nodes with empty names from complex C/C++ declarators (e.g. macro-generated declarations in .h files). These nodes were silently skipped by insertNode() validation, but their containment edges were still inserted into the database — causing a FK violation since the target node didn't exist.

Two-layer fix:

  • Extraction layer: createNode() now returns null for empty names, preventing the node and its containment edges from ever being created. All 7 callers that use the return value handle the null case.
  • Storage layer (safety net): storeExtractionResult() now filters edges and unresolved references to only include those referencing nodes that passed validation, preventing any future FK violations from similar edge cases.

Test plan

Fixes #42 — tree-sitter can produce nodes with empty names (e.g. from
complex C/C++ declarators in header files). These nodes were silently
skipped at DB insert time, but their containment edges were still
inserted, causing a FOREIGN KEY constraint violation that crashed
indexing.

Two-layer fix:
- createNode() now returns null for empty names, preventing the node
  and its edges from ever being created (Option A)
- storeExtractionResult() filters edges and unresolved refs to only
  reference nodes that passed validation, as a safety net (Option B)
@colbymchenry colbymchenry merged commit bdbe59b into main Mar 18, 2026
andreinknv added a commit to andreinknv/codegraph that referenced this pull request May 9, 2026
Closes friction tracker colbymchenry#59. codegraph_entry_points was including
test-bed HTTP routes from extraction-resolution-accuracy.test.ts and
framework-language-gate.test.ts (GET /docfake, GET /api/users, etc.)
and CLI commander fixtures from mcp-entry-points.test.ts itself
(cmd admin, cmd search, cmd build-similarity-edges, cmd admin-sub,
cmd foo). Real production routes/commands were crowded out by
fixtures.

handleEntryPoints now applies the same isTestPath filter that
codegraph_digest uses, dropping test paths from the `route` node
sweep before the HTTP / CLI bucket split. Self-introduced in
d3c80d2 when the cli-commander framework resolver shipped — my own
test fixture leaked into the agent-facing report.

Reviewer info finding filed as colbymchenry#62: bucketEntryPointCandidates
(public-exports path) has the same gap; deferred since no-in-tree-
callers is a strong secondary filter.

Regression test extends the existing commander fixture to write a
__tests__/fixture.test.ts under the fixture project and asserts
its `test-only-fixture-cmd` does NOT appear in the response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit to andreinknv/codegraph that referenced this pull request May 9, 2026
…LI-files buckets (colbymchenry#62)

After colbymchenry#59 filtered test-bed HTTP routes / CLI command nodes out of
entry_points, the Public exports bucket remained vulnerable. Exported
functions/classes in test files (e.g. `export function fakeFromTest()`
in `__tests__/test-helper.ts`) could still surface in the agent-facing
report — digest.ts already did the equivalent filter at line 148;
entry-points.ts did not.

Add `if (isTestPath(n.filePath)) continue;` as the first guard in
`collectExportedDefinitions`, before the `!n.isExported` check, so
test helpers are unconditionally excluded from both CLI-files and
public-exports buckets. Update the file-level docstring and the
function JSDoc to document the filter. Add a `colbymchenry#62` test that places
`export function fakeFromTest()` in `__tests__/test-helper.ts` of the
fixture project and asserts it does not appear in any bucket.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
colbymchenry added a commit that referenced this pull request May 26, 2026
…violations (#455) (#463)

PR #62 plugged this FK violation at the extraction-layer insertEdges site
(empty-named nodes whose containment edges had no target), but the same
violation kept reappearing on v0.9.5 during the daemon's *watch sync* once an
agent's daemon had been running long enough. The resolution-layer insertEdges
(and the callback-synthesizer pass) wasn't guarded the same way: a per-resolver
name cache or a framework resolver's WeakMap-keyed lookup map could hand back
a Node whose row had been removed by a recent file rewrite, and the FK check
then aborted the entire resolution batch, leaving the daemon log filling with
`Watch sync failed { error: 'FOREIGN KEY constraint failed' }`.

The resolution layer now mirrors the #62 defense — one cache-aware
getNodesByIds per pass drops any edge whose source or target is no longer in
the nodes table, so the rest of the resolved batch still lands.

Regression test seeds the resolver's nameCache with a stale Node and calls
resolveAndPersist directly; verified to throw FOREIGN KEY constraint failed
without the fix and pass with it. Full suite: 984/984 pass.

Closes #455.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jorgerobles pushed a commit to jorgerobles/codegraph that referenced this pull request Jun 1, 2026
…t-empty-names

fix: Prevent FK constraint failure from nodes with empty names
jorgerobles pushed a commit to jorgerobles/codegraph that referenced this pull request Jun 1, 2026
…violations (colbymchenry#455) (colbymchenry#463)

PR colbymchenry#62 plugged this FK violation at the extraction-layer insertEdges site
(empty-named nodes whose containment edges had no target), but the same
violation kept reappearing on v0.9.5 during the daemon's *watch sync* once an
agent's daemon had been running long enough. The resolution-layer insertEdges
(and the callback-synthesizer pass) wasn't guarded the same way: a per-resolver
name cache or a framework resolver's WeakMap-keyed lookup map could hand back
a Node whose row had been removed by a recent file rewrite, and the FK check
then aborted the entire resolution batch, leaving the daemon log filling with
`Watch sync failed { error: 'FOREIGN KEY constraint failed' }`.

The resolution layer now mirrors the colbymchenry#62 defense — one cache-aware
getNodesByIds per pass drops any edge whose source or target is no longer in
the nodes table, so the rest of the resolved batch still lands.

Regression test seeds the resolver's nameCache with a stale Node and calls
resolveAndPersist directly; verified to throw FOREIGN KEY constraint failed
without the fix and pass with it. Full suite: 984/984 pass.

Closes colbymchenry#455.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CodeGraph v0.6.2: FOREIGN KEY constraint failed

1 participant