Skip to content

feat: Mem0 feature parity — smart evolution, memory hierarchy, search enhancements, graph memory#4

Merged
clopca merged 22 commits into
mainfrom
feature/mem0-feature-parity
Feb 7, 2026
Merged

feat: Mem0 feature parity — smart evolution, memory hierarchy, search enhancements, graph memory#4
clopca merged 22 commits into
mainfrom
feature/mem0-feature-parity

Conversation

@clopca
Copy link
Copy Markdown
Owner

@clopca clopca commented Feb 7, 2026

Summary

Implements 18 tasks across 4 tracks from the mem0-feature-parity plan, bringing open-mem to feature parity with Mem0's advanced memory capabilities. All features are opt-in via config flags (defaulting to false).

  • Smart Memory Evolution (Track A): LLM-based conflict resolution that detects when new observations supersede, update, or duplicate existing ones — preventing stale/contradictory memories
  • Memory Hierarchy (Track B): Cross-project user-level memory database at ~/.config/open-mem/ with merged search results and separate token budgets for context injection
  • Search Enhancements (Track C): Advanced filters (importance, date range, concepts, files) and LLM-based + heuristic reranking pipeline
  • Graph Memory (Track D): Entity-relationship triple extraction, SQLite-backed graph storage, BFS traversal with cycle detection, and graph-augmented search

Changes

New Modules

File Purpose
src/ai/conflict-evaluator.ts LLM-based memory conflict detection
src/ai/entity-extractor.ts Entity & relationship extraction from observations
src/db/user-memory.ts Cross-project user-level memory database
src/db/entities.ts Entity repository with graph traversal
src/search/reranker.ts LLM + heuristic reranking pipeline
src/search/graph.ts Graph-augmented search

Modified Modules

File Change
src/db/schema.ts Added migrations v8 (conflict resolution) and v9 (entity graph)
src/db/observations.ts Extended with supersede/archive operations
src/queue/processor.ts Integrated conflict resolution + entity extraction
src/search/orchestrator.ts Added reranking, advanced filters, graph search
src/search/hybrid.ts Filter support in hybrid search
src/tools/save.ts, search.ts, recall.ts User-memory integration
src/context/builder.ts Hierarchical memory injection
src/hooks/context-inject.ts, compaction.ts User-memory awareness
src/config.ts 4 new feature flags
src/types.ts Extended type definitions
src/index.ts, src/mcp.ts, src/daemon.ts Initialization of new subsystems

New Test Suites (14 files, +201 tests)

  • tests/ai/conflict-evaluator.test.ts — Conflict evaluation scenarios
  • tests/ai/entity-extractor.test.ts — Entity/relationship extraction
  • tests/db/user-memory.test.ts — User-level DB operations
  • tests/db/entities.test.ts — Entity repository CRUD + traversal
  • tests/db/migration-v8.test.ts — Schema migration v8
  • tests/db/migration-v9.test.ts — Schema migration v9
  • tests/search/reranking.test.ts — Reranker pipeline
  • tests/search/graph-search.test.ts — Graph-augmented search
  • tests/tools/advanced-filters.test.ts — Filter operators
  • tests/tools/user-memory-tools.test.ts — User memory tool integration
  • tests/context/hierarchy.test.ts — Hierarchical context injection
  • tests/queue/conflict-resolution.test.ts — End-to-end conflict resolution

Configuration

All new features are opt-in and default to false:

enableConflictResolution: false  // Track A
enableUserMemory: false          // Track B
enableReranking: false           // Track C
enableGraphMemory: false         // Track D

Verification

  • 790 tests pass (up from 589, +201 new)
  • Type check clean (bun x tsc --noEmit)
  • Build succeeds (index.js 130KB, mcp.js 70KB, daemon.js 57KB)
  • +7,661 / -90 lines across 40 files

Summary by CodeRabbit

  • New Features

    • AI-powered conflict evaluation, entity extraction, knowledge‑graph augmented search, result reranking, and optional cross‑project user memory (save/search/recall).
    • Advanced search filters: importance, date range, concepts, and files.
  • Behavioral Changes

    • Superseded observations are hidden from search/index results.
    • mem-save supports a "user" scope; mem-recall and search annotate and return user‑scoped results.
  • Tests

    • Extensive test coverage added for AI features, entity graph, migrations, filters, user‑memory, and conflict resolution.
  • Documentation

    • Added/expanded public API doc comments.

clopca and others added 17 commits February 7, 2026 16:16
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…ts, files)

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…servation updates

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…tion

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…peline

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…l tools

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…g pipeline

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…rompt

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add entities, entity_relations, entity_observations tables with FTS5 search, sync triggers, CHECK constraints, and unique indexes. Add entityExtractionEnabled config option (default: false, env: OPEN_MEM_ENTITY_EXTRACTION). Update existing migration test assertions for new migration count.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…pipeline

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 7, 2026

Warning

Rate limit exceeded

@clopca has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 18 minutes and 8 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds AI-driven conflict evaluation and entity extraction, user-level cross-project memory with FTS, entity graph support, graph-augmented search and reranking, observation supersede tracking (migrations v8–v9), and wires these into the queue, search orchestrator, servers, tools, and tests.

Changes

Cohort / File(s) Summary
AI: evaluators, extractor & prompts
src/ai/conflict-evaluator.ts, src/ai/entity-extractor.ts, src/ai/prompts.ts, src/ai/parser.ts, src/ai/provider.ts, src/ai/summarizer.ts, src/ai/compressor.ts
New ConflictEvaluator and EntityExtractor classes; XML-based prompt builders and parsers for reranking/conflict/entity extraction; provider/model config types; retry/backoff and provider rate‑limit handling; public parser types and exports.
Reranker
src/search/reranker.ts
New Reranker abstraction with LLMReranker (LLM-driven ordering + retries) and HeuristicReranker fallback; factory createReranker and helper utilities.
Search: orchestrator, hybrid, graph & filters
src/search/orchestrator.ts, src/search/hybrid.ts, src/search/graph.ts, src/search/filters.ts
SearchOrchestrator extended to merge graph and user-memory results, apply advanced filters (importance/date/concepts/files), optional reranking; hybrid search updated to propagate filter options; new graphAugmentedSearch implementation.
Database: schema, entities, observations, user-memory
src/db/schema.ts, src/db/entities.ts, src/db/observations.ts, src/db/user-memory.ts
Migrations v8–v9 (add superseded columns and entity-graph + FTS); new EntityRepository with traversal/FTS; ObservationRepository updated to support supersede, exclude superseded rows, embedding helpers; new UserMemory DB + UserObservationRepository with FTS and CRUD.
Queue & processing pipeline
src/queue/processor.ts, src/mcp.ts, src/daemon.ts, src/index.ts
QueueProcessor gains conflict-resolution and entity-extraction branches (gray-zone LLM evaluation, supersede handling); new ctor deps (ConflictEvaluator, EntityExtractor, EntityRepository); MCP/daemon/index wiring to initialize/inject these components and user-memory.
Servers & tooling
src/servers/mcp-server.ts, src/tools/save.ts, src/tools/recall.ts, src/tools/search.ts, src/tools/timeline.ts, src/tools/update.ts
MCP server now async with pending-op tracking and optional SearchOrchestrator; tools accept optional user-memory repo (mem-save scope "user", mem-recall fallback); search tool accepts advanced filters and annotates result source.
Context, hooks & builders
src/context/builder.ts, src/context/progressive.ts, src/context/relevance.ts, src/hooks/compaction.ts, src/hooks/context-inject.ts
Added buildUserContextSection and buildUserCompactContext; progressive context exposes fullObservations and totalTokens; relevance context adds now: Date; hooks accept optional UserObservationRepository and append user-memory sections under token budgets.
Embeddings & utilities
src/search/embeddings.ts, src/ai/provider.ts, src/utils/retention.ts
New embedding helpers (generateEmbedding, cosineSimilarity, prepareObservationText), createEmbeddingModel helper, and enforceRetention now accepts pendingMessages param.
Types & config
src/types.ts, src/config.ts
Expanded OpenMemConfig with flags for conflictResolution, userMemory, reranking, entityExtraction; new SearchQuery filter fields; Observation gains supersededBy/supersededAt; SearchResult.source added.
Tests
tests/ai/*, tests/db/*, tests/queue/*, tests/search/*, tests/tools/*, tests/context/*
Extensive new/updated test suites for conflict evaluator, entity extractor, entity repo, user-memory DB, migrations v8/v9, graph search, reranking, advanced filters, queue conflict scenarios, tools integration, and context rendering.
Misc docs/comments & small APIs
Various files (daemon, db, servers, tools)
Added JSDoc comments, small API signature adjustments (e.g., TimelineTool, enforceRetention, DaemonWorker interfaces), and lifecycle cleanup wiring.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant QueueProcessor
    participant Embedding as EmbeddingModel
    participant ObservationRepo
    participant ConflictEval as ConflictEvaluator
    participant DB as Database

    User->>QueueProcessor: enqueue observation
    QueueProcessor->>Embedding: generate embedding
    QueueProcessor->>ObservationRepo: find similar candidates
    alt similarity > bandHigh
        QueueProcessor->>DB: mark as duplicate / skip creation
    else similarity in gray zone
        QueueProcessor->>ConflictEval: evaluate(newObs, candidates)
        alt outcome == "update"
            ConflictEval-->>QueueProcessor: { outcome: update, supersedesId }
            QueueProcessor->>DB: create new observation
            QueueProcessor->>ObservationRepo: supersede(oldId, newId)
        else outcome == "duplicate"
            ConflictEval-->>QueueProcessor: { outcome: duplicate }
            QueueProcessor->>DB: skip creation
        else outcome == "new_fact"
            ConflictEval-->>QueueProcessor: { outcome: new_fact }
            QueueProcessor->>DB: create new observation
        end
    else
        QueueProcessor->>DB: create new observation
    end
Loading
sequenceDiagram
    participant User
    participant QueueProcessor
    participant EntityExtractor
    participant LLM as LanguageModel
    participant EntityRepo
    participant DB as Database

    User->>QueueProcessor: enqueue observation
    QueueProcessor->>EntityExtractor: extract(observation)
    EntityExtractor->>LLM: send extraction prompt
    LLM-->>EntityExtractor: XML extraction response
    EntityExtractor->>EntityRepo: upsert entities & create relations
    EntityRepo->>DB: persist entities/relations/links
    EntityExtractor-->>QueueProcessor: ParsedEntityExtraction
    QueueProcessor->>DB: continue processing (link observation, etc.)
Loading
sequenceDiagram
    participant Client
    participant SearchOrch as SearchOrchestrator
    participant Hybrid as HybridSearch
    participant EntityRepo
    participant UserMemory
    participant Reranker
    participant Results

    Client->>SearchOrch: search(query, options)
    SearchOrch->>Hybrid: run FTS + vector
    Hybrid-->>SearchOrch: baseResults
    alt entityRepo available
        SearchOrch->>EntityRepo: traverse related entities
        EntityRepo-->>SearchOrch: related observation IDs
        SearchOrch->>SearchOrch: merge graph results (source=project)
    end
    alt userMemory enabled
        SearchOrch->>UserMemory: search user observations
        UserMemory-->>SearchOrch: userResults (source=user)
        SearchOrch->>SearchOrch: merge & dedupe results
    end
    alt reranker available
        SearchOrch->>Reranker: rerank(query, mergedResults, limit)
        Reranker-->>Results: rerankedResults
    else
        SearchOrch-->>Results: mergedResults
    end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

"I'm a rabbit with a prompt to share,
I hop through graphs and prune with care,
Conflicts settled, entities found,
Cross-project memories all around,
Hooray — search sings sweet and fair! 🐇✨"

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Mem0 feature parity — smart evolution, memory hierarchy, search enhancements, graph memory' clearly identifies the primary change: implementing Mem0 feature parity across four major tracks (smart memory evolution, memory hierarchy, search enhancements, and graph memory).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/mem0-feature-parity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Feb 7, 2026

Greptile Overview

Greptile Summary

Implements comprehensive Mem0 feature parity across 4 tracks with 7,661 lines added. All features are opt-in via config flags (defaulting to false).

Track A — Smart Memory Evolution: Introduced LLM-based conflict resolution that evaluates new observations against similar existing ones using a two-threshold similarity system (conflictSimilarityBandLow: 0.7, conflictSimilarityBandHigh: 0.92). The gray zone between thresholds triggers AI evaluation to classify observations as duplicate, update, or new fact. Supersession workflow correctly creates new observation before marking old one as superseded, with error handling that logs failures without blocking.

Track B — Memory Hierarchy: User-level cross-project memory database at ~/.config/open-mem/ with automatic directory creation (addresses previous PR feedback). Separate token budget (userMemoryMaxContextTokens: 1000) ensures project and user contexts don't compete. Search results merge both databases with content-based deduplication (title+narrative comparison), though similarity-based deduplication across databases was noted as potential enhancement in PR feedback.

Track C — Search Enhancements: Dual-mode reranking with LLM-based primary and heuristic fallback. Advanced filters for importance range, date range, concepts, and files. Graceful degradation ensures search succeeds even when AI services fail.

Track D — Graph Memory: Entity-relationship extraction creates SQLite-backed knowledge graph. BFS traversal with depth cap (max 2) and visited node limit (100) prevents memory explosion in dense graphs (addresses PR feedback). Graph-augmented search discovers related observations through entity relationships, expanding search coverage beyond direct keyword matches.

Architecture: All new subsystems integrate cleanly into existing plugin lifecycle with proper initialization, feature flag checks, and error handling. The 201 new tests provide solid coverage of core scenarios (790 total tests passing). Entity extraction prompt relaxed from "clearly mentioned" to include "strongly implied" relationships per PR feedback to improve knowledge graph coverage.

Confidence Score: 4/5

  • Safe to merge with minor considerations around error handling and cross-database deduplication
  • Score reflects comprehensive testing (201 new tests, 790 total passing), thoughtful error handling throughout, and opt-in design that minimizes risk to existing functionality. Previous PR feedback addressed (directory creation, BFS limits, entity extraction relaxation, supersede error handling). Minor considerations: supersede operations log but don't block on failure (acceptable by design), and cross-database similarity deduplication noted as future enhancement. The large scope (7,661 lines) is well-structured and incrementally testable.
  • Pay close attention to src/queue/processor.ts (conflict resolution integration) and src/search/orchestrator.ts (cross-database merging logic)

Important Files Changed

Filename Overview
src/queue/processor.ts Integrates conflict resolution and entity extraction into observation processing pipeline; supersede operation addressed per PR feedback but potential silent failure remains
src/ai/conflict-evaluator.ts LLM-based conflict detection with retry logic and proper error handling; follows established patterns from compressor module
src/ai/entity-extractor.ts Entity and relationship extraction module with retry logic; mirrors conflict-evaluator structure with good error handling
src/db/user-memory.ts Cross-project user memory database with automatic directory creation; path resolution improved per PR feedback with proper error handling
src/db/entities.ts Knowledge graph repository with BFS traversal and FTS5 search; depth capping and MAX_VISITED limit addresses PR feedback on memory safety
src/search/orchestrator.ts Multi-strategy search coordinator with graph augmentation and user memory merging; uses content-based deduplication but not similarity-based as noted in PR feedback
src/search/reranker.ts Dual-mode reranking (LLM + heuristic fallback) with graceful degradation; well-tested with proper error handling and retry logic
src/config.ts Added 4 new feature flags with environment variable support; all new features default to false (opt-in) as specified
src/index.ts Initializes new subsystems (conflict evaluator, entity extractor, user memory, reranker) with proper feature flag checks and error handling
src/db/schema.ts Adds migrations v8 (conflict resolution fields) and v9 (entity graph tables); follows existing migration patterns with proper indexing

Sequence Diagram

sequenceDiagram
    participant User
    participant Hook as Tool Capture Hook
    participant Queue as Queue Processor
    participant AI as AI Services
    participant DB as Databases
    participant Search as Search Orchestrator

    Note over User,Search: Observation Processing Pipeline
    
    User->>Hook: Tool Output Event
    Hook->>Queue: enqueue(toolOutput)
    Queue->>DB: store pending message
    
    Note over Queue,AI: Batch Processing (30s intervals)
    
    Queue->>Queue: processBatch()
    Queue->>AI: compress(toolOutput)
    AI-->>Queue: parsed observation
    
    alt Conflict Resolution Enabled
        Queue->>AI: generate embedding
        AI-->>Queue: embedding vector
        Queue->>DB: findSimilar(embedding)
        DB-->>Queue: similar observations
        
        alt Gray Zone Detected
            Queue->>AI: ConflictEvaluator.evaluate()
            AI-->>Queue: duplicate/update/new_fact
            
            alt Outcome: update
                Queue->>DB: create observation
                DB-->>Queue: created
                Queue->>DB: supersede(oldId, newId)
            end
            
            alt Outcome: duplicate
                Queue->>Queue: skipObservation = true
            end
        end
    end
    
    alt Not Skipped
        Queue->>DB: create observation
        DB-->>Queue: created observation
        
        alt Entity Extraction Enabled
            Queue->>AI: EntityExtractor.extract()
            AI-->>Queue: entities & relations
            Queue->>DB: upsertEntity(), createRelation()
        end
    end
    
    Note over User,Search: Search & Retrieval
    
    User->>Search: search(query)
    
    alt Hybrid Search Strategy
        Search->>DB: FTS5 search
        DB-->>Search: text results
        Search->>AI: generate query embedding
        AI-->>Search: embedding
        Search->>DB: vector search
        DB-->>Search: semantic results
        Search->>Search: merge results
    end
    
    alt Graph Augmentation Enabled
        Search->>DB: findByName(entities)
        DB-->>Search: entity nodes
        Search->>DB: traverseRelations(depth=1)
        DB-->>Search: related entity IDs
        Search->>DB: getObservationsForEntity()
        DB-->>Search: related observations
        Search->>Search: merge graph results
    end
    
    alt User Memory Enabled
        Search->>DB: UserMemoryDB.search()
        DB-->>Search: user observations
        Search->>Search: deduplicate & merge
    end
    
    alt Reranking Enabled
        Search->>AI: LLMReranker.rerank()
        AI-->>Search: relevance-ordered indices
    end
    
    Search-->>User: final ranked results
Loading

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread src/queue/processor.ts
Comment on lines +284 to +293
if (conflictSupersedesId) {
try {
this.observationRepo.supersede(conflictSupersedesId, created.id);
console.log(
`[open-mem] Superseded observation ${conflictSupersedesId} with ${created.id}`,
);
} catch {
// Supersede failure must not block observation creation
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supersede operation happens after observation is created. If supersede() fails silently (line 290), the new observation remains but the old one isn't marked as superseded, leaving duplicate active observations.

Suggested change
if (conflictSupersedesId) {
try {
this.observationRepo.supersede(conflictSupersedesId, created.id);
console.log(
`[open-mem] Superseded observation ${conflictSupersedesId} with ${created.id}`,
);
} catch {
// Supersede failure must not block observation creation
}
}
if (conflictSupersedesId) {
try {
this.observationRepo.supersede(conflictSupersedesId, created.id);
console.log(
`[open-mem] Superseded observation ${conflictSupersedesId} with ${created.id}`,
);
} catch (error) {
// Log and consider rollback or marking the new observation
console.error(`[open-mem] Failed to supersede ${conflictSupersedesId}:`, error);
}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/queue/processor.ts
Line: 284:293

Comment:
Supersede operation happens after observation is created. If `supersede()` fails silently (line 290), the new observation remains but the old one isn't marked as superseded, leaving duplicate active observations.

```suggestion
					if (conflictSupersedesId) {
						try {
							this.observationRepo.supersede(conflictSupersedesId, created.id);
							console.log(
								`[open-mem] Superseded observation ${conflictSupersedesId} with ${created.id}`,
							);
						} catch (error) {
							// Log and consider rollback or marking the new observation
							console.error(`[open-mem] Failed to supersede ${conflictSupersedesId}:`, error);
						}
					}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/db/entities.ts
Comment on lines +183 to +184
const maxDepth = Math.min(depth, 2); // Cap at 2 to prevent explosion
const visited = new Set<string>();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BFS traversal caps depth at 2, but this cap happens during traversal. With a dense graph, the queue could still grow large before hitting the cap. Consider adding a max visited nodes limit (e.g., 100) to prevent memory issues in pathological cases.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/db/entities.ts
Line: 183:184

Comment:
BFS traversal caps depth at 2, but this cap happens during traversal. With a dense graph, the queue could still grow large before hitting the cap. Consider adding a max visited nodes limit (e.g., 100) to prevent memory issues in pathological cases.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/search/orchestrator.ts Outdated
Comment on lines +282 to +284
const seenIds = new Set(projectResults.map((r) => r.observation.id));
const dedupedUserResults = userResults.filter((r) => !seenIds.has(r.observation.id));
return [...projectResults, ...dedupedUserResults].slice(0, limit);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User memory results are merged after project results without considering similarity-based deduplication. Two observations with identical content but different IDs (one from project, one from user DB) could both appear in results. Consider computing similarity for cross-database deduplication.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/search/orchestrator.ts
Line: 282:284

Comment:
User memory results are merged after project results without considering similarity-based deduplication. Two observations with identical content but different IDs (one from project, one from user DB) could both appear in results. Consider computing similarity for cross-database deduplication.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/db/user-memory.ts
Comment on lines +307 to +313
function resolveUserDbPath(dbPath: string): string {
if (dbPath.startsWith("~/")) {
const home = process.env.HOME || process.env.USERPROFILE || "";
return `${home}${dbPath.slice(1)}`;
}
return dbPath;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User DB path resolution uses HOME or USERPROFILE but doesn't validate the path exists or is writable. If the directory doesn't exist (e.g., ~/.config/open-mem/), database creation will fail.

Add directory creation in constructor:

Suggested change
function resolveUserDbPath(dbPath: string): string {
if (dbPath.startsWith("~/")) {
const home = process.env.HOME || process.env.USERPROFILE || "";
return `${home}${dbPath.slice(1)}`;
}
return dbPath;
}
function resolveUserDbPath(dbPath: string): string {
if (dbPath.startsWith("~/")) {
const home = process.env.HOME || process.env.USERPROFILE || "";
const resolved = `${home}${dbPath.slice(1)}`;
const dir = resolved.substring(0, resolved.lastIndexOf("/"));
require("node:fs").mkdirSync(dir, { recursive: true });
return resolved;
}
return dbPath;
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/db/user-memory.ts
Line: 307:313

Comment:
User DB path resolution uses `HOME` or `USERPROFILE` but doesn't validate the path exists or is writable. If the directory doesn't exist (e.g., `~/.config/open-mem/`), database creation will fail.

Add directory creation in constructor:
```suggestion
function resolveUserDbPath(dbPath: string): string {
	if (dbPath.startsWith("~/")) {
		const home = process.env.HOME || process.env.USERPROFILE || "";
		const resolved = `${home}${dbPath.slice(1)}`;
		const dir = resolved.substring(0, resolved.lastIndexOf("/"));
		require("node:fs").mkdirSync(dir, { recursive: true });
		return resolved;
	}
	return dbPath;
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/ai/prompts.ts Outdated
Comment on lines +226 to +227
Only extract entities that are clearly mentioned. Do not infer.
Respond with EXACTLY this XML format:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instruction says Only extract entities that are clearly mentioned. Do not infer. This conservative approach may miss implicit relationships present in code. For example, "React hooks" doesn't explicitly state "React uses hooks" but the relationship is semantically clear. Consider relaxing this constraint for better knowledge graph coverage.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai/prompts.ts
Line: 226:227

Comment:
Instruction says `Only extract entities that are clearly mentioned. Do not infer.` This conservative approach may miss implicit relationships present in code. For example, "React hooks" doesn't explicitly state "React uses hooks" but the relationship is semantically clear. Consider relaxing this constraint for better knowledge graph coverage.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/db/observations.ts (1)

265-301: ⚠️ Potential issue | 🟠 Major

searchByConcept and searchByFile do not filter out superseded observations — inconsistent with search().

The main search() method (line 204), getIndex() (line 176), getWithEmbeddings() (line 323), and findSimilar() (line 354) all exclude superseded records with AND o.superseded_by IS NULL, but these two methods don't. Superseded observations will appear in concept and file-based searches.

🐛 Proposed fix
 	searchByConcept(concept: string, limit = 10, projectPath?: string): Observation[] {
 		const hasProjectPath = !!projectPath;
 		const sql = `SELECT o.*
 				 FROM observations o
 				 JOIN observations_fts fts ON o._rowid = fts.rowid
 				 ${hasProjectPath ? "JOIN sessions s ON o.session_id = s.id" : ""}
-				 WHERE observations_fts MATCH ?
+				 WHERE observations_fts MATCH ? AND o.superseded_by IS NULL
 				 ${hasProjectPath ? "AND s.project_path = ?" : ""}
 				 ORDER BY rank
 				 LIMIT ?`;
 	searchByFile(filePath: string, limit = 10, projectPath?: string): Observation[] {
 		const hasProjectPath = !!projectPath;
 		const sql = `SELECT o.*
 				 FROM observations o
 				 JOIN observations_fts fts ON o._rowid = fts.rowid
 				 ${hasProjectPath ? "JOIN sessions s ON o.session_id = s.id" : ""}
-				 WHERE observations_fts MATCH ?
+				 WHERE observations_fts MATCH ? AND o.superseded_by IS NULL
 				 ${hasProjectPath ? "AND s.project_path = ?" : ""}
 				 ORDER BY rank
 				 LIMIT ?`;
tests/context/injection.test.ts (1)

36-46: ⚠️ Potential issue | 🟡 Minor

makeIndexEntry is missing required discoveryTokens and importance fields.

The ObservationIndex interface requires all 8 fields, but this helper only provides 6 defaults. The issue goes undetected because test files are excluded from TypeScript compilation (see tsconfig.json), but the function violates the interface contract. Add defaults for both fields or update the function to accept them via overrides.

Additionally, makeConfig and makeMockRepos helpers are duplicated across two describe blocks (lines ~251–260 and ~395–412). Extract them to module scope to avoid maintenance burden.

src/index.ts (1)

266-276: ⚠️ Potential issue | 🟡 Minor

process.on("beforeExit", cleanup) won't fire if process.exit() is called elsewhere.

This is a well-known Node.js gotcha — beforeExit only fires when the event loop drains naturally, not on explicit process.exit() calls. The MCP server's rl.on("close") handler calls process.exit(0), which would bypass this cleanup. If the MCP server and plugin entry point run in the same process, the user-memory DB and main DB may not be closed cleanly.

Consider additionally registering on exit (synchronous only) or restructuring the MCP server to not call process.exit().

🤖 Fix all issues with AI agents
In `@src/ai/parser.ts`:
- Around line 205-212: parseConflictEvaluationResponse can return a
ConflictEvaluation with outcome === "update" but no supersedesId when the
<supersedes> tag is missing; modify parseConflictEvaluationResponse so that when
outcome === "update" and the parsed supersedes value is missing/empty it returns
null instead of { outcome: "update", reason }, otherwise set result.supersedesId
= supersedes and return the result; reference symbols:
parseConflictEvaluationResponse, ConflictEvaluation, outcome, supersedes,
supersedesId.

In `@src/config.ts`:
- Around line 133-136: The config parse currently assigns Number.parseFloat(...)
directly to env.conflictSimilarityBandLow and env.conflictSimilarityBandHigh
which can become NaN for non-numeric env values; update the handling around the
parse calls (where OPEN_MEM_CONFLICT_BAND_LOW/HIGH are read) to parse into a
local variable, validate with Number.isNaN, and only assign to
env.conflictSimilarityBandLow and env.conflictSimilarityBandHigh if the parsed
value is a valid number (otherwise either leave the existing default or throw a
clear config error); reference the parse operation and the env fields
(OPEN_MEM_CONFLICT_BAND_LOW, OPEN_MEM_CONFLICT_BAND_HIGH,
env.conflictSimilarityBandLow, env.conflictSimilarityBandHigh) when making the
change.

In `@src/db/user-memory.ts`:
- Around line 307-312: The function resolveUserDbPath currently falls back to an
empty string when neither HOME nor USERPROFILE is set, causing "~/" to resolve
to a root path; update resolveUserDbPath to detect when home is missing and
throw a clear error (or return a controlled failure) instead of returning a path
prefixed with "", so when dbPath startsWith("~/") get the home from
process.env.HOME || process.env.USERPROFILE, and if that value is falsy throw a
descriptive Error (e.g., "Cannot resolve user home directory; HOME/USERPROFILE
not set") to prevent writing to an unintended location.
- Around line 208-235: The search method currently passes raw user input into
the FTS5 MATCH clause and can throw on malformed queries; wrap the DB call in a
try/catch inside the search(query) function (the block that calls
this.db.all<UserObservationSearchRow>(...), processes rows with mapRow, and
returns rank) and on any error return an empty array (optionally log the error)
so syntax errors in FTS5 queries are swallowed and handled consistently with
EntityRepository.findByName.

In `@src/search/hybrid.ts`:
- Around line 59-80: safelyRunFts is calling observations.search without
forwarding the required project scoping field; update the call inside
safelyRunFts (the observations.search invocation) to include projectPath:
options.projectPath so the SearchQuery passed to ObservationRepository.search is
project-scoped, preserving the existing try/catch and all other option mappings
(type, limit, importanceMin, importanceMax, createdAfter, createdBefore,
concepts, files).

In `@src/search/orchestrator.ts`:
- Around line 302-322: The functions passesAdvancedFilters (in
src/search/orchestrator.ts) and passesFilters (in src/search/hybrid.ts)
duplicate identical filter logic; extract the shared logic into a single utility
(e.g., create a new passesFilters in src/search/filters.ts or similar) that
accepts (obs: Observation, filters: ObservationFilterOptions) and move the
type/interface for the filter options there, then update both callers to import
and use the shared passesFilters (remove the local passesAdvancedFilters) so
both modules reference the same implementation and types.

In `@src/search/reranker.ts`:
- Around line 49-90: The rerank method currently truncates results into
candidates = results.slice(0, this.maxCandidates) which loses items beyond
maxCandidates; update rerank so after calling applyReranking(candidates,
indices, limit) you append the overflow items =
results.slice(this.maxCandidates) (or the tail needed to satisfy limit) to the
reranked list; ensure you only pass the candidate slice into
parseRerankingResponse/applyReranking but then merge the remaining original
results in their original order and respect the requested limit when returning
the final array.

In `@src/servers/mcp-server.ts`:
- Around line 150-153: The close handler for rl (rl.on("close", ...)) awaits
Promise.all(this.pendingOps) but if that promise rejects the async callback
becomes an unhandled rejection and process.exit(0) may never run; wrap the await
in a try/catch/finally (or append .catch and a .finally) so rejections from
Promise.all(this.pendingOps) are caught and logged (include context like
"pending ops on close"), and ensure process.exit(0) is always called in the
finally block to guarantee shutdown; refer to rl.on("close", ...) and
Promise.all(this.pendingOps) when making the change.
- Around line 508-519: The fallback branch is missing project scoping: when
this.searchOrchestrator is falsy the code calls this.observations.search({
query, type, limit }) but omits projectPath so results span all projects; update
the fallback call to include projectPath: this.projectPath (i.e.
this.observations.search({ query, type, limit, projectPath: this.projectPath }))
so both branches (searchOrchestrator.search and observations.search)
consistently scope by project; confirm SearchQuery handling remains compatible
with optional projectPath.
🧹 Nitpick comments (23)
src/db/schema.ts (1)

271-347: No ON DELETE CASCADE on entity graph FKs — orphan rows possible on observation deletion.

entity_relations.observation_id and entity_observations.observation_id reference observations(id), but without ON DELETE CASCADE. Deleting an observation (via delete() or deleteOlderThan()) will leave dangling rows in these tables. The same applies to entity deletions and the junction/relation tables.

This is consistent with the rest of the schema (no cascades anywhere), but the entity graph tables are more heavily cross-referenced. Consider adding cleanup logic in the retention/delete paths, or adding cascades in a follow-up migration.

src/db/observations.ts (2)

536-542: supersede() doesn't verify that either observation ID exists.

If called with invalid IDs, the UPDATE silently succeeds (affects 0 rows). Consider returning a boolean or checking the target observation exists, to surface bugs in the conflict resolution pipeline early.


16-22: LIKE-based filtering on JSON-encoded arrays may produce false positives.

The concepts column stores data like ["error-handling","testing"]. A LIKE %test% search would match both "testing" and any concept containing "test" as a substring. Same applies to files. This is acceptable for now but worth documenting as a known limitation.

Also applies to: 236-252

src/context/builder.ts (1)

228-258: Duplicated budget-selection logic between buildUserContextSection and buildUserCompactContext.

Lines 234–242 and 269–277 are identical. Consider extracting a small helper:

♻️ Proposed refactor
+function selectByBudget(entries: ObservationIndex[], maxTokens: number): ObservationIndex[] {
+	let budget = maxTokens;
+	const included: ObservationIndex[] = [];
+	for (const entry of entries) {
+		const tokens = entry.tokenCount || estimateTokens(entry.title);
+		if (budget - tokens < 0) break;
+		included.push(entry);
+		budget -= tokens;
+	}
+	return included;
+}
+
 export function buildUserContextSection(
 	userIndex: ObservationIndex[],
 	maxTokens: number,
 ): string {
 	if (userIndex.length === 0) return "";
-
-	let budget = maxTokens;
-	const included: ObservationIndex[] = [];
-
-	for (const entry of userIndex) {
-		const tokens = entry.tokenCount || estimateTokens(entry.title);
-		if (budget - tokens < 0) break;
-		included.push(entry);
-		budget -= tokens;
-	}
-
+	const included = selectByBudget(userIndex, maxTokens);
 	if (included.length === 0) return "";
 	// ... rest unchanged

Also applies to: 263-288

src/ai/prompts.ts (1)

184-190: Duplicate/orphaned "Reranking Prompt" section header.

Lines 188–189 contain a "Reranking Prompt" section comment with no associated code, then the actual buildRerankingPrompt function appears at line 245 under a second identical header. Remove the stale one.

🧹 Proposed cleanup
-// -----------------------------------------------------------------------------
-// Reranking Prompt
-// -----------------------------------------------------------------------------
-
 // -----------------------------------------------------------------------------
 // Entity Extraction Prompt
 // -----------------------------------------------------------------------------
src/ai/entity-extractor.ts (1)

94-112: isRetryable and sleep are duplicated verbatim in conflict-evaluator.ts.

Both this file and conflict-evaluator.ts contain identical isRetryable and sleep implementations (and share the same constructor/retry pattern). Consider extracting these into a shared utility (e.g., src/ai/retry.ts) to reduce duplication and ensure consistent retry behaviour if the logic evolves.

tests/ai/entity-extractor.test.ts (1)

15-20: withMockGenerate helper duplicated across test files.

This helper is identical in tests/ai/conflict-evaluator.test.ts. Consider extracting it to a shared test utility if more AI component tests follow the same pattern.

src/ai/conflict-evaluator.ts (1)

17-22: ConflictEvaluatorConfig is identical to EntityExtractorConfig.

Both config interfaces have the exact same fields (provider, apiKey, model, rateLimitingEnabled). Consider defining a shared AiComponentConfig (or similar) base type to reduce drift if new fields are added to one but not the other.

tests/context/injection.test.ts (1)

384-412: makeConfig and makeMockRepos are duplicated within this file.

These helpers (lines 385–412) are identical to those defined earlier (lines 159–186). Consider hoisting them to the top-level scope of the test file to avoid duplication.

src/tools/search.ts (1)

17-22: Consider validating importance_min ≤ importance_max and ISO 8601 date format.

importance_min and importance_max can currently be set such that min > max, which would silently return no results. Similarly, after and before accept arbitrary strings with no date format validation, which could produce confusing errors downstream.

💡 Proposed validation refinements
-const searchArgsSchema = z.object({
-	query: z.string().describe("Search query (supports keywords, phrases, file paths)"),
-	type: z
-		.enum(["decision", "bugfix", "feature", "refactor", "discovery", "change"])
-		.optional()
-		.describe("Filter by observation type"),
-	limit: z.number().min(1).max(50).default(10).describe("Maximum number of results"),
-	importance_min: z.number().min(1).max(5).optional().describe("Minimum importance (1-5)"),
-	importance_max: z.number().min(1).max(5).optional().describe("Maximum importance (1-5)"),
-	after: z.string().optional().describe("Only observations after this date (ISO 8601)"),
-	before: z.string().optional().describe("Only observations before this date (ISO 8601)"),
-	concepts: z.array(z.string()).optional().describe("Filter by concepts"),
-	files: z.array(z.string()).optional().describe("Filter by file paths"),
-});
+const searchArgsSchema = z.object({
+	query: z.string().describe("Search query (supports keywords, phrases, file paths)"),
+	type: z
+		.enum(["decision", "bugfix", "feature", "refactor", "discovery", "change"])
+		.optional()
+		.describe("Filter by observation type"),
+	limit: z.number().min(1).max(50).default(10).describe("Maximum number of results"),
+	importance_min: z.number().min(1).max(5).optional().describe("Minimum importance (1-5)"),
+	importance_max: z.number().min(1).max(5).optional().describe("Maximum importance (1-5)"),
+	after: z.string().datetime({ offset: true }).optional().describe("Only observations after this date (ISO 8601)"),
+	before: z.string().datetime({ offset: true }).optional().describe("Only observations before this date (ISO 8601)"),
+	concepts: z.array(z.string()).optional().describe("Filter by concepts"),
+	files: z.array(z.string()).optional().describe("Filter by file paths"),
+}).refine(
+	(d) => d.importance_min == null || d.importance_max == null || d.importance_min <= d.importance_max,
+	{ message: "importance_min must be ≤ importance_max" },
+);
tests/search/reranking.test.ts (2)

165-176: Dead code: dummyModel is unused.

Lines 167–169 define a dummyModel type alias that is never referenced. This can be removed for clarity.

🧹 Proposed cleanup
 	function makeLLMReranker() {
-		// Use a dummy language model — we override _generate anyway
-		const dummyModel = {} as Parameters<typeof LLMReranker.prototype.rerank>[0] extends string
-			? never
-			: never;
 		return new LLMReranker({} as any, {
 			rerankingMaxCandidates: 20,
 			provider: "anthropic",
 			model: "test-model",
 			rateLimitingEnabled: false,
 		});
 	}

384-411: Integration test asserts result count but not the expected reranked order.

The mock reverses results and the comment on Line 408 says "the order should be flipped," but the assertion on Lines 409–410 only checks ids.length. This makes the test pass even if the reranker wasn't actually applied. Consider asserting the expected ordering.

💡 Proposed stronger assertion
 		// The mock reverses, so the order should be flipped from FTS5 default
 		const ids = results.map((r) => r.observation.title);
-		expect(ids.length).toBe(2);
+		expect(ids).toEqual(["Authentication JWT tokens", "Database connection pooling"]);
src/search/graph.ts (2)

21-32: Nested entity lookups may become a hot path for verbose queries.

For a query with W words, this performs up to (W + W-1) FTS lookups (single words + bigrams), then for each matched entity runs BFS traversal + observation fetches. Short common words (2–3 chars) passing the >= 2 filter can cause noisy FTS matches, amplifying the loop.

Consider deduplicating observation IDs earlier, capping entityNames count, or filtering out common stop-words to keep this bounded.


57-69: Simplify single-word push loop.

🧹 Minor simplification
 function extractEntityCandidates(query: string): string[] {
 	const words = query.split(/\s+/).filter((w) => w.length >= 2);
 	const candidates: string[] = [];
-
-	for (const word of words) {
-		candidates.push(word);
-	}
+	candidates.push(...words);
 
 	for (let i = 0; i < words.length - 1; i++) {
 		candidates.push(`${words[i]} ${words[i + 1]}`);
 	}
 
 	return candidates;
 }
tests/tools/user-memory-tools.test.ts (1)

47-55: Reuse cleanupTestDb for user DB cleanup.

The manual cleanup loop on lines 51-55 duplicates the logic already in cleanupTestDb. You can simplify:

♻️ Proposed simplification
 afterEach(() => {
 	db.close();
 	cleanupTestDb(dbPath);
 	userMemoryDb.close();
-	for (const suffix of ["", "-wal", "-shm"]) {
-		try {
-			unlinkSync(userDbPath + suffix);
-		} catch {}
-	}
+	cleanupTestDb(userDbPath);
 });
tests/queue/conflict-resolution.test.ts (2)

55-69: createSequentialEmbeddingModel appears unused.

This helper is defined but never called in the test file. Consider removing it to reduce dead code, or add it when tests that need sequential embeddings are implemented.


80-108: Mocking via internal _generate property is brittle.

The mock helpers (mockEvaluatorResponse, mockEvaluatorFailure, mockEvaluatorInvalidResponse, mockCompressorReturning) all override an internal _generate method via type assertion. If the internal implementation of ConflictEvaluator or ObservationCompressor renames or restructures this method, all these tests will silently break (pass vacuously or fail with confusing errors). Consider exposing a test-friendly injection point or using a proper mock/spy framework.

src/mcp.ts (1)

57-79: Embeddings are gated on compressionEnabled intentionally, but consider decoupling if users need vector search independently.

The embeddingModel is indeed used for vector search in SearchOrchestrator, but also for deduplication and conflict resolution in QueueProcessor. The coupling to compressionEnabled appears intentional—when compression is enabled, observations are embedded for dedup/conflict resolution purposes, and those same embeddings enable vector search.

However, your concern about flexibility is valid: users who want vector search but not compression (or vice versa) would need separate config flags. Currently, disabling compression disables embeddings entirely, which disables both vector search and conflict resolution.

The system does gracefully degrade—search falls back to FTS5 when embeddingModel is null—but adding an independent embeddingsEnabled flag (or vectorSearchEnabled) would give users finer control over this feature cluster without forcing the coupling.

src/queue/processor.ts (2)

52-64: Consider a configuration/dependency object to reduce constructor parameter count.

The constructor now accepts 11 positional parameters. This makes call sites fragile — it's easy to swap nullable arguments. A single deps/options object would improve readability and prevent positional errors.

♻️ Example: group collaborators into a typed options object
+interface QueueProcessorDeps {
+  compressor: ObservationCompressor;
+  summarizer: SessionSummarizer;
+  pendingRepo: PendingMessageRepository;
+  observationRepo: ObservationRepository;
+  sessionRepo: SessionRepository;
+  summaryRepo: SummaryRepository;
+  embeddingModel?: EmbeddingModel | null;
+  conflictEvaluator?: ConflictEvaluator | null;
+  entityExtractor?: EntityExtractor | null;
+  entityRepo?: EntityRepository | null;
+}
+
 export class QueueProcessor {
   constructor(
     private config: QueueProcessorConfig,
-    private compressor: ObservationCompressor,
-    private summarizer: SessionSummarizer,
-    ...
+    private deps: QueueProcessorDeps,
   ) {}

125-247: Indentation mismatch with surrounding code inside the same try block.

Lines 125–247 (the new dedup/conflict-resolution block) are indented one level less than the original code at lines 118–123 and 268–282, even though they all live inside the same try block starting at line 117. This doesn't break functionality but reduces readability.

src/search/reranker.ts (1)

192-219: isRetryable and sleep are duplicated across multiple AI modules.

These same helpers exist in src/ai/conflict-evaluator.ts and src/ai/entity-extractor.ts. Consider extracting them into a shared utility (e.g., src/ai/retry-utils.ts) to reduce duplication.

src/db/entities.ts (1)

114-123: Silent catch swallows all DB errors in createRelation.

The bare catch on line 121 returns null for any error, not just constraint violations. A transient DB error (e.g., disk full, locked DB) would be silently ignored and treated as "relation already exists."

Consider narrowing the catch to only expected constraint errors, or at minimum logging the error:

Proposed fix
 		try {
 			this.db.run(
 				`INSERT OR IGNORE INTO entity_relations
 				 (id, source_entity_id, target_entity_id, relationship, observation_id, created_at)
 				 VALUES (?, ?, ?, ?, ?, ?)`,
 				[id, sourceEntityId, targetEntityId, relationship, observationId, now],
 			);
-		} catch {
+		} catch (err) {
+			console.warn(`[open-mem] Failed to create entity relation: ${err}`);
 			return null;
 		}
src/index.ts (1)

158-158: entityRepo is unconditionally passed to SearchOrchestrator, enabling graph-augmented search even when entity extraction is disabled.

EntityRepository is always created (line 158) and always passed to SearchOrchestrator (line 292). The orchestrator will then call graphAugmentedSearch on every query (orchestrator.ts line 73). When entity extraction is disabled, the entity tables will be empty, so this results in unnecessary DB queries on every search.

Consider guarding it:

Proposed fix
 	const searchOrchestrator = new SearchOrchestrator(
 		observationRepo,
 		embeddingModel,
 		db.hasVectorExtension,
 		reranker,
 		userObservationRepo,
-		entityRepo,
+		config.entityExtractionEnabled ? entityRepo : null,
 	);

Also applies to: 286-293

Comment thread src/ai/parser.ts
Comment thread src/config.ts Outdated
Comment thread src/db/user-memory.ts
Comment thread src/db/user-memory.ts
Comment thread src/search/hybrid.ts
Comment thread src/search/orchestrator.ts Outdated
Comment thread src/search/reranker.ts
Comment thread src/servers/mcp-server.ts Outdated
Comment thread src/servers/mcp-server.ts
…duplication, and docstrings

- Improved error handling: supersede try/catch, FTS try/catch, close handler, NaN guards
- Fixed project scoping: projectPath in FTS search and fallback search
- Added BFS max visited nodes limit to prevent runaway graph traversal
- Parser returns null for update without supersedes target
- Reranker preserves overflow items instead of discarding them
- Extracted shared filters utility (src/search/filters.ts)
- Content-based cross-DB deduplication in user-memory
- Relaxed entity extraction prompt for better recall
- Added JSDoc docstrings across all new files

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/db/observations.ts (1)

266-302: ⚠️ Potential issue | 🟠 Major

searchByConcept and searchByFile don't exclude superseded observations.

The main search(), getIndex(), getWithEmbeddings(), and findSimilar() methods all filter with superseded_by IS NULL, but these two specialized search methods do not. This means superseded (stale/contradictory) observations can still surface through concept or file searches.

Proposed fix
 	searchByConcept(concept: string, limit = 10, projectPath?: string): Observation[] {
 		const hasProjectPath = !!projectPath;
 		const sql = `SELECT o.*
 				 FROM observations o
 				 JOIN observations_fts fts ON o._rowid = fts.rowid
 				 ${hasProjectPath ? "JOIN sessions s ON o.session_id = s.id" : ""}
 				 WHERE observations_fts MATCH ?
+				 AND o.superseded_by IS NULL
 				 ${hasProjectPath ? "AND s.project_path = ?" : ""}
 				 ORDER BY rank
 				 LIMIT ?`;

Apply the same fix to searchByFile.

🤖 Fix all issues with AI agents
In `@src/db/schema.ts`:
- Around line 294-320: The foreign keys for observation_id in the
entity_relations and entity_observations tables currently reference
observations(id) without cascade behavior, which causes delete failures; update
the CREATE TABLE statements for entity_relations (foreign key clause referencing
observation_id) and entity_observations (foreign key clause referencing
observation_id) to include ON DELETE CASCADE on the reference to
observations(id) so that when an observation is deleted its related rows in
entity_relations and entity_observations are automatically removed.
🧹 Nitpick comments (12)
src/context/builder.ts (1)

240-245: || treats tokenCount === 0 as falsy — prefer ?? for the fallback.

On lines 241 and 276, entry.tokenCount || estimateTokens(entry.title) will fall through to the estimate if tokenCount is 0. While a zero token count is unlikely in practice, using ?? (nullish coalescing) is more semantically correct — it only falls back for null/undefined.

Proposed fix
-		const tokens = entry.tokenCount || estimateTokens(entry.title);
+		const tokens = entry.tokenCount ?? estimateTokens(entry.title);

Apply the same change on line 276.

src/db/observations.ts (1)

537-544: supersede doesn't verify that either observation ID exists.

If called with a non-existent observationId, the UPDATE silently affects zero rows. Consider returning a boolean or checking changes to let callers know if the operation succeeded — especially since a no-op supersede could mask bugs in conflict resolution logic.

src/db/user-memory.ts (1)

321-334: Use a top-level import instead of inline require("node:fs").

The file uses ESM import syntax throughout but switches to require on line 331. This works in Bun but is inconsistent and non-standard in an ESM module. Past review feedback on the HOME guard has been addressed correctly.

Proposed fix

Add to the top-level imports:

import { mkdirSync } from "node:fs";

Then update the function:

 function resolveUserDbPath(dbPath: string): string {
 	if (dbPath.startsWith("~/")) {
 		const home = process.env.HOME || process.env.USERPROFILE || "";
 		if (!home) {
 			throw new Error(
 				"Cannot resolve user DB path: HOME environment variable is not set",
 			);
 		}
 		const resolved = `${home}${dbPath.slice(1)}`;
 		const dir = resolved.substring(0, resolved.lastIndexOf("/"));
-		require("node:fs").mkdirSync(dir, { recursive: true });
+		mkdirSync(dir, { recursive: true });
 		return resolved;
 	}
 	return dbPath;
 }
src/search/hybrid.ts (1)

103-143: passesFilters does not enforce project scoping in native vector search.

runNativeVectorSearch retrieves candidates via getVecEmbeddingMatches which is not project-scoped, and passesFilters (from filters.ts) does not check projectPath. This means native vector results could include observations from other projects — unlike the FTS path (project-scoped) and the JS fallback path (which uses getWithEmbeddings(projectPath, ...)).

This is pre-existing behavior, but now that options is threaded through, it would be straightforward to add a project-path check to passesFilters or filter inline here for consistency.

src/ai/prompts.ts (2)

186-192: Duplicate/misplaced section header — "Reranking Prompt" appears twice.

Lines 187–188 contain a "Reranking Prompt" section header, but this block is between the conflict evaluation and entity extraction sections. The actual reranking prompt and its correct header are at Lines 244–246. This first occurrence appears to be a leftover.

Proposed fix
-// -----------------------------------------------------------------------------
-// Reranking Prompt
-// -----------------------------------------------------------------------------
-
 // -----------------------------------------------------------------------------
 // Entity Extraction Prompt
 // -----------------------------------------------------------------------------

209-242: Potential XML injection from user-controlled observation data.

Fields like obs.title, obs.narrative, obs.facts, and file paths are interpolated directly into the XML prompt. If any contain XML-like strings (e.g., </facts> in a fact), it could confuse the LLM's output parsing. This is low-severity since the LLM response (not the prompt) is what gets parsed, but escaping special characters (<, >, &) would make the prompts more robust.

src/ai/entity-extractor.ts (1)

105-123: isRetryable and sleep are duplicated between entity-extractor.ts and conflict-evaluator.ts.

Both files contain identical isRetryable() and sleep() implementations. Consider extracting these into a shared utility (e.g., src/ai/retry.ts) to keep them in sync.

src/search/orchestrator.ts (1)

265-283: Advanced filters are not forwarded to user memory search.

The searchUserMemory method only passes query and limit to userObservationRepo.search(). Filters like importanceMin, createdAfter, concepts, etc. from options are silently ignored for user memory results. If this is intentional (simpler user-memory interface), consider documenting it. If not, the UserObservationRepository.search API would need extending.

src/queue/processor.ts (2)

125-247: Conflict resolution logic is sound but deeply nested.

The gray-zone approach with bandLow/bandHigh thresholds and LLM evaluation is well-designed. Fast-path skip for exact duplicates above bandHigh, LLM evaluation for the gray zone, and graceful fallback on evaluator failure are all correct. The non-null assertion on this.conflictEvaluator! (Line 187) is safe given the conflictEnabled guard.

However, this block reaches ~6 levels of nesting (for → try → if → if → if → try). Consider extracting the conflict resolution logic into a private method (e.g., evaluateConflicts(...)) that returns { skip: boolean; supersedesId: string | null } to reduce cognitive complexity.


284-294: Inconsistent indentation in the supersede try/catch block.

The } catch (error) { at Line 290 appears to be at a different indentation level than its corresponding try { at Line 285. While not a functional issue (JS ignores whitespace), it harms readability in an already deeply nested context.

src/ai/conflict-evaluator.ts (1)

96-113: isRetryable and sleep are duplicated across conflict-evaluator.ts and reranker.ts.

Both files define identical isRetryable and sleep helper functions. Consider extracting them into a shared utility (e.g., src/ai/retry-utils.ts) to reduce duplication.

src/ai/parser.ts (1)

228-237: EntityType is independently defined in both parser.ts and entities.ts.

The EntityType union is duplicated here (line 229) and in src/db/entities.ts (line 13). If one is updated without the other, they'll silently drift. Consider having one canonical definition and importing it in the other module.

For example, entities.ts could import from parser.ts (or both from a shared types file):

-export type EntityType =
-	| "technology"
-	| "library"
-	| "pattern"
-	| "concept"
-	| "file"
-	| "person"
-	| "project"
-	| "other";
+export type { EntityType } from "../ai/parser";

Comment thread src/db/schema.ts
@clopca
Copy link
Copy Markdown
Owner Author

clopca commented Feb 7, 2026

@greptile review

clopca and others added 3 commits February 7, 2026 18:35
Add per-folder promise-based lock to serialize concurrent writes and clean up temp files on failure.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/tools/update.ts (1)

6-16: ⚠️ Potential issue | 🟡 Minor

Title description says "max 80 chars" but no .max(80) constraint is enforced.

Line 8 describes "Updated title (max 80 chars)" but the schema only specifies z.string().optional(). If this is a real constraint, add .max(80). If it's just guidance for the LLM agent, it's fine as-is, but the mismatch could confuse future developers.

🔧 Proposed fix (if the constraint is real)
-	title: z.string().optional().describe("Updated title (max 80 chars)"),
+	title: z.string().max(80).optional().describe("Updated title (max 80 chars)"),
src/context/progressive.ts (1)

38-70: ⚠️ Potential issue | 🟡 Minor

fullObservations bypass the token budget and totalTokens underreports.

While fullObservations are count-limited (via config.contextFullObservationCount, default 3) before being passed to buildProgressiveContext, they are not deducted from budget. This means:

  1. totalTokens only reflects summaries + filtered index entries, omitting the token cost of fullObservations.
  2. If the token budget is tight, some of the pre-selected full observations may not appear in the returned observationIndex (which is budget-filtered), creating a mismatch.

If this separation is intentional, document that the caller must account for fullObservations tokens separately. Otherwise, their token costs should be deducted from budget and included in totalTokens.

🤖 Fix all issues with AI agents
In `@src/db/observations.ts`:
- Around line 243-259: The current LIKE `%...%` on o.concepts / o.files_read /
o.files_modified (built in the query generation using query.concepts,
query.files and escapeLike) can produce substring false-positives (e.g. "test"
matches "testing"); change the matching to require JSON-array element boundaries
by searching for the quoted element (escape the value with escapeLike and wrap
with '"…"' in the pattern) or, preferably, use native JSON operators if
supported (e.g. JSON_CONTAINS / json_each) instead of simple LIKE. Concretely,
update the clauses built in the block that maps query.concepts and query.files
so the pushed params are like `%\"${escapedValue}\"%` (or replace the whole LIKE
logic with JSON_CONTAINS(o.concepts, json_quote(value)) / equivalent) while
still using escapeLike for safe LIKE escaping; keep references to escapeLike,
query.concepts, query.files, o.concepts, o.files_read and o.files_modified when
making the change.

In `@src/db/schema.ts`:
- Around line 261-271: The migration "add-conflict-resolution-columns" adds
observations.superseded_by without a foreign-key; decide whether to prevent
dangling references and, if so, modify the migration to add a FK constraint on
superseded_by referencing observations(id) with ON DELETE SET NULL (or
alternatively add a cleanup trigger) so that when the superseding observation is
deleted the superseded_by value is cleared; update the migration in the version
8 block (name: "add-conflict-resolution-columns") to implement the chosen
behavior and ensure any index/DDL ordering respects the FK change.
🧹 Nitpick comments (10)
src/ai/provider.ts (2)

12-12: | string renders the string-literal union ineffective for type checking.

The union "anthropic" | "bedrock" | "openai" | "google" | string collapses to string in TypeScript, so the compiler won't flag typos like "opneai". If the intent is to allow arbitrary providers while still offering autocomplete, use the string & {} idiom:

♻️ Suggested change
-export type ProviderType = "anthropic" | "bedrock" | "openai" | "google" | string;
+export type ProviderType = "anthropic" | "bedrock" | "openai" | "google" | (string & {});

85-111: createEmbeddingModel ignores config.model and hardcodes embedding model names.

The caller passes a ModelConfig that includes a model field, but this function never uses it — it always selects a hardcoded embedding model per provider (e.g., "text-embedding-004" for Google, "text-embedding-3-small" for OpenAI). This is surprising because createModel does respect config.model.

If this is intentional (embedding models are provider-fixed), consider either:

  1. Adding an embeddingModel field to ModelConfig (or a separate config type) so callers can override it, or
  2. Documenting clearly on the function that config.model is unused and the embedding model is provider-determined.

At minimum, the current signature is misleading — callers might expect config.model to control which embedding model is created.

src/daemon/worker.ts (1)

43-56: Minor: errors swallowed silently in the polling loop.

The catch block on line 51 discards all errors. While the comment explains the intent (polling errors must not crash the loop), consider logging at debug level so operational issues are diagnosable. Same applies to the .catch(() => {}) in handleMessage (line 89).

src/servers/http-server.ts (1)

75-86: redactConfig may over-redact non-secret fields like apiVersion.

The heuristic lowerKey.includes("api") will redact any string field whose key contains "api" — including fields like apiVersion or apiEndpoint that aren't secrets. This is a conservative default (safe), but could confuse dashboard users seeing ***REDACTED*** for non-sensitive config. Consider matching more precisely (e.g., lowerKey.includes("apikey") || lowerKey === "apikey" || lowerKey.includes("secret") || lowerKey.includes("token")).

src/queue/processor.ts (3)

286-296: Inconsistent indentation in the supersede try/catch block.

The } catch at Line 292 is indented at a different level than the try at Line 287, making the block structure hard to follow. Not a runtime issue, but it hurts readability in an already deeply nested method.

🔧 Proposed fix
 					if (conflictSupersedesId) {
 						try {
 							this.observationRepo.supersede(conflictSupersedesId, created.id);
 							console.log(
 								`[open-mem] Superseded observation ${conflictSupersedesId} with ${created.id}`,
 							);
-					} catch (error) {
-						// Supersede failure must not block observation creation
-						console.error(`[open-mem] Failed to supersede ${conflictSupersedesId}:`, error);
-					}
+						} catch (error) {
+							// Supersede failure must not block observation creation
+							console.error(`[open-mem] Failed to supersede ${conflictSupersedesId}:`, error);
+						}
 					}

127-330: Consider extracting the dedup/conflict and entity extraction logic into private methods.

processBatch is now ~220 lines with up to 7 levels of nesting in the conflict resolution path. Extracting these into dedicated private methods (e.g., resolveConflicts(observation, dedupEmbedding) and extractEntities(created)) would significantly improve readability and testability without changing behavior.


54-66: Constructor has grown to 11 parameters.

Consider grouping the AI-related dependencies into a single object (e.g., aiDeps: { conflictEvaluator, entityExtractor, entityRepo }) to keep the constructor manageable as new features are added.

src/db/observations.ts (1)

555-562: supersede doesn't verify that observationId or newObservationId exist.

If called with invalid IDs, the UPDATE silently affects zero rows. Consider returning a boolean or checking changes count so the caller knows whether the supersession actually took place, rather than silently succeeding.

🔧 Proposed fix
-	supersede(observationId: string, newObservationId: string): void {
+	supersede(observationId: string, newObservationId: string): boolean {
 		const now = new Date().toISOString();
 		this.db.run(
 			"UPDATE observations SET superseded_by = ?, superseded_at = ? WHERE id = ?",
 			[newObservationId, now, observationId],
 		);
+		const row = this.db.get<{ id: string }>(
+			"SELECT id FROM observations WHERE id = ? AND superseded_by = ?",
+			[observationId, newObservationId],
+		);
+		return !!row;
 	}
src/index.ts (2)

143-176: Conditional AI services follow a consistent pattern — minor note on entityRepo.

ConflictEvaluator and EntityExtractor are correctly gated behind their respective config flags + API key availability, matching the embeddingModel pattern above.

entityRepo (line 162) is instantiated unconditionally even when entityExtractionEnabled is false. Since it's a lightweight DB wrapper this is harmless, but for consistency with the other conditional components you could guard it or lazily create it. Not blocking.


349-357: Consider whether new types should be part of the public API.

The re-export surface hasn't been expanded despite adding significant new subsystems (entities, user memory, conflict evaluation). If consumers need to interact with these features programmatically (e.g., configuring entityExtractionEnabled or userMemoryEnabled), the corresponding config types or at least the feature flag names may need to be exported.

Comment thread src/db/observations.ts
Comment on lines +243 to +259
if (query.concepts && query.concepts.length > 0) {
const conceptClauses = query.concepts.map(() => "o.concepts LIKE ? ESCAPE '\\'");
sql += ` AND (${conceptClauses.join(" OR ")})`;
for (const c of query.concepts) {
params.push(`%${escapeLike(c)}%`);
}
}
if (query.files && query.files.length > 0) {
const fileClauses = query.files.map(
() => "(o.files_read LIKE ? ESCAPE '\\' OR o.files_modified LIKE ? ESCAPE '\\')",
);
sql += ` AND (${fileClauses.join(" OR ")})`;
for (const f of query.files) {
const escaped = `%${escapeLike(f)}%`;
params.push(escaped, escaped);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Concept/file LIKE filters may produce false-positive matches.

The LIKE %concept% pattern matches substrings within the JSON array (e.g., searching for "test" matches "testing"). This is a known trade-off for simplicity vs. JSON parsing, but worth noting for users who might expect exact-match semantics on the concepts and files filter arrays.

🤖 Prompt for AI Agents
In `@src/db/observations.ts` around lines 243 - 259, The current LIKE `%...%` on
o.concepts / o.files_read / o.files_modified (built in the query generation
using query.concepts, query.files and escapeLike) can produce substring
false-positives (e.g. "test" matches "testing"); change the matching to require
JSON-array element boundaries by searching for the quoted element (escape the
value with escapeLike and wrap with '"…"' in the pattern) or, preferably, use
native JSON operators if supported (e.g. JSON_CONTAINS / json_each) instead of
simple LIKE. Concretely, update the clauses built in the block that maps
query.concepts and query.files so the pushed params are like
`%\"${escapedValue}\"%` (or replace the whole LIKE logic with
JSON_CONTAINS(o.concepts, json_quote(value)) / equivalent) while still using
escapeLike for safe LIKE escaping; keep references to escapeLike,
query.concepts, query.files, o.concepts, o.files_read and o.files_modified when
making the change.

Comment thread src/db/schema.ts
…eanup trigger

Replace LIKE substring matching with json_each() for exact concept element matching and per-element file path matching. Add trg_clear_superseded_by trigger to v8 migration to prevent dangling superseded_by references on observation deletion.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@clopca clopca merged commit aa86fe2 into main Feb 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant