Skip to content

feat: split libscope into focused sub-packages (@libscope/parsers, @libscope/core, @libscope/lite, etc.) #488

@RobertLD

Description

@RobertLD

Summary

libscope is currently a single package that bundles every concern — CLI, MCP server, REST API, web dashboard, all third-party connectors, and the search engine — into one install. This makes it impractical to use libscope as a library dependency: consumers who only want semantic search pull in the MCP SDK, Commander.js, Notion/Slack client code, and more. This issue proposes splitting the package into focused sub-packages organised as a monorepo with npm workspaces.

Problem / Motivation

Installing libscope as a library today means taking a dependency on @modelcontextprotocol/sdk, commander, node-cron, connector HTTP clients, and a full web dashboard — none of which are relevant to a consumer who just wants to index and search documents. The src/core/index.ts export re-exports API servers, connectors, and CLI utilities alongside the search/index APIs, making the public surface unclear.

The ./lite export already proves the pattern is viable: LibScopeLite is a clean, embeddable interface with no process lifecycle concerns. The rest of the codebase should follow the same principle.

Proposed Solution

Convert the repo to a monorepo using npm workspaces, extracting 8 packages each with a single clear responsibility:

Package Contents
@libscope/parsers Format converters only — markdown, PDF, DOCX, EPUB, PPTX, CSV, JSON, YAML, HTML → text. No DB, no search.
@libscope/core Search engine, indexing, RAG, DB, embedding providers. Depends on @libscope/parsers.
@libscope/lite Single-class embeddable wrapper (LibScopeLite). Formalises the existing ./lite export.
@libscope/connectors Notion, Slack, Confluence, OneNote, Obsidian sync adapters.
@libscope/registry Git-backed pack registry — sync, publish, resolve, checksum.
@libscope/mcp MCP server for Claude/Claude Desktop integration.
@libscope/server REST API + web dashboard.
libscope CLI meta-package — depends on all of the above. The one justified kitchen-sink entry point.

Dependency hierarchy:

@libscope/parsers   (no libscope deps)
@libscope/core      (parsers)
@libscope/lite      @libscope/registry
@libscope/connectors   @libscope/mcp   @libscope/server
libscope (CLI)

A monorepo keeps coordinated cross-package changes in a single PR, shares CI config and lint setup, and avoids the overhead of 8 separate repositories. Packages can be extracted incrementally.

Acceptance Criteria

  • Repo is structured as an npm workspaces monorepo with a root package.json and per-package package.json files
  • @libscope/parsers builds and tests pass independently, with zero imports from any other @libscope/* package
  • @libscope/core builds and tests pass independently; its published install does not include CLI, MCP SDK, connector clients, or web server code
  • @libscope/lite is published as a standalone package and the existing libscope/lite import continues to resolve (backwards-compatible re-export or alias)
  • @libscope/connectors builds independently; src/core/scheduler.ts no longer imports connector modules directly (coupling inverted)
  • @libscope/registry builds independently with no circular deps on @libscope/core
  • @libscope/mcp and @libscope/server build independently
  • libscope CLI continues to work end-to-end (all existing commands pass integration tests)
  • src/core/index.ts no longer re-exports API servers, connectors, or CLI utilities — public surface is search/index/document/topic/tag/RAG only
  • CI passes for all packages on Node 20 and 22

Out of Scope

  • Internal implementation details of each package (tracked in sub-issues linked below)
  • Renaming or changing the public API of any existing module
  • ESM migration — tracked separately; packages stay CommonJS to match current build target
  • Publishing all packages to npm simultaneously — packages will be extracted incrementally

Technical Notes

Already clean (low-effort extractions):

  • src/core/parsers/ — zero upward imports; can be extracted immediately
  • src/lite/ — already has a clean export boundary; just needs its own package.json
  • src/connectors/ — no circular deps with core
  • src/registry/ — depends almost entirely on Node.js stdlib (fs, path, crypto, zlib)

Known coupling to resolve:

  • src/core/scheduler.ts imports connectors directly — needs inverting before @libscope/connectors can be independent
  • src/core/index.ts re-exports src/api/server.ts, src/web/server.ts, and connector utilities — these need removing before @libscope/core can be published cleanly
  • src/LibScope.ts is the main orchestrator and will stay in the libscope CLI package

Shared test fixtures (tests/fixtures/) are used across all modules today — they'll need to move into each package or a @libscope/testing internal package.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrefactorCode refactoringtech-debtCode quality and technical debt

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions