Skip to content

fix: stream SQL database dumps#206

Open
Treasure520520 wants to merge 2 commits into
outerbase:mainfrom
Treasure520520:bounty/59-streaming-dump
Open

fix: stream SQL database dumps#206
Treasure520520 wants to merge 2 commits into
outerbase:mainfrom
Treasure520520:bounty/59-streaming-dump

Conversation

@Treasure520520
Copy link
Copy Markdown

/claim #59

Summary

This is a focused streaming slice for the large database dump issue.

Instead of building the entire SQL dump in a single in-memory string/blob, /export/dump now returns a ReadableStream and writes the dump incrementally:

  • emits the existing SQLite header first
  • loads table schemas one table at a time
  • reads table rows in bounded LIMIT ? OFFSET ? pages
  • streams each batch of INSERT statements as it is produced
  • quotes table identifiers and escapes SQL values, including NULL, booleans, bigint, and binary values
  • preserves the existing download headers and response shape for callers

This does not attempt to solve the full async/R2 continuation path in one PR. It removes the immediate O(database dump size) string allocation while keeping the existing endpoint behavior intact, so it can compose with later R2/alarm work.

Validation

  • ./node_modules/.bin/prettier --check src/export/dump.ts src/export/dump.test.ts
  • ./node_modules/.bin/vitest run src/export/dump.test.ts (6 tests)
  • git diff --check

The repo-wide ./node_modules/.bin/tsc --noEmit is still blocked by pre-existing type errors outside this PR (src/do.ts, src/operation.ts, existing tests). The focused dump tests and formatting checks pass.

@Treasure520520
Copy link
Copy Markdown
Author

Adding a short demo GIF for the Algora bounty review guideline:

StarbaseDB #59 streaming dump demo

It summarizes the focused change in this PR: stream the SQL dump response, read rows in bounded pages, and avoid a single full-database dump string allocation.

Copy link
Copy Markdown

@digzrow-coder digzrow-coder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this still has the memory failure mode the issue is trying to remove. The export producer runs inside the stream's start() callback and loops through every table/page, calling controller.enqueue(...) without checking/awaiting backpressure. In Web Streams, start() is kicked off when the stream is created; it is not tied to downstream reads. If the client is slow, or if the runtime cannot flush the response as fast as SQLite pages are read, this can queue arbitrarily many encoded chunks in the isolate. A 10GB dump can therefore still grow toward dump-size memory, just in the stream queue instead of in one string/blob.

A safer shape is to make the stream pull-driven: do one schema/page of work from pull(controller) (or use a TransformStream/writer path that awaits backpressure) and only fetch the next SQLite page after the previous queued chunk has been consumed. That keeps memory bounded by a small number of pages instead of the full dump size.

@Treasure520520
Copy link
Copy Markdown
Author

Thanks for the careful review. I pushed 5fb75a9 to address this directly.

The dump stream is now pull-driven instead of doing the table/page loop inside start(): each pull() emits at most one header/schema/page/table-separator chunk, and the next SQLite page is only fetched after the previously queued chunk has been consumed by the stream reader. That keeps the producer bounded by the active page/chunk instead of enqueueing the full dump when the client is slow.

I also added a regression test that reads only through the schema chunk and verifies the next OFFSET 500 page has not been requested yet.

Validation re-run:

npx --yes vitest run src/export/dump.test.ts
npx --yes prettier --check src/export/dump.ts src/export/dump.test.ts
git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants