Skip to content

Latest commit

 

History

History
82 lines (56 loc) · 3.5 KB

File metadata and controls

82 lines (56 loc) · 3.5 KB

Parser Performance Comparison

Date: 2026-03-24 Host: AMD Ryzen 9 5950X 16-Core All parsers built with -O3


Compared Parsers

Parser Language Type Notes
ParserSQL (this project) C++17 Hand-written recursive descent Arena alloc, zero-copy, proxy-optimized
libpg_query v17 C PostgreSQL's Bison parser extracted Two modes: raw parse (AST only) and full (AST + JSON serialization)
sqlparser-rs v0.53 Rust Hand-written recursive descent General-purpose SQL parser, builds typed AST

Results

All parsers on the same queries

Query ParserSQL pg_query raw pg_query +JSON sqlparser-rs
SELECT col FROM t WHERE id = 1 175 ns 718 ns 2,018 ns 4,687 ns
SELECT ... JOIN ... WHERE 440 ns 1,745 ns 4,804 ns 10,684 ns
SELECT ... GROUP BY ... HAVING ... ORDER BY ... LIMIT 975 ns 3,479 ns 9,082 ns 23,411 ns
INSERT INTO t (cols) VALUES (...) 212 ns 849 ns 1,933 ns 3,784 ns
UPDATE t SET ... WHERE 214 ns 2,075 ns 4,102 ns
DELETE FROM t WHERE 149 ns 1,512 ns 3,049 ns
BEGIN 29 ns 259 ns 441 ns 412 ns

Speedup ratios

Query vs pg_query (raw parse) vs pg_query (+JSON) vs sqlparser-rs
SELECT simple 4.1x 11.5x 27x
SELECT JOIN 4.0x 10.9x 24x
SELECT complex 3.6x 9.3x 24x
INSERT 4.0x 9.1x 18x
BEGIN 8.9x 15.2x 14x

Analysis

vs libpg_query (fair comparison: raw parse only)

ParserSQL is ~4x faster than PostgreSQL's own parser when comparing parse-only (no JSON serialization). This is a genuine speedup from:

  1. Arena allocation — our parser allocates AST nodes from a bump allocator (3.5ns reset). PostgreSQL uses its MemoryContext system which is more general but has higher per-allocation overhead.
  2. Zero-copy strings — our StringRef points into the original input. PostgreSQL copies identifier strings into palloc'd memory.
  3. Simpler AST — our 32-byte AstNode vs PostgreSQL's richly-typed node structs (>100 bytes each).

Note: libpg_query's raw parse still includes PostgreSQL memory context setup/teardown per call. Our parser reuses the arena across calls (just a pointer rewind).

vs libpg_query (full: parse + JSON serialize)

The 9-15x ratio includes JSON serialization overhead in libpg_query. This is how pg_query_parse() is typically used, but it's not a fair parse-only comparison.

vs sqlparser-rs

ParserSQL is 14-27x faster. This is the fairest comparison — both are standalone syntax parsers. The speed gap comes from:

  1. Arena vs heap — sqlparser-rs uses Box<Expr>, Vec<SelectItem>, etc. Each allocation goes through Rust's allocator.
  2. Zero-copy vs owned strings — sqlparser-rs creates String values. We use StringRef pointing into input.
  3. 32-byte nodes vs enum variants — Rust's richly-typed AST enums carry more data per node.
  4. Hash table keyword lookup — O(1) FNV-1a hash vs tokenizer construction in sqlparser-rs.

Methodology

  • All benchmarks use Google Benchmark (C++) or criterion (Rust)
  • Same queries used across all parsers
  • PostgreSQL-compatible SQL to keep the query set common
  • Each benchmark runs millions of iterations; reported times are per-operation median
  • Machine: AMD Ryzen 9 5950X, Linux 6.17, GCC 13.3 -O3

Generated by scripts/run_comparison.sh