This report is auto-generated by scripts/benchmark_distributed.sh.
# Start the 2-shard MySQL demo
./scripts/start_sharding_demo.sh
# Start the single-backend baseline
./scripts/setup_single_backend.sh
# Build and run the full benchmark suite
make bench-distributed
./scripts/benchmark_distributed.shOr run the benchmark tool directly:
# 2-shard distributed
./bench_distributed \
--backend "mysql://root:test@127.0.0.1:13306/testdb?name=shard1" \
--backend "mysql://root:test@127.0.0.1:13307/testdb?name=shard2" \
--shard "users:id:shard1,shard2" \
--shard "orders:id:shard1,shard2"
# Single-backend baseline
./bench_distributed \
--backend "mysql://root:test@127.0.0.1:13308/testdb?name=single" \
--shard "users:id:single" \
--shard "orders:id:single"Each query goes through 5 stages, each independently timed:
- Parse -- Tokenize and build AST
- Plan -- Convert AST to logical plan tree
- Optimize -- Apply rewrite rules (predicate pushdown, constant folding, etc.)
- Distribute -- Rewrite plan for multi-shard execution (RemoteScan, MergeSort, etc.)
- Execute -- Run operators, fetch data from backends, merge results
| # | Name | SQL | Description |
|---|---|---|---|
| 1 | full_scan | SELECT * FROM users |
Scan all rows from both shards |
| 2 | filter_pushdown | SELECT name, age, salary FROM users WHERE dept = 'Engineering' |
Filter pushed to both shards |
| 3 | distributed_agg | SELECT dept, COUNT(*) FROM users GROUP BY dept |
Count by department, merged |
| 4 | sort_limit | SELECT name, salary FROM users ORDER BY salary DESC LIMIT 3 |
Top-3 via merge-sort |
| 5 | cross_shard_join | SELECT u.name, o.total, o.status FROM users u JOIN orders o ON u.id = o.user_id |
Cross-shard join |
| 6 | expression_only | SELECT 1 + 2, UPPER('distributed'), ... |
Pure expression, no backend |
| 7 | subquery | SELECT name, age FROM users WHERE age > (SELECT AVG(age) FROM users) |
Correlated subquery |
Vitess is Google's database clustering system for horizontal scaling of MySQL.
| Feature | Our Engine | Vitess |
|---|---|---|
| Proxy layer | Single binary | vtgate + vttablet per shard |
| Query parsing | Custom zero-alloc C++ parser | sqlparser (Go) |
| Planning | Single-pass plan builder | vtgate planner (Gen4) |
| Optimization | Rule-based (4 rules) | Cost-based (Gen4) |
| Shard routing | ShardMap + hash-based | Vindexes (pluggable) |
| Cross-shard joins | Hash join + merge sort | Scatter-gather |
| Aggregation | MergeAggregate | Ordered aggregate on vtgate |
For a direct Vitess comparison, set up their local example and run equivalent queries through their MySQL protocol endpoint (port 15306).
| Query | 2-shard (p50) | 1-shard (p50) | Overhead |
|---|---|---|---|
| full_scan | 371 us | 190 us | 1.95x |
| filter_pushdown | 402 us | 204 us | 1.97x |
| distributed_agg | 410 us | 213 us | 1.92x |
| sort_limit | 402 us | 193 us | 2.08x |
| cross_shard_join | 728 us | 374 us | 1.95x |
| expression_only | 1.4 us | 1.2 us | 1.17x |
| subquery | 4.33 ms | 2.13 ms | 2.03x |
| Query | Parse | Plan | Optimize | Distribute | Execute |
|---|---|---|---|---|---|
| full_scan | 1.1 us | 419 ns | 175 ns | 656 ns | 368 us |
| filter_pushdown | 1.5 us | 617 ns | 358 ns | 1.3 us | 403 us |
| distributed_agg | 1.6 us | 523 ns | 469 ns | 1.3 us | 416 us |
| sort_limit | 1.5 us | 575 ns | 320 ns | 905 ns | 407 us |
| cross_shard_join | 1.9 us | 662 ns | 403 ns | 1.2 us | 747 us |
| expression_only | 516 ns | 92 ns | 433 ns | 28 ns | 312 ns |
| subquery | 2.6 us | 590 ns | 800 ns | 922 ns | 4.40 ms |
- Parse + Plan + Optimize + Distribute combined account for less than 1% of total latency for any query that touches a backend. The entire planning pipeline completes in under 5 us even for the most complex query (subquery).
- Execute dominates at 99%+ of wall time for all backend queries. This is expected because network I/O to MySQL backends is the bottleneck.
- 2-shard overhead is ~2x for most queries, which is the expected cost of making two network round-trips instead of one. The engine fetches from both shards in parallel where possible.
- Cross-shard join is ~2x the single-shard join because both tables must be fetched from two backends (4 total round-trips for users + orders).
- Expression-only queries (no backend) complete in ~1.4 us, showing the raw overhead of the parse-plan-optimize-execute pipeline with zero I/O.
- Subquery is the most expensive at 4.3ms due to multiple backend round-trips for the inner AVG query plus the outer filtered scan.