Set aggregation hash seed by ctsk · Pull Request #16165 · apache/datafusion

ctsk · 2025-05-23T12:11:27Z

This PR hard-codes the seed for the hash aggregation. The main benefit compared to the previously runtime-determined seed is that after applying this PR, partial aggregation and final aggregation will share the same hash function.

I haven't measured it, but in theory, this should make the final aggregation step more efficient, because the partial aggregation will emit the group values in a way that will be clustered in the final aggregation hash table - thus causing a benefitial memory access pattern when building the final aggregation.

I expect it speeds up large-cardinality aggregations that don't trigger the skipping of the partial aggregation step a tiny bit.

alamb · 2025-05-24T11:39:13Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing fix/aggregation-seed (ad1d5a3) to ce835da diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2025-05-24T11:39:32Z

datafusion/physical-plan/src/aggregates/mod.rs

 mod topk_stream;

+/// Hard-coded seed for aggregations to ensure hash values differ from `RepartitionExec`, avoiding collisions.
+const AGGREGATION_HASH_SEED: ahash::RandomState =


alamb

Thanks @ctsk -- I started some benchmark runs just to be sure, but I think this looks good to me

alamb · 2025-05-24T12:31:30Z

🤖: Benchmark completed

Details

Comparing HEAD and fix_aggregation-seed
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ fix_aggregation-seed ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  1916.88ms │            1804.74ms │ +1.06x faster │
│ QQuery 1     │   692.65ms │             736.98ms │  1.06x slower │
│ QQuery 2     │  1413.15ms │            1478.90ms │     no change │
│ QQuery 3     │   689.13ms │             671.61ms │     no change │
│ QQuery 4     │  1446.11ms │            1442.70ms │     no change │
│ QQuery 5     │ 14897.86ms │           15181.61ms │     no change │
│ QQuery 6     │  2087.94ms │            2036.04ms │     no change │
│ QQuery 7     │  2162.26ms │            2113.35ms │     no change │
│ QQuery 8     │   835.26ms │             848.29ms │     no change │
└──────────────┴────────────┴──────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 26141.23ms │
│ Total Time (fix_aggregation-seed)   │ 26314.22ms │
│ Average Time (HEAD)                 │  2904.58ms │
│ Average Time (fix_aggregation-seed) │  2923.80ms │
│ Queries Faster                      │          1 │
│ Queries Slower                      │          1 │
│ Queries with No Change              │          7 │
└─────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ fix_aggregation-seed ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    15.99ms │              15.09ms │ +1.06x faster │
│ QQuery 1     │    32.41ms │              32.70ms │     no change │
│ QQuery 2     │    81.45ms │              80.66ms │     no change │
│ QQuery 3     │    95.69ms │             100.75ms │  1.05x slower │
│ QQuery 4     │   599.29ms │             577.82ms │     no change │
│ QQuery 5     │   872.63ms │             851.96ms │     no change │
│ QQuery 6     │    23.86ms │              22.72ms │     no change │
│ QQuery 7     │    36.25ms │              37.82ms │     no change │
│ QQuery 8     │   884.23ms │             907.67ms │     no change │
│ QQuery 9     │  1173.09ms │            1172.75ms │     no change │
│ QQuery 10    │   260.94ms │             263.86ms │     no change │
│ QQuery 11    │   293.28ms │             300.00ms │     no change │
│ QQuery 12    │   912.88ms │             920.09ms │     no change │
│ QQuery 13    │  1192.40ms │            1343.41ms │  1.13x slower │
│ QQuery 14    │   850.53ms │             833.81ms │     no change │
│ QQuery 15    │   820.13ms │             826.47ms │     no change │
│ QQuery 16    │  1724.06ms │            1723.78ms │     no change │
│ QQuery 17    │  1577.63ms │            1592.82ms │     no change │
│ QQuery 18    │  3017.47ms │            3072.19ms │     no change │
│ QQuery 19    │    84.78ms │              83.57ms │     no change │
│ QQuery 20    │  1165.48ms │            1110.95ms │     no change │
│ QQuery 21    │  1347.92ms │            1308.39ms │     no change │
│ QQuery 22    │  2259.92ms │            2159.81ms │     no change │
│ QQuery 23    │  8231.06ms │            8105.64ms │     no change │
│ QQuery 24    │   473.87ms │             467.20ms │     no change │
│ QQuery 25    │   399.63ms │             391.08ms │     no change │
│ QQuery 26    │   536.59ms │             529.31ms │     no change │
│ QQuery 27    │  1615.02ms │            1571.63ms │     no change │
│ QQuery 28    │ 12519.83ms │           13446.00ms │  1.07x slower │
│ QQuery 29    │   533.90ms │             528.88ms │     no change │
│ QQuery 30    │   813.19ms │             810.21ms │     no change │
│ QQuery 31    │   837.81ms │             853.59ms │     no change │
│ QQuery 32    │  2630.29ms │            2688.65ms │     no change │
│ QQuery 33    │  3366.13ms │            3343.27ms │     no change │
│ QQuery 34    │  3539.24ms │            3381.57ms │     no change │
│ QQuery 35    │  1311.14ms │            1327.88ms │     no change │
│ QQuery 36    │   127.42ms │             118.82ms │ +1.07x faster │
│ QQuery 37    │    57.54ms │              55.77ms │     no change │
│ QQuery 38    │   124.70ms │             118.21ms │ +1.05x faster │
│ QQuery 39    │   199.95ms │             191.97ms │     no change │
│ QQuery 40    │    45.81ms │              50.80ms │  1.11x slower │
│ QQuery 41    │    43.59ms │              47.31ms │  1.09x slower │
│ QQuery 42    │    38.67ms │              38.97ms │     no change │
└──────────────┴────────────┴──────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 56767.70ms │
│ Total Time (fix_aggregation-seed)   │ 57405.81ms │
│ Average Time (HEAD)                 │  1320.18ms │
│ Average Time (fix_aggregation-seed) │  1335.02ms │
│ Queries Faster                      │          3 │
│ Queries Slower                      │          5 │
│ Queries with No Change              │         35 │
└─────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ fix_aggregation-seed ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 1     │ 124.77ms │             119.14ms │    no change │
│ QQuery 2     │  23.91ms │              23.13ms │    no change │
│ QQuery 3     │  35.80ms │              35.41ms │    no change │
│ QQuery 4     │  21.46ms │              20.88ms │    no change │
│ QQuery 5     │  55.85ms │              56.54ms │    no change │
│ QQuery 6     │  12.05ms │              12.31ms │    no change │
│ QQuery 7     │ 103.38ms │             102.51ms │    no change │
│ QQuery 8     │  28.16ms │              27.30ms │    no change │
│ QQuery 9     │  63.93ms │              63.04ms │    no change │
│ QQuery 10    │  57.87ms │              59.88ms │    no change │
│ QQuery 11    │  13.01ms │              13.03ms │    no change │
│ QQuery 12    │  44.52ms │              45.03ms │    no change │
│ QQuery 13    │  30.37ms │              30.10ms │    no change │
│ QQuery 14    │  10.15ms │              10.36ms │    no change │
│ QQuery 15    │  24.70ms │              25.57ms │    no change │
│ QQuery 16    │  22.89ms │              24.06ms │ 1.05x slower │
│ QQuery 17    │ 100.03ms │             105.04ms │ 1.05x slower │
│ QQuery 18    │ 240.72ms │             243.75ms │    no change │
│ QQuery 19    │  26.64ms │              30.05ms │ 1.13x slower │
│ QQuery 20    │  39.88ms │              40.52ms │    no change │
│ QQuery 21    │ 173.12ms │             171.61ms │    no change │
│ QQuery 22    │  16.87ms │              17.34ms │    no change │
└──────────────┴──────────┴──────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 1270.08ms │
│ Total Time (fix_aggregation-seed)   │ 1276.58ms │
│ Average Time (HEAD)                 │   57.73ms │
│ Average Time (fix_aggregation-seed) │   58.03ms │
│ Queries Faster                      │         0 │
│ Queries Slower                      │         3 │
│ Queries with No Change              │        19 │
└─────────────────────────────────────┴───────────┘

alamb · 2025-05-25T11:32:06Z

I am surprised this shows any performance difference. I will rerun and see if I can reproduce

alamb · 2025-05-25T11:32:39Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing fix/aggregation-seed (ad1d5a3) to ce835da diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2025-05-25T11:48:17Z

🤖: Benchmark completed

Details

Comparing HEAD and fix_aggregation-seed
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ fix_aggregation-seed ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  1936.63ms │            1892.89ms │     no change │
│ QQuery 1     │   697.95ms │             740.02ms │  1.06x slower │
│ QQuery 2     │  1414.55ms │            1490.50ms │  1.05x slower │
│ QQuery 3     │   729.21ms │             692.71ms │ +1.05x faster │
│ QQuery 4     │  1463.08ms │            1446.79ms │     no change │
│ QQuery 5     │ 15325.51ms │           15177.39ms │     no change │
│ QQuery 6     │  2077.11ms │            2013.56ms │     no change │
│ QQuery 7     │  2082.41ms │            2126.12ms │     no change │
│ QQuery 8     │   822.11ms │             856.19ms │     no change │
└──────────────┴────────────┴──────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 26548.55ms │
│ Total Time (fix_aggregation-seed)   │ 26436.17ms │
│ Average Time (HEAD)                 │  2949.84ms │
│ Average Time (fix_aggregation-seed) │  2937.35ms │
│ Queries Faster                      │          1 │
│ Queries Slower                      │          2 │
│ Queries with No Change              │          6 │
└─────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ fix_aggregation-seed ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    14.79ms │              15.57ms │  1.05x slower │
│ QQuery 1     │    31.81ms │              32.52ms │     no change │
│ QQuery 2     │    81.06ms │              80.39ms │     no change │
│ QQuery 3     │    94.35ms │              99.08ms │  1.05x slower │
│ QQuery 4     │   591.30ms │             582.37ms │     no change │
│ QQuery 5     │   844.26ms │             840.60ms │     no change │
│ QQuery 6     │    24.29ms │              22.31ms │ +1.09x faster │
│ QQuery 7     │    37.24ms │              37.17ms │     no change │
│ QQuery 8     │   887.46ms │             901.41ms │     no change │
│ QQuery 9     │  1184.65ms │            1193.93ms │     no change │
│ QQuery 10    │   262.17ms │             263.72ms │     no change │
│ QQuery 11    │   295.86ms │             299.27ms │     no change │
│ QQuery 12    │   918.92ms │             902.62ms │     no change │
│ QQuery 13    │  1354.13ms │            1360.94ms │     no change │
│ QQuery 14    │   835.94ms │             840.35ms │     no change │
│ QQuery 15    │   815.61ms │             812.20ms │     no change │
│ QQuery 16    │  1692.38ms │            1762.16ms │     no change │
│ QQuery 17    │  1569.81ms │            1594.18ms │     no change │
│ QQuery 18    │  3036.44ms │            3054.22ms │     no change │
│ QQuery 19    │    83.10ms │              84.23ms │     no change │
│ QQuery 20    │  1160.85ms │            1098.39ms │ +1.06x faster │
│ QQuery 21    │  1343.82ms │            1285.97ms │     no change │
│ QQuery 22    │  2264.70ms │            2178.21ms │     no change │
│ QQuery 23    │  8100.26ms │            8111.28ms │     no change │
│ QQuery 24    │   483.52ms │             464.06ms │     no change │
│ QQuery 25    │   393.40ms │             390.52ms │     no change │
│ QQuery 26    │   550.19ms │             538.08ms │     no change │
│ QQuery 27    │  1626.07ms │            1548.73ms │     no change │
│ QQuery 28    │ 12635.88ms │           13455.63ms │  1.06x slower │
│ QQuery 29    │   533.59ms │             519.37ms │     no change │
│ QQuery 30    │   808.53ms │             808.72ms │     no change │
│ QQuery 31    │   836.22ms │             877.21ms │     no change │
│ QQuery 32    │  2573.89ms │            2631.39ms │     no change │
│ QQuery 33    │  3360.56ms │            3309.51ms │     no change │
│ QQuery 34    │  3383.49ms │            3338.50ms │     no change │
│ QQuery 35    │  1273.77ms │            1312.75ms │     no change │
│ QQuery 36    │   130.12ms │             120.81ms │ +1.08x faster │
│ QQuery 37    │    56.46ms │              54.89ms │     no change │
│ QQuery 38    │   124.57ms │             120.28ms │     no change │
│ QQuery 39    │   199.76ms │             197.42ms │     no change │
│ QQuery 40    │    48.99ms │              48.13ms │     no change │
│ QQuery 41    │    44.59ms │              43.03ms │     no change │
│ QQuery 42    │    37.59ms │              38.75ms │     no change │
└──────────────┴────────────┴──────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 56626.39ms │
│ Total Time (fix_aggregation-seed)   │ 57270.85ms │
│ Average Time (HEAD)                 │  1316.89ms │
│ Average Time (fix_aggregation-seed) │  1331.88ms │
│ Queries Faster                      │          3 │
│ Queries Slower                      │          3 │
│ Queries with No Change              │         37 │
└─────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ fix_aggregation-seed ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 1     │ 119.38ms │             117.20ms │    no change │
│ QQuery 2     │  23.29ms │              23.42ms │    no change │
│ QQuery 3     │  35.10ms │              35.93ms │    no change │
│ QQuery 4     │  21.09ms │              21.44ms │    no change │
│ QQuery 5     │  57.25ms │              55.57ms │    no change │
│ QQuery 6     │  12.20ms │              12.26ms │    no change │
│ QQuery 7     │ 103.83ms │             103.22ms │    no change │
│ QQuery 8     │  27.92ms │              26.70ms │    no change │
│ QQuery 9     │  63.18ms │              63.96ms │    no change │
│ QQuery 10    │  57.90ms │              58.98ms │    no change │
│ QQuery 11    │  13.12ms │              12.80ms │    no change │
│ QQuery 12    │  44.91ms │              44.84ms │    no change │
│ QQuery 13    │  30.43ms │              29.93ms │    no change │
│ QQuery 14    │  10.09ms │               9.99ms │    no change │
│ QQuery 15    │  24.48ms │              25.56ms │    no change │
│ QQuery 16    │  21.95ms │              23.39ms │ 1.07x slower │
│ QQuery 17    │ 103.08ms │             101.51ms │    no change │
│ QQuery 18    │ 237.40ms │             232.15ms │    no change │
│ QQuery 19    │  26.59ms │              28.64ms │ 1.08x slower │
│ QQuery 20    │  39.32ms │              39.71ms │    no change │
│ QQuery 21    │ 167.99ms │             170.40ms │    no change │
│ QQuery 22    │  17.36ms │              17.42ms │    no change │
└──────────────┴──────────┴──────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 1257.85ms │
│ Total Time (fix_aggregation-seed)   │ 1255.01ms │
│ Average Time (HEAD)                 │   57.17ms │
│ Average Time (fix_aggregation-seed) │   57.05ms │
│ Queries Faster                      │         0 │
│ Queries Slower                      │         2 │
│ Queries with No Change              │        20 │
└─────────────────────────────────────┴───────────┘

alamb · 2025-05-25T11:48:20Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing fix/aggregation-seed (ad1d5a3) to ce835da diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2025-05-25T12:04:02Z

🤖: Benchmark completed

Details

Comparing HEAD and fix_aggregation-seed
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ fix_aggregation-seed ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │  1853.66ms │            1928.52ms │    no change │
│ QQuery 1     │   713.31ms │             728.42ms │    no change │
│ QQuery 2     │  1397.75ms │            1469.10ms │ 1.05x slower │
│ QQuery 3     │   695.23ms │             697.26ms │    no change │
│ QQuery 4     │  1427.94ms │            1444.74ms │    no change │
│ QQuery 5     │ 15192.21ms │           15005.92ms │    no change │
│ QQuery 6     │  2081.48ms │            2068.25ms │    no change │
│ QQuery 7     │  2085.84ms │            2301.60ms │ 1.10x slower │
│ QQuery 8     │   840.83ms │             865.60ms │    no change │
└──────────────┴────────────┴──────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 26288.25ms │
│ Total Time (fix_aggregation-seed)   │ 26509.42ms │
│ Average Time (HEAD)                 │  2920.92ms │
│ Average Time (fix_aggregation-seed) │  2945.49ms │
│ Queries Faster                      │          0 │
│ Queries Slower                      │          2 │
│ Queries with No Change              │          7 │
└─────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ fix_aggregation-seed ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    15.30ms │              16.03ms │     no change │
│ QQuery 1     │    32.24ms │              33.04ms │     no change │
│ QQuery 2     │    80.41ms │              80.61ms │     no change │
│ QQuery 3     │    97.36ms │              98.26ms │     no change │
│ QQuery 4     │   584.09ms │             587.97ms │     no change │
│ QQuery 5     │   865.56ms │             822.00ms │ +1.05x faster │
│ QQuery 6     │    22.89ms │              24.12ms │  1.05x slower │
│ QQuery 7     │    36.05ms │              37.70ms │     no change │
│ QQuery 8     │   885.28ms │             879.73ms │     no change │
│ QQuery 9     │  1192.20ms │            1183.33ms │     no change │
│ QQuery 10    │   263.28ms │             269.25ms │     no change │
│ QQuery 11    │   286.62ms │             302.19ms │  1.05x slower │
│ QQuery 12    │   889.77ms │             904.32ms │     no change │
│ QQuery 13    │  1245.97ms │            1299.77ms │     no change │
│ QQuery 14    │   838.32ms │             837.41ms │     no change │
│ QQuery 15    │   822.77ms │             809.93ms │     no change │
│ QQuery 16    │  1704.37ms │            1706.80ms │     no change │
│ QQuery 17    │  1602.12ms │            1578.70ms │     no change │
│ QQuery 18    │  3122.90ms │            3051.86ms │     no change │
│ QQuery 19    │    85.31ms │              81.83ms │     no change │
│ QQuery 20    │  1175.11ms │            1103.54ms │ +1.06x faster │
│ QQuery 21    │  1369.48ms │            1306.05ms │     no change │
│ QQuery 22    │  2265.02ms │            2176.78ms │     no change │
│ QQuery 23    │  8100.27ms │            8139.10ms │     no change │
│ QQuery 24    │   480.38ms │             476.20ms │     no change │
│ QQuery 25    │   391.10ms │             386.16ms │     no change │
│ QQuery 26    │   545.15ms │             531.54ms │     no change │
│ QQuery 27    │  1636.09ms │            1548.87ms │ +1.06x faster │
│ QQuery 28    │ 12478.99ms │           13408.59ms │  1.07x slower │
│ QQuery 29    │   522.65ms │             526.28ms │     no change │
│ QQuery 30    │   806.12ms │             803.87ms │     no change │
│ QQuery 31    │   843.76ms │             845.04ms │     no change │
│ QQuery 32    │  2613.09ms │            2652.62ms │     no change │
│ QQuery 33    │  3363.03ms │            3316.43ms │     no change │
│ QQuery 34    │  3394.61ms │            3327.28ms │     no change │
│ QQuery 35    │  1280.92ms │            1305.63ms │     no change │
│ QQuery 36    │   126.63ms │             126.15ms │     no change │
│ QQuery 37    │    57.45ms │              55.26ms │     no change │
│ QQuery 38    │   124.87ms │             117.66ms │ +1.06x faster │
│ QQuery 39    │   198.34ms │             197.16ms │     no change │
│ QQuery 40    │    47.71ms │              47.63ms │     no change │
│ QQuery 41    │    43.53ms │              45.11ms │     no change │
│ QQuery 42    │    37.20ms │              37.70ms │     no change │
└──────────────┴────────────┴──────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 56574.31ms │
│ Total Time (fix_aggregation-seed)   │ 57085.51ms │
│ Average Time (HEAD)                 │  1315.68ms │
│ Average Time (fix_aggregation-seed) │  1327.57ms │
│ Queries Faster                      │          4 │
│ Queries Slower                      │          3 │
│ Queries with No Change              │         36 │
└─────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ fix_aggregation-seed ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 118.43ms │             117.30ms │     no change │
│ QQuery 2     │  23.42ms │              23.35ms │     no change │
│ QQuery 3     │  34.90ms │              34.50ms │     no change │
│ QQuery 4     │  20.45ms │              20.02ms │     no change │
│ QQuery 5     │  56.21ms │              55.10ms │     no change │
│ QQuery 6     │  12.05ms │              12.33ms │     no change │
│ QQuery 7     │ 104.99ms │             100.97ms │     no change │
│ QQuery 8     │  28.54ms │              27.07ms │ +1.05x faster │
│ QQuery 9     │  63.02ms │              63.56ms │     no change │
│ QQuery 10    │  57.63ms │              57.76ms │     no change │
│ QQuery 11    │  13.21ms │              12.95ms │     no change │
│ QQuery 12    │  45.07ms │              45.49ms │     no change │
│ QQuery 13    │  29.99ms │              29.52ms │     no change │
│ QQuery 14    │  10.26ms │               9.86ms │     no change │
│ QQuery 15    │  24.38ms │              25.63ms │  1.05x slower │
│ QQuery 16    │  23.41ms │              22.13ms │ +1.06x faster │
│ QQuery 17    │ 103.66ms │             101.13ms │     no change │
│ QQuery 18    │ 242.11ms │             233.87ms │     no change │
│ QQuery 19    │  28.14ms │              27.77ms │     no change │
│ QQuery 20    │  39.14ms │              38.09ms │     no change │
│ QQuery 21    │ 168.94ms │             170.74ms │     no change │
│ QQuery 22    │  16.92ms │              17.31ms │     no change │
└──────────────┴──────────┴──────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                   │ 1264.89ms │
│ Total Time (fix_aggregation-seed)   │ 1246.45ms │
│ Average Time (HEAD)                 │   57.49ms │
│ Average Time (fix_aggregation-seed) │   56.66ms │
│ Queries Faster                      │         2 │
│ Queries Slower                      │         1 │
│ Queries with No Change              │        19 │
└─────────────────────────────────────┴───────────┘

alamb · 2025-05-28T17:58:56Z

Second performance run looks as good / better so let's merge this in!

alamb · 2025-05-28T17:59:03Z

Thanks again @ctsk

waynexia · 2025-11-28T09:20:50Z

Got an interesting link https://morestina.net/1843/the-stable-hashmap-trap from this reddit discussion. Cross-referencing it in this thread.

(and it looks like we're not in that trap)

Set aggregation hash seed

ad1d5a3

github-actions bot added the physical-plan Changes to the physical-plan crate label May 23, 2025

alamb reviewed May 24, 2025

View reviewed changes

alamb approved these changes May 24, 2025

View reviewed changes

alamb merged commit 7002a00 into apache:main May 28, 2025
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set aggregation hash seed#16165

Set aggregation hash seed#16165
alamb merged 1 commit intoapache:mainfrom
ctsk:fix/aggregation-seed

ctsk commented May 23, 2025

Uh oh!

alamb commented May 24, 2025

Uh oh!

alamb May 24, 2025

Uh oh!

alamb left a comment

Uh oh!

alamb commented May 24, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 28, 2025

Uh oh!

Uh oh!

alamb commented May 28, 2025

Uh oh!

waynexia commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ctsk commented May 23, 2025

Uh oh!

alamb commented May 24, 2025

Uh oh!

alamb May 24, 2025

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb commented May 24, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 25, 2025

Uh oh!

alamb commented May 28, 2025

Uh oh!

Uh oh!

alamb commented May 28, 2025

Uh oh!

waynexia commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants