Implement hash partitioned aggregation#320
Conversation
Codecov Report
@@ Coverage Diff @@
## master apache/arrow-datafusion#320 +/- ##
==========================================
- Coverage 75.72% 75.71% -0.01%
==========================================
Files 143 143
Lines 23832 23881 +49
==========================================
+ Hits 18046 18081 +35
- Misses 5786 5800 +14
Continue to review full report at Codecov.
|
| input_schema, | ||
| )?)) | ||
| // TODO: dictionary type not yet supported in Hash Repartition | ||
| let contains_dict = groups |
There was a problem hiding this comment.
Will create an issue for this
alamb
left a comment
There was a problem hiding this comment.
This looks really cool @Dandandan. I suggest you add a test, if possible, that shows the plan with the repartition exec operation in order to prevent someone accidentally turning off this optimization during a refactor
| input_schema, | ||
| )?)) | ||
| // TODO: dictionary type not yet supported in Hash Repartition | ||
| let contains_dict = groups |
Co-authored-by: Andrew Lamb <[email protected]>
… into agg_partition
Good idea! Added a test for this |
alamb
left a comment
There was a problem hiding this comment.
I think it is looking great. Thanks @Dandandan
| match &self.mode { | ||
| AggregateMode::Partial => Distribution::UnspecifiedDistribution, | ||
| AggregateMode::FinalPartitioned => Distribution::HashPartitioned( | ||
| self.group_expr.iter().map(|x| x.0.clone()).collect(), |
jorgecarleitao
left a comment
There was a problem hiding this comment.
I went through this and it looks great. Thanks a lot @Dandandan , great work. 💯
from my side.
We may want to give some time in case someone else would like to go through this before merging.
❤️
|
@andygrove maybe? :) |
I will make time this weekend to review this and take it for a spin! |
|
Awesome, thanks @andygrove ! I have also some nice followup this weekend for more performance improvements for hash aggregates :D |
andygrove
left a comment
There was a problem hiding this comment.
LGTM. I tested this out locally and confirmed that performance is much better for unpartitioned data, and about the same for partitioned data.
|
@Dandandan Looks like there is a conflict that needs fixing |
|
I filed apache/datafusion-ballista#23 for implementing this optimization in Ballista |
|
Somehow coverage run seems to fail... But doesn't show what is failing |
|
The https://github.com/apache/arrow-datafusion/pull/320/checks?check_run_id=2591763548 shows this buried in the logs (not at the end, annoyingly): |
|
thanks @alamb will merge it now when it's green |
|
Thanks all 🎉 |
* Prepare for 22.0.0 release * fix tests * lint
Which issue does this PR close?
Closes #27
Rationale for this change
A more scalable hash aggregate that works well for group by expressions that have high cardinality int he output and allows to scale better with the number of cpu cores.
The algorithm changes the steps from:
partial hash aggregate -> merge (to 1 partition) -> full hash aggregateto
partial hash aggregate -> repartition on group by expressions -> hash aggregate (on partitions)This is the same as what Spark is doing.
This mostly has an effect on group by with higher cardinality, but no substantial effect on lower cardinality (as partial result is already small).
For Ballista this would also be required I think @andygrove - currently every partition is merged into one which can be problematic (and slow).
For example, TPC-H query 3 benefits quite a bit:
Master:
PR:
The db-benchmark group by queries improve by quite a bit:
Master:
PR
What changes are included in this PR?
Are there any user-facing changes?