Implement PoC block allocation for count accumulator#15642
Implement PoC block allocation for count accumulator#15642Dandandan wants to merge 4 commits intoapache:mainfrom
Conversation
|
@Rachelint PoC for accumulators (I also have some PoC for block-allocated GroupValues) I also made some separate changes for outputting the output from the state in batches, but left it out to keep the changes small. |
b6e8094 to
d7a8a78
Compare
d7a8a78 to
3a1733d
Compare
| use ahash::RandomState; | ||
| use datafusion_common::stats::Precision; | ||
| use datafusion_expr::expr::WindowFunction; | ||
| use datafusion_expr::groups_accumulator::BLOCK_SIZE; |
There was a problem hiding this comment.
Could be based on batch size as well.
| // Count is always non null (null inputs just don't contribute to the overall values) | ||
| let nulls = None; | ||
| let array = PrimitiveArray::<Int64Type>::new(counts.into(), nulls); | ||
| // TODO: support emitting batches |
There was a problem hiding this comment.
evaluate and state could be supported to return Result<Vec<ArrayRef>> and Result<Vec<Vec<ArrayRef>>> although this is making a quite large breaking change.
It seems similar as what was done in #11943 ? The problem I found after attempt in POC #11943 is we need introduce a It still work well when we enable |
|
Really thanks. I check the old sketch again, and found it is easy to avoid regression for disabling the optimization cases. Maybe it is actually too early to consider the cost of I plan to:
|
The plan sounds good! I think this PoC mainly shows that accumulators / Group state might be changed individually (without changing other parts). If there is a part of your code that shows improvement time and memory wise, we should take it and create some follow-up tickets! |
|
Let's close it for now - I'll be looking if I can contribute part of this later. |
Which issue does this PR close?
PoC to show a simple method for allocating in blocks
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?