implement rank and dense_rank function and refactor built-in window function evaluation#631
Conversation
| } | ||
|
|
||
| impl PartitionEvaluator for NthValueEvaluator { | ||
| fn evaluate_partition(&self, partition: Range<usize>) -> Result<ArrayRef> { |
There was a problem hiding this comment.
This interface makes sense (to pass in the range of rows), though it may make more sense to explicitly pass in values: Vec<ArrayRef> rather than assume whatever implements the Evaluator was constructed in a way they can be found
| } | ||
|
|
||
| /// evaluate the partition evaluator against the partition | ||
| fn evaluate_partition(&self, _partition: Range<usize>) -> Result<ArrayRef>; |
There was a problem hiding this comment.
Another potential way to model this with a single evaluate function might be:
| fn evaluate_partition(&self, _partition: Range<usize>) -> Result<ArrayRef>; | |
| fn evaluate_partition(&self, _partition: Range<usize>, _ranks_in_partition: Option<&[Range<usize>])) -> Result<ArrayRef>; |
Rather than having two separate functions with different signatures
There was a problem hiding this comment.
Actually I was trying to avoid generation sort partition points because a majority of the functions do not need that. Nth value not needing them, row number not needing values at all - just length info
There was a problem hiding this comment.
If it were to be consistent then the interface wouldn't need to exist - would reuse code with aggregation window functions.
There was a problem hiding this comment.
having thought of this for a while, i think let's merge this as is.
when arrow 4.4 is released, the partition points is migrated to be an iterator. at that time i can unify both functions and let the laziness do its work (i.e. pass in the iterator in all cases, letting the consumer to decide).
| UInt64Array::from_iter_values(ranks_in_partition.iter().enumerate().flat_map( | ||
| |(index, range)| { | ||
| let len = range.end - range.start; | ||
| iter::repeat((index as u64) + 1).take(len) |
18ce48f to
6d4cf41
Compare
Which issue does this PR close?
Closes #555
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?