Implement window functions with order_by clause#520
Conversation
1511b62 to
6cc7f96
Compare
Codecov Report
@@ Coverage Diff @@
## master #520 +/- ##
==========================================
- Coverage 76.09% 75.99% -0.10%
==========================================
Files 156 156
Lines 27047 27036 -11
==========================================
- Hits 20581 20547 -34
- Misses 6466 6489 +23
Continue to review full report at Codecov.
|
3d76e38 to
2a038e1
Compare
order_by clause
81a834e to
7fbe3f0
Compare
| 4, | ||
| ) | ||
| .await?; | ||
| // result in one batch, although e.g. having 2 batches do not change |
| if num_rows == 0 { | ||
| return Ok(new_empty_array(value.data_type())); | ||
| } | ||
| let index: usize = match self.kind { |
| value.len() | ||
| ))); | ||
| } | ||
| if num_rows == 0 { |
There was a problem hiding this comment.
Could this function could ever be passed a 0 row input? This check isn't a problem I am just wondering if my mental model is correct.
There was a problem hiding this comment.
this will be changed in later pull request
There was a problem hiding this comment.
but you are right this would not be passed with 0 length input. this check is just being pedantic.
| fn test_i32_result(expr: Arc<NthValue>, expected: i32) -> Result<()> { | ||
| fn test_i32_result(expr: NthValue, expected: Vec<i32>) -> Result<()> { | ||
| let arr: ArrayRef = Arc::new(Int32Array::from(vec![1, -2, 3, -4, 5, -6, 7, 8])); | ||
| let values = vec![arr]; |
There was a problem hiding this comment.
This test change shows the nice refactoring
| /// return states with the same description as `state_fields` | ||
| fn create_accumulator(&self) -> Result<Box<dyn WindowAccumulator>>; | ||
|
|
||
| /// expressions that are passed to the WindowAccumulator. |
There was a problem hiding this comment.
The WindowExpr trait is looking 👌
| /// A window expression that is a built-in window function. | ||
| /// | ||
| /// Note that unlike aggregation based window functions, built-in window functions normally ignore | ||
| /// window frame spec, with th expression of first_value, last_value, and nth_value. |
There was a problem hiding this comment.
| /// window frame spec, with th expression of first_value, last_value, and nth_value. | |
| /// window frame spec, with the exception of first_value, last_value, and nth_value. |
| /// peer based evaluation based on the fact that batch is pre-sorted given the sort columns | ||
| /// and then per partition point we'll evaluate the peer group (e.g. SUM or MAX gives the same | ||
| /// results for peers) and concatenate the results. | ||
| fn peer_based_evaluate(&self, batch: &RecordBatch) -> Result<ArrayRef> { |
There was a problem hiding this comment.
I don't understand the naming of peer here (rather than range_based_evaluate for example, to match with WindowFrameUnits::Range)
There was a problem hiding this comment.
i will possibly change this naming in implementing #361 but for the moment, range and groups both evaluates with peers but rows evaluates based on rows on each scan.
There was a problem hiding this comment.
since this is private function i guess i can leave the naming part for later changes.
There was a problem hiding this comment.
Yes I think it is fine for now
| let len = value_range.end - value_range.start; | ||
| let values = values | ||
| .iter() | ||
| .map(|v| v.slice(value_range.start, len)) |
| -- See the License for the specific language gOVERning permissions and | ||
| -- limitations under the License. | ||
|
|
||
| SELECT |
2650680 to
ce4e262
Compare
thank you for taking time to review. the changes to arrow references are now reverted. |
ce4e262 to
9f6a56b
Compare
|
@alamb this pull request is ready now |
| /// peer based evaluation based on the fact that batch is pre-sorted given the sort columns | ||
| /// and then per partition point we'll evaluate the peer group (e.g. SUM or MAX gives the same | ||
| /// results for peers) and concatenate the results. | ||
| fn peer_based_evaluate(&self, batch: &RecordBatch) -> Result<ArrayRef> { |
There was a problem hiding this comment.
Yes I think it is fine for now
…ons (apache#520) * Improve documentation about supported operators and expressions * Improve documentation about supported operators and expressions * more notes * Add more supported expressions * rename protobuf Negative to UnaryMinus for consistency * format * remove duplicate ASF header * SMJ not disabled by default * Update docs/source/user-guide/operators.md Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com> * Update docs/source/user-guide/operators.md Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com> * remove RLike --------- Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Which issue does this PR close?
Closes #360
for now this pull request relies on arrow 4.3.0 to merge
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?