feat: Support SQL filter clause for aggregate expressions, add SQL dialect support#5868
feat: Support SQL filter clause for aggregate expressions, add SQL dialect support#5868yjshen merged 15 commits intoapache:mainfrom
Conversation
|
I plan to review this PR tomorrow. Thank you @yjshen |
|
cc @andygrove @Dandandan and @jdye64 |
|
Thank you @alamb for the detailed review! I have made the following updates to the PR based on your feedback:
This PR is now ready for further review. Thank you again, @alamb! |
alamb
left a comment
There was a problem hiding this comment.
Looks great to me @yjshen -- thank you. I will upstream the parsing for dialect names now.
The only other thing I really want to do prior to merging this PR is to verify it doesn't change performance. I don't expect that it will but I want to double check to be sure
| .zip(filters) | ||
| .try_for_each(|((accum, expr), filter)| { | ||
| // 1.2 | ||
| let batch = match filter { |
|
cc @tustvold @mustafasrepo @crepererum and @Dandandan who I think are all interested in grouping performance |
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
|
I benched |
|
I ran this branch against main using https://github.com/alamb/datafusion-benchmarking and I see no performance difference 👍 |
|
#6616 -- PR to use upstreamed version of parse_sql_dialect |
Which issue does this PR close?
Closes #5873.
Closes #5608
Closes #2214
Rationale for this change
This pull request introduces support for the FILTER (WHERE) clause in aggregate expressions. This feature enables users to filter the rows that are considered for aggregation, similar to how it is done in popular SQL databases such as PostgreSQL, SQLite, Spark, and Hive.
What changes are included in this PR?
physical_plan/aggregatemodule is where the majority of the work for this project was completed.Are these changes tested?
New tests were added in
group_by.rsto cover various scenarios using the FILTER (WHERE) clause with different situations.Are there any user-facing changes?
Yes.