Add optimizer rule to replace inlist with or chain for small expression list#806
Add optimizer rule to replace inlist with or chain for small expression list#806Dandandan wants to merge 3 commits intoapache:masterfrom
or chain for small expression list#806Conversation
or chain for small list
or chain for small listor chain for small expression list
|
Seems it is not faster (yet). Will do some more research |
|
Cool, this is interesting to me since in our query engine we started from the opposite approach of initially rewriting all IN expressions into comparison and OR and are now introducing specialized kernel for IN. Would be very interested in finding out where the threshold for this optimization is. The benefit is probably bigger for dictionary encoded data and numbers since the string comparison itself will involve some branches. |
|
Possibly also related #813 as a different performance approach |
|
I might tune this later at a later moment to be for empty/single items instead for which it really should be an improvement, and do some more profiling. |
|
Marking PRs that haven't had activity in over a month as 'stale-pr' to help me filter the list. Please remove the label or let me know if "stale" is not the correct designation |
|
Closing a seemingly stale PR -- please reopen if that was a mistake. |
* Enable shuffle in benchmarks * format * Revert remove SPARK_GENERATE_BENCHMARK_FILES=1
Which issue does this PR close?
Closes #799
Rationale for this change
Speeding up inlist with one or two expressions by converting to a normal boolean expression.
What changes are included in this PR?
Are there any user-facing changes?