ARROW-11710: [Rust][DataFusion] Implement ExpressionRewriter#9545
ARROW-11710: [Rust][DataFusion] Implement ExpressionRewriter#9545alamb wants to merge 2 commits intoapache:masterfrom
Conversation
a82ddd5 to
161122e
Compare
There was a problem hiding this comment.
FYI @houqp and @Dandandan -- as discussed on #9309 (comment) here is a proposal of how to rewrite expressions directly without quite so much copying.
It is part of my larger plan to make rewriting LogicalPlans easier too.
Github kind of mangled the diff in this file, but the core change is that all code for recursing Expr trees that are not relevant to the constant folding is in Expr::rewrite now and no longer in this file
There was a problem hiding this comment.
This seems like an improvement to me NOT NOT #b is the same as b :) I suspect something was not quite right with the recursion previously
There was a problem hiding this comment.
Yeah, my original implementation did the tree traversal wrong for not expr :P It was doing a preorder traversal, which requires a convergent loop to produce #b in this case. Nice catch.
houqp
left a comment
There was a problem hiding this comment.
Overall it looks great! Good boilerplate code clean up :)
There was a problem hiding this comment.
it looks like these manual rewrites are redundant because they should have been invoked during tree traversal before mutate was called.
There was a problem hiding this comment.
That is a great call -- I will try and remove them.
There was a problem hiding this comment.
Same here, i think this rewrite is not needed.
There was a problem hiding this comment.
The code gets quite a bit cleaner with this improvement @houqp - thank you for the suggestion
There was a problem hiding this comment.
Yeah, my original implementation did the tree traversal wrong for not expr :P It was doing a preorder traversal, which requires a convergent loop to produce #b in this case. Nice catch.
There was a problem hiding this comment.
minor, but I think we can use the pre_visit method to skip traversal for these expressions.
There was a problem hiding this comment.
I thought about it and I could not convince myself that this change this would gain much -- we still have to match on the Expr type so it would just move the list of the variants into another function (in a separate match) which seems to obscure the logic a bit for me
161122e to
67d35b7
Compare
| let right = optimize_expr(right, schemas)?; | ||
| match op { | ||
| Operator::Eq => match (&left, &right) { | ||
| impl<'a> ExprRewriter for ConstantRewriter<'a> { |
There was a problem hiding this comment.
With some of @houqp 's comments, this rewrite pass is looking beautiful in my opinion -- it really looks like a rewrite rather than a reconstruction
Codecov Report
@@ Coverage Diff @@
## master #9545 +/- ##
==========================================
+ Coverage 82.25% 82.39% +0.13%
==========================================
Files 244 244
Lines 55685 56216 +531
==========================================
+ Hits 45806 46317 +511
- Misses 9879 9899 +20
Continue to review full report at Codecov.
|
Rationale:
This is part of a larger effort, described on ARROW-11689. for making improvements to the DataFusion query optimizer easier to write and making it more efficient,.
The idea is that by splitting out the expr traversal code from the code that does the actual rewriting, we will:
PlanRewriterthat doesn't have to clone its input, and can modify take their input by value and consume them.Changes
This PR introduce a
ExpressionRewriter, the mutable counterpart toExpressionVisitorand demonstrates its usefulness by using it in the constant folding algorithm.Note this also reduces a bunch of copies in the constant folding algorithm.