Conversation
dbcd08c to
f26b319
Compare
| "implementations have drifted and this is no longer safe even if `new()` still works, ", | ||
| "for example if `new()` now does something different than just calling `compute_properties(...).unwrap()`", | ||
| "\n", | ||
| "This is clearly a bug, please report it!" |
There was a problem hiding this comment.
I think we should return DatafusionError::Internal here rather than panic'ing as it is a better UX (if you want fail fast in debug builds, perhaps you could add a debug_assert)
I also recommend converting the explanation into comments and leaving the panic message like "Internal inconsistency in SortExec"
The rationale is that if a user sees this message it is not going to mean anything to them and they can't fix it, and this text will obscure the conclusion (this is a bug they can not do anything to fix). A developer will come to the source location and can read the comment.
There was a problem hiding this comment.
@alamb this already had the possibility to panic in it because it called SortExec::new():
datafusion/datafusion/physical-plan/src/sorts/sort.rs
Lines 865 to 872 in eb2b8c0
datafusion/datafusion/physical-plan/src/sorts/sort.rs
Lines 1115 to 1119 in eb2b8c0
Returning a Result would be a major breaking change to the ExecutionPlan::with_new_children API which I don't think we should do in this PR.
I do think ExecutionPlan::with_new_children returning a Result would be a good thing. In general I think trait methods should err on the side of returning a result in case some implementation needs to. If none of them do I'd expect compilation to make it pretty much a non issue for performance. But maybe let's do that as it's own PR if we really want to.
| /// Note to implementers: unlike [`ExecutionPlan::with_new_children`] this method does not accept new children as an argument, | ||
| /// thus it is expected that any cached plan properties will remain valid after the reset. | ||
| /// | ||
| /// [`DynamicFilterPhysicalExpr`]: datafusion_physical_expr::expressions::DynamicFilterPhysicalExpr |
There was a problem hiding this comment.
I wonder if we should also add a note to DynamicFilterPhysicalExpr saying any ExecutionPlan that uses them should also implement reset_state
| let (cache, sort_prefix) = Self::compute_properties( | ||
| &new_sort.input, | ||
| new_sort.expr.clone(), | ||
| new_sort.preserve_partitioning, | ||
| ) |
There was a problem hiding this comment.
Why don't we put the logic into reset_state if the changes are due to the reset_state?
There was a problem hiding this comment.
The logic for this change is:
- We need to do similar work as
with_new_children(i.e. cloneSortExec) but each method has slightly different requirements (with_new_childrenneeds to resetcachewhilereset_stateneeds to resetfilter. - To solve this I created the new
SortExec::clonedwhich does neither of those two things and moved the resetting ofcacheintowith_new_childrenand the resetting offilterintoreset_state.
In other words, it doesn't make sense to put Self::compute_properties(...) in reset_state.
There was a problem hiding this comment.
this function returns a Result though, right?
There was a problem hiding this comment.
Ah good point yep I'll do that!
There was a problem hiding this comment.
Sorry I missed your original point
xudong963
left a comment
There was a problem hiding this comment.
I recommend putting the PR into upgrading doc.
Good suggestion, added! |
|
Are we concerned that |
I suggest we file a ticket to look into this in the future for anyone who might be interested in that usecase, and not worry about it in this PR |
|
done! #17060 should we wait for anything else before merging this? |
Nope, I think we are good -- let's merge it! |
|
It seems like we should probably backport this to branch-49 for the 49.0.1 release |
* Add ExecutionPlan::reset_state Co-authored-by: Robert Ream <[email protected]> * Update datafusion/sqllogictest/test_files/cte.slt * Add reference * fmt * add to upgrade guide * add explain plan, implement in more plans * fmt * only explain --------- Co-authored-by: Robert Ream <[email protected]>
* Add ExecutionPlan::reset_state * Update datafusion/sqllogictest/test_files/cte.slt * Add reference * fmt * add to upgrade guide * add explain plan, implement in more plans * fmt * only explain --------- Co-authored-by: Robert Ream <[email protected]>
* Add ExecutionPlan::reset_state Co-authored-by: Robert Ream <[email protected]> * Update datafusion/sqllogictest/test_files/cte.slt * Add reference * fmt * add to upgrade guide * add explain plan, implement in more plans * fmt * only explain --------- Co-authored-by: Robert Ream <[email protected]>
* Enable physical filter pushdown for hash joins (apache#16954) (cherry picked from commit b10f453) * Add ExecutionPlan::reset_state (apache#17028) * Add ExecutionPlan::reset_state Co-authored-by: Robert Ream <[email protected]> * Update datafusion/sqllogictest/test_files/cte.slt * Add reference * fmt * add to upgrade guide * add explain plan, implement in more plans * fmt * only explain --------- Co-authored-by: Robert Ream <[email protected]> * Add dynamic filter (bounds) pushdown to HashJoinExec (apache#16445) (cherry picked from commit ff77b70) * Push dynamic pushdown through CooperativeExec and ProjectionExec (apache#17238) (cherry picked from commit 4bc0696) * Fix dynamic filter pushdown in HashJoinExec (apache#17201) (cherry picked from commit 1d4d74b) * Fix HashJoinExec sideways information passing for partitioned queries (apache#17197) (cherry picked from commit 64bc58d) * disallow pushdown of volatile functions (apache#16861) * dissallow pushdown of volatile PhysicalExprs * fix * add FilteredVec helper to handle filter / remap pattern (#34) * checkpoint: Address PR feedback in https://github.com/apach... * add FilteredVec to consolidate handling of filter / remap pattern * lint * Add slt test for pushing volatile predicates down (#35) --------- Co-authored-by: Andrew Lamb <[email protected]> (cherry picked from commit 94e8548) * fix bounds accumulator reset in HashJoinExec dynamic filter pushdown (apache#17371) --------- Co-authored-by: Adrian Garcia Badaracco <[email protected]> Co-authored-by: Robert Ream <[email protected]> Co-authored-by: Jack Kleeman <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>
Fixes #16998. Closes #17016.