Improve push down limit (logical optimizer rule)#15744
Improve push down limit (logical optimizer rule)#15744xudong963 wants to merge 4 commits intoapache:mainfrom
Conversation
The failing tests are related to topk (in the user_defined_plan.rs). Because the PR removes the limit during the first round, so I have a question, what's the difference between the |
d8bdbec to
a8b64b8
Compare
IIUC, the topk in https://github.com/apache/datafusion/blob/main/datafusion/core/tests/user_defined/user_defined_plan.rs is only used for test. |
Yes, now DataFusion don't have a |
| } | ||
| } else { | ||
| sort.fetch = new_fetch; | ||
| if skip == 0 && original_sort_fetch.is_none() { |
There was a problem hiding this comment.
It would be great to add a comment to explain why && original_sort_fetch.is_none()
There was a problem hiding this comment.
Your comment has triggered my deeper thinking, now I think we don't need the condition check
a8b64b8 to
80722ff
Compare
80722ff to
d831173
Compare
| use async_trait::async_trait; | ||
| use futures::{Stream, StreamExt}; | ||
|
|
||
| /// Execute the specified sql and return the resulting record batches |
There was a problem hiding this comment.
Given that we'll move the code about "how to write the user defined plan" to doc, so I moved the useless tests in the PR
There was a problem hiding this comment.
Can we please remove the tests in some other PR so it is clear what behavior the code is changing, if any? I found it hard to find the actual code / behavior change in this PR with several different behaviors in there
d831173 to
30c78ff
Compare
| // ------ The implementation of the TopK code follows ----- | ||
|
|
||
| #[derive(Debug)] | ||
| #[derive(Debug, Default)] |
There was a problem hiding this comment.
If we are removing all the tests that refer to this structure, I think we should remove the rest of the code too rather than making it as "allow unused"
There was a problem hiding this comment.
Make sense, let's wait for the PR : #15832,
let we finish it up as soon as possible , I think I was missing somethings and cant able to understand to them properly , it would be great help if you collaborate upon it @xudong963 . you can add your suggestions upon it adding to the PR
| use async_trait::async_trait; | ||
| use futures::{Stream, StreamExt}; | ||
|
|
||
| /// Execute the specified sql and return the resulting record batches |
There was a problem hiding this comment.
Can we please remove the tests in some other PR so it is clear what behavior the code is changing, if any? I found it hard to find the actual code / behavior change in this PR with several different behaviors in there
|
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
Which issue does this PR close?
Rationale for this change
If skip is zero, we can directly remove the limit, the current behavior is to remove the limit at the second round optimization.
What changes are included in this PR?
Are these changes tested?
Yes
Are there any user-facing changes?