Refactor: move RowGroupPredicateBuilder into its own module, rename to PruningPredicateBuilder#365
Conversation
Codecov Report
@@ Coverage Diff @@
## master #365 +/- ##
==========================================
+ Coverage 75.86% 75.89% +0.02%
==========================================
Files 143 144 +1
Lines 23758 23771 +13
==========================================
+ Hits 18025 18040 +15
+ Misses 5733 5731 -2
Continue to review full report at Codecov.
|
|
FYI @yordan-pavlov and @returnString |
jorgecarleitao
left a comment
There was a problem hiding this comment.
I have not read it in detail but I assume it is just a code movement and rename and therefore 👍
| let logical_predicate_expr = | ||
| build_predicate_expression(expr, &schema, &mut stat_column_req)?; | ||
| // println!( | ||
| // "PruningPredicateBuilder::try_new, logical_predicate_expr: {:?}", |
There was a problem hiding this comment.
Those println comments could be removed?
There was a problem hiding this comment.
(they were there in the original code but I think would be better to remove them)
| self.scalar_expr | ||
| } | ||
|
|
||
| // fn column_name(&self) -> &String { |
| } | ||
| } | ||
|
|
||
| // fn column_expr(&self) -> &Expr { |
| fn is_stat_column_missing(&self, statistics_type: StatisticsType) -> bool { | ||
| self.stat_column_req | ||
| .iter() | ||
| .filter(|(c, t, _f)| c == &self.column_name && t == &statistics_type) |
There was a problem hiding this comment.
This code could use .any instead of filter + count.
|
I also think this is looking good. I think we might clean things up a bit while touching the code, I added some suggestions. |
Cleaned per @Dandandan 's suggestions |
Dandandan
left a comment
There was a problem hiding this comment.
I like it, great use of the physical optimizer
* Add db-benchmark * python lint
Which issue does this PR close?
Part of #363
Rationale for this change
As explained on #363 the high level idea goal is to make the parquet row group pruning logic generic to any types of min/max statistics (not just parquet metadata)
What changes are included in this PR?
No changes in functionality are intended
The PR contains two commits to help reviewers
Are there any user-facing changes?
The
pub struct RowGroupPredicateBuilderstruct was renamed