Evaluate expressions after type coercion#3444
Conversation
|
Thanks @Dandandan. This looks like a great improvement. |
Codecov Report
@@ Coverage Diff @@
## master #3444 +/- ##
==========================================
+ Coverage 85.68% 85.69% +0.01%
==========================================
Files 298 298
Lines 54645 54667 +22
==========================================
+ Hits 46820 46846 +26
+ Misses 7825 7821 -4
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
| "+--------------+-------------------------+-------------------------+-------------------------+", | ||
| "| 1.5 | 2.5 | 3.5 | 2.5 |", | ||
| "+--------------+-------------------------+-------------------------+-------------------------+", | ||
| "+--------------+---------------------------+---------------------------+---------------------------+", |
There was a problem hiding this comment.
I just have the comments about the header of the expr.
The input sql is AGG(C1) + 1, 1 is the int64 data type, but the header is convert to float after casted
There was a problem hiding this comment.
Do we have the method to make the header consistent, and it can be changed with the changes of the optimizer plan.
cc @andygrove @alamb
There was a problem hiding this comment.
This is a concern I also have for a longer time and had a PR open once.
One approach would be to add an alias for every unnamed expression based on the original query SQL or expression.
This would avoid having the column names changed by the optimizers.
There was a problem hiding this comment.
I really like the idea of adding an alias once (maybe as the initial optimizer pass?)
I am not sure how valuable adding the types in the column names is in general, to be honest. I wouldn't mind if rather than Int(1) this was simply rendered 1
There was a problem hiding this comment.
I really like the idea of adding an alias once (maybe as the initial optimizer pass?)
I am not sure how valuable adding the types in the column names is in general, to be honest. I wouldn't mind if rather than
Int(1)this was simply rendered1
Do you have plan or a draft pr for that? @Dandandan
There was a problem hiding this comment.
Perhaps @Dandandan was referring to #280 / #279
There was a problem hiding this comment.
Yes indeed, we can give those a second life 🎉
I had some concerns with the PR, but I believe it is still a big improvement over the current state of things.
alamb
left a comment
There was a problem hiding this comment.
I think it looks like a good improvement to me. Perhaps we can file a follow on ticket for the column renaming?
| "+--------------+-------------------------+-------------------------+-------------------------+", | ||
| "| 1.5 | 2.5 | 3.5 | 2.5 |", | ||
| "+--------------+-------------------------+-------------------------+-------------------------+", | ||
| "+--------------+---------------------------+---------------------------+---------------------------+", |
There was a problem hiding this comment.
I really like the idea of adding an alias once (maybe as the initial optimizer pass?)
I am not sure how valuable adding the types in the column names is in general, to be honest. I wouldn't mind if rather than Int(1) this was simply rendered 1
| Sort: #revenue DESC NULLS FIRST\ | ||
| \n Projection: #customer.c_custkey, #customer.c_name, #SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS revenue, #customer.c_acctbal, #nation.n_name, #customer.c_address, #customer.c_phone, #customer.c_comment\ | ||
| \n Aggregate: groupBy=[[#customer.c_custkey, #customer.c_name, #customer.c_acctbal, #customer.c_phone, #nation.n_name, #customer.c_address, #customer.c_comment]], aggr=[[SUM(#lineitem.l_extendedprice * CAST(Int64(1) AS Float64) - #lineitem.l_discount)]]\ | ||
| \n Aggregate: groupBy=[[#customer.c_custkey, #customer.c_name, #customer.c_acctbal, #customer.c_phone, #nation.n_name, #customer.c_address, #customer.c_comment]], aggr=[[SUM(#lineitem.l_extendedprice * Float64(1) - #lineitem.l_discount)]]\ |
|
This PR appears to have some conflicts now |
|
@Dandandan This now needs a rebase |
|
Benchmark runs are scheduled for baseline = 97b3a4b and contender = f48a997. f48a997 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3431
Rationale for this change
See issue
What changes are included in this PR?
Are there any user-facing changes?