Change ScalarValue::Struct to ArrayRef#7893
Conversation
ac9ffc3 to
e27d7f9
Compare
|
wait on #7862 |
e27d7f9 to
5a5a88d
Compare
5a5a88d to
a006450
Compare
e57eca5 to
1c82d51
Compare
addf685 to
bd98b9a
Compare
| let should_fail_on_seralize: Vec<ScalarValue> = vec![ | ||
| // Should fail due to empty values | ||
| ScalarValue::Struct( | ||
| Some(vec![]), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Move to round_trip_scalar_values, since it is able to serialized
ScalarValue::try_from(&DataType::Struct(Fields::from(vec![
Field::new("a", DataType::Int32, true),
Field::new("a", DataType::Boolean, false),
])))
.unwrap(),
There was a problem hiding this comment.
I agree that there is no need to test serializing an empty array as it isn't a valid input anyways
c786df7 to
60f4d2a
Compare
|
@alamb Ready for review! |
| "| |", | ||
| "| {a: , b: } |", | ||
| "| {a: , b: {ba: , bb: }} |", | ||
| "| {a: 1, b: } |", |
There was a problem hiding this comment.
I think there is no way to construct StructArray like the left-hand side.
| explain select struct(1, 2.3, 'abc'); | ||
| ---- | ||
| logical_plan | ||
| Projection: Struct({c0:Int64(1),c1:Float64(2.3),c2:Utf8("abc")}) AS struct(Int64(1),Float64(2.3),Utf8("abc")) |
| .into_iter() | ||
| .map(|(name, scalar)| (Field::new(name, scalar.data_type(), false), scalar)) | ||
| .unzip(); | ||
| // Wrapper for ScalarValue::Struct that checks the length of the arrays, without nulls |
There was a problem hiding this comment.
TODO: Remove these two wrappers, no longer needed after changing to Scalar<T>
There was a problem hiding this comment.
Yes, we haven't changed to Scalar yet.
| let struct_type = DataType::Struct(Fields::from(fields)); | ||
| let mut column_wise_ordering_values = vec![]; | ||
| let num_columns = fields.len(); | ||
| for i in 0..num_columns { |
There was a problem hiding this comment.
I think there might be a better design for StructArray (previous design is based on old ScalarValue::Struct). I avoid changing the logic or data structure in this PR.
May benefit #8558?
There was a problem hiding this comment.
| } | ||
|
|
||
| /// Return a `null` literal representing a struct type like: `{ a: bool }` | ||
| // / Return a `null` literal representing a struct type like: `{ a: bool }` |
There was a problem hiding this comment.
nit:
| // / Return a `null` literal representing a struct type like: `{ a: bool }` | |
| /// Return a `null` literal representing a struct type like: `{ a: bool }` |
|
|
||
| Ok(ordering_columns_per_row) | ||
| } else { | ||
| exec_err!( |
There was a problem hiding this comment.
todo: internal_err
3f53ead to
98afb20
Compare
|
Rebase |
e431707 to
a756a8b
Compare
|
@jayzhan211 -- is this PR ready for a review? |
Yes, it keeps getting conflicts, but I think you can take a first scan, unless the conflicts are critical |
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
a756a8b to
4088750
Compare
|
Rebase |
|
Sorry -- starting to look now |
alamb
left a comment
There was a problem hiding this comment.
This is looking really good @jayzhan211 -- thank you both for the PR as well as for sticking with it for so long
I had a few comments about how to improve the implementation by using arrow kernels, but I also think we could merge this as is and then implement those improvements as a follow on PR if you prefer.
Again, thank you for your patience.
| @@ -323,20 +335,32 @@ impl Accumulator for OrderSensitiveArrayAggAccumulator { | |||
| impl OrderSensitiveArrayAggAccumulator { | |||
| fn evaluate_orderings(&self) -> Result<ScalarValue> { | |||
There was a problem hiding this comment.
Maybe we can file a follow on ticket to track this idea
| let should_fail_on_seralize: Vec<ScalarValue> = vec![ | ||
| // Should fail due to empty values | ||
| ScalarValue::Struct( | ||
| Some(vec![]), |
There was a problem hiding this comment.
I agree that there is no need to test serializing an empty array as it isn't a valid input anyways
| ); | ||
| } | ||
|
|
||
| let mut valid = BooleanBufferBuilder::new(arrays.len()); |
There was a problem hiding this comment.
What do you think about using https://docs.rs/arrow/latest/arrow/compute/kernels/concat/index.html here? I think you should be able to simply concat the arrays together without having to have special handling (and if concat doesn't support StructArray we can potentially file a ticket upstream in arrow-rs)
| .into_iter() | ||
| .map(|(name, scalar)| (Field::new(name, scalar.data_type(), false), scalar)) | ||
| .unzip(); | ||
| // Wrapper for ScalarValue::Struct that checks the length of the arrays, without nulls |
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
|
Thanks @jayzhan211 -- this is looking great. There are a few more outstanding suggestions, but I think we could do them as follow on PRs -- shall I merge this one? |
Sure! |
|
Thanks again @jayzhan211 |
Which issue does this PR close?
Closes #7835
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?