ARROW-4975: [C++] Support concatenation of UnionArrays#11843
ARROW-4975: [C++] Support concatenation of UnionArrays#11843mbrobbel wants to merge 6 commits intoapache:masterfrom
Conversation
lidavidm
left a comment
There was a problem hiding this comment.
Tricky indeed. I think just concatenating the dense union children (instead of trying to slice out only those elements which are referenced by an offset) is fine.
| {child_one_sliced, child_two_sliced})); | ||
| ASSERT_OK(expected_sliced->ValidateFull()); | ||
| AssertArraysEqual(*expected_sliced, *concat_sliced_arrays); | ||
| } |
There was a problem hiding this comment.
Can we also test concatenation of an array 1) which is not sliced, but whose children are sliced/have an offset? 2) which is sliced, whose children additionally have an offset?
bkietz
left a comment
There was a problem hiding this comment.
This is looking great, thanks!
This makes it more readable. Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
bkietz
left a comment
There was a problem hiding this comment.
LGTM
CI failures seemed like unrelated flakes so I'm restarting those jobs to see if we can get to all-green
|
merging |
|
Benchmark runs are scheduled for baseline = e903a21 and contender = a93c493. a93c493 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
This PR adds support for concatenation of union arrays.
For sparse union arrays this is trivial: the type buffers and child arrays are concatenated like the other concatenate implementations.
For dense union arrays the following approach is used:
Does this make sense or should we slice child arrays (when required) and reflect this in the concatenated offsets buffer?
This PR also removes a check in
DenseUnionArray::Makethat rejected empty offsets buffers. This made it impossible to construct empty dense union arrays. I discussed removing this check with @bkietz.