ARROW-5586: [R] convert Array of LIST type to R lists#4575
ARROW-5586: [R] convert Array of LIST type to R lists#4575romainfrancois wants to merge 5 commits intoapache:masterfrom
Conversation
r/src/array__to_vector.cpp
Outdated
There was a problem hiding this comment.
This is a relatively costly thing to do. I'm not sure how we can escape it. Do you keep a reference to the generated Slice/Array, or does it get consumed and transformed in something else. See #4366 (comment) for reference.
There was a problem hiding this comment.
::Allocate() makes an R list, and the various ::Ingest.*() fill that list.
I'll look if/how I can do this without actually calling Slice(), might be easy enough to just scan through values_array I guess.
There was a problem hiding this comment.
The challenging part is dealing with the bitmap and null_count.
There was a problem hiding this comment.
This can be a huge timesink, I think it's ok for now, but be wary that this will be a problem on large Array.
There was a problem hiding this comment.
I see. I don't think it's too much trouble to skip using Slice and use the bitmap of the values_array etc ... I'll have a look.
There was a problem hiding this comment.
🤔 yeah it's about the Array__as_vector(slice) which needs slice as a proper array already. The Converter api is designed to ingest all of an array, so I guess it would need some extra methods to only ingest a slice of an array without. I'll open another issue for that.
There was a problem hiding this comment.
r/src/arrowExports.cpp
Outdated
There was a problem hiding this comment.
I find this file very error prone to read. I'd recommend going selective import route.
arrowExports.cpp # with ifdef import of impl/mock depending on ARROW_R_WITH_ARROW
arrowExports_impl.cpp
arrowExports_mock.cpp
Don't do this in this PR, open a followup ticket.
There was a problem hiding this comment.
Not that I would expect anyone to actually read that file (hence it is supposed to be hidden in pull requests, ...) but I've open that issue:
r/src/array__to_vector.cpp
Outdated
There was a problem hiding this comment.
Is it worth adding support for FixedSizeList?
There was a problem hiding this comment.
this is new to me, that probably is another issue/pr.
There was a problem hiding this comment.
yep, but the code will be similar, almost equal, FixedSizeList also export value_offset.
r/src/array.cpp
Outdated
There was a problem hiding this comment.
These are "low-level" functions, they don't perform any bound checking, you can segfault with the wrong i. Do they need to be exported?
There was a problem hiding this comment.
🤷♂ what's the rule about what is export worthy ?
Codecov Report
@@ Coverage Diff @@
## master #4575 +/- ##
==========================================
- Coverage 83.42% 74.98% -8.44%
==========================================
Files 209 56 -153
Lines 16062 3358 -12704
Branches 1253 0 -1253
==========================================
- Hits 13399 2518 -10881
+ Misses 2384 840 -1544
+ Partials 279 0 -279
Continue to review full report at Codecov.
|
8229213 to
300f897
Compare
We don't have the reverse operation yet (convert from an R data structure) to a list array, so those aren't easy to make, but e.g. we can get some with the json reader:
Created on 2019-06-14 by the reprex package (v0.3.0.9000)