ARROW-10149: [Rust] Improved support for externally owned memory regions#8316
ARROW-10149: [Rust] Improved support for externally owned memory regions#8316jorgecarleitao wants to merge 3 commits intoapache:masterfrom jorgecarleitao:external_bytes
Conversation
rust/arrow/src/buffer.rs
Outdated
There was a problem hiding this comment.
This was not being used, and thus I dropped it.
rust/arrow/src/bytes.rs
Outdated
There was a problem hiding this comment.
Perhaps using a closure that mutates a captured variable?
Apparently you need to use either FnMut or a Cell:
https://stackoverflow.com/questions/38677736/passing-a-closure-that-modifies-its-environment-to-a-function-in-rust
There was a problem hiding this comment.
Done. I had to introduce a Mutex, because a FnMut is now mutable, while the data itself is not. I am not very happy with this, but it makes sense as we cannot assume that the C data interface is thread-safe, right?
There was a problem hiding this comment.
I reverted this. I do not think that that Fn should be mutable, as it is just performing an FFI call, over which Rust does not need to know about mutability. I am still trying to test it, but I think that something like Arc<dyn Fn(&mut Bytes)> is a better signature,.
There was a problem hiding this comment.
let b = Cell::new(false);
let dealloc = Arc::new(|bytes: &mut Bytes| {
*b.get_mut() = true;
assert_eq!(bytes.as_slice(), &b"hello"[1..4]);
});does not compile because it requires moving b to inside the closure: if we move b to inside the closure (using move), the closure is no longer Fn, but FnMut. If the closure is FnMut, we can no longer wrap it inside an Arc, as Arc is immutable. To make it immutable, we need to wrap it around a Mutex.
The difference between the code we are going here and the example in SO is that we have an immutable function, as the underlying resource that this function acts upon is outside rust. I.e. from rust's perspective, the function is Fn.
One way to test this would be to make the closure to write something to a file, and verify that that was written. I.e. test that the function mutated something outside of Rust.
There was a problem hiding this comment.
After trying out varying things, I got the following to work:
use std::cell::Cell;
use std::sync::Arc;
pub type VoidFn = Arc<dyn Fn()>;
fn main() {
let integer = Arc::new(Cell::new(5));
let inner = integer.clone();
let closure = Arc::new(move || {
inner.set(inner.get() + 1);
});
execute_closure(closure);
println!("After closure: {}", integer.get());
}
fn execute_closure(func: VoidFn)
{
func();
}I may be missing something though.
There was a problem hiding this comment.
By the way, by definition a destructor will mutate some state (visible or not), so it seems FnMut may be fine too.
There was a problem hiding this comment.
That is awesome! Thanks a lot for the help and insight @pitrou .
The destructor mutates state. Shouldn't that state be only the state of self? IMO that is the reason rust's Drop requires a mutable ref of self, fn drop(&mut self).
There was a problem hiding this comment.
I have no idea, I am out of my depth here (not a Rust developer :-)).
The existing implementation was not useful to support FFI as it did not specify how to release memory.
|
Closing in favor of #8401 |
Background
Currently, a memory region (
arrow::buffer::BufferData) always knows its capacity, that it uses todropitself once it is no longer needed. It also knows whether it needs to be dropped or not viaBufferData::owner: bool.However, this is insufficient for the purposes of supporting the C Data Interface, which requires informing the owner that the region is no longer needed, typically via a function call (
release), for reference counting by the owner of the region.This PR
This PR generalizes
BufferData(and renames it toBytes, which is more natural name for this structure, a-labytes::Bytes) to support foreign deallocators. Specifically, it accepts two deallocation modes:Native(usize): the current implementationForeign(Fn): an implementation that calls a function (which can be used to call a FFI)FYI @pitrou , @nevi-me @paddyhoran @sunchao
Related to #8052 , which IMO is blocked by this functionality.