You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/03 06:18:49 UTC
[GitHub] [arrow-rs] hu6360567 opened a new issue #994: Confused import/export API when interacting with C++
hu6360567 opened a new issue #994:
URL: https://github.com/apache/arrow-rs/issues/994
**Which part is this question about**
FFI apis and raw pointer management
**Describe your question**
I'm trying to import/export arrow arrays between C++ and Rust, when it crashes when C++ is compiled in release mode.
ref: https://github.com/apache/arrow/issues/11846
To be confused, when import/expore array/schema on Rust side, the input pointer is `const`.
However, C++ API`arrow::ImportRecordBatch` moves the payload from input array/schema pointers.
Is there any code to show best practice on import/export between Rust and C++?
My implementation of exporting arrow array allocated from Rust
Rust part:
```rust
fn export<T>(array: T, content: *mut *const FFI_ArrowArray, schema: *mut *const FFI_ArrowSchema) where T: Array {
match array.to_raw() {
Ok((c, s)) => unsafe {
// Is it right to directly copy pointer address, when does `c` and `s` actually dropped?
// Should I use `copy_to` or `replace` from `std::mem::ptr` ?
*content = c;
*schema = s;
}
Err(e) => {
eprintln!("{}", e);
}
}
}
#[no_mangle]
pub extern "C" fn export_array(content: *mut *const FFI_ArrowArray, schema: *mut *const FFI_ArrowSchema) {
let sa = StringArray::from(vec!["a", "b", "c"]);
export(sa, content, schema);
}
```
C++ part
```C++
void func(){
const ArrowArray *content = nullptr;
const ArrowSchema *schema = nullptr;
export_array(&content, &schema);
// I have to const_cast the pointers
auto array = *arrow::ImportRecordBatch(const_cast<ArrowArray *>(content), const_cast<ArrowSchema *>(schema));
std::cout << array->ToString() << std::endl;
}
```
**Additional context**
ref: https://github.com/apache/arrow/issues/11846
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 edited a comment on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 edited a comment on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet for import/export with FFI.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L149
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
For safety notice of ArrowArray,
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/ffi.rs#L638-L643
But, for the workflow of exporting to FFI, I didn't find when to release pointers created during `into_raw`.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L136
C++ `arrow::ImportArray` moves the payload of pointer, but cannot release the pointer, since it is allocated by `Arc` in rust.
Does it lead to a memory leak?
My proposal:
```rust
pub fn export_array<T>(array: &T, content: *mut FFI_ArrowArray, schema: *mut FFI_ArrowSchema) -> Result<()> where T: Array {
if content.is_null() { Err(ArrowError::MemoryError("content is null".to_string()))? }
if schema.is_null() { Err(ArrowError::MemoryError("schema is null".to_string()))? }
let (content_ptr, schema_ptr) = array.to_raw().unwrap();
// swap content/content_ptr, schema/schema_ptr
unsafe {
content.swap(content_ptr as *mut FFI_ArrowArray);
schema.swap(schema_ptr as *mut FFI_ArrowSchema);
}
// release content_ptr/schema_ptr
unsafe {
let _ = ArrowArray::try_from_raw(content_ptr, schema_ptr);
}
Ok(())
}
pub fn import_array(content: *mut FFI_ArrowArray, schema: *mut FFI_ArrowSchema) -> Result<ArrayRef> {
let empty_array = unsafe { ArrowArray::empty() };
let (content_ptr, schema_ptr) = ArrowArray::into_raw(empty_array);
unsafe {
content.swap(content_ptr as *mut FFI_ArrowArray);
schema.swap(schema_ptr as *mut FFI_ArrowSchema);
}
unsafe { make_array_from_raw(content_ptr, schema_ptr) }
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 edited a comment on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 edited a comment on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet for import/export with FFI.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L149
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
For safety notice of ArrowArray,
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/ffi.rs#L638-L643
But, for the workflow of exporting to FFI, I didn't find when to release pointers created during `into_raw`.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L136
C++ `arrow::ImportArray` moves the payload of pointer, but cannot release the pointer, since it is allocated by `Arc` in rust.
Does it lead to a memory leak?
My proposal:
```rust
pub type Result<T> = std::result::Result<T, error::Error>;
pub(crate) fn export_array(array: ArrowArray, content: *mut FFI_ArrowArray, schema: *mut FFI_ArrowSchema) -> Result<()> {
if content.is_null() { Err(ArrowError::MemoryError("content is null".to_string()))? }
if schema.is_null() { Err(ArrowError::MemoryError("schema is null".to_string()))? }
let (content_ptr, schema_ptr) = ArrowArray::into_raw(array);
// swap content/content_ptr, schema/schema_ptr, like C++ std::unique_ptr
unsafe {
content.swap(content_ptr as *mut FFI_ArrowArray);
schema.swap(schema_ptr as *mut FFI_ArrowSchema);
}
// release content_ptr/schema_ptr
unsafe {
let _ = ArrowArray::try_from_raw(content_ptr, schema_ptr);
}
Ok(())
}
pub(crate) fn import_array(content: *const FFI_ArrowArray, schema: *const FFI_ArrowSchema) -> Result<ArrowArray> {
let empty_array = unsafe { ArrowArray::empty() };
let (content_ptr, schema_ptr) = ArrowArray::into_raw(empty_array);
unsafe {
content.copy_to(content_ptr as *mut FFI_ArrowArray, 1);
schema.copy_to(schema_ptr as *mut FFI_ArrowSchema, 1);
}
unsafe { Ok(ArrowArray::try_from_raw(content_ptr, schema_ptr)?) }
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 commented on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 commented on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L132
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 edited a comment on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 edited a comment on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet for import/export with FFI.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L149
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
For safety notice of ArrowArray,
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/ffi.rs#L638-L643
But, for the workflow of exporting to FFI, I didn't find when to release pointers created during `into_raw`.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L136
C++ `arrow::ImportArray` moves the payload of pointer, but cannot release the pointer, since it is allocated by `Arc` in rust.
Is it lead to a memory leak?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 edited a comment on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 edited a comment on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet for import/export with FFI.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L149
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
For safety notice of ArrowArray,
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/ffi.rs#L638-L643
But, for the workflow of exporting to FFI, I didn't find when to release pointers created during `into_raw`.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L136
C++ `arrow::ImportArray` moves the payload of pointer, but cannot release the pointer, since it is allocated by `Arc` in rust.
Does it lead to a memory leak?
My proposal:
```rust
pub type Result<T> = std::result::Result<T, error::Error>;
pub(crate) fn export_array(array: ArrowArray, content: *mut FFI_ArrowArray, schema: *mut FFI_ArrowSchema) -> Result<()> {
if content.is_null() { Err(ArrowError::MemoryError("content is null".to_string()))? }
if schema.is_null() { Err(ArrowError::MemoryError("schema is null".to_string()))? }
let (content_ptr, schema_ptr) = ArrowArray::into_raw(array);
// write to mutable content/schema
unsafe {
content_ptr.copy_to(content, 1);
schema_ptr.copy_to(schema, 1);
}
// release content_ptr/schema_ptr
unsafe {
let _ = ArrowArray::try_from_raw(content_ptr, schema_ptr);
}
Ok(())
}
pub(crate) fn import_array(content: *const FFI_ArrowArray, schema: *const FFI_ArrowSchema) -> Result<ArrowArray> {
let empty_array = unsafe { ArrowArray::empty() };
let (content_ptr, schema_ptr) = ArrowArray::into_raw(empty_array);
unsafe {
content.copy_to(content_ptr as *mut FFI_ArrowArray, 1);
schema.copy_to(schema_ptr as *mut FFI_ArrowSchema, 1);
}
unsafe { Ok(ArrowArray::try_from_raw(content_ptr, schema_ptr)?) }
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 edited a comment on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 edited a comment on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet for import/export with FFI.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L149
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
For safety notice of ArrowArray,
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/ffi.rs#L638-L643
But, for the workflow of exporting to FFI, I didn't find when to release pointers created during `into_raw`.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L136
C++ `arrow::ImportArray` moves the payload of pointer, but cannot release the pointer, since it is allocated by `Arc` in rust.
Does it lead to a memory leak?
My proposal:
```rust
pub(crate) fn export_array<T>(array: T, content: *mut FFI_ArrowArray, schema: *mut FFI_ArrowSchema) -> Result<()> where T: Array {
if content.is_null() { Err(ArrowError::MemoryError("content is null".to_string()))? }
if schema.is_null() { Err(ArrowError::MemoryError("schema is null".to_string()))? }
let (content_ptr, schema_ptr) = array.to_raw().unwrap();
println!("content{:p}, content_ptr{:p}, schema{:p}, schema_ptr{:p}", content, content_ptr, schema, schema_ptr);
// swap content/content_ptr, schema/schema_ptr
unsafe {
content.swap(content_ptr as *mut FFI_ArrowArray);
schema.swap(schema_ptr as *mut FFI_ArrowSchema);
}
// release content_ptr/schema_ptr
unsafe {
let _ = ArrowArray::try_from_raw(content_ptr, schema_ptr);
}
Ok(())
}
pub(crate) fn import_array(content: *mut FFI_ArrowArray, schema: *mut FFI_ArrowSchema) -> Result<ArrowArray> {
let empty_array = unsafe { ArrowArray::empty() };
let (content_ptr, schema_ptr) = ArrowArray::into_raw(empty_array);
unsafe {
content.swap(content_ptr as *mut FFI_ArrowArray);
schema.swap(schema_ptr as *mut FFI_ArrowSchema);
}
unsafe { Ok(ArrowArray::try_from_raw(content_ptr, schema_ptr)?) }
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 edited a comment on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 edited a comment on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet for import/export with FFI.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L149
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
For safety notice of ArrowArray,
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/ffi.rs#L638-L643
But, for the workflow of exporting to FFI, I didn't find when to release pointers created during `into_raw`.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L136
`arrow::ImportArray` moves the payload of pointer, but cannot release the pointer, since it is allocated by `Arc` in rust.
Is it lead to a memory leak?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] hu6360567 edited a comment on issue #994: Confused import/export API when interacting with C++
Posted by GitBox <gi...@apache.org>.
hu6360567 edited a comment on issue #994:
URL: https://github.com/apache/arrow-rs/issues/994#issuecomment-985254643
In `pyarrow.rs`, I found some snippet for import/export with FFI.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L110-L149
The desired workflow of importing array from FFI:
1. prepare ArrowArray and leak both pointers to FFI
2. write C Data Interface into both pointers
3. Import from both pointers, allocated in first step
For safety notice of ArrowArray,
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/ffi.rs#L638-L643
But, for the workflow of exporting to FFI, I didn't find when to release pointers created during `into_raw`.
https://github.com/apache/arrow-rs/blob/e9be49d962560ce5b87544a2933d8b207322cf60/arrow/src/pyarrow.rs#L136
C++ `arrow::ImportArray` moves the payload of pointer, but cannot release the pointer, since it is allocated by `Arc` in rust.
Is it lead to a memory leak?
My proposal:
```rust
pub type Result<T> = std::result::Result<T, error::Error>;
pub(crate) fn export_array(array: ArrowArray, content: *mut FFI_ArrowArray, schema: *mut FFI_ArrowSchema) -> Result<()> {
if content.is_null() { Err(ArrowError::MemoryError("content is null".to_string()))? }
if schema.is_null() { Err(ArrowError::MemoryError("schema is null".to_string()))? }
let (content_ptr, schema_ptr) = ArrowArray::into_raw(array);
// write to mutable content/schema
unsafe {
content_ptr.copy_to(content, 1);
schema_ptr.copy_to(schema, 1);
}
// release content_ptr/schema_ptr
unsafe {
let _ = ArrowArray::try_from_raw(content_ptr, schema_ptr);
}
Ok(())
}
pub(crate) fn import_array(content: *const FFI_ArrowArray, schema: *const FFI_ArrowSchema) -> Result<ArrowArray> {
let empty_array = unsafe { ArrowArray::empty() };
let (content_ptr, schema_ptr) = ArrowArray::into_raw(empty_array);
unsafe {
content.copy_to(content_ptr as *mut FFI_ArrowArray, 1);
schema.copy_to(schema_ptr as *mut FFI_ArrowSchema, 1);
}
unsafe { Ok(ArrowArray::try_from_raw(content_ptr, schema_ptr)?) }
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org