You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/22 12:48:30 UTC

[GitHub] [arrow-rs] roee88 opened a new issue #596: FFI implementation deviates from specification for memory management on export

roee88 opened a new issue #596:
URL: https://github.com/apache/arrow-rs/issues/596


   The FFI implementation deviates from the C Data Interface specification in [Member allocation](https://arrow.apache.org/docs/format/CDataInterface.html#member-allocation).
   
   > The ArrowSchema and ArrowArray structures follow the same conventions for memory management. The term “base structure” below refers to the ArrowSchema or ArrowArray that is passed between producer and consumer – not any child structure thereof.
   >
   > Member allocation
   >
   > It is intended for the base structure to be stack- or heap-allocated by the consumer. In this case, the producer API should take a pointer to the consumer-allocated structure.
   
   Taking an _example_ of a pyarrow consumer and arrow-rs producer, my understanding is that the intended behavior is:
   1. pyarrow allocates empty ffi structure(s) 
   2. pyarrow pass the empty ffi structure(s) to arrow-rs
   3. arrow-rs fills the empty ffi structure(s) from a rust Array/Schema/RecordBatch
   4. when done with the data pyarrow calls the release callback on the ffi structure(s) that it allocated in step 1
   
   This doesn't seem to be supported currently. The current implementation for exporting through the C Data Interface only allows converting an arrow array/schema to ffi array/schema, both allocated by arrow-rs. A solution to this must also take into account that the release callback must be called explicitly by the consumer, avoiding calling release unintentionally by drop.
   
   The opposite path is correctly implemented as demonstrated in https://github.com/apache/arrow-rs/blob/9f40f899e439d072fc859e0b4abf46776387e0d1/arrow-pyarrow-integration-testing/src/lib.rs#L174 (rust allocates empty, pass to the the a python exporter that fills those structs).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] roee88 commented on issue #596: FFI implementation deviates from specification for memory management on export

Posted by GitBox <gi...@apache.org>.
roee88 commented on issue #596:
URL: https://github.com/apache/arrow-rs/issues/596#issuecomment-893456676


   I think that the current implementation also leaks memory. For example in https://github.com/apache/arrow-rs/blob/6bf1988852f87da21a163203eec4c83a7b692901/arrow-pyarrow-integration-testing/src/lib.rs#L192-L202 array.to_raw internally calls Arc::into_raw on the structures but nothing ever calls Arc::from_raw to release them. Is that correct? 
   cc @kszucs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jorgecarleitao commented on issue #596: FFI implementation deviates from specification for memory management on export

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on issue #596:
URL: https://github.com/apache/arrow-rs/issues/596#issuecomment-885158526


   I agree.
   
   What I am not following is which pyarrow API we should use to create the two pointers: `pyarrow.cffi` is not public, so we can't really allocate from within pyarrow/c++ to populate it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jorgecarleitao commented on issue #596: FFI implementation deviates from specification for memory management on export

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on issue #596:
URL: https://github.com/apache/arrow-rs/issues/596#issuecomment-885230150


   Ups, you are right; I was using an older version of pyarrow that did not have this exposed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] roee88 commented on issue #596: FFI implementation deviates from specification for memory management on export

Posted by GitBox <gi...@apache.org>.
roee88 commented on issue #596:
URL: https://github.com/apache/arrow-rs/issues/596#issuecomment-885228354


   This issue is not related to pyarrow, we're actually starting to experiment with other arrow implementations. However, what makes you say it's not public? I see it in https://github.com/apache/arrow/blob/apache-arrow-5.0.0/python/pyarrow/cffi.py and if I `pip install pyarrow` (pyarrow-4.0.1) I can successfully run `from pyarrow.cffi import ffi`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org