You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/07 03:04:56 UTC

[GitHub] [arrow] paddyhoran commented on pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

paddyhoran commented on pull request #8116:
URL: https://github.com/apache/arrow/pull/8116#issuecomment-687998129


   I mentioned this over on [ARROW-9921](https://github.com/apache/arrow/pull/8117) but I don't think we intended to use `From<Vec<_>>` for anything other that testing originally.
   
   I thought that we needed to use the functions from [memory](https://github.com/apache/arrow/blob/master/rust/arrow/src/memory.rs) to allocate (control alignment and padding) but this allows the `Vec` to allocate (via `collect`).  I think you would want to ensure that all arrays are allocated with a consistent alignment, i.e. use memory.rs.
   
   From the spec:
   > Implementations are recommended to allocate memory on aligned addresses (multiple of 8- or 64-bytes) and pad (overallocate) to a length that is a multiple of 8 or 64 bytes. When serializing Arrow data for interprocess communication, these alignment and padding requirements are enforced
   
   This approach might be fine for an application in the wild (that won't use IPC) but DataFusion is part of the Arrow project itself and so *should* follow the rules/recommendations, thoughts?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org