You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/06 12:45:23 UTC

[GitHub] [arrow-rs] tustvold commented on issue #1799: ArrayData Layout Enumeration

tustvold commented on issue #1799:
URL: https://github.com/apache/arrow-rs/issues/1799#issuecomment-1147407543

   > In arrow-rs currently, as bitwise operations are related to Buffer but not BitMap, I guess Buffer is a better type for BooleanArray.
   
   I specifically want to change this, as we can support zero-copy slicing of BitMap, but not Buffer (as you can't slice at the bit-level). I will write a ticket up shortly.
   
   > Curiously, why not directly declare each type of Array with Buffers, for example:
   
   A couple of reasons:
   
   * There are operations that don't care about the underlying array types, e.g. IPC, FFI, nullif, etc... ArrayData provides this
   * Buffer is not strongly typed and so is not a drop-in replacement for RawPtrBox (which I also have a separate plan to tweak)
   * Reduces code churn, this could theoretically not even be a breaking change
   
   Basically I see Array and ArrayData filling different roles:
   
   * Array, ArrayBuilder, etc... are user-facing and should prioritise providing a strongly-typed, idiomatic, safe API for users
   * ArrayData is the low level API, with a focus on interoperability with other arrow systems
   
   > It seems like that ArrayData::data_type is redundant because ArrayData::layout can also tell you the type of the array?
   
   You still need the DataType to roundtrip the actual type, e.g. int32 vs uint32, the Field for nested types, etc...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org