You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/28 04:55:19 UTC

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #9211: ARROW-11239: [Rust] Fixed equality with offsets and nulls

jorgecarleitao edited a comment on pull request #9211:
URL: https://github.com/apache/arrow/pull/9211#issuecomment-768798901


   Thanks @nevi-me . IMO the idea is good, but I think that in rust's notation that implementation will be unsound.
   
   `Buffer::offset` is measured in `bytes`, but `ArrayData::offset` is measured in slots. So, slicing a buffer in slots will lead to an unalligned buffer. E.g. a buffer representing N `u32` has 4 bytes per slot, and doing `ArrayData::slice(1, 1)` would cause that the buffer to contain `3` bytes.
   
   In the particular case of `ListArray`, I think that we should only offset the `ArrayData` and not the child array: the offset buffer (`ArrayData::buffers[0]`) will have all the information we need to extract the correct items from the child array. Of course we must use it to access the items, but imo we already do that on `ListArray::value_offset`. We may not be doing that in the equality, though.
   
   In the case of `StructArray`, we have two options: increase the offset of the child by an equal amount and only support `StructArray` with `ArrayData::offset = 0`, or increase `ArrayData::offset` and change the equality code to take that into account.
   
   In general, the child data's `ArrayData` is insufficient to use it. Either because the parent has a non-`None` null buffer, or because the parent has an `offset`. So, AFAI understand, we will always need to use the parent's accessors to interact with child objects.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org