You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/28 10:56:10 UTC

[GitHub] [arrow-rs] alamb opened a new issue, #1625: UnionArray::is_null incorrect

alamb opened a new issue, #1625:
URL: https://github.com/apache/arrow-rs/issues/1625

   **Describe the bug**
   
   UnionArray::is_null checks the Union's validity map rather than its slot. 
   
   Here is the default impl for `is_null`
   https://github.com/apache/arrow-rs/blob/master/arrow/src/array/array.rs#L164-L166
   
   There is no override in the union's impl of `Array`:https://github.com/apache/arrow-rs/blob/master/arrow/src/array/array_union.rs#L305-L313
   
   See discussion from @tustvold and @viirya here:  https://github.com/apache/arrow-rs/pull/1589/files#r854817015
   
   **To Reproduce**
   Steps to reproduce the behavior:
   
   **Expected behavior**
   UnionArray::is_null should check the slot (child's data) for nullness, rather than its own
   
   **Additional context**
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] viirya commented on issue #1625: UnionArray::is_null incorrect

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112820120

   When constructing `UnionArray` in C++, `null_count` has been set to `0` explicitly:
   
   `SparseUnionArray`:
   https://github.com/apache/arrow/blob/c70426f73326b3852d1bd7c31d98be4743f3fcba/cpp/src/arrow/array/array_nested.cc#L696
   
   `DenseUnionArray`:
   https://github.com/apache/arrow/blob/c70426f73326b3852d1bd7c31d98be4743f3fcba/cpp/src/arrow/array/array_nested.cc#L750
   
   According to the default `IsNull` implementation, looks like `data_->null_count != data_->length` just gives us `false` always (except for length is `0`, but in the case `IsNull(i)` doesn't make sense too).
   
   So, seems returning `false`  and adding some doc there should be good enough?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1625: UnionArray::is_null incorrect

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112079686

   I think we should check what other arrow implementations do, an argument could be made that the UnionArray is a union of potentially nullable arrays, and is not itself


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #1625: UnionArray::is_null incorrect

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #1625: UnionArray::is_null incorrect
URL: https://github.com/apache/arrow-rs/issues/1625


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on issue #1625: UnionArray::is_null incorrect

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112553811

   🤔  so we could perhaps go with the Java implementation and add some documentation? Or maybe no one care 🤷  given the C++ implementation?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] viirya commented on issue #1625: UnionArray::is_null incorrect

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112508202

   Java:
   https://github.com/apache/arrow/blob/b76caf4964dc96d20b72b21291052f459fa9c68a/java/vector/src/main/codegen/templates/UnionVector.java#L697-L704
   
   It mentions:
   
   > IMPORTANT: Union types always return non null as there is no validity buffer.
   To check validity correctly you must check the underlying vector.
   
   C++:
   Default `IsNull`: https://github.com/apache/arrow/blob/c70426f73326b3852d1bd7c31d98be4743f3fcba/cpp/src/arrow/array/array_base.h#L57-L62
   
   I don't see UnionArray has overridden implementation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org