You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/28 10:56:10 UTC
[GitHub] [arrow-rs] alamb opened a new issue, #1625: UnionArray::is_null incorrect
alamb opened a new issue, #1625:
URL: https://github.com/apache/arrow-rs/issues/1625
**Describe the bug**
UnionArray::is_null checks the Union's validity map rather than its slot.
Here is the default impl for `is_null`
https://github.com/apache/arrow-rs/blob/master/arrow/src/array/array.rs#L164-L166
There is no override in the union's impl of `Array`:https://github.com/apache/arrow-rs/blob/master/arrow/src/array/array_union.rs#L305-L313
See discussion from @tustvold and @viirya here: https://github.com/apache/arrow-rs/pull/1589/files#r854817015
**To Reproduce**
Steps to reproduce the behavior:
**Expected behavior**
UnionArray::is_null should check the slot (child's data) for nullness, rather than its own
**Additional context**
Add any other context about the problem here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] viirya commented on issue #1625: UnionArray::is_null incorrect
Posted by GitBox <gi...@apache.org>.
viirya commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112820120
When constructing `UnionArray` in C++, `null_count` has been set to `0` explicitly:
`SparseUnionArray`:
https://github.com/apache/arrow/blob/c70426f73326b3852d1bd7c31d98be4743f3fcba/cpp/src/arrow/array/array_nested.cc#L696
`DenseUnionArray`:
https://github.com/apache/arrow/blob/c70426f73326b3852d1bd7c31d98be4743f3fcba/cpp/src/arrow/array/array_nested.cc#L750
According to the default `IsNull` implementation, looks like `data_->null_count != data_->length` just gives us `false` always (except for length is `0`, but in the case `IsNull(i)` doesn't make sense too).
So, seems returning `false` and adding some doc there should be good enough?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold commented on issue #1625: UnionArray::is_null incorrect
Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112079686
I think we should check what other arrow implementations do, an argument could be made that the UnionArray is a union of potentially nullable arrays, and is not itself
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold closed issue #1625: UnionArray::is_null incorrect
Posted by GitBox <gi...@apache.org>.
tustvold closed issue #1625: UnionArray::is_null incorrect
URL: https://github.com/apache/arrow-rs/issues/1625
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] alamb commented on issue #1625: UnionArray::is_null incorrect
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112553811
🤔 so we could perhaps go with the Java implementation and add some documentation? Or maybe no one care 🤷 given the C++ implementation?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] viirya commented on issue #1625: UnionArray::is_null incorrect
Posted by GitBox <gi...@apache.org>.
viirya commented on issue #1625:
URL: https://github.com/apache/arrow-rs/issues/1625#issuecomment-1112508202
Java:
https://github.com/apache/arrow/blob/b76caf4964dc96d20b72b21291052f459fa9c68a/java/vector/src/main/codegen/templates/UnionVector.java#L697-L704
It mentions:
> IMPORTANT: Union types always return non null as there is no validity buffer.
To check validity correctly you must check the underlying vector.
C++:
Default `IsNull`: https://github.com/apache/arrow/blob/c70426f73326b3852d1bd7c31d98be4743f3fcba/cpp/src/arrow/array/array_base.h#L57-L62
I don't see UnionArray has overridden implementation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org