You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Steve M. Kim (Jira)" <ji...@apache.org> on 2021/01/09 03:37:00 UTC
[jira] [Commented] (ARROW-7342) [Java] offset buffer for vector of
variable-width type with zero value count is empty
[ https://issues.apache.org/jira/browse/ARROW-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261744#comment-17261744 ]
Steve M. Kim commented on ARROW-7342:
-------------------------------------
This incompatibility is a potential source of bugs when sharing buffers between Java and C++ via JNI, as proposed for
* https://issues.apache.org/jira/browse/ARROW-6720
* https://issues.apache.org/jira/browse/ARROW-7272
* https://issues.apache.org/jira/browse/ARROW-7808
> [Java] offset buffer for vector of variable-width type with zero value count is empty
> -------------------------------------------------------------------------------------
>
> Key: ARROW-7342
> URL: https://issues.apache.org/jira/browse/ARROW-7342
> Project: Apache Arrow
> Issue Type: Bug
> Components: Java
> Reporter: Steve M. Kim
> Priority: Major
>
> I am reporting what I think might be two related bugs in {{org.apache.arrow.vector.BaseVariableWidthVector}}
> # The offset buffer is initialized as empty. I expect that it to have 4 bytes that represent the integer zero.
> # The {{getBufferSize}} method returns 0 when value count is zero, instead of 4.
> Compare to the pyarrow implementation, which I believe correctly populates the offset buffer:
> {code:java}
> >>> import pyarrow as pa
> >>> array = pa.array([], type=pa.binary())
> >>> array <pyarrow.lib.BinaryArray object at 0x7f4f68b858e8>
> []
> >>> print([b.hex().decode() for b in array.buffers()])
> ['', '00000000', '']
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)