You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Steve M. Kim (Jira)" <ji...@apache.org> on 2021/01/09 03:37:00 UTC

[jira] [Commented] (ARROW-7342) [Java] offset buffer for vector of variable-width type with zero value count is empty

    [ https://issues.apache.org/jira/browse/ARROW-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261744#comment-17261744 ] 

Steve M. Kim commented on ARROW-7342:
-------------------------------------

This incompatibility is a potential source of bugs when sharing buffers between Java and C++ via JNI, as proposed for

* https://issues.apache.org/jira/browse/ARROW-6720

* https://issues.apache.org/jira/browse/ARROW-7272

* https://issues.apache.org/jira/browse/ARROW-7808

 

 

> [Java] offset buffer for vector of variable-width type with zero value count is empty
> -------------------------------------------------------------------------------------
>
>                 Key: ARROW-7342
>                 URL: https://issues.apache.org/jira/browse/ARROW-7342
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>            Reporter: Steve M. Kim
>            Priority: Major
>
> I am reporting what I think might be two related bugs in {{org.apache.arrow.vector.BaseVariableWidthVector}}
>  # The offset buffer is initialized as empty. I expect that it to have 4 bytes that represent the integer zero.
>  # The {{getBufferSize}} method returns 0 when value count is zero, instead of 4.
> Compare to the pyarrow implementation, which I believe correctly populates the offset buffer:
> {code:java}
> >>> import pyarrow as pa
> >>> array = pa.array([], type=pa.binary())
> >>> array <pyarrow.lib.BinaryArray object at 0x7f4f68b858e8>
> []
> >>> print([b.hex().decode() for b in array.buffers()])
> ['', '00000000', '']
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)