You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Todd Farmer (Jira)" <ji...@apache.org> on 2022/09/05 16:51:00 UTC

[jira] [Commented] (ARROW-15382) SplitAndTransfer throws for (0,0) if vector empty

    [ https://issues.apache.org/jira/browse/ARROW-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17600491#comment-17600491 ] 

Todd Farmer commented on ARROW-15382:
-------------------------------------

This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per [project policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment]. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

> SplitAndTransfer throws for (0,0) if vector empty
> -------------------------------------------------
>
>                 Key: ARROW-15382
>                 URL: https://issues.apache.org/jira/browse/ARROW-15382
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>            Reporter: David Vogelbacher
>            Assignee: Frank Wong
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> I've hit a bug where `splitAndTransfer` on vectors throws if the vector is completely empty and the offset buffer is empty.
> An easy repro is:
> {noformat}
>         BufferAllocator allocator = new RootAllocator(Long.MAX_VALUE);
>         ListVector listVector = ListVector.empty("listVector", allocator);
>         listVector.getTransferPair(listVector.getAllocator()).splitAndTransfer(0, 0);
> {noformat}
> This results in the following stacktrace:
> {noformat}
> java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> 	at io.netty.buffer.ArrowBuf.checkIndexD(ArrowBuf.java:335)
> 	at io.netty.buffer.ArrowBuf.chk(ArrowBuf.java:322)
> 	at io.netty.buffer.ArrowBuf.getInt(ArrowBuf.java:441)
> 	at org.apache.arrow.vector.complex.ListVector$TransferImpl.splitAndTransfer(ListVector.java:484)
> {noformat}
> In production we hit this when calling {{VectorSchemaRoot.slice}}. The schema root contains a {{ListVector}} with a {{VarCharVector}} value vector. The list vector isn't empty, but all the strings in the var char vector are. {{splitAndTransfer}} on the list vector works, but then when underlying var char vector is split we get the same exception:
> {noformat}
> java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> 	at io.netty.buffer.ArrowBuf.checkIndexD(ArrowBuf.java:335)
> 	at io.netty.buffer.ArrowBuf.chk(ArrowBuf.java:322)
> 	at io.netty.buffer.ArrowBuf.getInt(ArrowBuf.java:441)
> 	at org.apache.arrow.vector.BaseVariableWidthVector.splitAndTransferOffsetBuffer(BaseVariableWidthVector.java:728)
> 	at org.apache.arrow.vector.BaseVariableWidthVector.splitAndTransferTo(BaseVariableWidthVector.java:712)
> 	at org.apache.arrow.vector.VarCharVector$TransferImpl.splitAndTransfer(VarCharVector.java:321)
> 	at org.apache.arrow.vector.complex.ListVector$TransferImpl.splitAndTransfer(ListVector.java:496)
> 	at org.apache.arrow.vector.VectorSchemaRoot.lambda$slice$1(VectorSchemaRoot.java:308)
> 	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> 	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
> 	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> 	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> 	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
> 	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> 	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
> 	at org.apache.arrow.vector.VectorSchemaRoot.slice(VectorSchemaRoot.java:310)
> {noformat} 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)