You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Venki Korukanti (JIRA)" <ji...@apache.org> on 2015/04/21 08:33:00 UTC

[jira] [Comment Edited] (DRILL-2835) IndexOutOfBoundsException in partition sender when doing streaming aggregate with LIMIT

    [ https://issues.apache.org/jira/browse/DRILL-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504416#comment-14504416 ] 

Venki Korukanti edited comment on DRILL-2835 at 4/21/15 6:32 AM:
-----------------------------------------------------------------

Issue here is:
One of the outgoing batches in PartitionSender receives a "terminate" message from its receiver, which causes the OutgoingBatch to go in to"dropAll" mode (meaning ignore sending data to receiver once a batch is filled). Problem is when in "dropAll" mode, as part of flush we clear the vector buffers, but don't allocate any new buffers causing the OutgoingBatch copy to fail.

This issue has been there for sometime. It just started showing up because of recent fixes in fragment state updates.

There are two ways to resolve this issue:
1. Before copying a record in each OutgoingBatch check if the batch is in "dropAll" mode.
2. Reuse the same buffers with out releasing by resetting the recordCount. This has overhead of copying when the OutgoingBatch is in dropAll mode which happens when a LIMIT clause is used.

Provided patch uses approach 2.


was (Author: vkorukanti):
Issue here is:
One of the outgoing batches in PartitionSender receives a "terminate" message from its receiver, which causes the OutgoingBatch to go in to"dropAll" mode (meaning ignore sending data to receiver once a batch is filled). Problem is when in "dropAll" mode, as part of flush we clear the vector buffers, but don't allocate any new buffers causing the OutgoingBatch copy to fail.

There are two ways to resolve this issue:
1. Before copying a record in each OutgoingBatch check if the batch is in "dropAll" mode.
2. Reuse the same buffers with out releasing by resetting the recordCount. This has overhead of copying when the OutgoingBatch is in dropAll mode which happens when a LIMIT clause is used.

> IndexOutOfBoundsException in partition sender when doing streaming aggregate with LIMIT 
> ----------------------------------------------------------------------------------------
>
>                 Key: DRILL-2835
>                 URL: https://issues.apache.org/jira/browse/DRILL-2835
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - RPC
>    Affects Versions: 0.8.0
>            Reporter: Aman Sinha
>            Assignee: Venki Korukanti
>         Attachments: DRILL-2835-1.patch
>
>
> Following CTAS run on a TPC-DS 100GB scale factor on a 10-node cluster: 
> {code}
> alter session set `planner.enable_hashagg` = false;
> alter session set `planner.enable_multiphase_agg` = true;
> create table dfs.tmp.stream9 as 
> select cr_call_center_sk , cr_catalog_page_sk ,  cr_item_sk , cr_reason_sk , cr_refunded_addr_sk , count(*) from catalog_returns_dri100 
>  group by cr_call_center_sk , cr_catalog_page_sk ,  cr_item_sk , cr_reason_sk , cr_refunded_addr_sk
>  limit 100
> ;
> {code}
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: index: 1023, length: 1 (expected: range(0, 0))
>         at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:200) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.chk(DrillBuf.java:222) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.setByte(DrillBuf.java:621) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at org.apache.drill.exec.vector.UInt1Vector$Mutator.set(UInt1Vector.java:342) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.NullableBigIntVector$Mutator.set(NullableBigIntVector.java:372) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.NullableBigIntVector.copyFrom(NullableBigIntVector.java:284) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.doEval(PartitionerTemplate.java:370) ~[na:na]
>         at org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.copy(PartitionerTemplate.java:249) ~[na:na]
>         at org.apache.drill.exec.test.generated.PartitionerGen4.doCopy(PartitionerTemplate.java:208) ~[na:na]
>         at org.apache.drill.exec.test.generated.PartitionerGen4.partitionBatch(PartitionerTemplate.java:176) ~[na:na]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)