You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Adam Gilmore (JIRA)" <ji...@apache.org> on 2015/01/08 07:14:34 UTC

[jira] [Commented] (DRILL-1948) Reading large parquet files via HDFS fails

    [ https://issues.apache.org/jira/browse/DRILL-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268874#comment-14268874 ] 

Adam Gilmore commented on DRILL-1948:
-------------------------------------

Seemed to have worked out the cause.  This line is the ultimate culprit:

CompatibilityUtil.getBuf(input, directBuffer, pageLength);

which ends up doing an input.read(directBuffer) (I couldn't work out where the source for CompatibilityUtil is)

The fatal mistake that CompatibilityUtil makes, is that it assumes input.read(ByteBuffer) will always read the remaining bytes in the buffer.  For HDFS, this is not always the case.  In my instance, it only reads chunks of 64kb (65,535) at a time, thus for large Parquet files, it's requesting pages of 128kb or so, and only reading 64kb of them.

This compounds by only pushing the position in the stream down to 65,535 on the first page read, which then lands in the middle of a page and tries to read the page header, hence the error.

There is probably remedy to force HDFS to return larger chunks, but I'm not quite sure what setting would do that.  The real fix is to loop input.read() until it returns 0.

> Reading large parquet files via HDFS fails
> ------------------------------------------
>
>                 Key: DRILL-1948
>                 URL: https://issues.apache.org/jira/browse/DRILL-1948
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 0.7.0
>         Environment: Hadoop 2.4.0 on Amazon EMR
>            Reporter: Adam Gilmore
>            Assignee: Parth Chandra
>            Priority: Critical
>
> There appears to be an issue with reading medium to large Parquet files via HDFS.  We have created a basic Parquet file in with a schema like so:
> sellprice DOUBLE
> When filled with 10,000 double values, the following query in Drill works fine:
> select sum(sellprice) from hdfs.`/saleparquet`;
> When filled with 50,000 double values, the following error occurs:
> Query failed: Query stopped.[ 9aece851-48bc-4664-831e-d35bbfbcd1d5 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
> The full stack trace is:
> 2015-01-07 05:48:57,809 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.drill.exec.ops.FragmentContext - Fragment Context received failure.
> java.lang.ArrayIndexOutOfBoundsException: null
> 2015-01-07 05:48:57,809 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.d.e.p.i.ScreenCreator$ScreenRoot - Error 88fe95c3-b088-4674-8b65-967a7f4c3cdf: Query stopped.
> java.lang.ArrayIndexOutOfBoundsException: null
> 2015-01-07 05:48:57,809 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.d.e.w.f.AbstractStatusReporter - Error cd4123e4-7b9d-451d-90f0-3cc1ecf461e4: Failure while running fragment.
> java.lang.ArrayIndexOutOfBoundsException: null
> 2015-01-07 05:48:57,813 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.drill.exec.work.foreman.Foreman - Error 5db2c65b-cd10-4970-ba2b-f29b51fda923: Query failed: Failure while running fragment.[ cd4123e4-7b9d-451d-90f0-3cc1ecf461e4 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010 ]
> [ cd4123e4-7b9d-451d-90f0-3cc1ecf461e4 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010 ]
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment.[ cd4123e4-7b9d-451d-90f0-3cc1ecf461e4 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010 ]
> [ cd4123e4-7b9d-451d-90f0-3cc1ecf461e4 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010 ]
>         at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:93) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:151) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:113) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:109) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:166) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> 2015-01-07 05:48:57,814 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] WARN  o.a.d.e.p.impl.SendingAccountor - Failure while waiting for send complete.
> java.lang.InterruptedException: null
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301) ~[na:1.7.0_71]
>         at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) ~[na:1.7.0_71]
>         at org.apache.drill.exec.physical.impl.SendingAccountor.waitForSendComplete(SendingAccountor.java:44) ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.stop(ScreenCreator.java:186) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:144) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:117) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> If I fill with even more values (e.g. 100,000 or 1,000,000) - I get a variety of other errors, such as:
> "Query failed: Query stopped., don't know what type: 14"
> coming from the Parquet engine.
> I am able to consistently replicate this in my environment with a basic Parquet file.  I can attach that file if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)