You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/07/13 01:34:00 UTC

[jira] [Comment Edited] (DRILL-5669) Multiple TPCH queries failed due to OOM

    [ https://issues.apache.org/jira/browse/DRILL-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084971#comment-16084971 ] 

Paul Rogers edited comment on DRILL-5669 at 7/13/17 1:33 AM:
-------------------------------------------------------------

The ultimate case of DC’s issue is how we allocate memory:

2 GB / max_width_per_node / (sum(sorts) + sum(hashAggs))

In the query that failed there are 2 sorts and 4 hash eggs. Width is 32. Before Boaz’s change, we did:

2 GB / 32 / 2 = 32 MB per operator

After the change:

2 GB / 32 / 6 = 10 MB per operator

But, we actual clamp the memory at the initial operator allocation, which is 20 MB. Hence, the sort got 20 MB.

But, the incoming batch is ~27 MB.

We had DC change the max memory per query to 6 GB. Now the query ran (because the sort again got 32 MB.)

Now the workaround.

Maybe we can just change the initial allocation from 20 MB to, say, 50 MB. Now, regardless of our crazy math, the sort and hashAgg will never get less than 50 MB.

Consequences: prior to 1.11, the hash join and hash agg made unlimited use of memory. (Sort was limited.) In 1.11, hash join is still unlimited. Sort and hash agg are limited (but, according to the workaround, with a higher minimum.) This will work if the benefit from limiting hash agg memory more than makes up for the cost of the higher minimum allocation.

If we agree, this is a two-character fix.


was (Author: paul-rogers):
The ultimate case of DC’s issue is how we allocate memory:

2 GB / max_width_per_node / (sum(sorts) + sum(hashAggs))

In the query that failed there are 2 sorts and 4 hash eggs. Width is 32. Before Boaz’s change, we did:

2 GB / 32 / 2 = 32 MB per operator

After the change:

2 GB / 32 / 6 = 10 MB per operator

But, we actual clamp the memory at the initial operator allocation, which is 20 MB. Hence, the sort got 20 MB.

But, the incoming batch is ~27 MB.

We had DC change the max memory per query to 6 GB. Now the query ran (because the sort again got 32 MB.)

Now the workaround.

Maybe we can just change the initial allocation from 20 MB to, say, 50 MB. Now, regardless of our crazy math, the sort and hashing will never get less than 50 MB.

Consequences: prior to 1.11, the hash join and hash agg made unlimited use of memory. (Sort was limited.) In 1.11, hash join is still unlimited. Sort and hash agg are limited (but, according to the workaround, with a higher minimum.) This will work if the benefit from limiting hash agg memory more than makes up for the cost of the higher minimum allocation.

If we agree, this is a one-character fix.

> Multiple TPCH queries failed due to OOM
> ---------------------------------------
>
>                 Key: DRILL-5669
>                 URL: https://issues.apache.org/jira/browse/DRILL-5669
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>         Environment: RHEL 6.4 2.6.32-358.el6.x86_64, 10+1 nodes cluster
>            Reporter: Dechang Gu
>            Assignee: Boaz Ben-Zvi
>             Fix For: 1.11.0
>
>         Attachments: 26999476-174e-98fd-e21e-fd53f79284c7.sys.drill
>
>
> Running TPCH SF100 Parquet (and CSV) tests, multiple queries failed due to OOM. For example, Q16 hit the following error:
> {code}
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
> Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill.
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 23500416
> allocator limit 20000000
> Fragment 1:11
> [Error Id: e58161a6-2383-48b1-a350-50db1b5408c6 on ucs-node10.perf.lab:31010]
>         at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
>         at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:593)
>         at org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:215)
>         at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:140)
>         at PipSQueak.fetchRows(PipSQueak.java:420)
>         at PipSQueak.runTest(PipSQueak.java:116)
>         at PipSQueak.main(PipSQueak.java:556)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
> Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill.
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 23500416
> allocator limit 20000000
> Fragment 1:11
> {code}
> And in drillbit.log:
> {code}
> 2017-07-12 11:34:11,670 ucs-node10.perf.lab [26999476-174e-98fd-e21e-fd53f79284c7:frag:1:11] INFO  o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: One or more nodes ran out of memory while executing the query.
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
> Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill.
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 23500416
> allocator limit 20000000
> [Error Id: e58161a6-2383-48b1-a350-50db1b5408c6 ]
>         at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:639) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:381) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:140) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:144) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at java.security.AccessController.doPrivileged(Native Method) [na:1.7.0_65]
>         at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_65]
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) [hadoop-common-2.7.0-mapr-1607.jar:na]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_65]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)