You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Idan Sheinberg (Jira)" <ji...@apache.org> on 2020/03/29 00:53:00 UTC
[jira] [Updated] (DRILL-7675) Very slow performance and Memory
exhaustion while querying on very small dataset of parquet files
[ https://issues.apache.org/jira/browse/DRILL-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Idan Sheinberg updated DRILL-7675:
----------------------------------
Description:
Per our discussion in Slack/Dev-list Here are all details and sample data-set to recreate problematic query behavior:
* We are using Drill 1.18.0-SNAPSHOT built on March 6
* We are joining on two small Parquet datasets residing on S3 using the following query:
{code:java}
SELECT
CASE
WHEN tbl1.`timestamp` IS NULL THEN tbl2.`timestamp`
ELSE tbl1.`timestamp`
END AS ts, *
FROM `s3-store.state.`/164` AS tbl1
FULL OUTER JOIN `s3-store.result`.`/164` AS tbl2
ON tbl1.`timestamp`*10 = tbl2.`timestamp`
ORDER BY ts ASC
LIMIT 500 OFFSET 0 ROWS
{code}
* We are running drill in a single node setup on a 16 core, 64GB ram machine. Drill heap size is set to 16GB, while max direct memory is set to 32GB.
* As the dataset consist of really small files, Drill has been tweaked to parallelize on small item count by tweaking the following variables:
{code:java}
planner.slice_target = 25
planner.width.max_per_node = 16 (to match the core count){code}
* Without the above parallelization, query speeds on parquet files are super slow (tens of seconds)
* While queries do work, we are seeing non-proportional direct memory/heap utilization. (up 20GB of direct memory used, a min of 12GB heap required)
* We're still encountering the occasional OOM of memory error (we're also seeing heap exhaustion, but I guess that's another indication to same problem. Reducing the node parallelization width to say, 8, reduces memory contention, though it still reaches 8 gb of direct memory
{code:java}
User Error Occurred: One or more nodes ran out of memory while executing the query. (null)
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.null[Error Id: 67b61fc9-320f-47a1-8718-813843a10ecc ]
at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:338)
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.drill.exec.exception.OutOfMemoryException: null
at org.apache.drill.exec.vector.complex.AbstractContainerVector.allocateNew(AbstractContainerVector.java:59)
at org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.allocateOutgoingRecordBatch(PartitionerTemplate.java:380)
at org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:400)
at org.apache.drill.exec.test.generated.PartitionerGen5.setup(PartitionerTemplate.java:126)
at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createClassInstances(PartitionSenderRootExec.java:263)
at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createPartitioner(PartitionSenderRootExec.java:218)
at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:188)
at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:323)
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:310)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:310)
... 4 common frames omitted{code}
I've attached a (real!) sample data-set to match the query above.
Help, please.
Idan
was:
Per our discussion in Slack/Dev-list Here are all details and sample data-set to recreate problematic query behavior:
# We are using Drill 1.18.0-SNAPSHOT built on March 6
# We are joining on two small Parquet datasets residing on S3 using the following query:
SELECT
CASE
WHEN tbl1.`timestamp` IS NULL THEN tbl2.`timestamp`
ELSE tbl1.`timestamp`
END AS ts, *
FROM `s3-store.state.`/164` AS tbl1
FULL OUTER JOIN `s3-store.result`.`/164` AS tbl2
ON tbl1.`timestamp`*10 = tbl2.`timestamp`
ORDER BY ts ASC
LIMIT 500 OFFSET 0 ROWS
# We are running drill in a single node setup on a 16 core, 64GB ram machine. Drill heap size is set to 16GB, while max direct memory is set to 32GB.
# As the dataset consist of really small files, Drill has been tweaked to parallelize on small item count by tweaking the following variables:
planner.slice_target = 25
planner.width.max_per_node = 16 (to match the core count)
# Without the above parallelization, query speeds on parquet files are super slow (tens of seconds)
# While queries do work, we are seeing non-proportional direct memory/heap utilization. (up 20GB of direct memory used, a min of 12GB heap required)
# We're still encountering the occasional OOM of memory error (we're also seeing heap exhaustion, but I guess that's another indication to same problem. Reducing the node parallelization width to say, 8, reduces memory contention, though it still reaches 8 gb of direct memory
```
User Error Occurred: One or more nodes ran out of memory while executing the query. (null)
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.null[Error Id: 67b61fc9-320f-47a1-8718-813843a10ecc ]
at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:338)
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.drill.exec.exception.OutOfMemoryException: null
at org.apache.drill.exec.vector.complex.AbstractContainerVector.allocateNew(AbstractContainerVector.java:59)
at org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.allocateOutgoingRecordBatch(PartitionerTemplate.java:380)
at org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:400)
at org.apache.drill.exec.test.generated.PartitionerGen5.setup(PartitionerTemplate.java:126)
at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createClassInstances(PartitionSenderRootExec.java:263)
at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createPartitioner(PartitionSenderRootExec.java:218)
at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:188)
at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:323)
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:310)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:310)
... 4 common frames omitted
```
I've attached a (real!) sample data-set to match the query above.
Help, please.
Idan
> Very slow performance and Memory exhaustion while querying on very small dataset of parquet files
> -------------------------------------------------------------------------------------------------
>
> Key: DRILL-7675
> URL: https://issues.apache.org/jira/browse/DRILL-7675
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization, Storage - Parquet
> Affects Versions: 1.18.0
> Environment: [^sample-dataset.zip]
> Reporter: Idan Sheinberg
> Priority: Critical
> Attachments: sample-dataset.zip
>
>
> Per our discussion in Slack/Dev-list Here are all details and sample data-set to recreate problematic query behavior:
> * We are using Drill 1.18.0-SNAPSHOT built on March 6
> * We are joining on two small Parquet datasets residing on S3 using the following query:
>
> {code:java}
> SELECT
> CASE
> WHEN tbl1.`timestamp` IS NULL THEN tbl2.`timestamp`
> ELSE tbl1.`timestamp`
> END AS ts, *
> FROM `s3-store.state.`/164` AS tbl1
> FULL OUTER JOIN `s3-store.result`.`/164` AS tbl2
> ON tbl1.`timestamp`*10 = tbl2.`timestamp`
> ORDER BY ts ASC
> LIMIT 500 OFFSET 0 ROWS
> {code}
> * We are running drill in a single node setup on a 16 core, 64GB ram machine. Drill heap size is set to 16GB, while max direct memory is set to 32GB.
> * As the dataset consist of really small files, Drill has been tweaked to parallelize on small item count by tweaking the following variables:
> {code:java}
> planner.slice_target = 25
> planner.width.max_per_node = 16 (to match the core count){code}
> * Without the above parallelization, query speeds on parquet files are super slow (tens of seconds)
> * While queries do work, we are seeing non-proportional direct memory/heap utilization. (up 20GB of direct memory used, a min of 12GB heap required)
> * We're still encountering the occasional OOM of memory error (we're also seeing heap exhaustion, but I guess that's another indication to same problem. Reducing the node parallelization width to say, 8, reduces memory contention, though it still reaches 8 gb of direct memory
>
> {code:java}
> User Error Occurred: One or more nodes ran out of memory while executing the query. (null)
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.null[Error Id: 67b61fc9-320f-47a1-8718-813843a10ecc ]
> at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
> at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:338)
> at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: null
> at org.apache.drill.exec.vector.complex.AbstractContainerVector.allocateNew(AbstractContainerVector.java:59)
> at org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.allocateOutgoingRecordBatch(PartitionerTemplate.java:380)
> at org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:400)
> at org.apache.drill.exec.test.generated.PartitionerGen5.setup(PartitionerTemplate.java:126)
> at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createClassInstances(PartitionSenderRootExec.java:263)
> at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createPartitioner(PartitionSenderRootExec.java:218)
> at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:188)
> at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
> at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:323)
> at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:310)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:310)
> ... 4 common frames omitted{code}
>
> I've attached a (real!) sample data-set to match the query above.
> Help, please.
> Idan
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)