You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Chris Westin (JIRA)" <ji...@apache.org> on 2015/05/13 01:39:01 UTC
[jira] [Created] (DRILL-3044) Very deep record batch fetching stack
for single table query (TestTpchLimit0.tpch01)
Chris Westin created DRILL-3044:
-----------------------------------
Summary: Very deep record batch fetching stack for single table query (TestTpchLimit0.tpch01)
Key: DRILL-3044
URL: https://issues.apache.org/jira/browse/DRILL-3044
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 0.9.0
Reporter: Chris Westin
Assignee: Jinfeng Ni
I ran TestTpchLimit0 in a constrained memory environment while hunting for a memory leak.
Here are the startup parameters (from Eclipse's test launch configuration):
-Xms512m
-Xmx3g
-Ddrill.exec.http.enabled=false
-Ddrill.exec.sys.store.provider.local.write=false
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
-XX:MaxPermSize=256M -XX:MaxDirectMemorySize=3072M
-XX:+CMSClassUnloadingEnabled -ea
-Ddrill.exec.memory.top.max=67108864
Except for the last value, these were taken from the root pom.xml; the last value constrains the amount of direct memory used to 64M. (We're looking for leaks that happen when queries fail to allocate memory and have to be cancelled and aren't cleaned up properly).
I find that there is indeed a leak for tpch01 when the fragment is cleaned up. tpch01 looks like this:
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
cp.`tpch/lineitem.parquet`
where
l_shipdate <= date '1998-12-01' - interval '120' day (3)
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
Basically a single table query with a group and sort.
But in the trace file, this is the stack at the time of the creation of the leaked allocator:
org.apache.drill.exec.ops.FragmentContext.getNewChildAllocator:302
org.apache.drill.exec.ops.OperatorContextImpl.<init>:43
org.apache.drill.exec.ops.FragmentContext.newOperatorContext:366
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch:70
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch:1
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:140
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch:121
org.apache.drill.exec.physical.impl.ImplCreator.getChildren:163
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec:96
org.apache.drill.exec.physical.impl.ImplCreator.getExec:77
org.apache.drill.exec.work.fragment.FragmentExecutor.run:199
That seems like it's too deep for this query.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)