You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/04/23 23:35:38 UTC
[jira] [Updated] (DRILL-2865) Drillbit runs out of memory on
multiple consecutive CTAS
[ https://issues.apache.org/jira/browse/DRILL-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Victoria Markman updated DRILL-2865:
------------------------------------
Description:
Hardware configuration:
- single node
- 64GB RAM
Drill configuration
DRILL_MAX_DIRECT_MEMORY="8G"
DRILL_MAX_HEAP="4G"
`planner.enable_multiphase_agg` = false;
`store.parquet.block-size` = 134217728;
`planner.enable_mux_exchange` = false;
`exec.min_hash_table_size` = 67108864;
`planner.enable_hashagg` = true;
`planner.width.max_per_node` = 23;
Aggregation query on TPCDS scale factor 1:
{code}
select
ss_sold_date_sk ,
ss_sold_time_sk ,
ss_item_sk ,
ss_customer_sk ,
ss_cdemo_sk,
count(*) from store_sales
group by
ss_sold_date_sk ,
ss_sold_time_sk ,
ss_item_sk ,
ss_customer_sk ,
ss_cdemo_sk
;
{code}
1. Executing CTAS with this query and store.format = 'parquet' fails on iteration #9 with this configuration consistently
2. Ran query by itself: 47 iterations successfully
3. Ran CTAS with this query and store.format = 'csv': - 30 iterations did not reproduce the problem
Attached:
- drillbit.log
- scripts.tar (contains script that reproduces OOM)
was:
Hardware configuration:
- single node
- 64GB RAM
Drill configuration
DRILL_MAX_DIRECT_MEMORY="8G"
DRILL_MAX_HEAP="4G"
`planner.enable_multiphase_agg` = false;
`store.parquet.block-size` = 134217728;
`planner.enable_mux_exchange` = false;
`exec.min_hash_table_size` = 67108864;
`planner.enable_hashagg` = true;
`planner.width.max_per_node` = 23;
Aggregation query on TPCDS scale factor 1:
select
ss_sold_date_sk ,
ss_sold_time_sk ,
ss_item_sk ,
ss_customer_sk ,
ss_cdemo_sk,
count(*) from store_sales
group by
ss_sold_date_sk ,
ss_sold_time_sk ,
ss_item_sk ,
ss_customer_sk ,
ss_cdemo_sk
;
1. Executing CTAS with this query and store.format = 'parquet' fails on iteration #9 with this configuration consistently
2. Ran query by itself: 47 iterations successfully
3. Ran CTAS with this query and store.format = 'csv': - 30 iterations did not reproduce the problem
Attached:
- drillbit.log
- scripts.tar (contains script that reproduces OOM)
> Drillbit runs out of memory on multiple consecutive CTAS
> --------------------------------------------------------
>
> Key: DRILL-2865
> URL: https://issues.apache.org/jira/browse/DRILL-2865
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 0.9.0
> Reporter: Victoria Markman
>
> Hardware configuration:
> - single node
> - 64GB RAM
> Drill configuration
> DRILL_MAX_DIRECT_MEMORY="8G"
> DRILL_MAX_HEAP="4G"
> `planner.enable_multiphase_agg` = false;
> `store.parquet.block-size` = 134217728;
> `planner.enable_mux_exchange` = false;
> `exec.min_hash_table_size` = 67108864;
> `planner.enable_hashagg` = true;
> `planner.width.max_per_node` = 23;
> Aggregation query on TPCDS scale factor 1:
> {code}
> select
> ss_sold_date_sk ,
> ss_sold_time_sk ,
> ss_item_sk ,
> ss_customer_sk ,
> ss_cdemo_sk,
> count(*) from store_sales
> group by
> ss_sold_date_sk ,
> ss_sold_time_sk ,
> ss_item_sk ,
> ss_customer_sk ,
> ss_cdemo_sk
> ;
> {code}
> 1. Executing CTAS with this query and store.format = 'parquet' fails on iteration #9 with this configuration consistently
> 2. Ran query by itself: 47 iterations successfully
> 3. Ran CTAS with this query and store.format = 'csv': - 30 iterations did not reproduce the problem
> Attached:
> - drillbit.log
> - scripts.tar (contains script that reproduces OOM)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)