You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Ruilong Huo (JIRA)" <ji...@apache.org> on 2015/11/09 07:01:10 UTC

[jira] [Created] (HAWQ-139) Out of memory with 10 concurrent TPC-H workload in YARN mode

Ruilong Huo created HAWQ-139:
--------------------------------

             Summary: Out of memory with 10 concurrent TPC-H workload in YARN mode
                 Key: HAWQ-139
                 URL: https://issues.apache.org/jira/browse/HAWQ-139
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Resource Manager
            Reporter: Ruilong Huo
            Assignee: Lei Chang


On a 18 node HAWQ cluster with YARN configured, it errors out with "out of memory" during 10 concurrent TPC-H (10G data per node) workload.

Further analysis shows that one of TPC-H query 9 session oom using about 1.7G memory while the query is supposed to use about 1G memory.

For a long term fix, we need to investigate on resource manager and executor to identify action items. For a short term fix, we give HAWQ 8G memory buffer instead of 2G by default.

{code}
91265 [2015-11-03 12:31:15] select
 nation,
 o_year,
 sum(amount) as sum_profit
from
 (
 select
 n_name as nation,
 extract(year from o_orderdate) as o_year,
 l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
 from
 part,
 supplier,
 lineitem,
 partsupp,
 orders,
 nation
 where
 s_suppkey = l_suppkey
 and ps_suppkey = l_suppkey
 and ps_partkey = l_partkey
 and p_partkey = l_partkey
 and o_orderkey = l_orderkey
 and s_nationkey = n_nationkey
 and p_name like '%aquamarine%'
 ) as profit
group by
 nation,
 o_year
order by
 nation,
 o_year desc;
91272 [2015-11-03 12:31:21] psql:/data1/gpadmin/pulse2-agent/agents/agent1/work/HAWQ-main-SystemTest-yarn/rhel5_x86_64/lsp/report/20151103-114720/performance_tpch_concurrent/tpch_parquet_10gpn_nocomp_part_random_10c_gpadmin/tmp/1_8_TPCH_Query_09.sql:32: ERROR:  Canceling query because of high VMEM usage. Used: 1748MB, available 480MB, red zone: 9216MB (runaway_cleaner.c:135)  (seg74 bcn-w3:5532 pid=33619) (dispatcher.c:1681)
***|tpch_parquet_10gpn_nocomp_part_random_10c_gpadmin_1_8_TPCH_Query_09.sql|127665
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)