You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Ruilong Huo (JIRA)" <ji...@apache.org> on 2015/10/13 08:57:05 UTC
[jira] [Comment Edited] (HAWQ-12) "Cannot allocate memory" in
parquet_compression test in installcheck-good with hawq dbg build
[ https://issues.apache.org/jira/browse/HAWQ-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952756#comment-14952756 ]
Ruilong Huo edited comment on HAWQ-12 at 10/13/15 6:56 AM:
-----------------------------------------------------------
After the evaluation, the memory consumption of the query is reasonable:
1) The "internal statistics" and "external monitoring" memory usage of the query matches.
2) There is no outstanding operator consumes excessive memory.
Details below:
1. From "internal" perspective, the explain analyze result shows that:
1) The memory quota for the query is 2G.
2) There is one segment for the query, and it has 6 slices. The memory consumption of all slices is about 115564K bytes (about 112.8M bytes).
{noformat}
slice0 Executor memory: 712K bytes.
slice1 Executor memory: 11980K bytes (seg0:localhost.localdomain)
slice2 Executor memory: 11980K bytes (seg0:localhost.localdomain)
slice3 Executor memory: 11980K bytes (seg0:localhost.localdomain)
slice4 Executor memory: 15733K bytes (seg0:localhost.localdomain)
slice5 Executor memory: 63179K bytes (seg0:localhost.localdomain)
--------------------------------------------------------------------
Query Executor memory: 115564K bytes (112.8M bytes)
{noformat}
3) There is no outstanding operator consumes excessive memory.
For details, please refer to attached parquet_compression_explain_analyze.out and parquet_compression_explain_analyze.gif
2. From "external" perspective, if we monitoring the rough memory usage during the query execution, we can see:
{noformat}
Memory usage Available memory on OS during query execution
Run 1: 317936K bytes (310.5M bytes) Min: 3943444K bytes Max: 4261380K bytes
Run 2: 241860K bytes (236.2M bytes) Min: 4062304K bytes Max: 4304164K bytes
{noformat}
3. There is about 124M ~ 198M gap between "internal" and "external" view of the query's memory consumption. The gap comes from:
1) There is 1 QD and 6 QE. Each of them consumes 12M+ bytes. In total, it takes about 100M bytes.
2) The memory consumption of some library function (i.e., strcoll) is not covered by memory monitoring. The reason is that these functions bypass gp_malloc and call malloc/free directly.
was (Author: huor):
The explain analyze result shows that it uses about 2G memory. For details, please refer to attached parquet_compression_explain_analyze.out and parquet_compression_explain_analyze.gif
> "Cannot allocate memory" in parquet_compression test in installcheck-good with hawq dbg build
> ---------------------------------------------------------------------------------------------
>
> Key: HAWQ-12
> URL: https://issues.apache.org/jira/browse/HAWQ-12
> Project: Apache HAWQ
> Issue Type: Bug
> Components: Storage
> Environment: Red Hat Enterprise Linux Server release 5.5 (Tikanga)
> Linux pbld3 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
> Reporter: Ruilong Huo
> Assignee: Ruilong Huo
> Attachments: parquet_compression_explain_analyze.gif, parquet_compression_explain_analyze.out
>
>
> When running installcheck-good with hawq dbg build on a Linux box (RHEL 5.5, 12G Memory, Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz with 4 processors), the parquet_compression test fails with "Cannot allocate memory" from time to time.
> Initial investigation shows that strcoll fails to allocate memory to complete string comparison with locale considered during outer join of two partitioned parquet tables with gzip compression.
> We need to understand: 1) the amount of memory used by outer join query and conclude if it is expected; 2) fix the oom if there are issues either with memory leak or with memory protection/enforcement.
> {noformat}
> 2015-09-25 00:31:22.852771 PDT,"gpadmin","regression",p9703,th-1437302464,"127.0.0.1","39230",2015-09-25 00:31:16 PDT,4502,con368,cmd50,seg-1,,,x4502,sx1,"ERROR","XX000","Unable to compare strings. Error: Cannot allocate memory. First string has length 1145620 and value (limited to 100 characters): 'large data value for text data typelarge data value for text data typelarge data value for text data'. Second string has length 1145620 and value (limited to 100 characters): 'large data value for text data typelarge data value for text data typelarge data value for text data' (string_wrapper.h:58) (seg0 pbld3:23011 pid=9715) (dispatcher.c:1681)",,,,,,"select count(*) from parquet_gzip_part c1 full outer join parquet_gzip_part_unc c2 on c1.p1=c2.p1 and c1.document=c2.document and c1.vch1=c2.vch1 and c1.bta1=c2.bta1 and c1.bitv1=c2.bitv1;",0,,"dispatcher.c",1681,"Stack trace:
> 1 0x9de185 postgres errstart (elog.c:473)
> 2 0xb856f2 postgres <symbol not found> (dispatcher.c:1679)
> 3 0xb84c45 postgres dispatch_catch_error (dispatcher.c:1342)
> 4 0x7384e0 postgres mppExecutorCleanup (execUtils.c:2267)
> 5 0x718b21 postgres ExecutorRun (execMain.c:1230)
> 6 0x900648 postgres <symbol not found> (pquery.c:1642)
> 7 0x900225 postgres PortalRun (pquery.c:1466)
> 8 0x8f6276 postgres <symbol not found> (postgres.c:1728)
> 9 0x8faec8 postgres PostgresMain (postgres.c:4693)
> 10 0x89db5a postgres <symbol not found> (postmaster.c:5846)
> 11 0x89cfe4 postgres <symbol not found> (postmaster.c:5438)
> 12 0x897702 postgres <symbol not found> (postmaster.c:2146)
> 13 0x8967d8 postgres PostmasterMain (postmaster.c:1432)
> 14 0x7b095e postgres main (main.c:226)
> 15 0x336e21d994 libc.so.6 __libc_start_main (??:0)
> 16 0x4b9109 postgres <symbol not found> (??:0)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)