You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/10/23 21:29:00 UTC

[jira] [Commented] (IMPALA-6596) Query failed with OOM on coordinator while remote fragments on other nodes continue to run

    [ https://issues.apache.org/jira/browse/IMPALA-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661314#comment-16661314 ] 

Tim Armstrong commented on IMPALA-6596:
---------------------------------------

[~kwho] do we think this is fixed by your status reporting changes? Just looking at critical JIRAs - we have a lot of them.

> Query failed with OOM on coordinator while remote fragments on other nodes continue to run
> ------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-6596
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6596
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.11.0
>            Reporter: Mostafa Mokhtar
>            Priority: Critical
>         Attachments: RowSizeFailureProfile.txt
>
>
> This is somewhat similar to IMPALA-2990.
> Query 
> {code:java}
> set NUM_SCANNER_THREADS=2;
> set MAX_ROW_SIZE=4194304;
> with cte as (select group_concat(cast(fnv_hash(concat(o_comment, 'd')) as string)) as c1,group_concat(cast(fnv_hash(concat(o_comment, 'e')) as string)) as c2 from orders where o_orderkey <1200000000 and o_orderdate <"1993-01-01" union all select group_concat(cast(fnv_hash(concat(o_comment, 'd')) as string)) as c1,group_concat(cast(fnv_hash(concat(o_comment, 'e')) as string)) as c2 from orders where o_orderkey <1200000000 and o_orderdate <"1993-01-01"), cte2 as (select c1,c2,s_suppkey from cte , supplier) select count(*) from cte2 t1, cte2 t2 where t1.s_suppkey = t2.s_suppkey group by t1.c1 , t1.c2 , t2.c1 , t2.c2 having count(*) = 1
> {code}
> Failed on coordinator node which is also an executor with 
> {code:java}
> h4. _Status:_ Row of size 1.82 GB could not be materialized in plan node with id 14. Increase the max_row_size query option (currently 4.00 MB) to process larger rows.
> {code}
> Log on the coordinator has lots of entries with 
> {code:java}
> I0227 19:20:58.057637 62974 impala-server.cc:1196] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 9b439fc1ee1addb7:82d4156900000000 I0227 19:20:58.152979 63129 impala-server.cc:1196] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 9b439fc1ee1addb7:82d4156900000000 I0227 19:20:58.714336 63930 impala-server.cc:1196] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 9b439fc1ee1addb7:82d4156900000000 I0227 19:20:58.718415 63095 impala-server.cc:1196] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 9b439fc1ee1addb7:82d4156900000000 I0227 19:20:58.757306 63339 impala-server.cc:1196] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 9b439fc1ee1addb7:82d4156900000000 I0227 19:20:58.762310 63406 impala-server.cc:1196] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 9b439fc1ee1addb7:82d4156900000000
> {code}
> From the memz tab on a different node. 
> h2. Memory Usage
> Memory consumption / limit: *142.87 GB* / *100.00 GB*
> h3. Breakdown
> {code}
> Process: memory limit exceeded. Limit=100.00 GB Total=141.05 GB Peak=144.25 GB Buffer Pool: Free Buffers: Total=0 Buffer Pool: Clean Pages: Total=0 Buffer Pool: Unused Reservation: Total=-118.00 MB TCMalloc Overhead: Total=1.68 GB RequestPool=root.default: Total=40.10 GB Peak=89.40 GB Query(9b439fc1ee1addb7:82d4156900000000): Reservation=122.00 MB ReservationLimit=80.00 GB OtherMemory=39.98 GB Total=40.10 GB Peak=40.10 GB Unclaimed reservations: Reservation=72.00 MB OtherMemory=0 Total=72.00 MB Peak=226.00 MB Fragment 9b439fc1ee1addb7:82d4156900000059: Reservation=0 OtherMemory=0 Total=0 Peak=632.88 KB AGGREGATION_NODE (id=29): Total=0 Peak=76.12 KB EXCHANGE_NODE (id=28): Reservation=0 OtherMemory=0 Total=0 Peak=0 DataStreamRecvr: Total=0 Peak=0 DataStreamSender (dst_id=30): Total=0 Peak=1.75 KB CodeGen: Total=0 Peak=547.00 KB Fragment 9b439fc1ee1addb7:82d4156900000050: Reservation=0 OtherMemory=0 Total=0 Peak=3.67 GB AGGREGATION_NODE (id=15): Total=0 Peak=76.12 KB HASH_JOIN_NODE (id=14): Reservation=0 OtherMemory=0 Total=0 Peak=1.85 GB Hash Join Builder (join_node_id=14): Total=0 Peak=13.12 KB EXCHANGE_NODE (id=26): Reservation=0 OtherMemory=0 Total=0 Peak=1.82 GB DataStreamRecvr: Total=0 Peak=1.82 GB EXCHANGE_NODE (id=27): Reservation=0 OtherMemory=0 Total=0 Peak=1.82 GB DataStreamRecvr: Total=0 Peak=1.82 GB DataStreamSender (dst_id=28): Total=0 Peak=15.75 KB CodeGen: Total=0 Peak=1.81 MB Fragment 9b439fc1ee1addb7:82d4156900000023: Reservation=26.00 MB OtherMemory=19.99 GB Total=20.02 GB Peak=20.02 GB Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB OtherMemory=0 Total=2.00 MB Peak=2.00 MB NESTED_LOOP_JOIN_NODE (id=6): Total=3.63 GB Peak=3.63 GB Nested Loop Join Builder: Total=3.63 GB Peak=3.63 GB HDFS_SCAN_NODE (id=5): Reservation=24.00 MB OtherMemory=2.88 MB Total=26.88 MB Peak=26.88 MB EXCHANGE_NODE (id=20): Reservation=0 OtherMemory=0 Total=0 Peak=1.82 GB DataStreamRecvr: Total=0 Peak=1.82 GB DataStreamSender (dst_id=26): Total=16.36 GB Peak=16.36 GB CodeGen: Total=628.00 B Peak=222.50 KB Fragment 9b439fc1ee1addb7:82d4156900000007: Reservation=0 OtherMemory=0 Total=0 Peak=734.73 MB AGGREGATION_NODE (id=2): Total=0 Peak=516.29 MB HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 Total=0 Peak=350.27 MB DataStreamSender (dst_id=16): Total=0 Peak=3.88 KB CodeGen: Total=0 Peak=839.50 KB Fragment 9b439fc1ee1addb7:82d4156900000014: Reservation=0 OtherMemory=0 Total=0 Peak=734.18 MB AGGREGATION_NODE (id=4): Total=0 Peak=516.33 MB HDFS_SCAN_NODE (id=3): Reservation=0 OtherMemory=0 Total=0 Peak=349.92 MB DataStreamSender (dst_id=18): Total=0 Peak=3.88 KB CodeGen: Total=0 Peak=839.50 KB Fragment 9b439fc1ee1addb7:82d4156900000047: Reservation=24.00 MB OtherMemory=19.99 GB Total=20.02 GB Peak=20.02 GB NESTED_LOOP_JOIN_NODE (id=13): Total=3.63 GB Peak=3.63 GB Nested Loop Join Builder: Total=3.63 GB Peak=3.63 GB HDFS_SCAN_NODE (id=12): Reservation=24.00 MB OtherMemory=2.88 MB Total=26.88 MB Peak=26.88 MB EXCHANGE_NODE (id=25): Reservation=0 OtherMemory=0 Total=0 Peak=1.82 GB DataStreamRecvr: Total=0 Peak=1.82 GB DataStreamSender (dst_id=27): Total=16.36 GB Peak=16.36 GB CodeGen: Total=234.00 B Peak=52.50 KB Fragment 9b439fc1ee1addb7:82d415690000002b: Reservation=0 OtherMemory=0 Total=0 Peak=734.73 MB AGGREGATION_NODE (id=9): Total=0 Peak=516.29 MB HDFS_SCAN_NODE (id=8): Reservation=0 OtherMemory=0 Total=0 Peak=350.24 MB DataStreamSender (dst_id=21): Total=0 Peak=3.88 KB CodeGen: Total=0 Peak=839.50 KB Fragment 9b439fc1ee1addb7:82d4156900000038: Reservation=0 OtherMemory=0 Total=0 Peak=734.73 MB AGGREGATION_NODE (id=11): Total=0 Peak=516.29 MB HDFS_SCAN_NODE (id=10): Reservation=0 OtherMemory=0 Total=0 Peak=350.52 MB DataStreamSender (dst_id=23): Total=0 Peak=3.88 KB CodeGen: Total=0 Peak=839.50 KB Untracked Memory: Total=99.38 GB
> {code}
>  
> It is strange that a consumption of 143.63 GB is reported while the system has 128GB RAM.
> {code}
> Hardware Info
> Cpu Info: Model: Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz Cores: 24 Max Possible Cores: 24 L1 Cache: 32.00 KB (Line: 64.00 B) L2 Cache: 256.00 KB (Line: 64.00 B) L3 Cache: 15.00 MB (Line: 64.00 B) Hardware Supports: ssse3 sse4_1 sse4_2 popcnt avx pclmulqdq Numa Nodes: 2 Numa Nodes of Cores: 0->0 | 1->0 | 2->0 | 3->0 | 4->0 | 5->0 | 6->1 | 7->1 | 8->1 | 9->1 | 10->1 | 11->1 | 12->0 | 13->0 | 14->0 | 15->0 | 16->0 | 17->0 | 18->1 | 19->1 | 20->1 | 21->1 | 22->1 | 23->1 | Physical Memory: 126.00 GB
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org