You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/06/09 22:12:00 UTC

[jira] [Commented] (IMPALA-12106) Union fragment without scan node can be overparallelized by backend scheduler by 1

    [ https://issues.apache.org/jira/browse/IMPALA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731107#comment-17731107 ] 

ASF subversion and git services commented on IMPALA-12106:
----------------------------------------------------------

Commit f1cca6c767696c9dcf145ec0eda167f8efe84628 in impala's branch refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=f1cca6c76 ]

IMPALA-12135: Deflake test_krpc_datastream_sender_shuffle

test_krpc_datastream_sender_shuffle has been failing with OOM error in
HDFS EC environement. This is due to fix introduced in IMPALA-12106
reduce num instances of a union fragment (F06) from 3 to 2. Running the
test with BATCH_SIZE=8 help pass the test while still holding the
assertion (KrpcDataStreamSender claiming megabytes of memory for
RowBatchSerialization).

Testing:
- test_krpc_datastream_sender_shuffle pass both in regular minicluster
  and HDFS EC setup.

Change-Id: I8c7961ad8dd489a4d62e738d364d4da1fa44d0cc
Reviewed-on: http://gerrit.cloudera.org:8080/20011
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Joe McDonnell <jo...@cloudera.com>


> Union fragment without scan node can be overparallelized by backend scheduler by 1
> ----------------------------------------------------------------------------------
>
>                 Key: IMPALA-12106
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12106
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.2.0
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>             Fix For: Impala 4.3.0
>
>         Attachments: profile_f04aed6e613865a1_acb29fb100000000.txt
>
>
> IMPALA-10973 has a bug where union fragment without scan node can be overparallelized by backend scheduler by 1. This can be reproduced by running TPC-DS Q11 with MT_DOP=1.
> Planner will plan 2 instances of F11.
> {code:java}
> |  F11:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
> |  Per-Instance Resources: mem-estimate=27.53MB mem-reservation=17.00MB thread-reservation=1
> |  14:UNION
> |  |  mem-estimate=0B mem-reservation=0B thread-reservation=0
> |  |  tuple-ids=28 row-size=44B cardinality=14.80K
> |  |  in pipelines: 41(GETNEXT)
> |  |
> |  41:AGGREGATE [FINALIZE]
> |  |  output: sum:merge(ws_ext_list_price - ws_ext_discount_amt)
> |  |  group by: c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year
> |  |  having: sum(ws_ext_list_price - ws_ext_discount_amt) > CAST(0 AS DECIMAL(3,0))
> |  |  mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB thread-reservation=0
> |  |  tuple-ids=27 row-size=169B cardinality=14.80K
> |  |  in pipelines: 41(GETNEXT), 16(OPEN)
> |  |
> |  40:EXCHANGE [HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year)]
> |  |  mem-estimate=10.34MB mem-reservation=0B thread-reservation=0
> |  |  tuple-ids=27 row-size=169B cardinality=148.00K
> |  |  in pipelines: 16(GETNEXT) {code}
> But backend scheduler will schedule 1 extra instance of F11.
> {code:java}
> |  F11:EXCHANGE SENDER           3      3  130.372us  157.038us                        52.86 KB      192.00 KB                                                                                                                    
> |  14:UNION                      3      3  113.073us  196.543us    1.33K      14.80K    8.00 KB              0                                                                                                                    
> |  41:AGGREGATE                  3      3    1.437ms    1.787ms    1.33K      14.80K   17.11 MB       17.00 MB  FINALIZE                                                                                                          
> |  40:EXCHANGE                   3      3   82.561us  116.482us    1.33K     148.00K  104.00 KB       10.34 MB  HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year) {code}
> This is because backend scheduler mistakenly think that this fragment is free to get assigned randomly because it does not have scan node and its num input fragment is less than num backend.
> [https://github.com/apache/impala/blob/112bab64b77d6ed966b1c67bd503ed632da6f208/be/src/scheduling/scheduler.cc#L441] 
> This branch should additionally check if instances_per_host.empty().
> Attached is the full profile.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org