You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Riza Suminto (Jira)" <ji...@apache.org> on 2023/04/28 15:43:00 UTC

[jira] [Created] (IMPALA-12106) Union fragment without scan node can be overparallelized by backend scheduler by 1

Riza Suminto created IMPALA-12106:
-------------------------------------

             Summary: Union fragment without scan node can be overparallelized by backend scheduler by 1
                 Key: IMPALA-12106
                 URL: https://issues.apache.org/jira/browse/IMPALA-12106
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 4.2.0
            Reporter: Riza Suminto
            Assignee: Riza Suminto
         Attachments: profile_f04aed6e613865a1_acb29fb100000000.txt

IMPALA-10973 has a bug where union fragment without scan node can be overparallelized by backend scheduler by 1. This can be reproduced by running TPC-DS Q11 with MT_DOP=1.

Planner will plan 2 instances of F11.
{code:java}
|  F11:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|  Per-Instance Resources: mem-estimate=27.53MB mem-reservation=17.00MB thread-reservation=1
|  14:UNION
|  |  mem-estimate=0B mem-reservation=0B thread-reservation=0
|  |  tuple-ids=28 row-size=44B cardinality=14.80K
|  |  in pipelines: 41(GETNEXT)
|  |
|  41:AGGREGATE [FINALIZE]
|  |  output: sum:merge(ws_ext_list_price - ws_ext_discount_amt)
|  |  group by: c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year
|  |  having: sum(ws_ext_list_price - ws_ext_discount_amt) > CAST(0 AS DECIMAL(3,0))
|  |  mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB thread-reservation=0
|  |  tuple-ids=27 row-size=169B cardinality=14.80K
|  |  in pipelines: 41(GETNEXT), 16(OPEN)
|  |
|  40:EXCHANGE [HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year)]
|  |  mem-estimate=10.34MB mem-reservation=0B thread-reservation=0
|  |  tuple-ids=27 row-size=169B cardinality=148.00K
|  |  in pipelines: 16(GETNEXT) {code}
But backend scheduler will schedule 1 extra instance of F11.
{code:java}
|  F11:EXCHANGE SENDER           3      3  130.372us  157.038us                        52.86 KB      192.00 KB                                                                                                                    
|  14:UNION                      3      3  113.073us  196.543us    1.33K      14.80K    8.00 KB              0                                                                                                                    
|  41:AGGREGATE                  3      3    1.437ms    1.787ms    1.33K      14.80K   17.11 MB       17.00 MB  FINALIZE                                                                                                          
|  40:EXCHANGE                   3      3   82.561us  116.482us    1.33K     148.00K  104.00 KB       10.34 MB  HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year) {code}
This is because backend scheduler mistakenly think that this fragment is free to get assigned randomly because it does not have scan node and its num input fragment is less than num backend.
[https://github.com/apache/impala/blob/112bab64b77d6ed966b1c67bd503ed632da6f208/be/src/scheduling/scheduler.cc#L441] 

This branch should additionally check if instances_per_host.empty().
Attached is the full profile.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org