You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/07/20 01:04:00 UTC

[jira] [Commented] (IMPALA-11437) Set different parallelism for scan and non-scan fragments

    [ https://issues.apache.org/jira/browse/IMPALA-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568774#comment-17568774 ] 

Quanlong Huang commented on IMPALA-11437:
-----------------------------------------

Thanks for filing this!

Could you also share the results of setting MT_DOP to #CPU or half of the #CPU?

I'm thinking if we can set MT_DOP to a higher value which is close to the # of scanner threads used in MT_DOP=0, and then reduce the parallelism of non-scan fragments intelligently by IMPALA-9808.

CC [~drorke], [~rizaon]

> Set different parallelism for scan and non-scan fragments
> ---------------------------------------------------------
>
>                 Key: IMPALA-11437
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11437
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: YifanZhang
>            Priority: Major
>
> Currently we can only set the same parallelism for all fragments in a query by setting 'mt_dop'. But sometimes we can get faster scanning at mt_dop=0 than at mt_dop>0, because the scanners are also multiple threaded when mt_dop=0.
>  
> In a  15 node impala cluster we can get following results:
> {code:java}
> # set mt_dop=0; 
> +---------------------+--------+-------+----------+----------+---------+------------+-----------+---------------+---------------------------------+
> | Operator            | #Hosts | #Inst | Avg Time | Max Time | #Rows   | Est. #Rows | Peak Mem  | Est. Peak Mem | Detail                          |
> +---------------------+--------+-------+----------+----------+---------+------------+-----------+---------------+---------------------------------+
> | F02:ROOT            | 1      | 1     | 43.64us  | 43.64us  |         |            | 0 B       | 0 B           |                                 |
> | 05:MERGING-EXCHANGE | 1      | 1     | 38.90us  | 38.90us  | 4       | 72.25M     | 64.00 KB  | 141.70 MB     | UNPARTITIONED                   |
> | F01:EXCHANGE SENDER | 14     | 14    | 43.51us  | 64.53us  |         |            | 1.05 KB   | 0 B           |                                 |
> | 02:SORT             | 14     | 14    | 147.32us | 210.49us | 4       | 72.25M     | 12.02 MB  | 590.60 MB     |                                 |
> | 04:AGGREGATE        | 14     | 14    | 1.39ms   | 1.65ms   | 4       | 72.25M     | 34.04 MB  | 128.00 MB     | FINALIZE                        |
> | 03:EXCHANGE         | 14     | 14    | 33.03us  | 126.70us | 56      | 72.25M     | 120.00 KB | 11.70 MB      | HASH(l_returnflag,l_linestatus) |
> | F00:EXCHANGE SENDER | 14     | 14    | 187.56us | 214.18us |         |            | 26.50 KB  | 0 B           |                                 |
> | 01:AGGREGATE        | 14     | 14    | 2.20s    | 2.44s    | 56      | 72.25M     | 34.28 MB  | 128.00 MB     | STREAMING                       |
> | 00:SCAN HDFS        | 14     | 14    | 163.41ms | 190.82ms | 600.04M | 72.25M     | 683.66 MB | 616.00 MB     | tpch_parquet_100g.lineitem      |
> +---------------------+--------+-------+----------+----------+---------+------------+-----------+---------------+---------------------------------+
>  
> # set mt_dop=2;
> +---------------------+--------+-------+----------+----------+---------+------------+-----------+---------------+---------------------------------+
> | Operator            | #Hosts | #Inst | Avg Time | Max Time | #Rows   | Est. #Rows | Peak Mem  | Est. Peak Mem | Detail                          |
> +---------------------+--------+-------+----------+----------+---------+------------+-----------+---------------+---------------------------------+
> | F02:ROOT            | 1      | 1     | 32.11us  | 32.11us  |         |            | 0 B       | 0 B           |                                 |
> | 05:MERGING-EXCHANGE | 1      | 1     | 45.53us  | 45.53us  | 4       | 72.25M     | 64.00 KB  | 283.39 MB     | UNPARTITIONED                   |
> | F01:EXCHANGE SENDER | 14     | 28    | 42.33us  | 86.27us  |         |            | 1.05 KB   | 0 B           |                                 |
> | 02:SORT             | 14     | 28    | 168.09us | 257.61us | 4       | 72.25M     | 12.02 MB  | 295.30 MB     |                                 |
> | 04:AGGREGATE        | 14     | 28    | 1.49ms   | 1.77ms   | 4       | 72.25M     | 34.04 MB  | 128.00 MB     | FINALIZE                        |
> | 03:EXCHANGE         | 14     | 28    | 26.13us  | 134.21us | 112     | 72.25M     | 232.00 KB | 13.39 MB      | HASH(l_returnflag,l_linestatus) |
> | F00:EXCHANGE SENDER | 14     | 28    | 287.99us | 476.30us |         |            | 37.00 KB  | 0 B           |                                 |
> | 01:AGGREGATE        | 14     | 28    | 1.03s    | 1.18s    | 112     | 72.25M     | 34.28 MB  | 128.00 MB     | STREAMING                       |
> | 00:SCAN HDFS        | 14     | 28    | 2.09s    | 2.40s    | 600.04M | 72.25M     | 43.22 MB  | 88.00 MB      | tpch_parquet_100g.lineitem      |
> +---------------------+--------+-------+----------+----------+---------+------------+-----------+---------------+---------------------------------+ {code}
>  
> It would be good if we can set the degree of parallelism for all fragments except for scan fragments, or we can set different parallelism for scan fragments and other fragments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org