You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/07/14 01:38:00 UTC

[jira] [Created] (IMPALA-9951) Skew in analytic sorts when partition key has low cardinality

Tim Armstrong created IMPALA-9951:
-------------------------------------

             Summary: Skew in analytic sorts when partition key has low cardinality
                 Key: IMPALA-9951
                 URL: https://issues.apache.org/jira/browse/IMPALA-9951
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Tim Armstrong
            Assignee: Tim Armstrong


In queries like TPC-DS Q67, the cardinality of the PARTITION BY expression of the analytic may be much lower than the parallelism of the input fragment. In this case the runtime of the sort can be skewed. We could mitigate the problem by doing the expensive sort *before* the exchange, so that the analytic fragment only needs to merge together its sorted input and evaluate the analytic over it.

The impact of this is greater with multithreading, so I am considering only change the default when mt_dop > 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)