You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Alexander Behm (JIRA)" <ji...@apache.org> on 2017/11/10 02:21:00 UTC

[jira] [Created] (IMPALA-6178) Heuristic for selecting the left-most table should consider parallelism.

Alexander Behm created IMPALA-6178:
--------------------------------------

             Summary: Heuristic for selecting the left-most table should consider parallelism.
                 Key: IMPALA-6178
                 URL: https://issues.apache.org/jira/browse/IMPALA-6178
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
    Affects Versions: Impala 2.10.0
            Reporter: Alexander Behm
            Priority: Critical


For each query block, Impala uses the estimated materialized size of tables as a heuristic for choosing the left-most table in a series of joins. Unfortunately, that heuristic does not factor in the number of hosts that tables will execute on. The number of hosts of the left-most table dictates the degree of inter-node parallelism.

To handle this tradeoff between parallelism and size we should use a heuristic like the one in IMPALA-5612.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)