You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/10/30 17:29:00 UTC

[jira] [Updated] (IMPALA-1374) Improve Join Order Planning

     [ https://issues.apache.org/jira/browse/IMPALA-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong updated IMPALA-1374:
----------------------------------
    Issue Type: Improvement  (was: Bug)

> Improve Join Order Planning
> ---------------------------
>
>                 Key: IMPALA-1374
>                 URL: https://issues.apache.org/jira/browse/IMPALA-1374
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 1.3.2
>            Reporter: Ryan Bosshart
>            Priority: Minor
>              Labels: performance, planner
>         Attachments: consolidatedqueries_fast, consolidatedqueries_slow
>
>
> The join order is determined entirely by total size (#rows * column width). This makes sense in general. However, when the fact table size (after partition pruning) is close to the dim table, it can be a wrong choice because the join key from the fact table is duplicated many many times. This will make the hash chain very long.
> On an almost identical query (similar join condition, tables, & number of results), this caused a query time of ~10 seconds for one query and ~3 minutes for the other (first row fetched, queries attached).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org