You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tianyi Wang (JIRA)" <ji...@apache.org> on 2018/04/17 21:44:00 UTC

[jira] [Resolved] (IMPALA-6822) Provide a query option to not shuffle on distinct exprs

     [ https://issues.apache.org/jira/browse/IMPALA-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tianyi Wang resolved IMPALA-6822.
---------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.12.0
                   Impala 3.0

> Provide a query option to not shuffle on distinct exprs
> -------------------------------------------------------
>
>                 Key: IMPALA-6822
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6822
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 2.13.0
>            Reporter: Tianyi Wang
>            Assignee: Tianyi Wang
>            Priority: Critical
>              Labels: performance, planner, regression
>             Fix For: Impala 3.0, Impala 2.12.0
>
>
> After IMPALA-4794, in a distinct aggregation, data will be shuffled on grouping exprs and distinct expr. It works well if the NDV of grouping exprs is low, but is an regression otherwise. We should provide a query operation to disable IMPALA-4794 and probably look to do smarter planning in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)