You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Riza Suminto (Jira)" <ji...@apache.org> on 2023/06/12 15:10:00 UTC

[jira] [Resolved] (IMPALA-12183) Maintain cardinality clamping across multi-phase aggregation

     [ https://issues.apache.org/jira/browse/IMPALA-12183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Riza Suminto resolved IMPALA-12183.
-----------------------------------
    Fix Version/s: Impala 4.3.0
       Resolution: Fixed

> Maintain cardinality clamping across multi-phase aggregation
> ------------------------------------------------------------
>
>                 Key: IMPALA-12183
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12183
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.2.0
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>             Fix For: Impala 4.3.0
>
>
> In the Impala planner, an aggregation node's cardinality is a sum of all its aggregation class cardinality. An aggregation class cardinality is a simple multiplication of NDVs of contributing grouping columns. Since this simple multiplication of NDVs can be greater than the aggregation node's input cardinality, each aggregation class cardinality is further clamped at the aggregation node's input cardinality.
> An aggregation operator can translate into a chain of multi-phase aggregation plan nodes. The longest possible aggregation phase is as follows, from the bottom to the top:
>  # FIRST
>  # FIRST_MERGE
>  # SECOND
>  # SECOND_MERGE
>  # TRANSPOSE
> FIRST_MERGE aggregation maintains its aggregation class cardinality clamping at its corresponding FIRST aggregation's input cardinality (similar relationship between SECOND_MERGE and SECOND). However, the SECOND aggregation was clamped at the FIRST_MERGE output cardinality instead of the FIRST input cardinality. This cardinality mispropagation can causes cardinality explosion in the later aggregation phase and node operator above them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org