You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2019/02/19 18:56:01 UTC

[jira] [Created] (IMPALA-8220) Track adjusted NDV for each column through plan tree

Paul Rogers created IMPALA-8220:
-----------------------------------

             Summary: Track adjusted NDV for each column through plan tree
                 Key: IMPALA-8220
                 URL: https://issues.apache.org/jira/browse/IMPALA-8220
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
    Affects Versions: Impala 3.1.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers


See IMPALA-XXXX. To proper account for join cardinalities, we must track the adjusted NDV of columns as they pass through filters. IMPALA-8014, IMPALA-8015 and IMPALA-8213 suggest work-arounds based on the current code design.

A better longer-term solution is to track the adjusted NDV for each column up the plan tree.

That is, suppose we have column {{c}} with an original NDV of {{|c|}}. The scan applies a filter of {{c = 10}}. Clearly, the NDV out of the scan, {{|c'|}} is just 1.

By tracking the filtered NDV, calculations up the tree become local. At present, the join node must reach down through the tree to find filters and potentially reverse them. This is complex and can be replaced with per-column NDV tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)