You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ashutosh Chauhan (JIRA)" <ji...@apache.org> on 2018/07/27 23:24:00 UTC

[jira] [Created] (HIVE-20260) NDV of a column shouldn't be scaled when row count is changed by filter on another column

Ashutosh Chauhan created HIVE-20260:
---------------------------------------

             Summary: NDV of a column shouldn't be scaled when row count is changed by filter on another column
                 Key: HIVE-20260
                 URL: https://issues.apache.org/jira/browse/HIVE-20260
             Project: Hive
          Issue Type: Improvement
          Components: Statistics
            Reporter: Ashutosh Chauhan


HIVE-17465 introduced progressive scaling of rowcounts in presence of multiple filters. HIVE-19500 improved on that by also scaling col stats (NDV) in such scenario. However, it should pay attention to column used in filter expression and not scale for all filters. eg.,
consider filter a = 1 and b = 2 ndv of column b should not be scaled down by row count changes caused by a = 1
Other way to say this that ndv of a particular column should be updated at the end of computation of row count for that operator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)