You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tamas Mate (Jira)" <ji...@apache.org> on 2020/04/27 18:52:00 UTC

[jira] [Created] (IMPALA-9699) Skip '-1' values when aggregating num_null statistics

Tamas Mate created IMPALA-9699:
----------------------------------

             Summary: Skip '-1' values when aggregating num_null statistics
                 Key: IMPALA-9699
                 URL: https://issues.apache.org/jira/browse/IMPALA-9699
             Project: IMPALA
          Issue Type: Improvement
          Components: Catalog
    Affects Versions: Impala 3.3.0
            Reporter: Tamas Mate
            Assignee: Tamas Mate


IMPALA-7659 added the population of NULL counts while computing stats, later IMPALA-8566 fixed an accuracy issue caused by the initialization of statistics. The initial value was changed from '-1' to '0'. The fix also contained a slight change on how the values are being summarized. Earlier the negative values were excluded from the summary:
{code:java}
if (num_new_nulls >= 0) num_nulls += num_new_nulls;
{code}
while in the new implementation, as these values should not be negative, the condition was removed:
{code:java}
num_nulls += num_new_nulls;
{code}
This change does not cause any problem for stats created after this fix, however it can make table metadata unavailable between earlier and newer releases. The metadata can be invalid if a compute incremental stats is issued on a partition because the '-1' values can decrease the column level num_nulls under '-1'. Later a smaller than '-1' num_null will fail on a precondition check when CatalogD is trying to fetch the table metadata.

The condition should not cause any problem and due to backward compatibility reasons we should put it back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org