You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tamas Mate (Jira)" <ji...@apache.org> on 2020/04/27 19:04:00 UTC

[jira] [Updated] (IMPALA-9699) Skip '-1' values when aggregating num_null statistics

     [ https://issues.apache.org/jira/browse/IMPALA-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tamas Mate updated IMPALA-9699:
-------------------------------
    Summary: Skip '-1' values when aggregating num_null statistics  (was: Skip '-1' values when aggregating num_null incremental statistics)

> Skip '-1' values when aggregating num_null statistics
> -----------------------------------------------------
>
>                 Key: IMPALA-9699
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9699
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>    Affects Versions: Impala 3.3.0
>            Reporter: Tamas Mate
>            Assignee: Tamas Mate
>            Priority: Major
>              Labels: backwards-compatibility
>
> IMPALA-7659 added the population of NULL counts while computing stats, later IMPALA-8566 fixed an accuracy issue caused by the initialization of statistics. The initial value was changed from '-1' to '0'. The fix also contained a slight change on how the values are being summarized. Earlier the negative values were excluded from the summary:
> {code:java}
> if (num_new_nulls >= 0) num_nulls += num_new_nulls;
> {code}
> while in the new implementation, as these values should not be negative, the condition was removed:
> {code:java}
> num_nulls += num_new_nulls;
> {code}
> This change does not cause any problem for stats created after this fix, however it can make table metadata unavailable between earlier and newer releases. The metadata can be invalid if a compute incremental stats is issued on a partition because the '-1' values can decrease the column level num_nulls under '-1'. Later a smaller than '-1' num_null will fail on a precondition check when CatalogD is trying to fetch the table metadata.
> The condition should not cause any problem and due to backward compatibility reasons we should put it back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org