You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/11/09 06:23:00 UTC

[jira] [Commented] (IMPALA-11666) Consider revising the warning message when hasCorruptTableStats_ is true for a table

    [ https://issues.apache.org/jira/browse/IMPALA-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630770#comment-17630770 ] 

ASF subversion and git services commented on IMPALA-11666:
----------------------------------------------------------

Commit 4e6692b024b3e53afa0feae94b0def58e0ac8655 in impala's branch refs/heads/master from Fang-Yu Rao
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4e6692b02 ]

IMPALA-11686: Fix test_corrupt_stat after IMPALA-11666

IMPALA-11666 revised the message in the query plans when there are
potentially corrupt statistics, which broke test_corrupt_stat, an E2E
test only run in the exhaustive tests. This patch fixes the test file
accordingly.

Testing:
 - Verified locally that the patch passes test_corrupt_stat.

Change-Id: I817c7807a07bb89b93d795bce958b9872eff2eef
Reviewed-on: http://gerrit.cloudera.org:8080/19224
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Consider revising the warning message when hasCorruptTableStats_ is true for a table
> ------------------------------------------------------------------------------------
>
>                 Key: IMPALA-11666
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11666
>             Project: IMPALA
>          Issue Type: Task
>          Components: Frontend
>            Reporter: Fang-Yu Rao
>            Assignee: Fang-Yu Rao
>            Priority: Major
>
> Currently, '{{{}hasCorruptTableStats_{}}}' of an HDFS table is set to true when one of the following is true in [HdfsScanNode.java|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java].
>  # Its '{{{}cardinality_{}}}' less than -1.
>  # The number of rows in one of its partition is less than -1.
>  # The number of rows in one of its partition is 0 but the size of the associated files of this partition is greater than 0.
>  # The number of rows in the table is 0 but the size of the associated files of this table is greater than 0.
> For such a table, the {{EXPLAIN}} statement for queries involving the table would contain the message of "{{{}WARNING: The following tables have potentially corrupt table statistics. Drop and re-compute statistics to resolve this problem.{}}}"
> The warning message may be a bit too scary for an Impala user especially if we consider the fact that a table without corrupt statistics could indeed have its '{{{}hasCorruptTableStats_{}}}' set to true by Impala's frontend.
> Specifically, a table without corrupt statistics but having its '{{{}hasCorruptTableStats_{}}}' set to 1 could be created as follows after starting the Impala cluster.
>  # Execute on the command line "{{{}beeline -u "jdbc:hive2://localhost:11050/default"{}}}" to enter beeline.
>  # Create a transactional table in beeline via "{{{}create table test_db.test_tbl_01 (id int, name string) stored as orc tblproperties ('transactional'='true'){}}}".
>  # Insert a row into the table just created in beeline via "{{{}insert into table test_db.test_tbl_01 (1, "Alex");{}}}".
>  # Delete the row just inserted in beeline via "{{{}delete from test_db.test_tbl_01 where id = 1{}}}".
> # In Impala shell, execute "{{compute stats test_db.test_tbl_01}}".
>  # In Impala shell, execute "{{{}explain select * from test_db.test_tbl_01{}}}" to verify that the warning message described above appears in the output.
> The table '{{{}test_tbl_01{}}}' above has 0 row but the associated file size is greater than 0.
> It may be better that we revise the warning message to something less scary as shown below.
> {code:java}
> The number of rows in the following tables or in a partition of them has 0 or fewer than -1 row but positive total file size.
> This does not necessarily imply the existence of corrupt statistics.
> In the case of corrupt statistics, drop and re-compute statistics could resolve this problem.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org