You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/03/27 23:32:01 UTC

[jira] [Commented] (IMPALA-9560) Changing version from 3.4.0-SNAPSHOT to 3.4.0-RELEASE breaks TestStatsExtrapolation

    [ https://issues.apache.org/jira/browse/IMPALA-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17069116#comment-17069116 ] 

ASF subversion and git services commented on IMPALA-9560:
---------------------------------------------------------

Commit e9dd5d3f8c1d533bc5ae94c7e0677820fcd851aa in impala's branch refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e9dd5d3 ]

IMPALA-9560: Fix TestStatsExtrapolation for release versions

When changing the Impala version from 3.4.0-SNAPSHOT to 3.4.0-RELEASE,
TestStatsExtrapolation::test_stats_extrapolation started failing due
to a difference in the expected cardinality (expected: 17.91K,
actual 17.90K). This is because the Impala version gets embedded into
parquet files, and this causes a slight difference in file size, which
translates into a slight difference in expected cardinality.

This modifies TestStatsExtrapolation::test_stats_extrapolation to
allow any 17.9*K cardinality.

Testing:
 - Tested on master and on branch-3.4.0

Change-Id: Iebe538936f23c095ef58c808e425cfb7b31edd94
Reviewed-on: http://gerrit.cloudera.org:8080/15569
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Changing version from 3.4.0-SNAPSHOT to 3.4.0-RELEASE breaks TestStatsExtrapolation
> -----------------------------------------------------------------------------------
>
>                 Key: IMPALA-9560
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9560
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.4.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Critical
>              Labels: broken-build
>
> When working on the Impala 3.4 release, we changed the version on branch-3.4.0 from 3.4.0-SNAPSHOT to 3.4.0-RELEASE. 
> metadata/test_stats_extrapolation.py::TestStatsExtrapolation::test_stats_extrapolation() now fails with the following error:
> {noformat}
> metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation
>     self.run_test_case('QueryTest/stats-extrapolation', vector, unique_database)
> common/impala_test_suite.py:690: in run_test_case
>     self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
>     replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
>     VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:246: in verify_query_result_is_subset
>     assert expected_literal_strings <= actual_literal_strings
> E   assert Items in expected results not found in actual results:
> E     '   tuple-ids=0 row-size=4B cardinality=17.91K'
> E     Items in actual results:
> E     '|  output exprs: id'
> E     ''
> E     '     table: rows=unavailable size=unavailable'
> E     '   stored statistics:'
> E     'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2'
> E     '     columns: unavailable'
> E     '     partitions: 0/24 rows=unavailable'
> E     '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]'
> E     '   tuple-ids=0 row-size=4B cardinality=17.90K'
> E     '|'
> E     'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes'
> E     'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1'
> E     '   HDFS partitions=24/24 files=36 size=281.43KB'
> E     'test_stats_extrapolation_5c6bdfd.alltypes'
> E     'PLAN-ROOT SINK'
> E     '|  mem-estimate=0B mem-reservation=0B thread-reservation=0'
> E     '|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2'
> E     '   in pipelines: 00(GETNEXT)'
> E     '   extrapolated-rows=unavailable max-scan-range-rows=unavailable'
> E     'Per-Host Resource Estimates: Memory=16MB'
> E     'WARNING: The following tables are missing relevant table and/or column statistics.'
> E     '   mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1'{noformat}
> The output is expecting a cardinality of 17.91K, but instead the cardinality is 17.90K.
> The RELEASE version has one character fewer than the SNAPSHOT version. The version gets embedded in parquet files, so the parquet file is slightly smaller than before. The test is estimating cardinality by looking at the size of the parquet file. Apparently, this is right on the edge.
> This test should tolerate this difference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org