You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/03/27 17:20:00 UTC

[jira] [Created] (IMPALA-9560) Changing version from 3.4.0-SNAPSHOT to 3.4.0-RELEASE breaks TestStatsExtrapolation

Joe McDonnell created IMPALA-9560:
-------------------------------------

             Summary: Changing version from 3.4.0-SNAPSHOT to 3.4.0-RELEASE breaks TestStatsExtrapolation
                 Key: IMPALA-9560
                 URL: https://issues.apache.org/jira/browse/IMPALA-9560
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
    Affects Versions: Impala 3.4.0
            Reporter: Joe McDonnell
            Assignee: Joe McDonnell


When working on the Impala 3.4 release, we changed the version on branch-3.4.0 from 3.4.0-SNAPSHOT to 3.4.0-RELEASE. 

metadata/test_stats_extrapolation.py::TestStatsExtrapolation::test_stats_extrapolation() now fails with the following error:
{noformat}
metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation
    self.run_test_case('QueryTest/stats-extrapolation', vector, unique_database)
common/impala_test_suite.py:690: in run_test_case
    self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:523: in __verify_results_and_errors
    replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
    VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:246: in verify_query_result_is_subset
    assert expected_literal_strings <= actual_literal_strings
E   assert Items in expected results not found in actual results:
E     '   tuple-ids=0 row-size=4B cardinality=17.91K'
E     Items in actual results:
E     '|  output exprs: id'
E     ''
E     '     table: rows=unavailable size=unavailable'
E     '   stored statistics:'
E     'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2'
E     '     columns: unavailable'
E     '     partitions: 0/24 rows=unavailable'
E     '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]'
E     '   tuple-ids=0 row-size=4B cardinality=17.90K'
E     '|'
E     'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes'
E     'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1'
E     '   HDFS partitions=24/24 files=36 size=281.43KB'
E     'test_stats_extrapolation_5c6bdfd.alltypes'
E     'PLAN-ROOT SINK'
E     '|  mem-estimate=0B mem-reservation=0B thread-reservation=0'
E     '|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2'
E     '   in pipelines: 00(GETNEXT)'
E     '   extrapolated-rows=unavailable max-scan-range-rows=unavailable'
E     'Per-Host Resource Estimates: Memory=16MB'
E     'WARNING: The following tables are missing relevant table and/or column statistics.'
E     '   mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1'{noformat}
The output is expecting a cardinality of 17.91K, but instead the cardinality is 17.90K.

The RELEASE version has one character fewer than the SNAPSHOT version. The version gets embedded in parquet files, so the parquet file is slightly smaller than before. The test is estimating cardinality by looking at the size of the parquet file. Apparently, this is right on the edge.

This test should tolerate this difference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org