You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/11/18 01:39:00 UTC
[jira] [Created] (IMPALA-10334) test_stats_extrapolation output doesn't match on erasure coding build

Tim Armstrong created IMPALA-10334:
--------------------------------------

             Summary: test_stats_extrapolation output doesn't match on erasure coding build
                 Key: IMPALA-10334
                 URL: https://issues.apache.org/jira/browse/IMPALA-10334
             Project: IMPALA
          Issue Type: Bug
          Components: Infrastructure
    Affects Versions: Impala 4.0
            Reporter: Tim Armstrong
            Assignee: Qifan Chen


{noformat}
Regression

metadata.test_stats_extrapolation.TestStatsExtrapolation.test_stats_extrapolation[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest)
Failing for the past 1 build (Since Failed#621 )
Took 8.8 sec.
add description
Error Message

metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation     self.run_test_case('QueryTest/stats-extrapolation', vector, unique_database) common/impala_test_suite.py:693: in run_test_case     self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:529: in __verify_results_and_errors     replace_filenames_with_placeholder) common/test_result_verifier.py:456: in verify_raw_results     VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:278: in verify_query_result_is_equal     assert expected_results == actual_results E   assert Comparing QueryTestResults (expected vs actual): E     row_regex:.*Max Per-Host Resource Reservation: Memory=.* == 'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2' E     row_regex:.*Per-Host Resource Estimates: Memory=.* == 'Per-Host Resource Estimates: Memory=16MB' E     'Codegen disabled by planner' == 'Codegen disabled by planner' E     row_regex:.*Analyzed query: SELECT id FROM test_stats_extrapolation_.*.alltypes.* == 'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes' E     '' == '' E     'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' == 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' E     row_regex:.*Per-Host Resources: mem-estimate=.* mem-reservation=.* == '|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2' E     'PLAN-ROOT SINK' == 'PLAN-ROOT SINK' E     '|  output exprs: id' == '|  output exprs: id' E     row_regex:.*mem-estimate=.* mem-reservation=.* == '|  mem-estimate=0B mem-reservation=0B thread-reservation=0' E     '|' == '|' E     '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' == '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' E     row_regex:.*partitions=12/12 files=12 size=.* == '   HDFS partitions=12/12 files=12 size=93.81KB' E     '   stored statistics:' != '   erasure coded: files=12 size=93.81KB' E     row_regex:.*table: rows=3.65K size=.* != '   stored statistics:' E     '     partitions: 0/12 rows=unavailable' != '     table: rows=3.65K size=93.81KB' E     '     columns: all' != '     partitions: 0/12 rows=unavailable' E     row_regex:.* extrapolated-rows=3.65K .* != '     columns: all' E     row_regex:.*mem-estimate=.* mem-reservation=.* != '   extrapolated-rows=3.65K max-scan-range-rows=307' E     '   tuple-ids=0 row-size=4B cardinality=3.65K' != '   mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1' E     '   in pipelines: 00(GETNEXT)' != '   tuple-ids=0 row-size=4B cardinality=3.65K' E     None != '   in pipelines: 00(GETNEXT)' E     Number of rows returned (expected vs actual): 21 != 22

Stacktrace

metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation
    self.run_test_case('QueryTest/stats-extrapolation', vector, unique_database)
common/impala_test_suite.py:693: in run_test_case
    self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:529: in __verify_results_and_errors
    replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
    VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:278: in verify_query_result_is_equal
    assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E     row_regex:.*Max Per-Host Resource Reservation: Memory=.* == 'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2'
E     row_regex:.*Per-Host Resource Estimates: Memory=.* == 'Per-Host Resource Estimates: Memory=16MB'
E     'Codegen disabled by planner' == 'Codegen disabled by planner'
E     row_regex:.*Analyzed query: SELECT id FROM test_stats_extrapolation_.*.alltypes.* == 'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes'
E     '' == ''
E     'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' == 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1'
E     row_regex:.*Per-Host Resources: mem-estimate=.* mem-reservation=.* == '|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2'
E     'PLAN-ROOT SINK' == 'PLAN-ROOT SINK'
E     '|  output exprs: id' == '|  output exprs: id'
E     row_regex:.*mem-estimate=.* mem-reservation=.* == '|  mem-estimate=0B mem-reservation=0B thread-reservation=0'
E     '|' == '|'
E     '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' == '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]'
E     row_regex:.*partitions=12/12 files=12 size=.* == '   HDFS partitions=12/12 files=12 size=93.81KB'
E     '   stored statistics:' != '   erasure coded: files=12 size=93.81KB'
E     row_regex:.*table: rows=3.65K size=.* != '   stored statistics:'
E     '     partitions: 0/12 rows=unavailable' != '     table: rows=3.65K size=93.81KB'
E     '     columns: all' != '     partitions: 0/12 rows=unavailable'
E     row_regex:.* extrapolated-rows=3.65K .* != '     columns: all'
E     row_regex:.*mem-estimate=.* mem-reservation=.* != '   extrapolated-rows=3.65K max-scan-range-rows=307'
E     '   tuple-ids=0 row-size=4B cardinality=3.65K' != '   mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1'
E     '   in pipelines: 00(GETNEXT)' != '   tuple-ids=0 row-size=4B cardinality=3.65K'
E     None != '   in pipelines: 00(GETNEXT)'
E     Number of rows returned (expected vs actual): 21 != 22

Standard Error

SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2020-10-31 18:50:27,206 INFO     MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2020-10-31 18:50:27,226 INFO     MainThread: Closing active operation
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
SET sync_ddl=False;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_stats_extrapolation_5c6bdfd` CASCADE;

-- 2020-10-31 18:50:30,980 INFO     MainThread: Started query 384f0c72b59374cd:cf6e5f9e00000000
-- 2020-10-31 18:50:30,983 INFO     MainThread: Starting new HTTP connection (1): 0.0.0.0
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
SET sync_ddl=False;
-- executing against localhost:21000

CREATE DATABASE `test_stats_extrapolation_5c6bdfd`;

-- 2020-10-31 18:50:30,996 INFO     MainThread: Started query a9448b3bd95d84a1:6680ea7800000000
-- 2020-10-31 18:50:30,998 INFO     MainThread: Created database "test_stats_extrapolation_5c6bdfd" for test ID "metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none]"
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
-- executing against localhost:21000

use test_stats_extrapolation_5c6bdfd;

-- 2020-10-31 18:50:31,002 INFO     MainThread: Started query d847216bd7fae3d5:9af8403900000000
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
SET explain_level=2;
SET batch_size=0;
SET num_nodes=1;
SET disable_codegen_rows_threshold=5000;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- 2020-10-31 18:50:31,003 INFO     MainThread: Loading query test file: /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
-- 2020-10-31 18:50:31,005 INFO     MainThread: Starting new HTTP connection (1): localhost
-- executing against localhost:21000

create table alltypes sort by (id) like functional_parquet.alltypes;

-- 2020-10-31 18:50:31,078 INFO     MainThread: Started query c74e0815ada71327:cc823c3c00000000
-- executing against localhost:21000


alter table alltypes set tblproperties("impala.enable.stats.extrapolation"="true");

-- 2020-10-31 18:50:35,014 INFO     MainThread: Started query c54959345920c672:8db7865200000000
-- executing against localhost:21000


insert into alltypes partition(year, month)
select * from functional_parquet.alltypes where year = 2009;

-- 2020-10-31 18:50:35,024 INFO     MainThread: Started query 6d4c89d58f2988bc:b34380fc00000000
-- executing against localhost:21000

explain select id from alltypes;

-- 2020-10-31 18:50:35,437 INFO     MainThread: Started query 744653d34cf0878b:d54b970900000000
-- executing against localhost:21000

SET DISABLE_HDFS_NUM_ROWS_ESTIMATE=1;

-- 2020-10-31 18:50:35,443 INFO     MainThread: Started query 5e45825e43253fd0:027e842e00000000
-- executing against localhost:21000


explain select id from alltypes;

-- 2020-10-31 18:50:35,450 INFO     MainThread: Started query 4e4ca07ae38321a0:d82f0f4900000000
-- executing against localhost:21000

SET DISABLE_HDFS_NUM_ROWS_ESTIMATE="0";

-- 2020-10-31 18:50:35,457 INFO     MainThread: Started query 9e4b8ce22884bdd8:dc6eefaa00000000
-- executing against localhost:21000

compute stats alltypes;

-- 2020-10-31 18:50:35,463 INFO     MainThread: Started query 794fc0e9c9141aef:bc6b392c00000000
-- executing against localhost:21000

show table stats alltypes;

-- 2020-10-31 18:50:35,971 INFO     MainThread: Started query a0462aa89ace75d0:253c408b00000000
-- executing against localhost:21000

explain select id from alltypes;

-- 2020-10-31 18:50:35,980 INFO     MainThread: Started query 404884ad85bf8458:8549f40a00000000
-- 2020-10-31 18:50:35,994 ERROR    MainThread: Comparing QueryTestResults (expected vs actual):
row_regex:.*Max Per-Host Resource Reservation: Memory=.* == 'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2'
row_regex:.*Per-Host Resource Estimates: Memory=.* == 'Per-Host Resource Estimates: Memory=16MB'
'Codegen disabled by planner' == 'Codegen disabled by planner'
row_regex:.*Analyzed query: SELECT id FROM test_stats_extrapolation_.*.alltypes.* == 'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes'
'' == ''
'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' == 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1'
row_regex:.*Per-Host Resources: mem-estimate=.* mem-reservation=.* == '|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2'
'PLAN-ROOT SINK' == 'PLAN-ROOT SINK'
'|  output exprs: id' == '|  output exprs: id'
row_regex:.*mem-estimate=.* mem-reservation=.* == '|  mem-estimate=0B mem-reservation=0B thread-reservation=0'
'|' == '|'
'00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' == '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]'
row_regex:.*partitions=12/12 files=12 size=.* == '   HDFS partitions=12/12 files=12 size=93.81KB'
'   stored statistics:' != '   erasure coded: files=12 size=93.81KB'
row_regex:.*table: rows=3.65K size=.* != '   stored statistics:'
'     partitions: 0/12 rows=unavailable' != '     table: rows=3.65K size=93.81KB'
'     columns: all' != '     partitions: 0/12 rows=unavailable'
row_regex:.* extrapolated-rows=3.65K .* != '     columns: all'
row_regex:.*mem-estimate=.* mem-reservation=.* != '   extrapolated-rows=3.65K max-scan-range-rows=307'
'   tuple-ids=0 row-size=4B cardinality=3.65K' != '   mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1'
'   in pipelines: 00(GETNEXT)' != '   tuple-ids=0 row-size=4B cardinality=3.65K'
None != '   in pipelines: 00(GETNEXT)'
Number of rows returned (expected vs actual): 21 != 22
{noformat}

IMPALA-7097 added the extra line here: '   erasure coded: files=12 size=93.81KB'

It might be OK to just skip this since it's not directly related to the erasure coding functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org