You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/11/18 01:39:00 UTC
[jira] [Created] (IMPALA-10334) test_stats_extrapolation output
doesn't match on erasure coding build
Tim Armstrong created IMPALA-10334:
--------------------------------------
Summary: test_stats_extrapolation output doesn't match on erasure coding build
Key: IMPALA-10334
URL: https://issues.apache.org/jira/browse/IMPALA-10334
Project: IMPALA
Issue Type: Bug
Components: Infrastructure
Affects Versions: Impala 4.0
Reporter: Tim Armstrong
Assignee: Qifan Chen
{noformat}
Regression
metadata.test_stats_extrapolation.TestStatsExtrapolation.test_stats_extrapolation[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest)
Failing for the past 1 build (Since Failed#621 )
Took 8.8 sec.
add description
Error Message
metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation self.run_test_case('QueryTest/stats-extrapolation', vector, unique_database) common/impala_test_suite.py:693: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:529: in __verify_results_and_errors replace_filenames_with_placeholder) common/test_result_verifier.py:456: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:278: in verify_query_result_is_equal assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): E row_regex:.*Max Per-Host Resource Reservation: Memory=.* == 'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2' E row_regex:.*Per-Host Resource Estimates: Memory=.* == 'Per-Host Resource Estimates: Memory=16MB' E 'Codegen disabled by planner' == 'Codegen disabled by planner' E row_regex:.*Analyzed query: SELECT id FROM test_stats_extrapolation_.*.alltypes.* == 'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes' E '' == '' E 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' == 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' E row_regex:.*Per-Host Resources: mem-estimate=.* mem-reservation=.* == '| Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2' E 'PLAN-ROOT SINK' == 'PLAN-ROOT SINK' E '| output exprs: id' == '| output exprs: id' E row_regex:.*mem-estimate=.* mem-reservation=.* == '| mem-estimate=0B mem-reservation=0B thread-reservation=0' E '|' == '|' E '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' == '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' E row_regex:.*partitions=12/12 files=12 size=.* == ' HDFS partitions=12/12 files=12 size=93.81KB' E ' stored statistics:' != ' erasure coded: files=12 size=93.81KB' E row_regex:.*table: rows=3.65K size=.* != ' stored statistics:' E ' partitions: 0/12 rows=unavailable' != ' table: rows=3.65K size=93.81KB' E ' columns: all' != ' partitions: 0/12 rows=unavailable' E row_regex:.* extrapolated-rows=3.65K .* != ' columns: all' E row_regex:.*mem-estimate=.* mem-reservation=.* != ' extrapolated-rows=3.65K max-scan-range-rows=307' E ' tuple-ids=0 row-size=4B cardinality=3.65K' != ' mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1' E ' in pipelines: 00(GETNEXT)' != ' tuple-ids=0 row-size=4B cardinality=3.65K' E None != ' in pipelines: 00(GETNEXT)' E Number of rows returned (expected vs actual): 21 != 22
Stacktrace
metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation
self.run_test_case('QueryTest/stats-extrapolation', vector, unique_database)
common/impala_test_suite.py:693: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:529: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:278: in verify_query_result_is_equal
assert expected_results == actual_results
E assert Comparing QueryTestResults (expected vs actual):
E row_regex:.*Max Per-Host Resource Reservation: Memory=.* == 'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2'
E row_regex:.*Per-Host Resource Estimates: Memory=.* == 'Per-Host Resource Estimates: Memory=16MB'
E 'Codegen disabled by planner' == 'Codegen disabled by planner'
E row_regex:.*Analyzed query: SELECT id FROM test_stats_extrapolation_.*.alltypes.* == 'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes'
E '' == ''
E 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' == 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1'
E row_regex:.*Per-Host Resources: mem-estimate=.* mem-reservation=.* == '| Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2'
E 'PLAN-ROOT SINK' == 'PLAN-ROOT SINK'
E '| output exprs: id' == '| output exprs: id'
E row_regex:.*mem-estimate=.* mem-reservation=.* == '| mem-estimate=0B mem-reservation=0B thread-reservation=0'
E '|' == '|'
E '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' == '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]'
E row_regex:.*partitions=12/12 files=12 size=.* == ' HDFS partitions=12/12 files=12 size=93.81KB'
E ' stored statistics:' != ' erasure coded: files=12 size=93.81KB'
E row_regex:.*table: rows=3.65K size=.* != ' stored statistics:'
E ' partitions: 0/12 rows=unavailable' != ' table: rows=3.65K size=93.81KB'
E ' columns: all' != ' partitions: 0/12 rows=unavailable'
E row_regex:.* extrapolated-rows=3.65K .* != ' columns: all'
E row_regex:.*mem-estimate=.* mem-reservation=.* != ' extrapolated-rows=3.65K max-scan-range-rows=307'
E ' tuple-ids=0 row-size=4B cardinality=3.65K' != ' mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1'
E ' in pipelines: 00(GETNEXT)' != ' tuple-ids=0 row-size=4B cardinality=3.65K'
E None != ' in pipelines: 00(GETNEXT)'
E Number of rows returned (expected vs actual): 21 != 22
Standard Error
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2020-10-31 18:50:27,206 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2020-10-31 18:50:27,226 INFO MainThread: Closing active operation
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_stats_extrapolation_5c6bdfd` CASCADE;
-- 2020-10-31 18:50:30,980 INFO MainThread: Started query 384f0c72b59374cd:cf6e5f9e00000000
-- 2020-10-31 18:50:30,983 INFO MainThread: Starting new HTTP connection (1): 0.0.0.0
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_stats_extrapolation_5c6bdfd`;
-- 2020-10-31 18:50:30,996 INFO MainThread: Started query a9448b3bd95d84a1:6680ea7800000000
-- 2020-10-31 18:50:30,998 INFO MainThread: Created database "test_stats_extrapolation_5c6bdfd" for test ID "metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none]"
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
-- executing against localhost:21000
use test_stats_extrapolation_5c6bdfd;
-- 2020-10-31 18:50:31,002 INFO MainThread: Started query d847216bd7fae3d5:9af8403900000000
SET client_identifier=metadata/test_stats_extrapolation.py::TestStatsExtrapolation::()::test_stats_extrapolation[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_t;
SET explain_level=2;
SET batch_size=0;
SET num_nodes=1;
SET disable_codegen_rows_threshold=5000;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- 2020-10-31 18:50:31,003 INFO MainThread: Loading query test file: /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
-- 2020-10-31 18:50:31,005 INFO MainThread: Starting new HTTP connection (1): localhost
-- executing against localhost:21000
create table alltypes sort by (id) like functional_parquet.alltypes;
-- 2020-10-31 18:50:31,078 INFO MainThread: Started query c74e0815ada71327:cc823c3c00000000
-- executing against localhost:21000
alter table alltypes set tblproperties("impala.enable.stats.extrapolation"="true");
-- 2020-10-31 18:50:35,014 INFO MainThread: Started query c54959345920c672:8db7865200000000
-- executing against localhost:21000
insert into alltypes partition(year, month)
select * from functional_parquet.alltypes where year = 2009;
-- 2020-10-31 18:50:35,024 INFO MainThread: Started query 6d4c89d58f2988bc:b34380fc00000000
-- executing against localhost:21000
explain select id from alltypes;
-- 2020-10-31 18:50:35,437 INFO MainThread: Started query 744653d34cf0878b:d54b970900000000
-- executing against localhost:21000
SET DISABLE_HDFS_NUM_ROWS_ESTIMATE=1;
-- 2020-10-31 18:50:35,443 INFO MainThread: Started query 5e45825e43253fd0:027e842e00000000
-- executing against localhost:21000
explain select id from alltypes;
-- 2020-10-31 18:50:35,450 INFO MainThread: Started query 4e4ca07ae38321a0:d82f0f4900000000
-- executing against localhost:21000
SET DISABLE_HDFS_NUM_ROWS_ESTIMATE="0";
-- 2020-10-31 18:50:35,457 INFO MainThread: Started query 9e4b8ce22884bdd8:dc6eefaa00000000
-- executing against localhost:21000
compute stats alltypes;
-- 2020-10-31 18:50:35,463 INFO MainThread: Started query 794fc0e9c9141aef:bc6b392c00000000
-- executing against localhost:21000
show table stats alltypes;
-- 2020-10-31 18:50:35,971 INFO MainThread: Started query a0462aa89ace75d0:253c408b00000000
-- executing against localhost:21000
explain select id from alltypes;
-- 2020-10-31 18:50:35,980 INFO MainThread: Started query 404884ad85bf8458:8549f40a00000000
-- 2020-10-31 18:50:35,994 ERROR MainThread: Comparing QueryTestResults (expected vs actual):
row_regex:.*Max Per-Host Resource Reservation: Memory=.* == 'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2'
row_regex:.*Per-Host Resource Estimates: Memory=.* == 'Per-Host Resource Estimates: Memory=16MB'
'Codegen disabled by planner' == 'Codegen disabled by planner'
row_regex:.*Analyzed query: SELECT id FROM test_stats_extrapolation_.*.alltypes.* == 'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes'
'' == ''
'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' == 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1'
row_regex:.*Per-Host Resources: mem-estimate=.* mem-reservation=.* == '| Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2'
'PLAN-ROOT SINK' == 'PLAN-ROOT SINK'
'| output exprs: id' == '| output exprs: id'
row_regex:.*mem-estimate=.* mem-reservation=.* == '| mem-estimate=0B mem-reservation=0B thread-reservation=0'
'|' == '|'
'00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' == '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]'
row_regex:.*partitions=12/12 files=12 size=.* == ' HDFS partitions=12/12 files=12 size=93.81KB'
' stored statistics:' != ' erasure coded: files=12 size=93.81KB'
row_regex:.*table: rows=3.65K size=.* != ' stored statistics:'
' partitions: 0/12 rows=unavailable' != ' table: rows=3.65K size=93.81KB'
' columns: all' != ' partitions: 0/12 rows=unavailable'
row_regex:.* extrapolated-rows=3.65K .* != ' columns: all'
row_regex:.*mem-estimate=.* mem-reservation=.* != ' extrapolated-rows=3.65K max-scan-range-rows=307'
' tuple-ids=0 row-size=4B cardinality=3.65K' != ' mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1'
' in pipelines: 00(GETNEXT)' != ' tuple-ids=0 row-size=4B cardinality=3.65K'
None != ' in pipelines: 00(GETNEXT)'
Number of rows returned (expected vs actual): 21 != 22
{noformat}
IMPALA-7097 added the extra line here: ' erasure coded: files=12 size=93.81KB'
It might be OK to just skip this since it's not directly related to the erasure coding functionality.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org