You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Michael Brown (JIRA)" <ji...@apache.org> on 2017/11/20 20:39:00 UTC
[jira] [Resolved] (IMPALA-6109) Hbase in minicluster appears to be
flaky
[ https://issues.apache.org/jira/browse/IMPALA-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Brown resolved IMPALA-6109.
-----------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.11.0
{noformat}
commit 10334260d893c8bca03349f23e34ad8832aa1d51
Author: Lars Volker <lv...@cloudera.com>
Date: Fri Nov 17 15:24:22 2017 -0800
IMPALA-6109: xfail TestHdfsUnknownErrors::test_hdfs_safe_mode_error_255
The test puts the HDFS name node into safe mode to trigger an "Unknown
Error 255" and verifies that the error details can be obtained correctly
via the libHDFS API. However, putting the name node into safe mode can
trip up HBase (HBASE-18738), which causes sporadic failures of our other
HBase tests. To prevent this, we xfail the test until the HBase issue
has been addressed (or we find a better way to trigger a 255 error).
IMPALA-6212 tracks re-enabling the test in the future.
Change-Id: I55979bed07147409949b798d4beb7a3b3b7ec5c3
Reviewed-on: http://gerrit.cloudera.org:8080/8590
Reviewed-by: Sailesh Mukil <sa...@cloudera.com>
Tested-by: Impala Public Jenkins
{noformat}
> Hbase in minicluster appears to be flaky
> ----------------------------------------
>
> Key: IMPALA-6109
> URL: https://issues.apache.org/jira/browse/IMPALA-6109
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Affects Versions: Impala 2.11.0
> Reporter: Tim Armstrong
> Assignee: Lars Volker
> Priority: Critical
> Labels: flaky
> Fix For: Impala 2.11.0
>
>
> I saw a bunch of hbase-related tests failing with errors along the lines below:
> metadata.test_compute_stats.TestHbaseComputeStats.test_hbase_compute_stats[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> metadata.test_compute_stats.TestHbaseComputeStats.test_hbase_compute_stats_incremental[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> query_test.test_hbase_queries.TestHBaseQueries.test_hbase_scan_node[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> query_test.test_join_queries.TestJoinQueries.test_joins_against_hbase[batch_size: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]
> query_test.test_hbase_queries.TestHBaseQueries.test_hbase_row_key[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> query_test.test_observability.TestObservability.test_scan_summary
> query_test.test_hbase_queries.TestHBaseQueries.test_hbase_filters[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> query_test.test_scanners.TestScannersAllTableFormats.test_scanners[batch_size: 0 | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> query_test.test_mt_dop.TestMtDop.test_mt_dop[mt_dop: 2 | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> query_test.test_hbase_queries.TestHBaseQueries.test_hbase_subquery[exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | location: CLOSE | action: FAIL | query: select 1 from alltypessmall order by id]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 0 | location: OPEN | action: CANCEL | query: select row_number() over (partition by int_col order by id) from alltypessmall]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | location: GETNEXT_SCANNER | action: MEM_LIMIT_EXCEEDED | query: select * from alltypes]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 0 | location: GETNEXT | action: MEM_LIMIT_EXCEEDED | query: select 1 from alltypessmall order by id limit 100]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 0 | location: PREPARE_SCANNER | action: MEM_LIMIT_EXCEEDED | query: select count(int_col) from alltypessmall group by id]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 0 | location: OPEN | action: MEM_LIMIT_EXCEEDED | query: select count(*) from alltypessmall]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 0 | location: CLOSE | action: MEM_LIMIT_EXCEEDED | query: select c from (select id c from alltypessmall order by id limit 10) v where c = 1]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 0 | location: CLOSE | action: MEM_LIMIT_EXCEEDED | query: select * from alltypessmall union all select * from alltypessmall]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 0 | location: CLOSE | action: MEM_LIMIT_EXCEEDED | query: select 1 from alltypessmall a join alltypessmall b on a.id != b.id]
> failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | location: PREPARE | action: FAIL | query: select 1 from alltypessmall a join alltypessmall b on a.id = b.id]
> {noformat}
> E Query aborted:RuntimeException: couldn't retrieve HBase table (functional_hbase.alltypessmall) info:
> E This server is in the failed servers list: localhost/127.0.0.1:16202
> E CAUSED BY: FailedServerException: This server is in the failed servers list: localhost/127.0.0.1:16202
> {noformat}
> {noformat}
> E ImpalaBeeswaxException: ImpalaBeeswaxException:
> E INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
> E MESSAGE: RuntimeException: couldn't retrieve HBase table (functional_hbase.alltypessmall) info:
> E Connection refused
> E CAUSED BY: ConnectException: Connection refused
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)