You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/12/22 21:37:00 UTC

[jira] [Resolved] (IMPALA-8722) test_hbase_col_filter failure

     [ https://issues.apache.org/jira/browse/IMPALA-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-8722.
-----------------------------------
    Resolution: Cannot Reproduce

> test_hbase_col_filter failure
> -----------------------------
>
>                 Key: IMPALA-8722
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8722
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 3.3.0
>            Reporter: Bikramjeet Vig
>            Assignee: Vihang Karajgaonkar
>            Priority: Critical
>              Labels: broken-build, flaky
>
> test_hbase_col_filter failure with the following exec params failed in one of the exhaustive runs
> {noformat}
> query_test.test_hbase_queries.TestHBaseQueries.test_hbase_col_filter[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
> {noformat}
> From the logs it seems like the insert query was executed completely around 23:27:42 and the invalidate metadata in impala was run around 23:27:32
> I am not sure if this is due to the log being buffered and written later, but if the insert finished after the invalidate metadata then it probably didn't get the necessary file data and hence the query that expected 3 rows, didn't get any. Note: The insert call is run in hive using "self.run_stmt_in_hive(add_data, username='hdfs')"
> hive server 2 logs:
> {noformat}
> 2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.TableScanOperator: Initializing operator TS[0]
> 2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: Initializing operator SEL[1]
> 2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: SELECT struct<tmp_values_col1:string,tmp_values_col2:string>
> 2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Initializing operator FS[2]
> 2019-06-26T23:27:42,465  INFO [LocalJobRunner Map Task Executor #0] hadoop.InternalParquetRecordReader: block read in memory in 17 ms. row count = 2133979
> 2019-06-26T23:27:42,469  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Using serializer : class org.apache.hadoop.hive.hbase.HBaseSerDe[[:key,cf:c:[k, c]:[string, string]]] and formatter : org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat@c73c2ac
> 2019-06-26T23:27:42,471  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: New Final Path: FS hdfs://localhost:20500/test-warehouse/test_hbase_col_filter_2598223d.db/_tmp.hbase_col_filter_testkeyx/000000_0
> 2019-06-26T23:27:42,479 ERROR [LocalJobRunner Map Task Executor #0] hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> 2019-06-26T23:27:42,482  INFO [LocalJobRunner Map Task Executor #0] hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 2142543 records.
> 2019-06-26T23:27:42,482  INFO [LocalJobRunner Map Task Executor #0] hadoop.InternalParquetRecordReader: at row 0. reading next block
> 2019-06-26T23:27:42,496  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$47/372191955@532dae72
> 2019-06-26T23:27:42,497  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
> 2019-06-26T23:27:42,498  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /0:0:0:0:0:0:0:1:55090, server: localhost/0:0:0:0:0:0:0:1:2181
> 2019-06-26T23:27:42,499  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x16b96a782de00c5, negotiated timeout = 90000
> 2019-06-26T23:27:42,503  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[2]: records written - 1
> 2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: MAP[0]: records read - 1
> 2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: MAP[0]: Total records read - 1. abort - false
> 2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: DESERIALIZE_ERRORS:0, RECORDS_IN:1, 
> 2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[2]: records written - 1
> 2019-06-26T23:27:42,511  INFO [SpillThread] mapred.MapTask: Finished spill 3
> 2019-06-26T23:27:42,514  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: TOTAL_TABLE_ROWS_WRITTEN:1, RECORDS_OUT_1_test_hbase_col_filter_2598223d.hbase_col_filter_testkeyx:1, 
> 2019-06-26T23:27:42,517  INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task:attempt_local258751042_0023_m_000000_0 is done. And is in the process of committing
> 2019-06-26T23:27:42,518  INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task 'attempt_local258751042_0023_m_000000_0' done.
> 2019-06-26T23:27:42,518  INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Final Counters for attempt_local258751042_0023_m_000000_0: Counters: 26
>         File System Counters
>                 FILE: Number of bytes read=823764465
>                 FILE: Number of bytes written=1184809057
>                 FILE: Number of read operations=0
>                 FILE: Number of large read operations=0
>                 FILE: Number of write operations=0
>                 HDFS: Number of bytes read=49758306
>                 HDFS: Number of bytes written=1721
>                 HDFS: Number of read operations=2308
>                 HDFS: Number of large read operations=0
>                 HDFS: Number of write operations=907
>                 HDFS: Number of bytes read erasure-coded=0
>         Map-Reduce Framework
>                 Map input records=1
>                 Map output records=0
>                 Input split bytes=276
>                 Spilled Records=0
>                 Failed Shuffles=0
>                 Merged Map outputs=0
>                 GC time elapsed (ms)=0
>                 Total committed heap usage (bytes)=1310720000
>         HIVE
>                 CREATED_FILES=1
>                 DESERIALIZE_ERRORS=0
>                 RECORDS_IN=1
>                 RECORDS_OUT_1_test_hbase_col_filter_2598223d.hbase_col_filter_t=1
>                 TOTAL_TABLE_ROWS_WRITTEN=1
>         File Input Format Counters 
>                 Bytes Read=0
>         File Output Format Counters 
>                 Bytes Written=0
> {noformat}
> failed test log:
> {noformat}
> Stacktrace
> query_test/test_hbase_queries.py:91: in test_hbase_col_filter
>     self.run_test_case('QueryTest/hbase-col-filter', vector, unique_database)
> common/impala_test_suite.py:592: in run_test_case
>     self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:431: in __verify_results_and_errors
>     replace_filenames_with_placeholder)
> common/test_result_verifier.py:449: in verify_raw_results
>     VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:271: in verify_query_result_is_equal
>     assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E     3 != 0
> Standard Error
> SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_hbase_col_filter_2598223d` CASCADE;
> -- 2019-06-26 23:27:31,609 INFO     MainThread: Started query d342e8d640a4497a:27d6b2ff00000000
> SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_hbase_col_filter_2598223d`;
> -- 2019-06-26 23:27:32,524 INFO     MainThread: Started query 6245118433382eee:491c3c5f00000000
> -- 2019-06-26 23:27:32,525 INFO     MainThread: Created database "test_hbase_col_filter_2598223d" for test ID "query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]"
> -- executing against localhost:21000
> invalidate metadata test_hbase_col_filter_2598223d.hbase_col_filter_testkeyx;
> -- 2019-06-26 23:27:43,606 INFO     MainThread: Started query 8042663402a431b0:d18c468900000000
> SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> -- executing against localhost:21000
> use test_hbase_col_filter_2598223d;
> -- 2019-06-26 23:27:43,613 INFO     MainThread: Started query b34ba175430fa8ab:d338ded600000000
> SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select count(*) from hbase_col_filter_testkeyx
> WHERE k != 'row1';
> -- 2019-06-26 23:27:47,629 INFO     MainThread: Started query 564e1694a90e24b3:2e5ccc1f00000000
> -- 2019-06-26 23:27:47,703 ERROR    MainThread: Comparing QueryTestResults (expected vs actual):
> 3 != 0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)