You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Bikramjeet Vig (JIRA)" <ji...@apache.org> on 2019/06/27 19:12:00 UTC

[jira] [Created] (IMPALA-8722) test_hbase_col_filter failure

Bikramjeet Vig created IMPALA-8722:
--------------------------------------

             Summary: test_hbase_col_filter failure
                 Key: IMPALA-8722
                 URL: https://issues.apache.org/jira/browse/IMPALA-8722
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 3.3.0
            Reporter: Bikramjeet Vig
            Assignee: Vihang Karajgaonkar


test_hbase_col_filter failure with the following exec params failed in one of the exhaustive runs
{noformat}
query_test.test_hbase_queries.TestHBaseQueries.test_hbase_col_filter[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]
{noformat}
From the logs it seems like the insert query was executed completely around 23:27:42 and the invalidate metadata in impala was run around 23:27:32

I am not sure if this is due to the log being buffered and written later, but if the insert finished after the invalidate metadata then it probably didn't get the necessary file data and hence the query that expected 3 rows, didn't get any. Note: The insert call is run in hive using "self.run_stmt_in_hive(add_data, username='hdfs')"

hive server 2 logs:
{noformat}
2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.TableScanOperator: Initializing operator TS[0]
2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: Initializing operator SEL[1]
2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: SELECT struct<tmp_values_col1:string,tmp_values_col2:string>
2019-06-26T23:27:42,456  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Initializing operator FS[2]
2019-06-26T23:27:42,465  INFO [LocalJobRunner Map Task Executor #0] hadoop.InternalParquetRecordReader: block read in memory in 17 ms. row count = 2133979
2019-06-26T23:27:42,469  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Using serializer : class org.apache.hadoop.hive.hbase.HBaseSerDe[[:key,cf:c:[k, c]:[string, string]]] and formatter : org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat@c73c2ac
2019-06-26T23:27:42,471  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: New Final Path: FS hdfs://localhost:20500/test-warehouse/test_hbase_col_filter_2598223d.db/_tmp.hbase_col_filter_testkeyx/000000_0
2019-06-26T23:27:42,479 ERROR [LocalJobRunner Map Task Executor #0] hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
2019-06-26T23:27:42,482  INFO [LocalJobRunner Map Task Executor #0] hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 2142543 records.
2019-06-26T23:27:42,482  INFO [LocalJobRunner Map Task Executor #0] hadoop.InternalParquetRecordReader: at row 0. reading next block
2019-06-26T23:27:42,496  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$47/372191955@532dae72
2019-06-26T23:27:42,497  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
2019-06-26T23:27:42,498  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /0:0:0:0:0:0:0:1:55090, server: localhost/0:0:0:0:0:0:0:1:2181
2019-06-26T23:27:42,499  INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x16b96a782de00c5, negotiated timeout = 90000
2019-06-26T23:27:42,503  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[2]: records written - 1
2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: MAP[0]: records read - 1
2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: MAP[0]: Total records read - 1. abort - false
2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: DESERIALIZE_ERRORS:0, RECORDS_IN:1, 
2019-06-26T23:27:42,504  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[2]: records written - 1
2019-06-26T23:27:42,511  INFO [SpillThread] mapred.MapTask: Finished spill 3
2019-06-26T23:27:42,514  INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: TOTAL_TABLE_ROWS_WRITTEN:1, RECORDS_OUT_1_test_hbase_col_filter_2598223d.hbase_col_filter_testkeyx:1, 
2019-06-26T23:27:42,517  INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task:attempt_local258751042_0023_m_000000_0 is done. And is in the process of committing
2019-06-26T23:27:42,518  INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task 'attempt_local258751042_0023_m_000000_0' done.
2019-06-26T23:27:42,518  INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Final Counters for attempt_local258751042_0023_m_000000_0: Counters: 26
        File System Counters
                FILE: Number of bytes read=823764465
                FILE: Number of bytes written=1184809057
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=49758306
                HDFS: Number of bytes written=1721
                HDFS: Number of read operations=2308
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=907
                HDFS: Number of bytes read erasure-coded=0
        Map-Reduce Framework
                Map input records=1
                Map output records=0
                Input split bytes=276
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=0
                Total committed heap usage (bytes)=1310720000
        HIVE
                CREATED_FILES=1
                DESERIALIZE_ERRORS=0
                RECORDS_IN=1
                RECORDS_OUT_1_test_hbase_col_filter_2598223d.hbase_col_filter_t=1
                TOTAL_TABLE_ROWS_WRITTEN=1
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=0
{noformat}
failed test log:
{noformat}
Stacktrace
query_test/test_hbase_queries.py:91: in test_hbase_col_filter
    self.run_test_case('QueryTest/hbase-col-filter', vector, unique_database)
common/impala_test_suite.py:592: in run_test_case
    self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:431: in __verify_results_and_errors
    replace_filenames_with_placeholder)
common/test_result_verifier.py:449: in verify_raw_results
    VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:271: in verify_query_result_is_equal
    assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E     3 != 0
Standard Error
SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_hbase_col_filter_2598223d` CASCADE;

-- 2019-06-26 23:27:31,609 INFO     MainThread: Started query d342e8d640a4497a:27d6b2ff00000000
SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_hbase_col_filter_2598223d`;

-- 2019-06-26 23:27:32,524 INFO     MainThread: Started query 6245118433382eee:491c3c5f00000000
-- 2019-06-26 23:27:32,525 INFO     MainThread: Created database "test_hbase_col_filter_2598223d" for test ID "query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: hbase/none]"
-- executing against localhost:21000
invalidate metadata test_hbase_col_filter_2598223d.hbase_col_filter_testkeyx;

-- 2019-06-26 23:27:43,606 INFO     MainThread: Started query 8042663402a431b0:d18c468900000000
SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
-- executing against localhost:21000
use test_hbase_col_filter_2598223d;

-- 2019-06-26 23:27:43,613 INFO     MainThread: Started query b34ba175430fa8ab:d338ded600000000
SET client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000
select count(*) from hbase_col_filter_testkeyx
WHERE k != 'row1';

-- 2019-06-26 23:27:47,629 INFO     MainThread: Started query 564e1694a90e24b3:2e5ccc1f00000000
-- 2019-06-26 23:27:47,703 ERROR    MainThread: Comparing QueryTestResults (expected vs actual):
3 != 0
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)