You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/01/13 23:19:00 UTC

[jira] [Commented] (IMPALA-9262) Blacklist test test_kill_impalad_with_running_queries fails in exhaustive mode

    [ https://issues.apache.org/jira/browse/IMPALA-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014722#comment-17014722 ] 

ASF subversion and git services commented on IMPALA-9262:
---------------------------------------------------------

Commit 915e811c2cc95615f2619589d0128e98a188507e in impala's branch refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=915e811 ]

IMPALA-9262: De-flake TestBlacklist::test_kill_impalad_with_running_queries

test_blacklist.py::TestBlacklist::test_kill_impalad_with_running_queries
runs a query asynchronously, waits for it to reach the RUNNING or FINISHED
state, kills an impalad, and then expects a fetch results request for
the query to fail. The test is flaky because it is possible the query can
finish successfully before an impalad is successfully killed.

The fix is to make the query slower using a debug action.

Testing:
* Looped the test for several hours

Change-Id: I8129323a7eb62cef61f1c6c34da06f08cf6d4b06
Reviewed-on: http://gerrit.cloudera.org:8080/14985
Reviewed-by: Sahil Takiar <st...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Blacklist test test_kill_impalad_with_running_queries fails in exhaustive mode
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-9262
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9262
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Distributed Exec
>    Affects Versions: Impala 3.4.0
>            Reporter: Laszlo Gaal
>            Assignee: Sahil Takiar
>            Priority: Blocker
>              Labels: broken-build
>
> Test proc test_kill_impalad_with_running_queries in test_blacklist.py failed in multiple builds in executive mode with identical symptoms:
> Backtrace is
> {code}
> custom_cluster/test_blacklist.py:145: in test_kill_impalad_with_running_queries
>     assert "TransmitData() to " in str(e) or "EndDataStream() to " in str(e)
> E   assert ('TransmitData() to ' in 'Query was expected to fail\nassert False' or 'EndDataStream() to ' in 'Query was expected to fail\nassert False')
> E    +  where 'Query was expected to fail\nassert False' = str(AssertionError(u'Query was expected to fail\nassert False',))
> E    +  and   'Query was expected to fail\nassert False' = str(AssertionError(u'Query was expected to fail\nassert False',))
> {code}
> Standard error is:
> {code}
> 02:19:19 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 02:19:19 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 02:19:19 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 02:19:20 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 02:19:20 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 02:19:20 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 02:19:23 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 02:19:23 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 02:19:23 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25000
> 02:19:23 MainThread: Waiting for num_known_live_backends=3. Current value: 2
> 02:19:24 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 02:19:24 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25000
> 02:19:24 MainThread: Waiting for num_known_live_backends=3. Current value: 2
> 02:19:25 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 02:19:25 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25000
> 02:19:25 MainThread: num_known_live_backends has reached value: 3
> 02:19:25 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 02:19:25 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25001
> 02:19:25 MainThread: num_known_live_backends has reached value: 3
> 02:19:25 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 02:19:25 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25002
> 02:19:25 MainThread: num_known_live_backends has reached value: 3
> 02:19:25 MainThread: Impala Cluster Running with 3 nodes (1 coordinators, 2 executors).
> DEBUG:impala_cluster:Found 3 impalad/1 statestored/1 catalogd process(es)
> INFO:impala_service:Getting metric: statestore.live-backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25010
> INFO:impala_service:Metric 'statestore.live-backends' has reached desired value: 4
> DEBUG:impala_service:Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25000
> INFO:impala_service:num_known_live_backends has reached value: 3
> DEBUG:impala_service:Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25001
> INFO:impala_service:num_known_live_backends has reached value: 3
> DEBUG:impala_service:Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1066.vpc.cloudera.com:25002
> INFO:impala_service:num_known_live_backends has reached value: 3
> ERROR:impala.hiveserver2:Failed to open transport (tries_left=3)
> Traceback (most recent call last):
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py", line 1009, in _execute
>     return func(request)
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 176, in OpenSession
>     return self.recv_OpenSession()
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 188, in recv_OpenSession
>     (fname, mtype, rseqid) = iprot.readMessageBegin()
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 126, in readMessageBegin
>     sz = self.readI32()
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 206, in readI32
>     buff = self.trans.readAll(4)
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll
>     chunk = self.read(sz - have)
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 159, in read
>     self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
>   File "/data/jenkins/workspace/impala-asf-master-exhaustive-release/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py", line 120, in read
>     message='TSocket read 0 bytes')
> TTransportException: TSocket read 0 bytes
> INFO:impala_cluster:Killing <ImpaladProcess PID: 23167 (/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/be/build/latest/service/impalad -is_coordinator=false -disconnected_session_timeout 3600 -kudu_client_rpc_timeout_ms 0 -kudu_master_hosts localhost -mem_limit=12884901888 -logbufsecs=5 -v=1 -max_log_files=0 -log_filename=impalad_node2 -log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests -beeswax_port=21002 -hs2_port=21052 -hs2_http_port=28002 -be_port=22002 -krpc_port=27002 -state_store_subscriber_port=23002 -webserver_port=25002 --default_query_options=)> with signal 9
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org