You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Hao Hao (JIRA)" <ji...@apache.org> on 2019/08/13 22:10:00 UTC

[jira] [Commented] (IMPALA-8856) TestKuduHMSIntegration.test_drop_managed_kudu_table failed with "the table does not exist"

    [ https://issues.apache.org/jira/browse/IMPALA-8856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906662#comment-16906662 ] 

Hao Hao commented on IMPALA-8856:
---------------------------------

Looking at the Kudu master log, there has been two DeleteTable requests for table "test_drop_managed_kudu_table_a82c250c.foo". Because the drop HMS table took more than 10s to finish, the first request timed out and the client issued a retry. The retry failed because the table has been already deleted by the first request.

{noformat}
W0812 14:49:59.766089 29240 hms_client.cc:275] Time spent drop HMS table: real 13.625s	user 0.000s	sys 0.000s
W0812 14:49:59.766505 29179 rpcz_store.cc:253] Call kudu.master.MasterService.DeleteTable from 127.0.0.1:42892 (request call id 3) took 13631 ms (13.6 s). Client timeout 9999 ms (10 s)
{noformat}

The end result is the table has been deleted in Kudu and the HMS. And as Kudu depends on the HMS, the client can receive 'Not Found' error even though the table has already been deleted with a default timeout config. So I don't think this is a blocker, rather it is a test flakiness.

> TestKuduHMSIntegration.test_drop_managed_kudu_table failed with "the table does not exist"
> ------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-8856
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8856
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>    Affects Versions: Impala 3.3.0
>            Reporter: Tim Armstrong
>            Assignee: Hao Hao
>            Priority: Blocker
>              Labels: flaky
>         Attachments: catalogd.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.log.INFO.20190812-144934.5595, hive-metastore.log, impalad.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.log.INFO.20190812-144935.5670, impalad.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.log.INFO.20190812-144935.5675, impalad.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.log.INFO.20190812-144935.5675, kudu-master.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.log.INFO.20190812-144746.28998, kudu-tserver.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.diagnostics.20190812-144746.0.28965, kudu-tserver.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.log.INFO.20190812-144746.28981, kudu-tserver.impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com.jenkins.log.INFO.20190812-144746.29022
>
>
> {noformat}
> custom_cluster.test_kudu.TestKuduHMSIntegration.test_drop_managed_kudu_table (from pytest)
> Failing for the past 1 build (Since Failed#31 )
> Took 27 sec.
> add description
> Error Message
> KuduNotFound: the table does not exist: table_name: "test_drop_managed_kudu_table_a82c250c.foo"
> Stacktrace
> custom_cluster/test_kudu.py:256: in test_drop_managed_kudu_table
>     kudu_client.delete_table(kudu_tbl_name)
> kudu/client.pyx:392: in kudu.client.Client.delete_table (kudu/client.cpp:7106)
>     ???
> kudu/errors.pyx:56: in kudu.errors.check_status (kudu/errors.cpp:904)
>     ???
> E   KuduNotFound: the table does not exist: table_name: "test_drop_managed_kudu_table_a82c250c.foo"
> Standard Error
> -- 2019-08-12 14:49:32,854 INFO     MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50     --statestore_priority_update_frequency_ms=50     --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests --log_level=1 --impalad_args=--default_query_options=
> 14:49:33 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 14:49:33 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 14:49:34 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 14:49:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 14:49:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 14:49:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 14:49:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 14:49:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 14:49:38 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25000
> 14:49:38 MainThread: Debug webpage not yet available: ('Connection aborted.', error(111, 'Connection refused'))
> 14:49:40 MainThread: Debug webpage did not become available in expected time.
> 14:49:40 MainThread: Waiting for num_known_live_backends=3. Current value: None
> 14:49:41 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 14:49:41 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25000
> 14:49:41 MainThread: Waiting for num_known_live_backends=3. Current value: 0
> 14:49:42 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 14:49:42 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25000
> 14:49:42 MainThread: num_known_live_backends has reached value: 3
> 14:49:43 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 14:49:43 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25001
> 14:49:43 MainThread: num_known_live_backends has reached value: 3
> 14:49:43 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 14:49:43 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25002
> 14:49:43 MainThread: num_known_live_backends has reached value: 3
> 14:49:44 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors).
> -- 2019-08-12 14:49:44,203 DEBUG    MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> -- 2019-08-12 14:49:44,203 INFO     MainThread: Getting metric: statestore.live-backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25010
> -- 2019-08-12 14:49:44,204 INFO     MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com
> -- 2019-08-12 14:49:44,206 INFO     MainThread: Metric 'statestore.live-backends' has reached desired value: 4
> -- 2019-08-12 14:49:44,206 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25000
> -- 2019-08-12 14:49:44,206 INFO     MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com
> -- 2019-08-12 14:49:44,207 INFO     MainThread: num_known_live_backends has reached value: 3
> -- 2019-08-12 14:49:44,207 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25001
> -- 2019-08-12 14:49:44,208 INFO     MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com
> -- 2019-08-12 14:49:44,209 INFO     MainThread: num_known_live_backends has reached value: 3
> -- 2019-08-12 14:49:44,209 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com:25002
> -- 2019-08-12 14:49:44,210 INFO     MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-1188.vpc.cloudera.com
> -- 2019-08-12 14:49:44,211 INFO     MainThread: num_known_live_backends has reached value: 3
> SET client_identifier=custom_cluster/test_kudu.py::TestKuduHMSIntegration::()::test_drop_managed_kudu_table;
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2019-08-12 14:49:44,348 INFO     MainThread: Closing active operation
> -- 2019-08-12 14:49:44,349 ERROR    MainThread: Failed to open transport (tries_left=3)
> Traceback (most recent call last):
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py", line 961, in _execute
>     return func(request)
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 176, in OpenSession
>     return self.recv_OpenSession()
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 188, in recv_OpenSession
>     (fname, mtype, rseqid) = iprot.readMessageBegin()
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 126, in readMessageBegin
>     sz = self.readI32()
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 206, in readI32
>     buff = self.trans.readAll(4)
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll
>     chunk = self.read(sz - have)
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 159, in read
>     self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
>   File "/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py", line 120, in read
>     message='TSocket read 0 bytes')
> TTransportException: TSocket read 0 bytes
> -- 2019-08-12 14:49:44,350 INFO     MainThread: Using database testkuduhmsintegration_12761_bypi6f as default
> W0812 14:49:44.382771 12761 client-internal.cc:655] Specified master server address 'localhost:7051' resolved to multiple IPs. Using 127.0.0.1:7051
> SET client_identifier=custom_cluster/test_kudu.py::TestKuduHMSIntegration::()::test_drop_managed_kudu_table;
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_drop_managed_kudu_table_a82c250c` CASCADE;
> -- 2019-08-12 14:49:44,497 INFO     MainThread: Started query a4467ba0d476591c:e3d6328a00000000
> SET client_identifier=custom_cluster/test_kudu.py::TestKuduHMSIntegration::()::test_drop_managed_kudu_table;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_drop_managed_kudu_table_a82c250c`;
> -- 2019-08-12 14:49:44,882 INFO     MainThread: Started query 96443159258ca73f:52af6c0e00000000
> -- 2019-08-12 14:49:44,884 INFO     MainThread: Created database "test_drop_managed_kudu_table_a82c250c" for test ID "custom_cluster/test_kudu.py::TestKuduHMSIntegration::()::test_drop_managed_kudu_table"
> -- 2019-08-12 14:49:46,129 INFO     MainThread: Starting new HTTP connection (1): localhost
> W0812 14:49:56.135144  6748 master_proxy_rpc.cc:192] Re-attempting DeleteTable request to leader Master (localhost:7051)
> {noformat}
> It looks like there is some kind of race here, since there is an existence check for the table before deleting it:
> {noformat}
>     assert kudu_client.table_exists(kudu_tbl_name)
>     kudu_client.delete_table(kudu_tbl_name)
> {noformat}
> I attached all the logs that included the table name.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org