You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Xinyu Liu (JIRA)" <ji...@apache.org> on 2018/01/04 20:33:00 UTC

[jira] [Updated] (SAMZA-1403) Indefinite wait in TestZkLocalApplicationRunner.teardown

     [ https://issues.apache.org/jira/browse/SAMZA-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xinyu Liu updated SAMZA-1403:
-----------------------------
    Fix Version/s:     (was: 0.14.0)
                   0.15.0

> Indefinite wait in TestZkLocalApplicationRunner.teardown
> --------------------------------------------------------
>
>                 Key: SAMZA-1403
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1403
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Shanthoosh Venkataraman
>            Assignee: Shanthoosh Venkataraman
>            Priority: Minor
>             Fix For: 0.15.0
>
>
> We observed long teardown phase in zookeeper integration tests(TestZkLocalApplicationRunner which spawns EmbeddedZookeeper), leading to failures(integration tests has maximum runtime limit as 2 mins).
> Here’s the related stacktrace:
> {code:java}
> shouldReElectLeaderWhenLeaderDies FAILED
>     java.lang.Exception: test timed out after 120000 milliseconds
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>         at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>         at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:589)
>         at kafka.zk.ZkFourLetterWords$.sendStat(ZkFourLetterWords.scala:37)
>         at kafka.zk.EmbeddedZookeeper.kafka$zk$EmbeddedZookeeper$$isDown$1(EmbeddedZookeeper.scala:45)
>         at kafka.zk.EmbeddedZookeeper$$anonfun$shutdown$3.apply$mcZ$sp(EmbeddedZookeeper.scala:50)
>         at kafka.zk.EmbeddedZookeeper$$anonfun$shutdown$3.apply(EmbeddedZookeeper.scala:50)
>         at kafka.zk.EmbeddedZookeeper$$anonfun$shutdown$3.apply(EmbeddedZookeeper.scala:50)
>         at scala.collection.Iterator$$anon$9.next(Iterator.scala:162)
>         at scala.collection.Iterator$class.exists(Iterator.scala:919)
>         at scala.collection.AbstractIterator.exists(Iterator.scala:1336)
>         at kafka.zk.EmbeddedZookeeper.shutdown(EmbeddedZookeeper.scala:50)
>         at org.apache.samza.test.harness.AbstractZookeeperTestHarness$$anonfun$tearDown$2.apply$mcV$sp(AbstractZookeeperTestHarness.scala:54)
>         at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:79)
>         at kafka.utils.Logging$class.swallowWarn(Logging.scala:94)
>         at kafka.utils.CoreUtils$.swallowWarn(CoreUtils.scala:49)
>         at kafka.utils.Logging$class.swallow(Logging.scala:96)
>         at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:49)
>         at org.apache.samza.test.harness.AbstractZookeeperTestHarness.tearDown(AbstractZookeeperTestHarness.scala:54)
>         at org.apache.samza.test.harness.AbstractKafkaServerTestHarness.tearDown(AbstractKafkaServerTestHarness.scala:97)
>         at org.apache.samza.test.StandaloneIntegrationTestHarness.tearDown(StandaloneIntegrationTestHarness.java:73)
>         at org.apache.samza.test.processor.TestZkLocalApplicationRunner.tearDown(TestZkLocalApplicationRunner.java:159)
> {code}
> In EmbeddedZookeeper.teardown, there’s an indefinite polling of zkServer after zookeeper.shutdown to verify shutdown completion. From the stack trace it’s obvious that the polling never terminates.
> There’s are deadlock problems in zookeeperServer.shutdown in zookeeper version: 3.4.6 that samza project is using.  
> This issue is documented/discussed in the following places:
>  
> * http://zookeeper-user.578899.n2.nabble.com/ZooKeeperServer-shutdown-hangs-td7581821.html
> * https://issues.apache.org/jira/browse/ZOOKEEPER-2347
> * https://issues.apache.org/jira/browse/ZOOKEEPER-2687
> From ZK JIRA’s it’s apparent that upgrading zookeeperVersion to 3.4.9 should solve this problem.
> After the upgrade, this problem didn’t occur locally during multiple test runs. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)