You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Steven Phillips (JIRA)" <ji...@apache.org> on 2015/03/26 04:19:52 UTC

[jira] [Commented] (DRILL-2120) Bringing up multiple drillbits at same time results in synchronization failure

    [ https://issues.apache.org/jira/browse/DRILL-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381284#comment-14381284 ] 

Steven Phillips commented on DRILL-2120:
----------------------------------------

In the putIfAbsent() method of ZKAbstractStore, we check if a znode exists using our local cache, and if it doesn't exist, we attempt to create the znode. But in some cases, the znode does exist when we try to create, and this causes an exception which we are not handling.

The straightforward solution would be to catch the NodeExistsException and rebuild the node in the local cache.

> Bringing up multiple drillbits at same time results in synchronization failure
> ------------------------------------------------------------------------------
>
>                 Key: DRILL-2120
>                 URL: https://issues.apache.org/jira/browse/DRILL-2120
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Ramana Inukonda Nagaraj
>            Assignee: Steven Phillips
>             Fix For: 0.9.0
>
>
> Repro:
> With a fresh ZK install bring up 4 drillbits at the same time using something like clush
> clush -g ats /opt/drill/bin/drillbit.sh start
> Looks like all 4 nodes try to query the ZK to see if the node exists and all of them try to create it at the same time. Some succeed, Others don't. The ones which fail have incorrect information about the state of the ZK and that would explain the below stacktrace.
> {code}
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> Exception in thread "main" org.apache.drill.exec.exception.DrillbitStartupException: Failure during initial startup of Drillbit.
>         at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:76)
>         at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:60)
>         at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:83)
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
>         at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.putIfAbsent(ZkAbstractStore.java:135)
>         at org.apache.drill.exec.store.StoragePluginRegistry.createPlugins(StoragePluginRegistry.java:150)
>         at org.apache.drill.exec.store.StoragePluginRegistry.init(StoragePluginRegistry.java:130)
>         at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:155)
>         at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:73)
>         ... 2 more
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
>         at org.apache.drill.exec.store.sys.zk.ZkPStore.createNodeInZK(ZkPStore.java:53)
>         at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.putIfAbsent(ZkAbstractStore.java:129)
>         ... 6 more
> Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /drill-ats-build/sys.storage_plugins/cp
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
>         at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676)
>         at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660)
>         at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>         at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656)
>         at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441)
>         at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431)
>         at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44)
>         at org.apache.drill.exec.store.sys.zk.ZkPStore.createNodeInZK(ZkPStore.java:51)
>         ... 7 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)