You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Scott Blum (JIRA)" <ji...@apache.org> on 2015/08/05 00:31:05 UTC

[jira] [Updated] (SOLR-7869) Overseer does not handle BadVersionException correctly

     [ https://issues.apache.org/jira/browse/SOLR-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Scott Blum updated SOLR-7869:
-----------------------------
    Attachment: SOLR-7869.patch

Attached a TEST ONLY that repros the failure.  This is not a fix.

> Overseer does not handle BadVersionException correctly
> ------------------------------------------------------
>
>                 Key: SOLR-7869
>                 URL: https://issues.apache.org/jira/browse/SOLR-7869
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 5.2.1
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>              Labels: difficulty-medium, impact-low
>             Fix For: 5.3, Trunk
>
>         Attachments: SOLR-7869.patch
>
>
> If the /clusterstate.json is modified externally then the Overseer can go into an infinite loop upon a BadVersionException alternately trying to execute main queue and then the work queue:
> {code}
> ERROR - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer work queue loop
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /clusterstate.json
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270)
>         at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:362)
>         at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:359)
>         at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
>         at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:359)
>         at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:180)
>         at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:67)
>         at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:286)
>         at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:168)
>         at java.lang.Thread.run(Thread.java:745)
> INFO  - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; processMessage: queueSize: 1, message = {
>   "operation":"state",
>   "state":"down",
>   "base_url":"http://127.0.1.1:7574/solr",
>   "core":"test_shard1_replica1",
>   "roles":null,
>   "node_name":"127.0.1.1:7574_solr",
>   "shard":null,
>   "collection":"test",
>   "core_node_name":"core_node1"} current state version: 9
> INFO  - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.overseer.ReplicaMutator; Update state numShards=null message={
>   "operation":"state",
>   "state":"down",
>   "base_url":"http://127.0.1.1:7574/solr",
>   "core":"test_shard1_replica1",
>   "roles":null,
>   "node_name":"127.0.1.1:7574_solr",
>   "shard":null,
>   "collection":"test",
>   "core_node_name":"core_node1"}
> INFO  - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.overseer.ReplicaMutator; shard=shard1 is already registered
> ERROR - 2015-08-04 18:49:56.225; [   ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main queue loop
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /clusterstate.json
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270)
>         at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:362)
>         at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:359)
>         at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
>         at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:359)
>         at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:180)
>         at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:67)
>         at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:286)
>         at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:213)
>         at java.lang.Thread.run(Thread.java:745)
> INFO  - 2015-08-04 18:49:56.225; [   ] org.apache.solr.common.cloud.ZkStateReader; Updating data for gettingstarted to ver 8
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org