You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tim Vaillancourt <ti...@elementspace.com> on 2013/12/06 00:50:34 UTC

No /clusterstate.json updates on Solrcloud 4.3.1 Cores API UNLOAD/CREATE

Hey guys,

I've been having an issue with 1 of my 4 replicas having an inconsistent
replica, and have been trying to fix it. At the core of this issue, I've
noticed /clusterstate.json doesn't seem to be receiving updates when cores
get unhealthy, or even added/removed.

Today I decided I would remove the "bad" replica from the SolrCloud and
force a sync of a new clean replica, so I ran a
'/admin/cores?command=UNLOAD&name=name' to drop it. After this, on the
instance with the "bad" replica, the core was removed from solr.xml but
strangely NOT the /clusterstate.json in Zookeeper - it remained in
Zookeeper unchanged, still with "state: active" :(.

So, I then manually edited the clusterstate.json with a Perl script,
removing the json data for the "bad" replica. I checked all nodes saw the
change themselves, things looked good. Then I brought the node up/down to
check that it was properly adding/removing itself from /live_nodes znode in
Zookeeper. That all worked perfectly, too.

Here is the really odd part: when I created a new replica on this node (to
replace the "bad" replica), the core was created on the node, and NO update
was made to /clusterstate.json. At this point this node had no cores, no
cores with state in /clusterstate.json, and all data dirs deleted, so this
is quite confusing.

Upon checking ACLs on /clusterstate.json, it is world/anyone accessible:

"[zk: localhost:2181(CONNECTED) 18] getAcl /clusterstate.json
'world,'anyone
: cdrwa"

Also, keep in mind my external Perl script had no issue updating
/clusterstate.json. Can anyone make any suggestions why /clusterstate.json
isn't getting updated when I create this new core?

One other thing I checked was the health of the Zookeeper ensemble, and all
3 Zookeepers have the same mZxid, ctime, mtime, etc for /clusterstate.json
and receive updates no problem, just this node isn't updating Zookeeper
somehow.

Any thoughts are much appreciated!

Thanks!

Tim