You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Gilles Comeau <gi...@polecat.co> on 2012/11/13 11:26:21 UTC

Removing Shards from Zookeeper - no servers hosting shard

Hi all,

We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011.  We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster.   We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.

Old code to remove cores from Zookeeper:


curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>

        echo "Removing indexes from all Zookeeper hosts"
        for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
        do
                $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
                $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
        Done

curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master

Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.

Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work?   The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both.   This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it.  I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work.  (also doesn't work when I do it)

Regards,

Gilles Comeau

Re: Removing Shards from Zookeeper - no servers hosting shard

Posted by Mark Miller <ma...@gmail.com>.
Missed the list in my last reply:

This used to work properly - I'm guess that the zk layout refactoring right before 4.0 broke it. We likely need a JIRA issue, a fix, and a test. 

Mark

On Nov 14, 2012, at 6:43 AM, Gilles Comeau <gi...@polecat.co> wrote:

> Hi all,
> 
> I just wanted to make the simplest repro of this issue, which now I am thinking might be related to the decision made in: https://issues.apache.org/jira/browse/SOLR-3080  ?  And this is the expected behaviour?
> 
> 1.	Download SOLR 4 production and extract.
> 2.	Replace solr.xml in apache-solr-4.0.0/example/solr/solr.xml with:
> 
> <?xml version="1.0" encoding="UTF-8" ?>
> <solr persistent="true">
>  <cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
>    <core shard="shard1" instanceDir="collection1/" name="collection1" collection="polecat"/>
>    <core shard="shard1" instanceDir="collection2/" name="collection2" collection="polecat"/>
>    <core schema="schema.xml" shard="core3" instanceDir="core3/" name="core3" config="solrconfig.xml" collection="polecat" dataDir="data"/>
>  </cores>
> </solr>
> 
> 3.	Start solr with: java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -Dsolrcloud.skip.autorecovery=true  -jar start.jar
> 	(skip.autorecovery is used because the shards don't exist previously)
> 
> Then run this:
> 	Sanity query:  http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
> 	Remove the core: http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core3&deleteIndex=true
> 	Error query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
> 
> And the sanity query, we will receive 0 records, the error query "no servers hosting shard:".   And in the clusterstate.json:  "core3":{"replicas":{}}}}
> 
> Regards,
> 
> Gilles
> 
> -----Original Message-----
> From: Gilles Comeau [mailto:gilles.comeau@polecat.co] 
> Sent: 13 November 2012 16:39
> To: solr-user@lucene.apache.org; markrmiller@gmail.com
> Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
> 
> Sorry forgot.. pictures are no good.. From cluster.json, the same information, the core I unloaded shard sticks around:  “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}”
> 
> Do I need a special command to delete the shard or something?  I’ve never seen a command that does that?
> 
> Regards, Gilles
> 
>  "experiment":{
>    "solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{
>          "shard":"solrexperiment:8080_solr_experiment_master",
>          "roles":null, "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
>    "solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{
>          "shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment",
>          "collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
>    "solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}
> 
> 
> From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
> Sent: 13 November 2012 16:29
> To: solr-user@lucene.apache.org; markrmiller@gmail.com
> Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
> 
> 
> When I do the unload through the UI, I see the below messages in the solr log.   Nothing in the zookeeper log.
> 
> 
> 
> Then right after I try:  http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true  and get  <str name="msg">no servers hosting shard:</str>.   Also, I still see the shard being referenced in the cloud tab in the UI.
> 
> 
> 
> [cid:image001.png@01CDC1BB.FD2BE590]
> 
> 
> 
> Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺   Anyone else have any problems getting this to work?
> 
> 
> 
> 
> My setup is pretty basic:  Local external zookeeper  3.3.6, solr 4.0 with three cores seen above.
> 
> 
> 
> Regards, Gilles
> 
> 
> 
> INFO: [02_10_2012_experiment]  CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6<ma...@11e3c2c6>
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher
> 
> INFO: [02_10_2012_experiment] Closing main searcher on request.
> 
> 13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close
> 
> FINE: Closing Searcher@7cd47880 main
> 
>        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> 
>        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> 
>        queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}
> 
>        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close
> 
> FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>>
> 
> 13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close
> 
> INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
> 
> 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
> 
> INFO: SolrCoreState ref count has reached 0 - closing IndexWriter
> 
> 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
> 
> INFO: Closing SolrCoreState - canceling any ongoing recovery
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile
> 
> INFO: Persisting cores config to /solr2/solr.xml
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@adminPath=/admin/cores
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
> 
> FINE: null missing optional solr/cores/@shareSchema
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@hostPort=9090
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@zkClientTimeout=10000
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@hostContext=solr
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
> 
> FINE: null missing optional solr/cores/@leaderVoteWait
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile
> 
> INFO: Persisting cores config to /solr2/solr.xml
> 
> 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState
> 
> INFO: Updating cloud state from ZooKeeper...
> 
> 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process
> 
> INFO: A cluster state change has occurred - updating...
> 
> 
> 
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: 13 November 2012 14:13
> To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
> Subject: Re: Removing Shards from Zookeeper - no servers hosting shard
> 
> 
> 
> Odd...the unload command should be enough...
> 
> 
> 
> On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:
> 
>> Hi all,
> 
>> 
> 
>> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011.  We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster.   We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
> 
>> 
> 
>> Old code to remove cores from Zookeeper:
> 
>> 
> 
>> 
> 
>> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>
> 
>> 
> 
>>        echo "Removing indexes from all Zookeeper hosts"
> 
>>        for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
> 
>>        do
> 
>>                $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
> 
>>                $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
> 
>>        Done
> 
>> 
> 
>> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
> 
>> 
> 
>> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
> 
>> 
> 
>> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work?   The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both.   This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it.  I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work.  (also doesn't work when I do it)
> 
>> 
> 
>> Regards,
> 
>> 
> 
>> Gilles Comeau
> 
> 
> 
> 
> 
> 
> 
> --
> 
> - Mark


RE: Removing Shards from Zookeeper - no servers hosting shard

Posted by Gilles Comeau <gi...@polecat.co>.
Hi all,

I just wanted to make the simplest repro of this issue, which now I am thinking might be related to the decision made in: https://issues.apache.org/jira/browse/SOLR-3080  ?  And this is the expected behaviour?

1.	Download SOLR 4 production and extract.
2.	Replace solr.xml in apache-solr-4.0.0/example/solr/solr.xml with:

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
  <cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
    <core shard="shard1" instanceDir="collection1/" name="collection1" collection="polecat"/>
    <core shard="shard1" instanceDir="collection2/" name="collection2" collection="polecat"/>
    <core schema="schema.xml" shard="core3" instanceDir="core3/" name="core3" config="solrconfig.xml" collection="polecat" dataDir="data"/>
  </cores>
</solr>

3.	Start solr with: java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -Dsolrcloud.skip.autorecovery=true  -jar start.jar
	(skip.autorecovery is used because the shards don't exist previously)

Then run this:
	Sanity query:  http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
	Remove the core: http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core3&deleteIndex=true
	Error query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true

And the sanity query, we will receive 0 records, the error query "no servers hosting shard:".   And in the clusterstate.json:  "core3":{"replicas":{}}}}

Regards,

Gilles

-----Original Message-----
From: Gilles Comeau [mailto:gilles.comeau@polecat.co] 
Sent: 13 November 2012 16:39
To: solr-user@lucene.apache.org; markrmiller@gmail.com
Subject: RE: Removing Shards from Zookeeper - no servers hosting shard

Sorry forgot.. pictures are no good.. From cluster.json, the same information, the core I unloaded shard sticks around:  “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}”

Do I need a special command to delete the shard or something?  I’ve never seen a command that does that?

Regards, Gilles

  "experiment":{
    "solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{
          "shard":"solrexperiment:8080_solr_experiment_master",
          "roles":null, "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
    "solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{
          "shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment",
          "collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
    "solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}


From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
Sent: 13 November 2012 16:29
To: solr-user@lucene.apache.org; markrmiller@gmail.com
Subject: RE: Removing Shards from Zookeeper - no servers hosting shard


When I do the unload through the UI, I see the below messages in the solr log.   Nothing in the zookeeper log.



Then right after I try:  http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true  and get  <str name="msg">no servers hosting shard:</str>.   Also, I still see the shard being referenced in the cloud tab in the UI.



[cid:image001.png@01CDC1BB.FD2BE590]



Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺   Anyone else have any problems getting this to work?




My setup is pretty basic:  Local external zookeeper  3.3.6, solr 4.0 with three cores seen above.



Regards, Gilles



INFO: [02_10_2012_experiment]  CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6<ma...@11e3c2c6>

13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher

INFO: [02_10_2012_experiment] Closing main searcher on request.

13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close

FINE: Closing Searcher@7cd47880 main

        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

        queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}

        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close

FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>>

13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close

INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}

13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref

INFO: SolrCoreState ref count has reached 0 - closing IndexWriter

13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref

INFO: Closing SolrCoreState - canceling any ongoing recovery

13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile

INFO: Persisting cores config to /solr2/solr.xml

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@adminPath=/admin/cores

13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode

FINE: null missing optional solr/cores/@shareSchema

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@hostPort=9090

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@zkClientTimeout=10000

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@hostContext=solr

13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode

FINE: null missing optional solr/cores/@leaderVoteWait

13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile

INFO: Persisting cores config to /solr2/solr.xml

13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState

INFO: Updating cloud state from ZooKeeper...

13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process

INFO: A cluster state change has occurred - updating...



-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: 13 November 2012 14:13
To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: Removing Shards from Zookeeper - no servers hosting shard



Odd...the unload command should be enough...



On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:

> Hi all,

>

> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011.  We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster.   We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.

>

> Old code to remove cores from Zookeeper:

>

>

> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>

>

>         echo "Removing indexes from all Zookeeper hosts"

>         for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))

>         do

>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD

>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD

>         Done

>

> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master

>

> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.

>

> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work?   The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both.   This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it.  I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work.  (also doesn't work when I do it)

>

> Regards,

>

> Gilles Comeau







--

- Mark

RE: Removing Shards from Zookeeper - no servers hosting shard

Posted by Gilles Comeau <gi...@polecat.co>.
Sorry forgot.. pictures are no good.. From cluster.json, the same information, the core I unloaded shard sticks around:  “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}”

Do I need a special command to delete the shard or something?  I’ve never seen a command that does that?

Regards, Gilles

  "experiment":{
    "solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{
          "shard":"solrexperiment:8080_solr_experiment_master",
          "roles":null, "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
    "solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{
          "shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment",
          "collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
    "solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}


From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
Sent: 13 November 2012 16:29
To: solr-user@lucene.apache.org; markrmiller@gmail.com
Subject: RE: Removing Shards from Zookeeper - no servers hosting shard


When I do the unload through the UI, I see the below messages in the solr log.   Nothing in the zookeeper log.



Then right after I try:  http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true  and get  <str name="msg">no servers hosting shard:</str>.   Also, I still see the shard being referenced in the cloud tab in the UI.



[cid:image001.png@01CDC1BB.FD2BE590]



Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺   Anyone else have any problems getting this to work?



My setup is pretty basic:  Local external zookeeper  3.3.6, solr 4.0 with three cores seen above.



Regards, Gilles



INFO: [02_10_2012_experiment]  CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6<ma...@11e3c2c6>

13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher

INFO: [02_10_2012_experiment] Closing main searcher on request.

13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close

FINE: Closing Searcher@7cd47880 main

        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

        queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}

        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close

FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>>

13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close

INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}

13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref

INFO: SolrCoreState ref count has reached 0 - closing IndexWriter

13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref

INFO: Closing SolrCoreState - canceling any ongoing recovery

13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile

INFO: Persisting cores config to /solr2/solr.xml

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@adminPath=/admin/cores

13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode

FINE: null missing optional solr/cores/@shareSchema

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@hostPort=9090

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@zkClientTimeout=10000

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@hostContext=solr

13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode

FINE: null missing optional solr/cores/@leaderVoteWait

13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile

INFO: Persisting cores config to /solr2/solr.xml

13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState

INFO: Updating cloud state from ZooKeeper...

13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process

INFO: A cluster state change has occurred - updating...



-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: 13 November 2012 14:13
To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: Removing Shards from Zookeeper - no servers hosting shard



Odd...the unload command should be enough...



On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:

> Hi all,

>

> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011.  We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster.   We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.

>

> Old code to remove cores from Zookeeper:

>

>

> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>

>

>         echo "Removing indexes from all Zookeeper hosts"

>         for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))

>         do

>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD

>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD

>         Done

>

> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master

>

> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.

>

> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work?   The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both.   This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it.  I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work.  (also doesn't work when I do it)

>

> Regards,

>

> Gilles Comeau







--

- Mark

RE: Removing Shards from Zookeeper - no servers hosting shard

Posted by Gilles Comeau <gi...@polecat.co>.
When I do the unload through the UI, I see the below messages in the solr log.   Nothing in the zookeeper log.



Then right after I try:  http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true  and get  <str name="msg">no servers hosting shard:</str>.   Also, I still see the shard being referenced in the cloud tab in the UI.



[cid:image001.png@01CDC1BB.FD2BE590]



Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺   Anyone else have any problems getting this to work?



My setup is pretty basic:  Local external zookeeper  3.3.6, solr 4.0 with three cores seen above.



Regards, Gilles



INFO: [02_10_2012_experiment]  CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6

13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher

INFO: [02_10_2012_experiment] Closing main searcher on request.

13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close

FINE: Closing Searcher@7cd47880 main

        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

        queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}

        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close

FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>

13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close

INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}

13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref

INFO: SolrCoreState ref count has reached 0 - closing IndexWriter

13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref

INFO: Closing SolrCoreState - canceling any ongoing recovery

13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile

INFO: Persisting cores config to /solr2/solr.xml

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@adminPath=/admin/cores

13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode

FINE: null missing optional solr/cores/@shareSchema

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@hostPort=9090

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@zkClientTimeout=10000

13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal

FINE: null solr/cores/@hostContext=solr

13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode

FINE: null missing optional solr/cores/@leaderVoteWait

13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile

INFO: Persisting cores config to /solr2/solr.xml

13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState

INFO: Updating cloud state from ZooKeeper...

13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process

INFO: A cluster state change has occurred - updating...



-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: 13 November 2012 14:13
To: solr-user@lucene.apache.org
Subject: Re: Removing Shards from Zookeeper - no servers hosting shard



Odd...the unload command should be enough...



On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:

> Hi all,

>

> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011.  We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster.   We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.

>

> Old code to remove cores from Zookeeper:

>

>

> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>

>

>         echo "Removing indexes from all Zookeeper hosts"

>         for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))

>         do

>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD

>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD

>         Done

>

> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master

>

> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.

>

> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work?   The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both.   This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it.  I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work.  (also doesn't work when I do it)

>

> Regards,

>

> Gilles Comeau







--

- Mark

Re: Removing Shards from Zookeeper - no servers hosting shard

Posted by Mark Miller <ma...@gmail.com>.
Odd...the unload command should be enough...

On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co> wrote:
> Hi all,
>
> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011.  We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster.   We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
>
> Old code to remove cores from Zookeeper:
>
>
> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>
>
>         echo "Removing indexes from all Zookeeper hosts"
>         for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
>         do
>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
>                 $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
>         Done
>
> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
>
> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
>
> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work?   The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both.   This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it.  I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work.  (also doesn't work when I do it)
>
> Regards,
>
> Gilles Comeau



-- 
- Mark