You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Gilles Comeau <gi...@polecat.co> on 2012/11/13 11:26:21 UTC
Removing Shards from Zookeeper - no servers hosting shard
Hi all,
We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011. We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster. We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
Old code to remove cores from Zookeeper:
curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>
echo "Removing indexes from all Zookeeper hosts"
for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
do
$JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
$JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
Done
curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work? The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both. This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it. I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work. (also doesn't work when I do it)
Regards,
Gilles Comeau
Re: Removing Shards from Zookeeper - no servers hosting shard
Posted by Mark Miller <ma...@gmail.com>.
Missed the list in my last reply:
This used to work properly - I'm guess that the zk layout refactoring right before 4.0 broke it. We likely need a JIRA issue, a fix, and a test.
Mark
On Nov 14, 2012, at 6:43 AM, Gilles Comeau <gi...@polecat.co> wrote:
> Hi all,
>
> I just wanted to make the simplest repro of this issue, which now I am thinking might be related to the decision made in: https://issues.apache.org/jira/browse/SOLR-3080 ? And this is the expected behaviour?
>
> 1. Download SOLR 4 production and extract.
> 2. Replace solr.xml in apache-solr-4.0.0/example/solr/solr.xml with:
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <solr persistent="true">
> <cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
> <core shard="shard1" instanceDir="collection1/" name="collection1" collection="polecat"/>
> <core shard="shard1" instanceDir="collection2/" name="collection2" collection="polecat"/>
> <core schema="schema.xml" shard="core3" instanceDir="core3/" name="core3" config="solrconfig.xml" collection="polecat" dataDir="data"/>
> </cores>
> </solr>
>
> 3. Start solr with: java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -Dsolrcloud.skip.autorecovery=true -jar start.jar
> (skip.autorecovery is used because the shards don't exist previously)
>
> Then run this:
> Sanity query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
> Remove the core: http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core3&deleteIndex=true
> Error query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
>
> And the sanity query, we will receive 0 records, the error query "no servers hosting shard:". And in the clusterstate.json: "core3":{"replicas":{}}}}
>
> Regards,
>
> Gilles
>
> -----Original Message-----
> From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
> Sent: 13 November 2012 16:39
> To: solr-user@lucene.apache.org; markrmiller@gmail.com
> Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
>
> Sorry forgot.. pictures are no good.. From cluster.json, the same information, the core I unloaded shard sticks around: “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}”
>
> Do I need a special command to delete the shard or something? I’ve never seen a command that does that?
>
> Regards, Gilles
>
> "experiment":{
> "solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{
> "shard":"solrexperiment:8080_solr_experiment_master",
> "roles":null, "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
> "solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{
> "shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment",
> "collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
> "solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}
>
>
> From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
> Sent: 13 November 2012 16:29
> To: solr-user@lucene.apache.org; markrmiller@gmail.com
> Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
>
>
> When I do the unload through the UI, I see the below messages in the solr log. Nothing in the zookeeper log.
>
>
>
> Then right after I try: http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true and get <str name="msg">no servers hosting shard:</str>. Also, I still see the shard being referenced in the cloud tab in the UI.
>
>
>
> [cid:image001.png@01CDC1BB.FD2BE590]
>
>
>
> Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺ Anyone else have any problems getting this to work?
>
>
>
>
> My setup is pretty basic: Local external zookeeper 3.3.6, solr 4.0 with three cores seen above.
>
>
>
> Regards, Gilles
>
>
>
> INFO: [02_10_2012_experiment] CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6<ma...@11e3c2c6>
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher
>
> INFO: [02_10_2012_experiment] Closing main searcher on request.
>
> 13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close
>
> FINE: Closing Searcher@7cd47880 main
>
> fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
>
> filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
>
> queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}
>
> documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close
>
> FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>>
>
> 13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close
>
> INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
>
> 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
>
> INFO: SolrCoreState ref count has reached 0 - closing IndexWriter
>
> 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
>
> INFO: Closing SolrCoreState - canceling any ongoing recovery
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile
>
> INFO: Persisting cores config to /solr2/solr.xml
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
>
> FINE: null solr/cores/@adminPath=/admin/cores
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
>
> FINE: null missing optional solr/cores/@shareSchema
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
>
> FINE: null solr/cores/@hostPort=9090
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
>
> FINE: null solr/cores/@zkClientTimeout=10000
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
>
> FINE: null solr/cores/@hostContext=solr
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
>
> FINE: null missing optional solr/cores/@leaderVoteWait
>
> 13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile
>
> INFO: Persisting cores config to /solr2/solr.xml
>
> 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState
>
> INFO: Updating cloud state from ZooKeeper...
>
> 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process
>
> INFO: A cluster state change has occurred - updating...
>
>
>
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: 13 November 2012 14:13
> To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
> Subject: Re: Removing Shards from Zookeeper - no servers hosting shard
>
>
>
> Odd...the unload command should be enough...
>
>
>
> On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:
>
>> Hi all,
>
>>
>
>> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011. We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster. We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
>
>>
>
>> Old code to remove cores from Zookeeper:
>
>>
>
>>
>
>> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>
>
>>
>
>> echo "Removing indexes from all Zookeeper hosts"
>
>> for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
>
>> do
>
>> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
>
>> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
>
>> Done
>
>>
>
>> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
>
>>
>
>> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
>
>>
>
>> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work? The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both. This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it. I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work. (also doesn't work when I do it)
>
>>
>
>> Regards,
>
>>
>
>> Gilles Comeau
>
>
>
>
>
>
>
> --
>
> - Mark
RE: Removing Shards from Zookeeper - no servers hosting shard
Posted by Gilles Comeau <gi...@polecat.co>.
Hi all,
I just wanted to make the simplest repro of this issue, which now I am thinking might be related to the decision made in: https://issues.apache.org/jira/browse/SOLR-3080 ? And this is the expected behaviour?
1. Download SOLR 4 production and extract.
2. Replace solr.xml in apache-solr-4.0.0/example/solr/solr.xml with:
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
<cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
<core shard="shard1" instanceDir="collection1/" name="collection1" collection="polecat"/>
<core shard="shard1" instanceDir="collection2/" name="collection2" collection="polecat"/>
<core schema="schema.xml" shard="core3" instanceDir="core3/" name="core3" config="solrconfig.xml" collection="polecat" dataDir="data"/>
</cores>
</solr>
3. Start solr with: java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -Dsolrcloud.skip.autorecovery=true -jar start.jar
(skip.autorecovery is used because the shards don't exist previously)
Then run this:
Sanity query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
Remove the core: http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core3&deleteIndex=true
Error query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
And the sanity query, we will receive 0 records, the error query "no servers hosting shard:". And in the clusterstate.json: "core3":{"replicas":{}}}}
Regards,
Gilles
-----Original Message-----
From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
Sent: 13 November 2012 16:39
To: solr-user@lucene.apache.org; markrmiller@gmail.com
Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
Sorry forgot.. pictures are no good.. From cluster.json, the same information, the core I unloaded shard sticks around: “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}”
Do I need a special command to delete the shard or something? I’ve never seen a command that does that?
Regards, Gilles
"experiment":{
"solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{
"shard":"solrexperiment:8080_solr_experiment_master",
"roles":null, "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
"solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{
"shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment",
"collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}
From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
Sent: 13 November 2012 16:29
To: solr-user@lucene.apache.org; markrmiller@gmail.com
Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
When I do the unload through the UI, I see the below messages in the solr log. Nothing in the zookeeper log.
Then right after I try: http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true and get <str name="msg">no servers hosting shard:</str>. Also, I still see the shard being referenced in the cloud tab in the UI.
[cid:image001.png@01CDC1BB.FD2BE590]
Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺ Anyone else have any problems getting this to work?
My setup is pretty basic: Local external zookeeper 3.3.6, solr 4.0 with three cores seen above.
Regards, Gilles
INFO: [02_10_2012_experiment] CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6<ma...@11e3c2c6>
13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher
INFO: [02_10_2012_experiment] Closing main searcher on request.
13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close
FINE: Closing Searcher@7cd47880 main
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}
documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close
FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>>
13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close
INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
INFO: SolrCoreState ref count has reached 0 - closing IndexWriter
13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
INFO: Closing SolrCoreState - canceling any ongoing recovery
13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile
INFO: Persisting cores config to /solr2/solr.xml
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@adminPath=/admin/cores
13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
FINE: null missing optional solr/cores/@shareSchema
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@hostPort=9090
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@zkClientTimeout=10000
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@hostContext=solr
13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
FINE: null missing optional solr/cores/@leaderVoteWait
13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile
INFO: Persisting cores config to /solr2/solr.xml
13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState
INFO: Updating cloud state from ZooKeeper...
13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process
INFO: A cluster state change has occurred - updating...
-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: 13 November 2012 14:13
To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: Removing Shards from Zookeeper - no servers hosting shard
Odd...the unload command should be enough...
On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:
> Hi all,
>
> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011. We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster. We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
>
> Old code to remove cores from Zookeeper:
>
>
> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>
>
> echo "Removing indexes from all Zookeeper hosts"
> for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
> do
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
> Done
>
> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
>
> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
>
> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work? The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both. This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it. I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work. (also doesn't work when I do it)
>
> Regards,
>
> Gilles Comeau
--
- Mark
RE: Removing Shards from Zookeeper - no servers hosting shard
Posted by Gilles Comeau <gi...@polecat.co>.
Sorry forgot.. pictures are no good.. From cluster.json, the same information, the core I unloaded shard sticks around: “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}”
Do I need a special command to delete the shard or something? I’ve never seen a command that does that?
Regards, Gilles
"experiment":{
"solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{
"shard":"solrexperiment:8080_solr_experiment_master",
"roles":null, "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
"solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{
"shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment",
"collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}
From: Gilles Comeau [mailto:gilles.comeau@polecat.co]
Sent: 13 November 2012 16:29
To: solr-user@lucene.apache.org; markrmiller@gmail.com
Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
When I do the unload through the UI, I see the below messages in the solr log. Nothing in the zookeeper log.
Then right after I try: http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true and get <str name="msg">no servers hosting shard:</str>. Also, I still see the shard being referenced in the cloud tab in the UI.
[cid:image001.png@01CDC1BB.FD2BE590]
Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺ Anyone else have any problems getting this to work?
My setup is pretty basic: Local external zookeeper 3.3.6, solr 4.0 with three cores seen above.
Regards, Gilles
INFO: [02_10_2012_experiment] CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6<ma...@11e3c2c6>
13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher
INFO: [02_10_2012_experiment] Closing main searcher on request.
13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close
FINE: Closing Searcher@7cd47880 main
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}
documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close
FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>>
13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close
INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
INFO: SolrCoreState ref count has reached 0 - closing IndexWriter
13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
INFO: Closing SolrCoreState - canceling any ongoing recovery
13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile
INFO: Persisting cores config to /solr2/solr.xml
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@adminPath=/admin/cores
13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
FINE: null missing optional solr/cores/@shareSchema
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@hostPort=9090
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@zkClientTimeout=10000
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@hostContext=solr
13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
FINE: null missing optional solr/cores/@leaderVoteWait
13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile
INFO: Persisting cores config to /solr2/solr.xml
13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState
INFO: Updating cloud state from ZooKeeper...
13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process
INFO: A cluster state change has occurred - updating...
-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: 13 November 2012 14:13
To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: Removing Shards from Zookeeper - no servers hosting shard
Odd...the unload command should be enough...
On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:
> Hi all,
>
> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011. We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster. We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
>
> Old code to remove cores from Zookeeper:
>
>
> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>
>
> echo "Removing indexes from all Zookeeper hosts"
> for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
> do
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
> Done
>
> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
>
> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
>
> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work? The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both. This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it. I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work. (also doesn't work when I do it)
>
> Regards,
>
> Gilles Comeau
--
- Mark
RE: Removing Shards from Zookeeper - no servers hosting shard
Posted by Gilles Comeau <gi...@polecat.co>.
When I do the unload through the UI, I see the below messages in the solr log. Nothing in the zookeeper log.
Then right after I try: http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true and get <str name="msg">no servers hosting shard:</str>. Also, I still see the shard being referenced in the cloud tab in the UI.
[cid:image001.png@01CDC1BB.FD2BE590]
Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺ Anyone else have any problems getting this to work?
My setup is pretty basic: Local external zookeeper 3.3.6, solr 4.0 with three cores seen above.
Regards, Gilles
INFO: [02_10_2012_experiment] CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6
13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher
INFO: [02_10_2012_experiment] Closing main searcher on request.
13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close
FINE: Closing Searcher@7cd47880 main
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}
documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close
FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>
13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close
INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
INFO: SolrCoreState ref count has reached 0 - closing IndexWriter
13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
INFO: Closing SolrCoreState - canceling any ongoing recovery
13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile
INFO: Persisting cores config to /solr2/solr.xml
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@adminPath=/admin/cores
13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
FINE: null missing optional solr/cores/@shareSchema
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@hostPort=9090
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@zkClientTimeout=10000
13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
FINE: null solr/cores/@hostContext=solr
13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
FINE: null missing optional solr/cores/@leaderVoteWait
13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile
INFO: Persisting cores config to /solr2/solr.xml
13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState
INFO: Updating cloud state from ZooKeeper...
13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process
INFO: A cluster state change has occurred - updating...
-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: 13 November 2012 14:13
To: solr-user@lucene.apache.org
Subject: Re: Removing Shards from Zookeeper - no servers hosting shard
Odd...the unload command should be enough...
On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co>> wrote:
> Hi all,
>
> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011. We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster. We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
>
> Old code to remove cores from Zookeeper:
>
>
> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>
>
> echo "Removing indexes from all Zookeeper hosts"
> for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
> do
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
> Done
>
> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
>
> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
>
> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work? The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both. This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it. I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work. (also doesn't work when I do it)
>
> Regards,
>
> Gilles Comeau
--
- Mark
Re: Removing Shards from Zookeeper - no servers hosting shard
Posted by Mark Miller <ma...@gmail.com>.
Odd...the unload command should be enough...
On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gi...@polecat.co> wrote:
> Hi all,
>
> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 development version circa November 2011. We keep 6 months of data online in our primary cluster, and archive off old stuff to a slower disk archive cluster. We used to remove SOLR cores with the following code, but everything has changed in Zookeeper now.
>
> Old code to remove cores from Zookeeper:
>
>
> curl http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>
>
> echo "Removing indexes from all Zookeeper hosts"
> for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
> do
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
> $JAVA -cp .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete /collections/polecat/shards/solrenglish:8080_solr_$SHARD
> Done
>
> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
>
> Now that we have migrated, I have tried removing cores from Zookeeper by removing the stuff for the unloaded core in "leaders" and "leader_elect", but for some reason SOLR keeps sending the requests to the shard, and I end up with the "no servers hosting shard" error.
>
> Does anyone know how to remove a SOLR core from a SOLR server and have Zookeeper updated, and have distributed queries still work? The only thing I know how to do now is stop tomcat, stop zookeeper, clear out the data directory and then restart both. This isn't really ideal for a process I'd like to have running each night, and surely it is something others have it. I've tried google searching, and what I find is references to the bug where solr notifies zookeeper on core unloads which is marked as fixed, and people talking about how it doesn't work but if your run reloads on each core, it will work. (also doesn't work when I do it)
>
> Regards,
>
> Gilles Comeau
--
- Mark