You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by deniz <de...@gmail.com> on 2012/11/19 07:21:56 UTC

SolrCloud Error after leader restarts

Hello,

for test purposes, I am running two zookeepers on ports 2181 and 2182. and i
have two solr instances running on different machines...

For the one which is running on my local and acts as leader:
 java -Dbootstrap_conf=true -DzkHost=localhost:2181 -jar start.jar

and for the one which acts as follower, on a remote machine:
java -Djetty.port=7574 -DzkHost=<address-of-mylocal>:2182 -jar start.jar

until this point everything is smooth and i can see the configs on both
zookeeper hosts when i connect with zkCli.sh. 

just to see what happens and check recovery stuff, i have killed the solr
which is running on my local and tried to index some files by using the
follewer, which was failed... this is normal as writes are routed into the
leader...

the point that i dont understand is here:

when i restart the leader with the same command on terminal, after normal
logs, it start showing this 


Nov 19, 2012 2:15:18 PM org.apache.solr.common.SolrException log
SEVERE: SnapPull failed :org.apache.solr.common.SolrException: Index fetch
failed : 
	at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:400)
	at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:297)
	at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:151)
	at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:405)
	at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:220)
Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file
found in org.apache.lucene.store.RAMDirectory@1e75e89
lockFactory=org.apache.lucene.store.NativeFSLockFactory@128e909: files: []
	at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:741)
	at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630)
	at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:343)
	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:639)
	at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:75)
	at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:62)
	at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:191)
	at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:77)
	at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:354)
	... 4 more

Nov 19, 2012 2:15:18 PM org.apache.solr.common.SolrException log
SEVERE: Error while trying to recover:org.apache.solr.common.SolrException:
Replication for recovery failed.
	at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:154)
	at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:405)
	at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:220)


it fails to recover after shutdown... why does this happen? 


 



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Error-after-leader-restarts-tp4020985.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Error after leader restarts

Posted by deniz <de...@gmail.com>.
Mark Miller-3 wrote
> On Nov 19, 2012, at 9:11 PM, deniz &lt;

> denizdurmus87@

> &gt; wrote:
> 
>> so in case i use ramdir with 5x cloud, it will still not do the recovery?
>> i
>> mean it will not get the data from the leader and fill its ramdir again?
> 
> Yes, in 5x RAM directory should be able to recover.
> 
> - Mark

thank you so much for your patience with me :) 



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Error-after-leader-restarts-tp4020985p4021209.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Error after leader restarts

Posted by Mark Miller <ma...@gmail.com>.
On Nov 19, 2012, at 9:11 PM, deniz <de...@gmail.com> wrote:

> so in case i use ramdir with 5x cloud, it will still not do the recovery? i
> mean it will not get the data from the leader and fill its ramdir again?

Yes, in 5x RAM directory should be able to recover.

- Mark

Re: SolrCloud Error after leader restarts

Posted by deniz <de...@gmail.com>.
i know facts about ramdirectory actually.. just running some perf tests on
our dev env right now..

so in case i use ramdir with 5x cloud, it will still not do the recovery? i
mean it will not get the data from the leader and fill its ramdir again?



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Error-after-leader-restarts-tp4020985p4021203.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Error after leader restarts

Posted by Mark Miller <ma...@gmail.com>.
It's generally not a good choice to use ram directory.

4x solrcloud does not work with it no - 5x does, but in any case, ram dir is not persistent. So when you restart Solr you will lose the data.

MMap is generally the right dir to use.

- Mark

On Nov 19, 2012, at 6:52 PM, deniz <de...@gmail.com> wrote:

> yea, i am using ram.
> 
> solrcloud is not working with ram directory? 
> 
> 
> 
> -----
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Error-after-leader-restarts-tp4020985p4021194.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud Error after leader restarts

Posted by deniz <de...@gmail.com>.
yea, i am using ram.

solrcloud is not working with ram directory? 



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Error-after-leader-restarts-tp4020985p4021194.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Error after leader restarts

Posted by Mark Miller <ma...@gmail.com>.
Your using ram dir?

Sent from my iPhone

On Nov 19, 2012, at 1:21 AM, deniz <de...@gmail.com> wrote:

> Hello,
> 
> for test purposes, I am running two zookeepers on ports 2181 and 2182. and i
> have two solr instances running on different machines...
> 
> For the one which is running on my local and acts as leader:
> java -Dbootstrap_conf=true -DzkHost=localhost:2181 -jar start.jar
> 
> and for the one which acts as follower, on a remote machine:
> java -Djetty.port=7574 -DzkHost=<address-of-mylocal>:2182 -jar start.jar
> 
> until this point everything is smooth and i can see the configs on both
> zookeeper hosts when i connect with zkCli.sh. 
> 
> just to see what happens and check recovery stuff, i have killed the solr
> which is running on my local and tried to index some files by using the
> follewer, which was failed... this is normal as writes are routed into the
> leader...
> 
> the point that i dont understand is here:
> 
> when i restart the leader with the same command on terminal, after normal
> logs, it start showing this 
> 
> 
> Nov 19, 2012 2:15:18 PM org.apache.solr.common.SolrException log
> SEVERE: SnapPull failed :org.apache.solr.common.SolrException: Index fetch
> failed : 
>    at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:400)
>    at
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:297)
>    at
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:151)
>    at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:405)
>    at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:220)
> Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file
> found in org.apache.lucene.store.RAMDirectory@1e75e89
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@128e909: files: []
>    at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:741)
>    at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630)
>    at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:343)
>    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:639)
>    at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:75)
>    at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:62)
>    at
> org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:191)
>    at
> org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:77)
>    at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:354)
>    ... 4 more
> 
> Nov 19, 2012 2:15:18 PM org.apache.solr.common.SolrException log
> SEVERE: Error while trying to recover:org.apache.solr.common.SolrException:
> Replication for recovery failed.
>    at
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:154)
>    at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:405)
>    at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:220)
> 
> 
> it fails to recover after shutdown... why does this happen? 
> 
> 
> 
> 
> 
> 
> -----
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Error-after-leader-restarts-tp4020985.html
> Sent from the Solr - User mailing list archive at Nabble.com.