You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sling <sl...@gmail.com> on 2013/10/21 11:20:51 UTC

how to avoid recover? how to ensure a recover success?

Hi, guys:

I have an online application with solrcloud 4.1, but I get errors of
syncpeer every 2 or 3 weeks...
In my opinion, a recover occers when a replica can not sync data to its
leader successfully.

I see the topic 
http://lucene.472066.n3.nabble.com/SolrCloud-5x-Errors-while-recovering-td4022542.html
and https://issues.apache.org/jira/i#browse/SOLR-4032, but why did I still
get similar errors in solrcloud4.1?

so is there any settings for syncpeer? 
how to reduce the probability of this error?
when recover happens, how to ensure its success?



The errors I got is like these:
[2013.10.21 10:39:13.482]2013-10-21 10:39:13,482 WARN
[org.apache.solr.handler.SnapPuller] - Error in fetching packets 
[2013.10.21 10:39:13.482]java.io.EOFException
[2013.10.21 10:39:13.482]       at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
[2013.10.21 10:39:13.482]       at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
[2013.10.21 10:39:13.482]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1136)
[2013.10.21 10:39:13.482]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1099)
[2013.10.21 10:39:13.482]       at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:738)
[2013.10.21 10:39:13.482]       at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:395)
[2013.10.21 10:39:13.482]       at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:274)
[2013.10.21 10:39:13.482]       at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:153)
[2013.10.21 10:39:13.482]       at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:409)
[2013.10.21 10:39:13.482]       at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:223)
[2013.10.21 10:39:13.485]2013-10-21 10:39:13,485 WARN
[org.apache.solr.handler.SnapPuller] - Error in fetching packets 
[2013.10.21 10:39:13.485]java.io.EOFException
[2013.10.21 10:39:13.485]       at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
[2013.10.21 10:39:13.485]       at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
[2013.10.21 10:39:13.485]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1136)
[2013.10.21 10:39:13.485]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1099)
[2013.10.21 10:39:13.485]       at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:738)
[2013.10.21 10:39:13.485]       at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:395)
[2013.10.21 10:39:13.485]       at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:274)
[2013.10.21 10:39:13.485]       at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:153)
[2013.10.21 10:39:13.485]       at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:409)
[2013.10.21 10:39:13.485]       at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:223)
[2013.10.21 10:41:08.461]2013-10-21 10:41:08,461 ERROR
[org.apache.solr.handler.ReplicationHandler] - SnapPull failed
:org.apache.solr.common.SolrException: Unable to download
_fi05_Lucene41_0.pos completely. Downloaded 0!=1485
[2013.10.21 10:41:08.461]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapPuller.java:1230)
[2013.10.21 10:41:08.461]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1110)
[2013.10.21 10:41:08.461]       at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:738)
[2013.10.21 10:41:08.461]       at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:395)
[2013.10.21 10:41:08.461]       at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:274)
[2013.10.21 10:41:08.461]       at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:153)
[2013.10.21 10:41:08.461]       at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:409)
[2013.10.21 10:41:08.461]       at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:223)
[2013.10.21 10:41:08.461]
[2013.10.21 10:41:08.461]2013-10-21 10:41:08,461 ERROR
[org.apache.solr.cloud.RecoveryStrategy] - Error while trying to
recover:org.apache.solr.common.SolrException: Replication for recovery
failed.
[2013.10.21 10:41:08.461]       at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:156)
[2013.10.21 10:41:08.461]       at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:409)
[2013.10.21 10:41:08.461]       at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:223)
[2013.10.21 10:41:08.461]
[2013.10.21 10:41:08.555]2013-10-21 10:41:08,462 ERROR
[org.apache.solr.handler.ReplicationHandler] - SnapPull failed
:org.apache.solr.common.SolrException: Unable to download _2mgr_gh.del
completely. Downloaded 0!=13971
[2013.10.21 10:41:08.555]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapPuller.java:1230)
[2013.10.21 10:41:08.555]       at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1110)
[2013.10.21 10:41:08.555]       at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:738)
[2013.10.21 10:41:08.555]       at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:395)
[2013.10.21 10:41:08.555]       at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:274)
[2013.10.21 10:41:08.555]       at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:153)
[2013.10.21 10:41:08.555]       at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:409)
[2013.10.21 10:41:08.555]       at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:223)
[2013.10.21 10:41:08.555]
[2013.10.21 10:41:08.556]2013-10-21 10:41:08,556 ERROR
[org.apache.solr.cloud.RecoveryStrategy] - Error while trying to
recover:org.apache.solr.common.SolrException: Replication for recovery
failed.
[2013.10.21 10:41:08.556]       at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:156)
[2013.10.21 10:41:08.556]       at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:409)
[2013.10.21 10:41:08.556]       at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:223)
[2013.10.21 10:41:08.556]
[2013.10.21 10:41:08.574]2013-10-21 10:41:08,574 ERROR
[org.apache.solr.cloud.RecoveryStrategy] - Recovery failed - trying again...
(0) core=cgindex
[2013.10.21 10:41:08.650]2013-10-21 10:41:08,574 ERROR
[org.apache.solr.cloud.RecoveryStrategy] - Recovery failed - trying again...
(0) core=doc
[2013.10.21 10:41:37.548]2013-10-21 10:41:37,548 WARN
[org.apache.solr.update.UpdateLog] - Starting log replay
tlog{file=/home/app/workspace/cmsindex/solr/solrhome/pic/data/tlog/tlog.0000000000000014967
refcount=2} active=true starting pos=0




--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-avoid-recover-how-to-ensure-a-recover-success-tp4096777.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to avoid recover? how to ensure a recover success?

Posted by deniz <de...@gmail.com>.
I have had a similar problem before but the patch which was included with the
version 4.1 fixed that... I couldnt reproduce the problem with the patch... 

anyone is able to reproduce this exception?



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-avoid-recover-how-to-ensure-a-recover-success-tp4096777p4098166.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to avoid recover? how to ensure a recover success?

Posted by sling <sl...@gmail.com>.
There is 8Gb index in each replica, 8 nodes , 4 shards and 4 collections in
this application.

in test enviroment, it can get 20 qps with no pressure. but the index size
is small too...



--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-avoid-recover-how-to-ensure-a-recover-success-tp4096777p4096963.html
Sent from the Solr - User mailing list archive at Nabble.com.