You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Vijay Sekhri <se...@gmail.com> on 2015/01/26 22:43:21 UTC

solr cloud replicas goes in recovery mode after update

Hi Erick,

The older message seems to be deleted so I am sending a new one
http://osdir.com/ml/solr-user.lucene.apache.org/2015-01/msg00773.html


In solr.xml file I had zk timeout set to*  <int
name="zkClientTimeout">${zkClientTimeout:450000}</int>*
One thing that made a it a bit better now is the zk tick time and syncLimit
settings. I set it to a higher value as below. This may not be advisable
though.

tickTime=30000
initLimit=30
syncLimit=20

Now we observed that replicas do not go in recovery that often as before.
In the whole cluster at a given time I would have a couple of replicas in
recovery whereas earlier it were multiple replicas from every shard .
On the wiki https://wiki.apache.org/solr/SolrCloud it says the "The maximum
is 20 times the tickTime." in the FAQ so I decided to increase the tick
time. Is this the correct approach ?

One question I have is that if auto commit settings has anything to do with
this or not ? Does it induce extra work for the searchers because of which
this would happen? I have tried with following settings
*  <autoSoftCommit>*
*            <maxDocs>500000</maxDocs>*
*            <maxTime>900000</maxTime>*
*        </autoSoftCommit>*

*        <autoCommit>*
*            <maxDocs>200000</maxDocs>*
*            <maxTime>30000</maxTime>*
*            <openSearcher>false</openSearcher>*
*        </autoCommit>*

I have increased  the  heap size to 15GB for each JVM instance . I
monitored during full indexing how the heap usage looks like and it never
goes beyond 8 GB .  I don't see any Full GC happening at any point .


 Our rate is a variable rate . It is not a sustained rate of 6000/second ,
however there are intervals where it would reach that much and come down
and grow again and come down.  So if I would take an average it would be
600/second only but that is not real rate at any given time.
Version of solr cloud is *4.10*.  All indexers are basically java programs
running on different host using CloudSolrServer api.
As I mentioned it is much better now than before , however not completely
as expected .If possible we would like to have none of them go in recovery

I captured some logs before and after recovery

14:16:52,774 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: delete
"_5r2_1.del"
14:16:52,774 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: 0 msec to
checkpoint
14:16:52,774 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: now
checkpoint "_4qe(4.10.0):C4312879/1nts ; isCommit = false]
14:16:52,774 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: delete
"_5r4_1.del"
14:16:52,774 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: 0 msec to
checkpoint
14:16:52,775 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [DW][recoveryExecutor-7-thread-1]:
recoveryExecutor-7-thread-1 finishFullFlush success=true
14:16:52,775 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
findMerges: 34 segments
14:16:52,777 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_554(4.10.0):C3995865/780418:delGen=23 size=3669.307 MB [skip: too
large]
14:16:52,777 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_4qe(4.10.0):C4312879/1370113:delGen=57 size=3506.254 MB [skip: too
large]
14:16:52,777 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5co(4.10.0):C871785/93995:delGen=11 size=853.668 MB
14:16:52,777 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5kb(4.10.0):C424868/49572:delGen=12 size=518.704 MB
14:16:52,777 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5hm(4.10.0):C457977/83353:delGen=12 size=470.422 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_56u(4.10.0):C286775/11906:delGen=15 size=312.952 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5f5(4.10.0):C116528/43621:delGen=2 size=95.529 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5m7(4.10.0):C122852/54010:delGen=12 size=84.949 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5p7(4.10.0):C37457/649:delGen=2 size=54.241 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5of(4.10.0):C38545/5677:delGen=1 size=50.672 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qm(4.10.0):C29770/4:delGen=2 size=34.052 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5o8(4.10.0):C27155/7531:delGen=1 size=31.008 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5r2(4.10.0):C32769/4:delGen=2 size=27.410 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5r3(4.10.0):C26057 size=23.893 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5r4(4.10.0):C23934/3:delGen=2 size=22.004 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5nx(4.10.0):C33236/20669:delGen=2 size=19.861 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5r5(4.10.0):C24222/1:delGen=1 size=19.546 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5q0(4.10.0):C13610/2625:delGen=7 size=12.480 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qv(4.10.0):C3973/1:delGen=1 size=3.402 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qo(4.10.0):C3456/2:delGen=2 size=3.147 MB
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qq(4.10.0):C1997 size=1.781 MB [floored]
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5q2(4.10.0):C779 size=1.554 MB [floored]
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qh(4.10.0):C1765/1:delGen=1 size=1.549 MB [floored]
14:16:52,778 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qc(4.10.0):C1828/1:delGen=1 size=1.401 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qr(4.10.0):C1468 size=1.390 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qp(4.10.0):C1729/1:delGen=1 size=1.351 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qa(4.10.0):C967 size=1.235 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qi(4.10.0):C1241 size=1.146 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qy(4.10.0):C1407 size=1.050 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5q1(4.10.0):C402/1:delGen=1 size=0.954 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qw(4.10.0):C802/1:delGen=1 size=0.821 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qx(4.10.0):C638 size=0.818 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qt(4.10.0):C30 size=0.072 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
seg=_5qu(4.10.0):C27 size=0.063 MB [floored]
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
allowedSegmentCount=31 vs count=34 (eligible count=32) tooBigCount=2
14:16:52,779 INFO  [org.apache.solr.update.LoggingInfoStream]
(recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]:
maybe=_5co(4.10.0):C871785/93995:delGen=11
_5kb(4.10.0):C424868/49572:delGen=12 _5hm(4.10.0):C457977/83353:delGen=12
_56u(4.10.0):C286775/11906:delGen=15 _5f5(4.10.0):C116528/43621:delGen=2
_5m7(4.10.0):C122852/54010:delGen=12 _5p7(4.10.0):C37457/649:delGen=2
_5of(4.10.0):C38545/5677:delGen=1 _5qm(4.10.0):C29770/4:delGen=2
_5o8(4.10.0):C27155/7531:delGen=1 score=0.7313777596572674 skew=0.341
nonDelRatio=0.852 tooLarge=false size=2506.203 MB
.....

14:16:58,367 INFO  [org.apache.solr.update.LoggingInfoStream]
(searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete
"_5p7_1.del"
14:16:58,367 INFO  [org.apache.solr.update.LoggingInfoStream]
(searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete
"_5nx_1.del"
14:16:58,367 INFO  [org.apache.solr.update.LoggingInfoStream]
(searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete
"_5co_a.del"
14:16:58,367 INFO  [org.apache.solr.update.LoggingInfoStream]
(searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete
"_5q0_6.del"
14:16:58,367 INFO  [org.apache.solr.update.LoggingInfoStream]
(searcherExecutor-6-thread-1) [

-- 
*********************************************
Vijay Sekhri
*********************************************