You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by lame <la...@o2.pl> on 2011/03/14 14:04:51 UTC

Solr 1.4 replication - partial index on slave while indexing master

Hi guys,
I have master slave replication enabled. Slave is replicating every 3
minutes and I encourage problems while I'm performing full import
command on master (which takes about 7 minutes).
Slave repliacates partial index about 200k documents out of 700k.
After next repliacation full index is replicated successfully.
We wrote simple scripts which check how many docs are indexed both on
master and slave and turns out that at the same time while slave has
partial index - master has not been reload yet and has all documents
(from previous full index).
My question is what is the best way to avoid partial index on slaves -
Should I disable replication while indexing master or should i use
core swaping?

Thanks.

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by Markus Jelsma <ma...@openindex.io>.
Yes, commits from the application will interfere indeed. If your business 
scenario allows for using always optimized indices you might choose to only 
replicate on optimize.

On Monday 14 March 2011 18:45:15 lame wrote:
> We have also commits from application (besides full import) - maybe
> that is the case.
> If you don't have any other ideas I'll probably try reindexing second
> core, than swap cores and run delta import (to import documets added
> in the meantime).
> 
> Thanks
> 
> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
> > These settings don't affect a commit. But, the maxPendingDeletes might
> > but i'm unsure. If you commit on the master and slaves are configured to
> > replicate on commit, it all should have the same index version.
> > 
> > On Monday 14 March 2011 14:42:51 lame wrote:
> >> It looks like (we don't have autocommit section in
> >> solr.DirectUpdateHandler2, is ramBufferSizeMB is responsible for
> >> that?):
> >> 
> >> <indexDefaults>
> >>     <useCompoundFile>false</useCompoundFile>
> >>     <mergeFactor>10</mergeFactor>
> >>     <ramBufferSizeMB>320</ramBufferSizeMB>
> >>     <maxMergeDocs>2147483647</maxMergeDocs>
> >>     <maxFieldLength>10000</maxFieldLength>
> >>     <writeLockTimeout>1000</writeLockTimeout>
> >>     <commitLockTimeout>10000</commitLockTimeout>
> >>     <lockType>single</lockType>
> >>   </indexDefaults>
> >> 
> >>   <mainIndex>
> >>     <useCompoundFile>false</useCompoundFile>
> >>     <ramBufferSizeMB>320</ramBufferSizeMB>
> >>     <mergeFactor>10</mergeFactor>
> >>     <maxMergeDocs>2147483647</maxMergeDocs>
> >>     <maxFieldLength>10000</maxFieldLength>
> >>     <unlockOnStartup>false</unlockOnStartup>
> >>   </mainIndex>
> >> 
> >>   <updateHandler class="solr.DirectUpdateHandler2">
> >>     <maxPendingDeletes>100000</maxPendingDeletes>
> >>   </updateHandler>
> >> 
> >> But as you said before slave replicates after commit, but in that case
> >> shouldn't master be also updated with nex index? Our scripts showed
> >> that master still has the old index (see my first email).
> >> 
> >> Thanks
> >> 
> >> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
> >> > In solrconfig there might be a autocommit section enabled.
> >> > 
> >> > On Monday 14 March 2011 14:18:42 lame wrote:
> >> >> I don't commit at all we use Dataimporter, but I have a feeling that
> >> >> it could be done by DIH (autocommit  is it possible)?
> >> >> 
> >> >> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
> >> >> > Do you commit to often? Slaves won't replicate if while master is
> >> >> > indexing if you don't send commits. Can you only commit once the
> >> >> > indexing finishes?
> >> >> > 
> >> >> > On Monday 14 March 2011 14:04:51 lame wrote:
> >> >> >> Hi guys,
> >> >> >> I have master slave replication enabled. Slave is replicating
> >> >> >> every 3 minutes and I encourage problems while I'm performing
> >> >> >> full import command on master (which takes about 7 minutes).
> >> >> >> Slave repliacates partial index about 200k documents out of 700k.
> >> >> >> After next repliacation full index is replicated successfully.
> >> >> >> We wrote simple scripts which check how many docs are indexed both
> >> >> >> on master and slave and turns out that at the same time while
> >> >> >> slave has partial index - master has not been reload yet and has
> >> >> >> all documents (from previous full index).
> >> >> >> My question is what is the best way to avoid partial index on
> >> >> >> slaves - Should I disable replication while indexing master or
> >> >> >> should i use core swaping?
> >> >> >> 
> >> >> >> Thanks.
> >> >> > 
> >> >> > --
> >> >> > Markus Jelsma - CTO - Openindex
> >> >> > http://www.linkedin.com/in/markus17
> >> >> > 050-8536620 / 06-50258350
> >> > 
> >> > --
> >> > Markus Jelsma - CTO - Openindex
> >> > http://www.linkedin.com/in/markus17
> >> > 050-8536620 / 06-50258350
> > 
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by lame <la...@o2.pl>.
We have also commits from application (besides full import) - maybe
that is the case.
If you don't have any other ideas I'll probably try reindexing second
core, than swap cores and run delta import (to import documets added
in the meantime).

2011/3/14 Markus Jelsma <ma...@openindex.io>:
> These settings don't affect a commit. But, the maxPendingDeletes might but i'm
> unsure. If you commit on the master and slaves are configured to replicate on
> commit, it all should have the same index version.
>
> On Monday 14 March 2011 14:42:51 lame wrote:
>> It looks like (we don't have autocommit section in
>> solr.DirectUpdateHandler2, is ramBufferSizeMB is responsible for
>> that?):
>>
>> <indexDefaults>
>>     <useCompoundFile>false</useCompoundFile>
>>     <mergeFactor>10</mergeFactor>
>>     <ramBufferSizeMB>320</ramBufferSizeMB>
>>     <maxMergeDocs>2147483647</maxMergeDocs>
>>     <maxFieldLength>10000</maxFieldLength>
>>     <writeLockTimeout>1000</writeLockTimeout>
>>     <commitLockTimeout>10000</commitLockTimeout>
>>     <lockType>single</lockType>
>>   </indexDefaults>
>>
>>   <mainIndex>
>>     <useCompoundFile>false</useCompoundFile>
>>     <ramBufferSizeMB>320</ramBufferSizeMB>
>>     <mergeFactor>10</mergeFactor>
>>     <maxMergeDocs>2147483647</maxMergeDocs>
>>     <maxFieldLength>10000</maxFieldLength>
>>     <unlockOnStartup>false</unlockOnStartup>
>>   </mainIndex>
>>
>>   <updateHandler class="solr.DirectUpdateHandler2">
>>     <maxPendingDeletes>100000</maxPendingDeletes>
>>   </updateHandler>
>>
>> But as you said before slave replicates after commit, but in that case
>> shouldn't master be also updated with nex index? Our scripts showed
>> that master still has the old index (see my first email).
>>
>> Thanks
>>
>> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
>> > In solrconfig there might be a autocommit section enabled.
>> >
>> > On Monday 14 March 2011 14:18:42 lame wrote:
>> >> I don't commit at all we use Dataimporter, but I have a feeling that
>> >> it could be done by DIH (autocommit  is it possible)?
>> >>
>> >> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
>> >> > Do you commit to often? Slaves won't replicate if while master is
>> >> > indexing if you don't send commits. Can you only commit once the
>> >> > indexing finishes?
>> >> >
>> >> > On Monday 14 March 2011 14:04:51 lame wrote:
>> >> >> Hi guys,
>> >> >> I have master slave replication enabled. Slave is replicating every 3
>> >> >> minutes and I encourage problems while I'm performing full import
>> >> >> command on master (which takes about 7 minutes).
>> >> >> Slave repliacates partial index about 200k documents out of 700k.
>> >> >> After next repliacation full index is replicated successfully.
>> >> >> We wrote simple scripts which check how many docs are indexed both on
>> >> >> master and slave and turns out that at the same time while slave has
>> >> >> partial index - master has not been reload yet and has all documents
>> >> >> (from previous full index).
>> >> >> My question is what is the best way to avoid partial index on slaves
>> >> >> - Should I disable replication while indexing master or should i use
>> >> >> core swaping?
>> >> >>
>> >> >> Thanks.
>> >> >
>> >> > --
>> >> > Markus Jelsma - CTO - Openindex
>> >> > http://www.linkedin.com/in/markus17
>> >> > 050-8536620 / 06-50258350
>> >
>> > --
>> > Markus Jelsma - CTO - Openindex
>> > http://www.linkedin.com/in/markus17
>> > 050-8536620 / 06-50258350
>
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by Markus Jelsma <ma...@openindex.io>.
These settings don't affect a commit. But, the maxPendingDeletes might but i'm 
unsure. If you commit on the master and slaves are configured to replicate on 
commit, it all should have the same index version.

On Monday 14 March 2011 14:42:51 lame wrote:
> It looks like (we don't have autocommit section in
> solr.DirectUpdateHandler2, is ramBufferSizeMB is responsible for
> that?):
> 
> <indexDefaults>
>     <useCompoundFile>false</useCompoundFile>
>     <mergeFactor>10</mergeFactor>
>     <ramBufferSizeMB>320</ramBufferSizeMB>
>     <maxMergeDocs>2147483647</maxMergeDocs>
>     <maxFieldLength>10000</maxFieldLength>
>     <writeLockTimeout>1000</writeLockTimeout>
>     <commitLockTimeout>10000</commitLockTimeout>
>     <lockType>single</lockType>
>   </indexDefaults>
> 
>   <mainIndex>
>     <useCompoundFile>false</useCompoundFile>
>     <ramBufferSizeMB>320</ramBufferSizeMB>
>     <mergeFactor>10</mergeFactor>
>     <maxMergeDocs>2147483647</maxMergeDocs>
>     <maxFieldLength>10000</maxFieldLength>
>     <unlockOnStartup>false</unlockOnStartup>
>   </mainIndex>
> 
>   <updateHandler class="solr.DirectUpdateHandler2">
>     <maxPendingDeletes>100000</maxPendingDeletes>
>   </updateHandler>
> 
> But as you said before slave replicates after commit, but in that case
> shouldn't master be also updated with nex index? Our scripts showed
> that master still has the old index (see my first email).
> 
> Thanks
> 
> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
> > In solrconfig there might be a autocommit section enabled.
> > 
> > On Monday 14 March 2011 14:18:42 lame wrote:
> >> I don't commit at all we use Dataimporter, but I have a feeling that
> >> it could be done by DIH (autocommit  is it possible)?
> >> 
> >> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
> >> > Do you commit to often? Slaves won't replicate if while master is
> >> > indexing if you don't send commits. Can you only commit once the
> >> > indexing finishes?
> >> > 
> >> > On Monday 14 March 2011 14:04:51 lame wrote:
> >> >> Hi guys,
> >> >> I have master slave replication enabled. Slave is replicating every 3
> >> >> minutes and I encourage problems while I'm performing full import
> >> >> command on master (which takes about 7 minutes).
> >> >> Slave repliacates partial index about 200k documents out of 700k.
> >> >> After next repliacation full index is replicated successfully.
> >> >> We wrote simple scripts which check how many docs are indexed both on
> >> >> master and slave and turns out that at the same time while slave has
> >> >> partial index - master has not been reload yet and has all documents
> >> >> (from previous full index).
> >> >> My question is what is the best way to avoid partial index on slaves
> >> >> - Should I disable replication while indexing master or should i use
> >> >> core swaping?
> >> >> 
> >> >> Thanks.
> >> > 
> >> > --
> >> > Markus Jelsma - CTO - Openindex
> >> > http://www.linkedin.com/in/markus17
> >> > 050-8536620 / 06-50258350
> > 
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by lame <la...@o2.pl>.
It looks like (we don't have autocommit section in
solr.DirectUpdateHandler2, is ramBufferSizeMB is responsible for
that?):

<indexDefaults>
    <useCompoundFile>false</useCompoundFile>
    <mergeFactor>10</mergeFactor>
    <ramBufferSizeMB>320</ramBufferSizeMB>
    <maxMergeDocs>2147483647</maxMergeDocs>
    <maxFieldLength>10000</maxFieldLength>
    <writeLockTimeout>1000</writeLockTimeout>
    <commitLockTimeout>10000</commitLockTimeout>
    <lockType>single</lockType>
  </indexDefaults>

  <mainIndex>
    <useCompoundFile>false</useCompoundFile>
    <ramBufferSizeMB>320</ramBufferSizeMB>
    <mergeFactor>10</mergeFactor>
    <maxMergeDocs>2147483647</maxMergeDocs>
    <maxFieldLength>10000</maxFieldLength>
    <unlockOnStartup>false</unlockOnStartup>
  </mainIndex>

  <updateHandler class="solr.DirectUpdateHandler2">
    <maxPendingDeletes>100000</maxPendingDeletes>
  </updateHandler>

But as you said before slave replicates after commit, but in that case
shouldn't master be also updated with nex index? Our scripts showed
that master still has the old index (see my first email).

Thanks


2011/3/14 Markus Jelsma <ma...@openindex.io>:
> In solrconfig there might be a autocommit section enabled.
>
> On Monday 14 March 2011 14:18:42 lame wrote:
>> I don't commit at all we use Dataimporter, but I have a feeling that
>> it could be done by DIH (autocommit  is it possible)?
>>
>> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
>> > Do you commit to often? Slaves won't replicate if while master is
>> > indexing if you don't send commits. Can you only commit once the
>> > indexing finishes?
>> >
>> > On Monday 14 March 2011 14:04:51 lame wrote:
>> >> Hi guys,
>> >> I have master slave replication enabled. Slave is replicating every 3
>> >> minutes and I encourage problems while I'm performing full import
>> >> command on master (which takes about 7 minutes).
>> >> Slave repliacates partial index about 200k documents out of 700k.
>> >> After next repliacation full index is replicated successfully.
>> >> We wrote simple scripts which check how many docs are indexed both on
>> >> master and slave and turns out that at the same time while slave has
>> >> partial index - master has not been reload yet and has all documents
>> >> (from previous full index).
>> >> My question is what is the best way to avoid partial index on slaves -
>> >> Should I disable replication while indexing master or should i use
>> >> core swaping?
>> >>
>> >> Thanks.
>> >
>> > --
>> > Markus Jelsma - CTO - Openindex
>> > http://www.linkedin.com/in/markus17
>> > 050-8536620 / 06-50258350
>
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by Markus Jelsma <ma...@openindex.io>.
In solrconfig there might be a autocommit section enabled.

On Monday 14 March 2011 14:18:42 lame wrote:
> I don't commit at all we use Dataimporter, but I have a feeling that
> it could be done by DIH (autocommit  is it possible)?
> 
> 2011/3/14 Markus Jelsma <ma...@openindex.io>:
> > Do you commit to often? Slaves won't replicate if while master is
> > indexing if you don't send commits. Can you only commit once the
> > indexing finishes?
> > 
> > On Monday 14 March 2011 14:04:51 lame wrote:
> >> Hi guys,
> >> I have master slave replication enabled. Slave is replicating every 3
> >> minutes and I encourage problems while I'm performing full import
> >> command on master (which takes about 7 minutes).
> >> Slave repliacates partial index about 200k documents out of 700k.
> >> After next repliacation full index is replicated successfully.
> >> We wrote simple scripts which check how many docs are indexed both on
> >> master and slave and turns out that at the same time while slave has
> >> partial index - master has not been reload yet and has all documents
> >> (from previous full index).
> >> My question is what is the best way to avoid partial index on slaves -
> >> Should I disable replication while indexing master or should i use
> >> core swaping?
> >> 
> >> Thanks.
> > 
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by lame <la...@o2.pl>.
I don't commit at all we use Dataimporter, but I have a feeling that
it could be done by DIH (autocommit  is it possible)?

2011/3/14 Markus Jelsma <ma...@openindex.io>:
> Do you commit to often? Slaves won't replicate if while master is indexing if
> you don't send commits. Can you only commit once the indexing finishes?
>
> On Monday 14 March 2011 14:04:51 lame wrote:
>> Hi guys,
>> I have master slave replication enabled. Slave is replicating every 3
>> minutes and I encourage problems while I'm performing full import
>> command on master (which takes about 7 minutes).
>> Slave repliacates partial index about 200k documents out of 700k.
>> After next repliacation full index is replicated successfully.
>> We wrote simple scripts which check how many docs are indexed both on
>> master and slave and turns out that at the same time while slave has
>> partial index - master has not been reload yet and has all documents
>> (from previous full index).
>> My question is what is the best way to avoid partial index on slaves -
>> Should I disable replication while indexing master or should i use
>> core swaping?
>>
>> Thanks.
>
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by Markus Jelsma <ma...@openindex.io>.
Do you commit to often? Slaves won't replicate if while master is indexing if 
you don't send commits. Can you only commit once the indexing finishes?

On Monday 14 March 2011 14:04:51 lame wrote:
> Hi guys,
> I have master slave replication enabled. Slave is replicating every 3
> minutes and I encourage problems while I'm performing full import
> command on master (which takes about 7 minutes).
> Slave repliacates partial index about 200k documents out of 700k.
> After next repliacation full index is replicated successfully.
> We wrote simple scripts which check how many docs are indexed both on
> master and slave and turns out that at the same time while slave has
> partial index - master has not been reload yet and has all documents
> (from previous full index).
> My question is what is the best way to avoid partial index on slaves -
> Should I disable replication while indexing master or should i use
> core swaping?
> 
> Thanks.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr 1.4 replication - partial index on slave while indexing master

Posted by Bill Bell <bi...@gmail.com>.
Turn off all autocommitting..

On 3/14/11 7:04 AM, "lame" <la...@o2.pl> wrote:

>Hi guys,
>I have master slave replication enabled. Slave is replicating every 3
>minutes and I encourage problems while I'm performing full import
>command on master (which takes about 7 minutes).
>Slave repliacates partial index about 200k documents out of 700k.
>After next repliacation full index is replicated successfully.
>We wrote simple scripts which check how many docs are indexed both on
>master and slave and turns out that at the same time while slave has
>partial index - master has not been reload yet and has all documents
>(from previous full index).
>My question is what is the best way to avoid partial index on slaves -
>Should I disable replication while indexing master or should i use
>core swaping?
>
>Thanks.