You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ben McCarthy <Be...@TraderMedia.co.uk> on 2012/03/23 12:33:03 UTC

Simple Slave Replication Question

Hello,

Im looking at the replication from a master to a number of slaves.  I have configured it and it appears to be working.  When updating 40K records on the master is it standard to always copy over the full index, currently 5gb in size.  If this is standard what do people do who have massive 200gb indexs, does it not take a while to bring the slaves inline with the master?

Thanks
Ben

________________________________________


This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses. 


RE: Simple Slave Replication Question

Posted by Ben McCarthy <Be...@TraderMedia.co.uk>.
That's great information.

Thanks for all the help and guidance, its been invaluable.

Thanks
Ben

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: 26 March 2012 12:21
To: solr-user@lucene.apache.org
Subject: Re: Simple Slave Replication Question

It's the optimize step. Optimize essentially forces all the segments to be copied into a single new segment, which means that your entire index will be replicated to the slaves.

In recent Solrs, there's usually no need to optimize, so unless and until you can demonstrate a noticeable change, I'd just leave the optimize step off. In fact, trunk renames it to forceMerge or something just because it's so common for people to think "of course I want to optimize my index!" and get the unintended consequences you're seeing even thought the optimize doesn't actually do that much good in most cases.

Some people just do the optimize once a day (or week or whatever) during off-peak hours as a compromise.

Best
Erick


On Mon, Mar 26, 2012 at 5:02 AM, Ben McCarthy <Be...@tradermedia.co.uk> wrote:
> Hello,
>
> Had to leave the office so didn't get a chance to reply.  Nothing in the logs.  Just ran one through from the ingest tool.
>
> Same results full copy of the index.
>
> Is it something to do with:
>
> server.commit();
> server.optimize();
>
> I call this at the end of the ingestion.
>
> Would optimize then work across the whole index?
>
> Thanks
> Ben
>
> -----Original Message-----
> From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com]
> Sent: 23 March 2012 15:10
> To: solr-user@lucene.apache.org
> Subject: Re: Simple Slave Replication Question
>
> Also, what happens if, instead of adding the 40K docs you add just one and commit?
>
> 2012/3/23 Tomás Fernández Löbbe <to...@gmail.com>
>
>> Have you changed the mergeFactor or are you using 10 as in the 
>> example solrconfig?
>>
>> What do you see in the slave's log during replication? Do you see any 
>> line like "Skipping download for..."?
>>
>>
>> On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy < 
>> Ben.McCarthy@tradermedia.co.uk> wrote:
>>
>>> I just have a index directory.
>>>
>>> I push the documents through with a change to a field.  Im using 
>>> SOLRJ to do this.  Im using the guide from the wiki to setup the 
>>> replication.  When the feed of updates to the master finishes I call 
>>> a commit again using SOLRJ.  I then have a poll period of 5 minutes 
>>> from the slave.  When it kicks in I see a new version of the index 
>>> and then it copys the full 5gb index.
>>>
>>> Thanks
>>> Ben
>>>
>>> -----Original Message-----
>>> From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com]
>>> Sent: 23 March 2012 14:29
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Simple Slave Replication Question
>>>
>>> Hi Ben, only new segments are replicated from master to slave. In a 
>>> situation where all the segments are new, this will cause the index 
>>> to be fully replicated, but this rarely happen with incremental 
>>> updates. It can also happen if the slave Solr assumes it has an "invalid" index.
>>> Are you committing or optimizing on the slaves? After replication, 
>>> the index directory on the slaves is called "index" or "index.<timestamp>"?
>>>
>>> Tomás
>>>
>>> On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy < 
>>> Ben.McCarthy@tradermedia.co.uk> wrote:
>>>
>>> > So do you just simpy address this with big nic and network pipes.
>>> >
>>> > -----Original Message-----
>>> > From: Martin Koch [mailto:mak@issuu.com]
>>> > Sent: 23 March 2012 14:07
>>> > To: solr-user@lucene.apache.org
>>> > Subject: Re: Simple Slave Replication Question
>>> >
>>> > I guess this would depend on network bandwidth, but we move around 
>>> > 150G/hour when hooking up a new slave to the master.
>>> >
>>> > /Martin
>>> >
>>> > On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy < 
>>> > Ben.McCarthy@tradermedia.co.uk> wrote:
>>> >
>>> > > Hello,
>>> > >
>>> > > Im looking at the replication from a master to a number of slaves.
>>> > > I have configured it and it appears to be working.  When 
>>> > > updating 40K records on the master is it standard to always copy 
>>> > > over the full index, currently 5gb in size.  If this is standard 
>>> > > what do people do who have massive 200gb indexs, does it not 
>>> > > take a while to bring the
>>> > slaves inline with the master?
>>> > >
>>> > > Thanks
>>> > > Ben
>>> > >
>>> > > ________________________________________
>>> > >
>>> > >
>>> > > This e-mail is sent on behalf of Trader Media Group Limited, 
>>> > > Registered
>>> > > Office: Auto Trader House, Cutbush Park Industrial Estate, 
>>> > > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>>> > 4768833).
>>> > > This email and any files transmitted with it are confidential 
>>> > > and may be legally privileged, and intended solely for the use 
>>> > > of the individual or entity to whom they are addressed. If you 
>>> > > have received this email in error please notify the sender. This 
>>> > > email message has been swept for the presence of computer viruses.
>>> > >
>>> > >
>>> >
>>> > ________________________________________
>>> >
>>> >
>>> > This e-mail is sent on behalf of Trader Media Group Limited, 
>>> > Registered
>>> > Office: Auto Trader House, Cutbush Park Industrial Estate, 
>>> > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>>> 4768833).
>>> > This email and any files transmitted with it are confidential and 
>>> > may be legally privileged, and intended solely for the use of the 
>>> > individual or entity to whom they are addressed. If you have 
>>> > received this email in error please notify the sender. This email 
>>> > message has been swept for the presence of computer viruses.
>>> >
>>> >
>>>
>>> ________________________________________
>>>
>>>
>>> This e-mail is sent on behalf of Trader Media Group Limited, 
>>> Registered
>>> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, 
>>> Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
>>> This email and any files transmitted with it are confidential and 
>>> may be legally privileged, and intended solely for the use of the 
>>> individual or entity to whom they are addressed. If you have 
>>> received this email in error please notify the sender. This email 
>>> message has been swept for the presence of computer viruses.
>>>
>>>
>>
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses.
>

________________________________________


This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses. 


Re: Simple Slave Replication Question

Posted by Erick Erickson <er...@gmail.com>.
It's the optimize step. Optimize essentially forces all the segments to
be copied into a single new segment, which means that your entire index
will be replicated to the slaves.

In recent Solrs, there's usually no need to optimize, so unless and until you
can demonstrate a noticeable change, I'd just leave the optimize step off. In
fact, trunk renames it to forceMerge or something just because it's so common
for people to think "of course I want to optimize my index!" and get the
unintended consequences you're seeing even thought the optimize doesn't
actually do that much good in most cases.

Some people just do the optimize once a day (or week or whatever) during
off-peak hours as a compromise.

Best
Erick


On Mon, Mar 26, 2012 at 5:02 AM, Ben McCarthy
<Be...@tradermedia.co.uk> wrote:
> Hello,
>
> Had to leave the office so didn't get a chance to reply.  Nothing in the logs.  Just ran one through from the ingest tool.
>
> Same results full copy of the index.
>
> Is it something to do with:
>
> server.commit();
> server.optimize();
>
> I call this at the end of the ingestion.
>
> Would optimize then work across the whole index?
>
> Thanks
> Ben
>
> -----Original Message-----
> From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com]
> Sent: 23 March 2012 15:10
> To: solr-user@lucene.apache.org
> Subject: Re: Simple Slave Replication Question
>
> Also, what happens if, instead of adding the 40K docs you add just one and commit?
>
> 2012/3/23 Tomás Fernández Löbbe <to...@gmail.com>
>
>> Have you changed the mergeFactor or are you using 10 as in the example
>> solrconfig?
>>
>> What do you see in the slave's log during replication? Do you see any
>> line like "Skipping download for..."?
>>
>>
>> On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy <
>> Ben.McCarthy@tradermedia.co.uk> wrote:
>>
>>> I just have a index directory.
>>>
>>> I push the documents through with a change to a field.  Im using
>>> SOLRJ to do this.  Im using the guide from the wiki to setup the
>>> replication.  When the feed of updates to the master finishes I call
>>> a commit again using SOLRJ.  I then have a poll period of 5 minutes
>>> from the slave.  When it kicks in I see a new version of the index
>>> and then it copys the full 5gb index.
>>>
>>> Thanks
>>> Ben
>>>
>>> -----Original Message-----
>>> From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com]
>>> Sent: 23 March 2012 14:29
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Simple Slave Replication Question
>>>
>>> Hi Ben, only new segments are replicated from master to slave. In a
>>> situation where all the segments are new, this will cause the index
>>> to be fully replicated, but this rarely happen with incremental
>>> updates. It can also happen if the slave Solr assumes it has an "invalid" index.
>>> Are you committing or optimizing on the slaves? After replication,
>>> the index directory on the slaves is called "index" or "index.<timestamp>"?
>>>
>>> Tomás
>>>
>>> On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy <
>>> Ben.McCarthy@tradermedia.co.uk> wrote:
>>>
>>> > So do you just simpy address this with big nic and network pipes.
>>> >
>>> > -----Original Message-----
>>> > From: Martin Koch [mailto:mak@issuu.com]
>>> > Sent: 23 March 2012 14:07
>>> > To: solr-user@lucene.apache.org
>>> > Subject: Re: Simple Slave Replication Question
>>> >
>>> > I guess this would depend on network bandwidth, but we move around
>>> > 150G/hour when hooking up a new slave to the master.
>>> >
>>> > /Martin
>>> >
>>> > On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
>>> > Ben.McCarthy@tradermedia.co.uk> wrote:
>>> >
>>> > > Hello,
>>> > >
>>> > > Im looking at the replication from a master to a number of slaves.
>>> > > I have configured it and it appears to be working.  When updating
>>> > > 40K records on the master is it standard to always copy over the
>>> > > full index, currently 5gb in size.  If this is standard what do
>>> > > people do who have massive 200gb indexs, does it not take a while
>>> > > to bring the
>>> > slaves inline with the master?
>>> > >
>>> > > Thanks
>>> > > Ben
>>> > >
>>> > > ________________________________________
>>> > >
>>> > >
>>> > > This e-mail is sent on behalf of Trader Media Group Limited,
>>> > > Registered
>>> > > Office: Auto Trader House, Cutbush Park Industrial Estate,
>>> > > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>>> > 4768833).
>>> > > This email and any files transmitted with it are confidential and
>>> > > may be legally privileged, and intended solely for the use of the
>>> > > individual or entity to whom they are addressed. If you have
>>> > > received this email in error please notify the sender. This email
>>> > > message has been swept for the presence of computer viruses.
>>> > >
>>> > >
>>> >
>>> > ________________________________________
>>> >
>>> >
>>> > This e-mail is sent on behalf of Trader Media Group Limited,
>>> > Registered
>>> > Office: Auto Trader House, Cutbush Park Industrial Estate,
>>> > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>>> 4768833).
>>> > This email and any files transmitted with it are confidential and
>>> > may be legally privileged, and intended solely for the use of the
>>> > individual or entity to whom they are addressed. If you have
>>> > received this email in error please notify the sender. This email
>>> > message has been swept for the presence of computer viruses.
>>> >
>>> >
>>>
>>> ________________________________________
>>>
>>>
>>> This e-mail is sent on behalf of Trader Media Group Limited,
>>> Registered
>>> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
>>> Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
>>> This email and any files transmitted with it are confidential and may
>>> be legally privileged, and intended solely for the use of the
>>> individual or entity to whom they are addressed. If you have received
>>> this email in error please notify the sender. This email message has
>>> been swept for the presence of computer viruses.
>>>
>>>
>>
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses.
>

RE: Simple Slave Replication Question

Posted by Ben McCarthy <Be...@TraderMedia.co.uk>.
Hello,

Had to leave the office so didn't get a chance to reply.  Nothing in the logs.  Just ran one through from the ingest tool.

Same results full copy of the index.

Is it something to do with:

server.commit();
server.optimize();

I call this at the end of the ingestion.

Would optimize then work across the whole index?

Thanks
Ben

-----Original Message-----
From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com] 
Sent: 23 March 2012 15:10
To: solr-user@lucene.apache.org
Subject: Re: Simple Slave Replication Question

Also, what happens if, instead of adding the 40K docs you add just one and commit?

2012/3/23 Tomás Fernández Löbbe <to...@gmail.com>

> Have you changed the mergeFactor or are you using 10 as in the example 
> solrconfig?
>
> What do you see in the slave's log during replication? Do you see any 
> line like "Skipping download for..."?
>
>
> On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy < 
> Ben.McCarthy@tradermedia.co.uk> wrote:
>
>> I just have a index directory.
>>
>> I push the documents through with a change to a field.  Im using 
>> SOLRJ to do this.  Im using the guide from the wiki to setup the 
>> replication.  When the feed of updates to the master finishes I call 
>> a commit again using SOLRJ.  I then have a poll period of 5 minutes 
>> from the slave.  When it kicks in I see a new version of the index 
>> and then it copys the full 5gb index.
>>
>> Thanks
>> Ben
>>
>> -----Original Message-----
>> From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com]
>> Sent: 23 March 2012 14:29
>> To: solr-user@lucene.apache.org
>> Subject: Re: Simple Slave Replication Question
>>
>> Hi Ben, only new segments are replicated from master to slave. In a 
>> situation where all the segments are new, this will cause the index 
>> to be fully replicated, but this rarely happen with incremental 
>> updates. It can also happen if the slave Solr assumes it has an "invalid" index.
>> Are you committing or optimizing on the slaves? After replication, 
>> the index directory on the slaves is called "index" or "index.<timestamp>"?
>>
>> Tomás
>>
>> On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy < 
>> Ben.McCarthy@tradermedia.co.uk> wrote:
>>
>> > So do you just simpy address this with big nic and network pipes.
>> >
>> > -----Original Message-----
>> > From: Martin Koch [mailto:mak@issuu.com]
>> > Sent: 23 March 2012 14:07
>> > To: solr-user@lucene.apache.org
>> > Subject: Re: Simple Slave Replication Question
>> >
>> > I guess this would depend on network bandwidth, but we move around 
>> > 150G/hour when hooking up a new slave to the master.
>> >
>> > /Martin
>> >
>> > On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy < 
>> > Ben.McCarthy@tradermedia.co.uk> wrote:
>> >
>> > > Hello,
>> > >
>> > > Im looking at the replication from a master to a number of slaves.
>> > > I have configured it and it appears to be working.  When updating 
>> > > 40K records on the master is it standard to always copy over the 
>> > > full index, currently 5gb in size.  If this is standard what do 
>> > > people do who have massive 200gb indexs, does it not take a while 
>> > > to bring the
>> > slaves inline with the master?
>> > >
>> > > Thanks
>> > > Ben
>> > >
>> > > ________________________________________
>> > >
>> > >
>> > > This e-mail is sent on behalf of Trader Media Group Limited, 
>> > > Registered
>> > > Office: Auto Trader House, Cutbush Park Industrial Estate, 
>> > > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>> > 4768833).
>> > > This email and any files transmitted with it are confidential and 
>> > > may be legally privileged, and intended solely for the use of the 
>> > > individual or entity to whom they are addressed. If you have 
>> > > received this email in error please notify the sender. This email 
>> > > message has been swept for the presence of computer viruses.
>> > >
>> > >
>> >
>> > ________________________________________
>> >
>> >
>> > This e-mail is sent on behalf of Trader Media Group Limited, 
>> > Registered
>> > Office: Auto Trader House, Cutbush Park Industrial Estate, 
>> > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>> 4768833).
>> > This email and any files transmitted with it are confidential and 
>> > may be legally privileged, and intended solely for the use of the 
>> > individual or entity to whom they are addressed. If you have 
>> > received this email in error please notify the sender. This email 
>> > message has been swept for the presence of computer viruses.
>> >
>> >
>>
>> ________________________________________
>>
>>
>> This e-mail is sent on behalf of Trader Media Group Limited, 
>> Registered
>> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, 
>> Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
>> This email and any files transmitted with it are confidential and may 
>> be legally privileged, and intended solely for the use of the 
>> individual or entity to whom they are addressed. If you have received 
>> this email in error please notify the sender. This email message has 
>> been swept for the presence of computer viruses.
>>
>>
>

________________________________________


This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses. 


Re: Simple Slave Replication Question

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
Also, what happens if, instead of adding the 40K docs you add just one and
commit?

2012/3/23 Tomás Fernández Löbbe <to...@gmail.com>

> Have you changed the mergeFactor or are you using 10 as in the example
> solrconfig?
>
> What do you see in the slave's log during replication? Do you see any line
> like "Skipping download for..."?
>
>
> On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy <
> Ben.McCarthy@tradermedia.co.uk> wrote:
>
>> I just have a index directory.
>>
>> I push the documents through with a change to a field.  Im using SOLRJ to
>> do this.  Im using the guide from the wiki to setup the replication.  When
>> the feed of updates to the master finishes I call a commit again using
>> SOLRJ.  I then have a poll period of 5 minutes from the slave.  When it
>> kicks in I see a new version of the index and then it copys the full 5gb
>> index.
>>
>> Thanks
>> Ben
>>
>> -----Original Message-----
>> From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com]
>> Sent: 23 March 2012 14:29
>> To: solr-user@lucene.apache.org
>> Subject: Re: Simple Slave Replication Question
>>
>> Hi Ben, only new segments are replicated from master to slave. In a
>> situation where all the segments are new, this will cause the index to be
>> fully replicated, but this rarely happen with incremental updates. It can
>> also happen if the slave Solr assumes it has an "invalid" index.
>> Are you committing or optimizing on the slaves? After replication, the
>> index directory on the slaves is called "index" or "index.<timestamp>"?
>>
>> Tomás
>>
>> On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy <
>> Ben.McCarthy@tradermedia.co.uk> wrote:
>>
>> > So do you just simpy address this with big nic and network pipes.
>> >
>> > -----Original Message-----
>> > From: Martin Koch [mailto:mak@issuu.com]
>> > Sent: 23 March 2012 14:07
>> > To: solr-user@lucene.apache.org
>> > Subject: Re: Simple Slave Replication Question
>> >
>> > I guess this would depend on network bandwidth, but we move around
>> > 150G/hour when hooking up a new slave to the master.
>> >
>> > /Martin
>> >
>> > On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
>> > Ben.McCarthy@tradermedia.co.uk> wrote:
>> >
>> > > Hello,
>> > >
>> > > Im looking at the replication from a master to a number of slaves.
>> > > I have configured it and it appears to be working.  When updating
>> > > 40K records on the master is it standard to always copy over the
>> > > full index, currently 5gb in size.  If this is standard what do
>> > > people do who have massive 200gb indexs, does it not take a while to
>> > > bring the
>> > slaves inline with the master?
>> > >
>> > > Thanks
>> > > Ben
>> > >
>> > > ________________________________________
>> > >
>> > >
>> > > This e-mail is sent on behalf of Trader Media Group Limited,
>> > > Registered
>> > > Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
>> > > Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>> > 4768833).
>> > > This email and any files transmitted with it are confidential and
>> > > may be legally privileged, and intended solely for the use of the
>> > > individual or entity to whom they are addressed. If you have
>> > > received this email in error please notify the sender. This email
>> > > message has been swept for the presence of computer viruses.
>> > >
>> > >
>> >
>> > ________________________________________
>> >
>> >
>> > This e-mail is sent on behalf of Trader Media Group Limited,
>> > Registered
>> > Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
>> > Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
>> 4768833).
>> > This email and any files transmitted with it are confidential and may
>> > be legally privileged, and intended solely for the use of the
>> > individual or entity to whom they are addressed. If you have received
>> > this email in error please notify the sender. This email message has
>> > been swept for the presence of computer viruses.
>> >
>> >
>>
>> ________________________________________
>>
>>
>> This e-mail is sent on behalf of Trader Media Group Limited, Registered
>> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower
>> Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
>> This email and any files transmitted with it are confidential and may be
>> legally privileged, and intended solely for the use of the individual or
>> entity to whom they are addressed. If you have received this email in error
>> please notify the sender. This email message has been swept for the
>> presence of computer viruses.
>>
>>
>

Re: Simple Slave Replication Question

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
Have you changed the mergeFactor or are you using 10 as in the example
solrconfig?

What do you see in the slave's log during replication? Do you see any line
like "Skipping download for..."?

On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy <
Ben.McCarthy@tradermedia.co.uk> wrote:

> I just have a index directory.
>
> I push the documents through with a change to a field.  Im using SOLRJ to
> do this.  Im using the guide from the wiki to setup the replication.  When
> the feed of updates to the master finishes I call a commit again using
> SOLRJ.  I then have a poll period of 5 minutes from the slave.  When it
> kicks in I see a new version of the index and then it copys the full 5gb
> index.
>
> Thanks
> Ben
>
> -----Original Message-----
> From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com]
> Sent: 23 March 2012 14:29
> To: solr-user@lucene.apache.org
> Subject: Re: Simple Slave Replication Question
>
> Hi Ben, only new segments are replicated from master to slave. In a
> situation where all the segments are new, this will cause the index to be
> fully replicated, but this rarely happen with incremental updates. It can
> also happen if the slave Solr assumes it has an "invalid" index.
> Are you committing or optimizing on the slaves? After replication, the
> index directory on the slaves is called "index" or "index.<timestamp>"?
>
> Tomás
>
> On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy <
> Ben.McCarthy@tradermedia.co.uk> wrote:
>
> > So do you just simpy address this with big nic and network pipes.
> >
> > -----Original Message-----
> > From: Martin Koch [mailto:mak@issuu.com]
> > Sent: 23 March 2012 14:07
> > To: solr-user@lucene.apache.org
> > Subject: Re: Simple Slave Replication Question
> >
> > I guess this would depend on network bandwidth, but we move around
> > 150G/hour when hooking up a new slave to the master.
> >
> > /Martin
> >
> > On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
> > Ben.McCarthy@tradermedia.co.uk> wrote:
> >
> > > Hello,
> > >
> > > Im looking at the replication from a master to a number of slaves.
> > > I have configured it and it appears to be working.  When updating
> > > 40K records on the master is it standard to always copy over the
> > > full index, currently 5gb in size.  If this is standard what do
> > > people do who have massive 200gb indexs, does it not take a while to
> > > bring the
> > slaves inline with the master?
> > >
> > > Thanks
> > > Ben
> > >
> > > ________________________________________
> > >
> > >
> > > This e-mail is sent on behalf of Trader Media Group Limited,
> > > Registered
> > > Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
> > > Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
> > 4768833).
> > > This email and any files transmitted with it are confidential and
> > > may be legally privileged, and intended solely for the use of the
> > > individual or entity to whom they are addressed. If you have
> > > received this email in error please notify the sender. This email
> > > message has been swept for the presence of computer viruses.
> > >
> > >
> >
> > ________________________________________
> >
> >
> > This e-mail is sent on behalf of Trader Media Group Limited,
> > Registered
> > Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
> > Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
> 4768833).
> > This email and any files transmitted with it are confidential and may
> > be legally privileged, and intended solely for the use of the
> > individual or entity to whom they are addressed. If you have received
> > this email in error please notify the sender. This email message has
> > been swept for the presence of computer viruses.
> >
> >
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, Registered
> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower
> Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
> This email and any files transmitted with it are confidential and may be
> legally privileged, and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this email in error
> please notify the sender. This email message has been swept for the
> presence of computer viruses.
>
>

RE: Simple Slave Replication Question

Posted by Ben McCarthy <Be...@TraderMedia.co.uk>.
I just have a index directory.

I push the documents through with a change to a field.  Im using SOLRJ to do this.  Im using the guide from the wiki to setup the replication.  When the feed of updates to the master finishes I call a commit again using SOLRJ.  I then have a poll period of 5 minutes from the slave.  When it kicks in I see a new version of the index and then it copys the full 5gb index.

Thanks
Ben

-----Original Message-----
From: Tomás Fernández Löbbe [mailto:tomasflobbe@gmail.com] 
Sent: 23 March 2012 14:29
To: solr-user@lucene.apache.org
Subject: Re: Simple Slave Replication Question

Hi Ben, only new segments are replicated from master to slave. In a situation where all the segments are new, this will cause the index to be fully replicated, but this rarely happen with incremental updates. It can also happen if the slave Solr assumes it has an "invalid" index.
Are you committing or optimizing on the slaves? After replication, the index directory on the slaves is called "index" or "index.<timestamp>"?

Tomás

On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy < Ben.McCarthy@tradermedia.co.uk> wrote:

> So do you just simpy address this with big nic and network pipes.
>
> -----Original Message-----
> From: Martin Koch [mailto:mak@issuu.com]
> Sent: 23 March 2012 14:07
> To: solr-user@lucene.apache.org
> Subject: Re: Simple Slave Replication Question
>
> I guess this would depend on network bandwidth, but we move around 
> 150G/hour when hooking up a new slave to the master.
>
> /Martin
>
> On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy < 
> Ben.McCarthy@tradermedia.co.uk> wrote:
>
> > Hello,
> >
> > Im looking at the replication from a master to a number of slaves.  
> > I have configured it and it appears to be working.  When updating 
> > 40K records on the master is it standard to always copy over the 
> > full index, currently 5gb in size.  If this is standard what do 
> > people do who have massive 200gb indexs, does it not take a while to 
> > bring the
> slaves inline with the master?
> >
> > Thanks
> > Ben
> >
> > ________________________________________
> >
> >
> > This e-mail is sent on behalf of Trader Media Group Limited, 
> > Registered
> > Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, 
> > Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
> 4768833).
> > This email and any files transmitted with it are confidential and 
> > may be legally privileged, and intended solely for the use of the 
> > individual or entity to whom they are addressed. If you have 
> > received this email in error please notify the sender. This email 
> > message has been swept for the presence of computer viruses.
> >
> >
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, 
> Registered
> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, 
> Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
> This email and any files transmitted with it are confidential and may 
> be legally privileged, and intended solely for the use of the 
> individual or entity to whom they are addressed. If you have received 
> this email in error please notify the sender. This email message has 
> been swept for the presence of computer viruses.
>
>

________________________________________


This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses. 


Re: Simple Slave Replication Question

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
Hi Ben, only new segments are replicated from master to slave. In a
situation where all the segments are new, this will cause the index to be
fully replicated, but this rarely happen with incremental updates. It can
also happen if the slave Solr assumes it has an "invalid" index.
Are you committing or optimizing on the slaves? After replication, the
index directory on the slaves is called "index" or "index.<timestamp>"?

Tomás

On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy <
Ben.McCarthy@tradermedia.co.uk> wrote:

> So do you just simpy address this with big nic and network pipes.
>
> -----Original Message-----
> From: Martin Koch [mailto:mak@issuu.com]
> Sent: 23 March 2012 14:07
> To: solr-user@lucene.apache.org
> Subject: Re: Simple Slave Replication Question
>
> I guess this would depend on network bandwidth, but we move around
> 150G/hour when hooking up a new slave to the master.
>
> /Martin
>
> On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
> Ben.McCarthy@tradermedia.co.uk> wrote:
>
> > Hello,
> >
> > Im looking at the replication from a master to a number of slaves.  I
> > have configured it and it appears to be working.  When updating 40K
> > records on the master is it standard to always copy over the full
> > index, currently 5gb in size.  If this is standard what do people do
> > who have massive 200gb indexs, does it not take a while to bring the
> slaves inline with the master?
> >
> > Thanks
> > Ben
> >
> > ________________________________________
> >
> >
> > This e-mail is sent on behalf of Trader Media Group Limited,
> > Registered
> > Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
> > Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No.
> 4768833).
> > This email and any files transmitted with it are confidential and may
> > be legally privileged, and intended solely for the use of the
> > individual or entity to whom they are addressed. If you have received
> > this email in error please notify the sender. This email message has
> > been swept for the presence of computer viruses.
> >
> >
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, Registered
> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower
> Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
> This email and any files transmitted with it are confidential and may be
> legally privileged, and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this email in error
> please notify the sender. This email message has been swept for the
> presence of computer viruses.
>
>

RE: Simple Slave Replication Question

Posted by Ben McCarthy <Be...@TraderMedia.co.uk>.
So do you just simpy address this with big nic and network pipes.

-----Original Message-----
From: Martin Koch [mailto:mak@issuu.com] 
Sent: 23 March 2012 14:07
To: solr-user@lucene.apache.org
Subject: Re: Simple Slave Replication Question

I guess this would depend on network bandwidth, but we move around 150G/hour when hooking up a new slave to the master.

/Martin

On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy < Ben.McCarthy@tradermedia.co.uk> wrote:

> Hello,
>
> Im looking at the replication from a master to a number of slaves.  I 
> have configured it and it appears to be working.  When updating 40K 
> records on the master is it standard to always copy over the full 
> index, currently 5gb in size.  If this is standard what do people do 
> who have massive 200gb indexs, does it not take a while to bring the slaves inline with the master?
>
> Thanks
> Ben
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, 
> Registered
> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, 
> Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
> This email and any files transmitted with it are confidential and may 
> be legally privileged, and intended solely for the use of the 
> individual or entity to whom they are addressed. If you have received 
> this email in error please notify the sender. This email message has 
> been swept for the presence of computer viruses.
>
>

________________________________________


This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses. 


Re: Simple Slave Replication Question

Posted by Martin Koch <ma...@issuu.com>.
I guess this would depend on network bandwidth, but we move around
150G/hour when hooking up a new slave to the master.

/Martin

On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
Ben.McCarthy@tradermedia.co.uk> wrote:

> Hello,
>
> Im looking at the replication from a master to a number of slaves.  I have
> configured it and it appears to be working.  When updating 40K records on
> the master is it standard to always copy over the full index, currently 5gb
> in size.  If this is standard what do people do who have massive 200gb
> indexs, does it not take a while to bring the slaves inline with the master?
>
> Thanks
> Ben
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, Registered
> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower
> Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833).
> This email and any files transmitted with it are confidential and may be
> legally privileged, and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this email in error
> please notify the sender. This email message has been swept for the
> presence of computer viruses.
>
>