You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Eric Pugh <ep...@opensourceconnections.com> on 2012/03/20 12:36:44 UTC

Staggering Replication start times

I am playing with an index that is sharded many times, between 64 and 128.  One thing I noticed is that with replication set to happen every 5 minutes, it means that each slave hits the master at the same moment asking for updates:  :00:00, :05:00, :10:00, :15:00 etc.   Replication takes very little time, so it seems like I may be flooding the network with a bunch of traffic requests, and then goes away.

I tweaked the replication start time code to instead just start 5 minutes after a shard starts up, which means instead of all of the slaves hitting at the same moment, they are a bit staggered.   :00:00, :00:01, :00:02, :00:04 etcetera.   Which presumably will use my network pipe more efficiently.  

Any thoughts on this?  I know it means the slaves are more likely to be slightly out of sync, but over a 5 minute range will get back in sync.  

Eric

-----------------------------------------------------
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Co-Author: Apache Solr 3 Enterprise Search Server available from http://www.packtpub.com/apache-solr-3-enterprise-search-server/book	
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.

Re: Staggering Replication start times

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.

What William said was the original motivation to sync all slaves to poll
approximately at the same time.

On Tue, Mar 20, 2012 at 10:38 PM, William Bell <bi...@gmail.com> wrote:

> For our use case this is a no-no. When the index is updated, we need
> all indexes to be updated at the same time.
>
> We put all indexes (slaves) behind a load balancer and the user would
> expect the same results from page to page.
>
>
> On Tue, Mar 20, 2012 at 5:36 AM, Eric Pugh
> <ep...@opensourceconnections.com> wrote:
> > I am playing with an index that is sharded many times, between 64 and
> 128.  One thing I noticed is that with replication set to happen every 5
> minutes, it means that each slave hits the master at the same moment asking
> for updates:  :00:00, :05:00, :10:00, :15:00 etc.   Replication takes very
> little time, so it seems like I may be flooding the network with a bunch of
> traffic requests, and then goes away.
> >
> > I tweaked the replication start time code to instead just start 5
> minutes after a shard starts up, which means instead of all of the slaves
> hitting at the same moment, they are a bit staggered.   :00:00, :00:01,
> :00:02, :00:04 etcetera.   Which presumably will use my network pipe more
> efficiently.
> >
> > Any thoughts on this?  I know it means the slaves are more likely to be
> slightly out of sync, but over a 5 minute range will get back in sync.
> >
> > Eric
> >
> > -----------------------------------------------------
> > Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com
> > Co-Author: Apache Solr 3 Enterprise Search Server available from
> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> > This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>
> --
> Bill Bell
> billnbell@gmail.com
> cell 720-256-8076
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: Staggering Replication start times

Posted by William Bell <bi...@gmail.com>.

For our use case this is a no-no. When the index is updated, we need
all indexes to be updated at the same time.

We put all indexes (slaves) behind a load balancer and the user would
expect the same results from page to page.


On Tue, Mar 20, 2012 at 5:36 AM, Eric Pugh
<ep...@opensourceconnections.com> wrote:
> I am playing with an index that is sharded many times, between 64 and 128.  One thing I noticed is that with replication set to happen every 5 minutes, it means that each slave hits the master at the same moment asking for updates:  :00:00, :05:00, :10:00, :15:00 etc.   Replication takes very little time, so it seems like I may be flooding the network with a bunch of traffic requests, and then goes away.
>
> I tweaked the replication start time code to instead just start 5 minutes after a shard starts up, which means instead of all of the slaves hitting at the same moment, they are a bit staggered.   :00:00, :00:01, :00:02, :00:04 etcetera.   Which presumably will use my network pipe more efficiently.
>
> Any thoughts on this?  I know it means the slaves are more likely to be slightly out of sync, but over a 5 minute range will get back in sync.
>
> Eric
>
> -----------------------------------------------------
> Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
> Co-Author: Apache Solr 3 Enterprise Search Server available from http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
>
>
>
>
>
>
>
>
>
>
>



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076