You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Vauthrin, Laurent" <La...@disney.com> on 2009/03/17 20:04:50 UTC

More replication questions

Hello,

 

I have a couple of questions relating to replication in Solr.  As far as
I understand it, the replication approach for both 1.3 and 1.4 involves
having the slaves poll the master for updates to the index.  We're
curious to know if it's possible to have a more dynamic/quicker way to
propagate updates.

 

1.       Is there a built-in mechanism for pushing out
updates(/inserts/deletes) received by the master to the slaves?

2.       Is it discouraged to post updates to multiple Solr instances?
(all instances can receive updates and fulfill query requests)

3.       If that sort of capability is not supported, why was it not
implemented this way?  (So that we don't repeat any mistakes)

4.       Has anyone else on the list attempted to do this?  The intent
here is to achieve optimal performance while have the freshest data
possible if that's possible.

 

Thanks,
Laurent


Re: More replication questions

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
it depends on a few things.
1) no:of docs added
2) is the index optimized
3) autowarming

if the no:of docs added are few and the index is not optimized , the
replication will be will be done in milliseconds (the changed files
will be small). If there is no autoWarming , there should be no delay
in seeing the new data


On Thu, Mar 19, 2009 at 6:23 AM, Vauthrin, Laurent
<La...@disney.com> wrote:
> Thanks for the responses.
>
> If we used a poll interval of one second (for 1.4), wouldn't we still have to wait for the replication to finish?  In that case, couldn't it take minutes (depending on index size) to get that data on the slave?  Or would there be a lot less data to pull down because of the high replication frequency (i.e. Will it only have small files to replicate)?
>
> -----Original Message-----
> From: solr-user-return-19721-Laurent.Vauthrin=disney.com@lucene.apache.org [mailto:solr-user-return-19721-Laurent.Vauthrin=disney.com@lucene.apache.org] On Behalf Of Noble Paul ??????? ??????
> Sent: Tuesday, March 17, 2009 9:04 PM
> To: solr-user@lucene.apache.org
> Subject: Re: More replication questions
>
> On Wed, Mar 18, 2009 at 12:34 AM, Vauthrin, Laurent
> <La...@disney.com> wrote:
>> Hello,
>>
>>
>>
>> I have a couple of questions relating to replication in Solr.  As far as
>> I understand it, the replication approach for both 1.3 and 1.4 involves
>> having the slaves poll the master for updates to the index.  We're
>> curious to know if it's possible to have a more dynamic/quicker way to
>> propagate updates.
>>
>>
>>
>> 1.       Is there a built-in mechanism for pushing out
>> updates(/inserts/deletes) received by the master to the slaves?
> The pull mechanism in 1.4 can be good enough. The 'pollInterval' can
> be as small as 1 sec. So you will get the updates within a second
> .Isn't it not good enough?
>>
>> 2.       Is it discouraged to post updates to multiple Solr instances?
>> (all instances can receive updates and fulfill query requests)
> This is prone to serious errors all the solr instances may not be in sync
>>
>> 3.       If that sort of capability is not supported, why was it not
>> implemented this way?  (So that we don't repeat any mistakes)
> A push based replication is in the cards. the implementation is not
> trivial. In Solr commits are already expensive s a second's delay may
> be alright .
>>
>> 4.       Has anyone else on the list attempted to do this?  The intent
>> here is to achieve optimal performance while have the freshest data
>> possible if that's possible.
>>
>>
>>
>> Thanks,
>> Laurent
>>
>>
>
>
>
> --
> --Noble Paul
>



-- 
--Noble Paul

RE: More replication questions

Posted by "Vauthrin, Laurent" <La...@disney.com>.
Thanks for the responses.

If we used a poll interval of one second (for 1.4), wouldn't we still have to wait for the replication to finish?  In that case, couldn't it take minutes (depending on index size) to get that data on the slave?  Or would there be a lot less data to pull down because of the high replication frequency (i.e. Will it only have small files to replicate)?

-----Original Message-----
From: solr-user-return-19721-Laurent.Vauthrin=disney.com@lucene.apache.org [mailto:solr-user-return-19721-Laurent.Vauthrin=disney.com@lucene.apache.org] On Behalf Of Noble Paul ??????? ??????
Sent: Tuesday, March 17, 2009 9:04 PM
To: solr-user@lucene.apache.org
Subject: Re: More replication questions

On Wed, Mar 18, 2009 at 12:34 AM, Vauthrin, Laurent
<La...@disney.com> wrote:
> Hello,
>
>
>
> I have a couple of questions relating to replication in Solr.  As far as
> I understand it, the replication approach for both 1.3 and 1.4 involves
> having the slaves poll the master for updates to the index.  We're
> curious to know if it's possible to have a more dynamic/quicker way to
> propagate updates.
>
>
>
> 1.       Is there a built-in mechanism for pushing out
> updates(/inserts/deletes) received by the master to the slaves?
The pull mechanism in 1.4 can be good enough. The 'pollInterval' can
be as small as 1 sec. So you will get the updates within a second
.Isn't it not good enough?
>
> 2.       Is it discouraged to post updates to multiple Solr instances?
> (all instances can receive updates and fulfill query requests)
This is prone to serious errors all the solr instances may not be in sync
>
> 3.       If that sort of capability is not supported, why was it not
> implemented this way?  (So that we don't repeat any mistakes)
A push based replication is in the cards. the implementation is not
trivial. In Solr commits are already expensive s a second's delay may
be alright .
>
> 4.       Has anyone else on the list attempted to do this?  The intent
> here is to achieve optimal performance while have the freshest data
> possible if that's possible.
>
>
>
> Thanks,
> Laurent
>
>



-- 
--Noble Paul

Re: More replication questions

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Wed, Mar 18, 2009 at 12:34 AM, Vauthrin, Laurent
<La...@disney.com> wrote:
> Hello,
>
>
>
> I have a couple of questions relating to replication in Solr.  As far as
> I understand it, the replication approach for both 1.3 and 1.4 involves
> having the slaves poll the master for updates to the index.  We're
> curious to know if it's possible to have a more dynamic/quicker way to
> propagate updates.
>
>
>
> 1.       Is there a built-in mechanism for pushing out
> updates(/inserts/deletes) received by the master to the slaves?
The pull mechanism in 1.4 can be good enough. The 'pollInterval' can
be as small as 1 sec. So you will get the updates within a second
.Isn't it not good enough?
>
> 2.       Is it discouraged to post updates to multiple Solr instances?
> (all instances can receive updates and fulfill query requests)
This is prone to serious errors all the solr instances may not be in sync
>
> 3.       If that sort of capability is not supported, why was it not
> implemented this way?  (So that we don't repeat any mistakes)
A push based replication is in the cards. the implementation is not
trivial. In Solr commits are already expensive s a second's delay may
be alright .
>
> 4.       Has anyone else on the list attempted to do this?  The intent
> here is to achieve optimal performance while have the freshest data
> possible if that's possible.
>
>
>
> Thanks,
> Laurent
>
>



-- 
--Noble Paul