You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by ratabora <ra...@gmail.com> on 2012/06/04 05:57:41 UTC

Re: Round Robin concept in distributed Solr

Hey Erick,

It looks like the thread you mentioned talks about how to configure the
shards parameter in the Solr query. I am more interested in the 'main' shard
you query against when you make Solr queries (main shard being the shard you
direct the query against,
mainshard/select?q=*:*&shards=shard1,shard2,shard3)

I think Suneel's original question is still unanswered, is it better to use
Scenario A or Scenario B? I suppose the 'main' shard is going to create a
sub query to the rest of the shards defined in the shard parameter, but I am
still wondering if you query the same main shard every time if that is going
to have a load/performance impact.


Suneel wrote
> 
>> So scenario A (round-robin):
>>
>> query 1: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>> query 2: /solr-shard-2/select?q=dog... shards=shard-1,shard2
>> query 3: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>> etc.
>>
>> or or scenario B (fixed):
>>
>> query 1: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>> query 2: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>> query 3: /solr-shard-1/select?q=dog... shards=shard-1,shard2 
> 

Thank you for any help.

Regards,
Ryan Tabora


--
View this message in context: http://lucene.472066.n3.nabble.com/Round-Robin-concept-in-distributed-Solr-tp3636345p3987494.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Round Robin concept in distributed Solr

Posted by ratabora <ra...@gmail.com>.

Thanks Erick,

While it probably does not have any major impact on performance, I think solely to avoid a single point of failure on the main shard it makes sense to load balance it.

Regards,
Ryan Tabora

On Jun 4, 2012, at 2:51 AM, Erick Erickson [via Lucene] wrote:

> The "main" shard has some extra work to do. Namely 
> 1> create the sub-requests 
> 2> collate the results from all the sub-requests (including itself). 
> 
> But this work is generally a small amount of the actual work being 
> done, so it's often unnoticeable. 
> 
> That said, I'd just put all my slaves behind a load-balancer and let 
> that mechanism send the 
> requests to the various slaves if for no other reason than your LB 
> should be able to detect 
> if one of your machines goes down and send requests to slaves still running. 
> 
> I even know of one situation where the users "main" slave has _no_ 
> index on it, it serves solely 
> to distribute requests/aggregate results.... 
> 
> Best 
> Erick 
> 
> On Sun, Jun 3, 2012 at 11:57 PM, ratabora <[hidden email]> wrote:
> 
> > Hey Erick, 
> > 
> > It looks like the thread you mentioned talks about how to configure the 
> > shards parameter in the Solr query. I am more interested in the 'main' shard 
> > you query against when you make Solr queries (main shard being the shard you 
> > direct the query against, 
> > mainshard/select?q=*:*&shards=shard1,shard2,shard3) 
> > 
> > I think Suneel's original question is still unanswered, is it better to use 
> > Scenario A or Scenario B? I suppose the 'main' shard is going to create a 
> > sub query to the rest of the shards defined in the shard parameter, but I am 
> > still wondering if you query the same main shard every time if that is going 
> > to have a load/performance impact. 
> > 
> > 
> > Suneel wrote 
> >> 
> >>> So scenario A (round-robin): 
> >>> 
> >>> query 1: /solr-shard-1/select?q=dog... shards=shard-1,shard2 
> >>> query 2: /solr-shard-2/select?q=dog... shards=shard-1,shard2 
> >>> query 3: /solr-shard-1/select?q=dog... shards=shard-1,shard2 
> >>> etc. 
> >>> 
> >>> or or scenario B (fixed): 
> >>> 
> >>> query 1: /solr-shard-1/select?q=dog... shards=shard-1,shard2 
> >>> query 2: /solr-shard-1/select?q=dog... shards=shard-1,shard2 
> >>> query 3: /solr-shard-1/select?q=dog... shards=shard-1,shard2 
> >> 
> > 
> > Thank you for any help. 
> > 
> > Regards, 
> > Ryan Tabora 
> > 
> > 
> > -- 
> > View this message in context: http://lucene.472066.n3.nabble.com/Round-Robin-concept-in-distributed-Solr-tp3636345p3987494.html
> > Sent from the Solr - User mailing list archive at Nabble.com. 
> 
> 
> If you reply to this email, your message will be added to the discussion below:
> http://lucene.472066.n3.nabble.com/Round-Robin-concept-in-distributed-Solr-tp3636345p3987521.html
> To unsubscribe from Round Robin concept in distributed Solr, click here.
> NAML



-----
http://ryantabora.com
--
View this message in context: http://lucene.472066.n3.nabble.com/Round-Robin-concept-in-distributed-Solr-tp3636345p3987594.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Round Robin concept in distributed Solr

Posted by Erick Erickson <er...@gmail.com>.

The "main" shard has some extra work to do. Namely
1> create the sub-requests
2> collate the results from all the sub-requests (including itself).

But this work is generally a small amount of the actual work being
done, so it's often unnoticeable.

That said, I'd just put all my slaves behind a load-balancer and let
that mechanism send the
requests to the various slaves if for no other reason than your LB
should be able to detect
if one of your machines goes down and send requests to slaves still running.

I even know of one situation where the users "main" slave has _no_
index on it, it serves solely
to distribute requests/aggregate results....

Best
Erick

On Sun, Jun 3, 2012 at 11:57 PM, ratabora <ra...@gmail.com> wrote:
> Hey Erick,
>
> It looks like the thread you mentioned talks about how to configure the
> shards parameter in the Solr query. I am more interested in the 'main' shard
> you query against when you make Solr queries (main shard being the shard you
> direct the query against,
> mainshard/select?q=*:*&shards=shard1,shard2,shard3)
>
> I think Suneel's original question is still unanswered, is it better to use
> Scenario A or Scenario B? I suppose the 'main' shard is going to create a
> sub query to the rest of the shards defined in the shard parameter, but I am
> still wondering if you query the same main shard every time if that is going
> to have a load/performance impact.
>
>
> Suneel wrote
>>
>>> So scenario A (round-robin):
>>>
>>> query 1: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>>> query 2: /solr-shard-2/select?q=dog... shards=shard-1,shard2
>>> query 3: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>>> etc.
>>>
>>> or or scenario B (fixed):
>>>
>>> query 1: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>>> query 2: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>>> query 3: /solr-shard-1/select?q=dog... shards=shard-1,shard2
>>
>
> Thank you for any help.
>
> Regards,
> Ryan Tabora
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Round-Robin-concept-in-distributed-Solr-tp3636345p3987494.html
> Sent from the Solr - User mailing list archive at Nabble.com.