You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Keene, David" <dk...@soe.sony.com> on 2008/02/06 23:34:22 UTC

solrj and multiple slaves

Hey guys,

 

I have a quick question about using solrj to connect to multiple slaves.
My application is deployed on multiple boxes that have to talk to
multiple solr slaves.  In order to take advantage of the queryResult
cache, each request from one of my app boxes should be redirected to the
same solr slave.

 

I'm using an apache to load balance between the slaves using sticky
sessions with jk2 (jsessionId cookie).  Is this the right way to go
about loadbalancing multiple solr slaves when using solrj? If so, should
I look into making a patch for solrj so that each query can optionally
take a cookie parameter so the underling HttpClient knows what
jsessionId to attach to the request?

 

On the other hand.. I could be going about this all wrong.

Thanks,

Dave

 

 


Re: solrj and multiple slaves

Posted by Walter Underwood <wu...@netflix.com>.
On 2/11/08 8:42 PM, "Chris Hostetter" <ho...@fucit.org> wrote:

> if you want to worry about smart load balancing, try to load balance based
> on the nature of the URL query string ... make you load balancer pick
> a slave by hashing on the "q" param for example.

This is very effective. We used this at Infoseek ten years ago.

An easy way to do this is to have the client code do the hash and
add it as an extra parameter. Then have the load balancer switch
based on that param. Something like this:

   &preferred_server=2

wunder


Re: solrj and multiple slaves

Posted by Chris Hostetter <ho...@fucit.org>.
: I have a quick question about using solrj to connect to multiple slaves.
: My application is deployed on multiple boxes that have to talk to
: multiple solr slaves.  In order to take advantage of the queryResult
: cache, each request from one of my app boxes should be redirected to the
: same solr slave.

i've never once worried about "session affinity" when dealing with Solr 
... if a query is common/important enough that it's going to be a cache 
hit, it will probably be a cache hit on all the servers.  besides which: 
just because two queries come from the same client doesn't mean they have 
anything to do with eachother - i'm typically just as likely to get the 
same query from two differnet clients as i am twice fro mthe same client.  

if you want to worry about smart load balancing, try to load balance based 
on the nature of the URL query string ... make you loard balancer pick 
a slave by hashing on the "q" param for example.

the one situation where i worry about sending certain traffic to some Solr 
boxes and other traffic to other Solr boxes is when i know that the client 
apps have very differnet query usage patterns ... then i have two seperate 
tiers of Slaves -- identical indexes, but different solrconfigs.  the 
clients that hit my custon faceting plugin use one tier with a big custom 
cache and filterCache.  the clients that do more traditional searching 
using dismax hit a second tier which has no custom cache, a smaller 
filterCache and a bigger queryResultCache ... but even then i don't worry 
about session IDs ... i just configure the two client applications with 
different DNS aliases.




-Hoss