You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Akshay <ak...@gmail.com> on 2011/06/06 15:18:21 UTC

Auto-scaling solr setup

So i am trying to setup an auto-scaling search system of ec2 solr-slaves
which scale up as number of requests increase and vice versa
Here is what I have
1. A solr master and underlying slaves(scalable). And an elastic load
balancer to distribute the load.
2. The ec2-auto-scaling setup fires nodes when traffic increases. However
the replication times(replication speed) for the index from the master
varies for these newly fired nodes.
3. I want to avoid addition of these nodes to the load balancer till it has
completed initial replication and has a warmed up cache.
    For this I need to know a way I can check if the initial replication has
completed. and also a way of warming up the cache post this.

I can think of doing this via .. a shellscript/awk(checking times
replicated/index size) ... is there a cleaner way ?

Also on the side note .. any suggestions or pointers to how one set up their
scalable solr setup on cloud(AWS mainly) would be helpful.

Regards,
Akshay

Re: Auto-scaling solr setup

Posted by Akshay <ak...@gmail.com>.
Yes sadly ..  I too have not much clue about AWS.

The SolrReplication API doesnt give me what i want exactly.. For the time
being i have hacked my way into the amazon image bootstrapping the
replication check in a shell script ((curl & awk) very dirty way) . Once the
check suceeds I enable the server using the Solr healthcheck for
load-balancers. I was wondering if anyone has moved to the cloud..specially
Amazon auto-scaling where they dont have control over when a new node is
fired.. All scenarios i encountered were people creating a node .. warming
up the cache and then adding it under the HAProxy LB.

I guess warmup is not that big an issue as compared to an empty response.
Thanks for your response :)

Regards,
Akshay

On Mon, Jun 6, 2011 at 6:33 PM, Erick Erickson <er...@gmail.com>wrote:

> The HTTP interface (http://wiki.apache.org/solr/SolrReplication#HTTP_API)
> can be used to control lots of parts of replication.
>
> As to warmups, I don't know of a good way to test that. I don't know
> whether
> getting the current status on the slave includes whether warmup is
> completed
> or not. At worst, after replication is complete you could wait an interval
> (see
> the warmup times on your running servers) before routing requests to the
> slave.
>
> I haven't any clue at all about AWS...
>
> Best
> Erick
>
> On Mon, Jun 6, 2011 at 9:18 AM, Akshay <ak...@gmail.com> wrote:
> > So i am trying to setup an auto-scaling search system of ec2 solr-slaves
> > which scale up as number of requests increase and vice versa
> > Here is what I have
> > 1. A solr master and underlying slaves(scalable). And an elastic load
> > balancer to distribute the load.
> > 2. The ec2-auto-scaling setup fires nodes when traffic increases. However
> > the replication times(replication speed) for the index from the master
> > varies for these newly fired nodes.
> > 3. I want to avoid addition of these nodes to the load balancer till it
> has
> > completed initial replication and has a warmed up cache.
> >    For this I need to know a way I can check if the initial replication
> has
> > completed. and also a way of warming up the cache post this.
> >
> > I can think of doing this via .. a shellscript/awk(checking times
> > replicated/index size) ... is there a cleaner way ?
> >
> > Also on the side note .. any suggestions or pointers to how one set up
> their
> > scalable solr setup on cloud(AWS mainly) would be helpful.
> >
> > Regards,
> > Akshay
> >
>

Re: Auto-scaling solr setup

Posted by Erick Erickson <er...@gmail.com>.
The HTTP interface (http://wiki.apache.org/solr/SolrReplication#HTTP_API)
can be used to control lots of parts of replication.

As to warmups, I don't know of a good way to test that. I don't know whether
getting the current status on the slave includes whether warmup is completed
or not. At worst, after replication is complete you could wait an interval (see
the warmup times on your running servers) before routing requests to the
slave.

I haven't any clue at all about AWS...

Best
Erick

On Mon, Jun 6, 2011 at 9:18 AM, Akshay <ak...@gmail.com> wrote:
> So i am trying to setup an auto-scaling search system of ec2 solr-slaves
> which scale up as number of requests increase and vice versa
> Here is what I have
> 1. A solr master and underlying slaves(scalable). And an elastic load
> balancer to distribute the load.
> 2. The ec2-auto-scaling setup fires nodes when traffic increases. However
> the replication times(replication speed) for the index from the master
> varies for these newly fired nodes.
> 3. I want to avoid addition of these nodes to the load balancer till it has
> completed initial replication and has a warmed up cache.
>    For this I need to know a way I can check if the initial replication has
> completed. and also a way of warming up the cache post this.
>
> I can think of doing this via .. a shellscript/awk(checking times
> replicated/index size) ... is there a cleaner way ?
>
> Also on the side note .. any suggestions or pointers to how one set up their
> scalable solr setup on cloud(AWS mainly) would be helpful.
>
> Regards,
> Akshay
>

Re: Auto-scaling solr setup

Posted by jwang <je...@hotmail.com>.
An option is to wrap your Solr slave in a beanstalk and have it take care of
the auto-scaling. 

--
View this message in context: http://lucene.472066.n3.nabble.com/Auto-scaling-solr-setup-tp3029913p3511140.html
Sent from the Solr - User mailing list archive at Nabble.com.