You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Jan Høydahl <ja...@cominvent.com> on 2021/03/11 18:51:15 UTC

Best throttling / push-back strategy for updates?

Hi,

When sending updates to Solr, you often need to run multi threaded to utilize the CPU on the solr side.
But how can the client (whether it is pure HTTP POST or SolrJ know whether Solr is happy with the indexing speed or not?

I'm thinking of a feedback mechanism where Solr can check its load level, indexing queue filling rate or other metrics as desired, and respond to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
Clients will then know that they should pause for a while and retry. Clients can then implement an exponential backoff strategy to adjust their indexing rate.
A bonus with such a system would be that Solr could "tell" indexing to slow down during periods with heavy query traffic, background merge activity, recovery, replication, if warming is too slow (max warming searchers) etc etc.

I know Elastic has something similar. Is there already something in our APIs that I don't know about?

Jan

Re: Best throttling / push-back strategy for updates?

Posted by Walter Underwood <wu...@wunderwood.org>.
Circuit breakers only cancel searches. They don’t touch updates.
I was in that code a few weeks ago and have a patch waiting for approval.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 11, 2021, at 4:33 PM, Mike Drob <md...@mdrob.com> wrote:
> 
> The new circuit breakers might be able to offer some rate limiting.
> 
> On Thu, Mar 11, 2021 at 6:25 PM Jan Høydahl <ja...@cominvent.com> wrote:
> 
>> Yes, that is what I'm recommending customers right now, to manually match
>> indexing threads with CPUs, and that is the "manual" way.
>> 
>> My question was rather whether we have or want to add some dynamic backoff
>> system so that clients can just go full speed until told to back off, and
>> thus adjust perfectly to what Solr can swallow.
>> 
>> I had a client the other day where they ingested too fast, into a system
>> with a other query load going on at the same time, and it caused some
>> serious slowdown and even GC pauses.
>> 
>> Jan
>> 
>>> 11. mar. 2021 kl. 20:55 skrev Walter Underwood <wu...@wunderwood.org>:
>>> 
>>> In a master/slave system, it is OK to run as fast as possible to the
>> master.
>>> In a cloud system, we want to keep the indexing load at a level that
>> doesn’t
>>> interfere with queries.
>>> 
>>> I do this by matching the number of indexing threads to the number of
>> CPUs.
>>> Very roughly, two threads will keep one CPU busy, that is one thread
>> waiting for
>>> the CPU to finish the batch and another sending the next batch.
>>> 
>>> With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads
>>> to use 25% (2 CPUs).
>>> 
>>> In a sharded system, the indexing is spread over the leaders. For
>> example,
>>> in our system with 8 shards, 64 threads will keep 2 CPUs busy on each
>>> leader. That number of threads runs at nearly a half-million updates per
>>> minute, so we don’t need further tuning. 2  busy CPUs is just fine on
>> hosts
>>> with 72 CPUs.
>>> 
>>> Also, we don’t use the cloud-sensitive stuff, we just throw update
>> batches
>>> at the load balancer. One loader is a simple Python program, so that
>> sends
>>> it all in JSON. That is the one doing 480k/min with 64 threads.
>>> 
>>> Finally, we use a separate load balancer for indexing. That lets us set
>> different
>>> response time alert levels for query traffic and update traffic. It also
>> allows us
>>> to see anomalous bursts of query traffic separate from updates.
>>> 
>>> wunder
>>> Walter Underwood
>>> wunder@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>>>> On Mar 11, 2021, at 10:51 AM, Jan Høydahl <ja...@cominvent.com>
>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> When sending updates to Solr, you often need to run multi threaded to
>> utilize the CPU on the solr side.
>>>> But how can the client (whether it is pure HTTP POST or SolrJ know
>> whether Solr is happy with the indexing speed or not?
>>>> 
>>>> I'm thinking of a feedback mechanism where Solr can check its load
>> level, indexing queue filling rate or other metrics as desired, and respond
>> to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
>>>> Clients will then know that they should pause for a while and retry.
>> Clients can then implement an exponential backoff strategy to adjust their
>> indexing rate.
>>>> A bonus with such a system would be that Solr could "tell" indexing to
>> slow down during periods with heavy query traffic, background merge
>> activity, recovery, replication, if warming is too slow (max warming
>> searchers) etc etc.
>>>> 
>>>> I know Elastic has something similar. Is there already something in our
>> APIs that I don't know about?
>>>> 
>>>> Jan
>>> 
>> 
>> 


Re: Best throttling / push-back strategy for updates?

Posted by Mike Drob <md...@mdrob.com>.
The new circuit breakers might be able to offer some rate limiting.

On Thu, Mar 11, 2021 at 6:25 PM Jan Høydahl <ja...@cominvent.com> wrote:

> Yes, that is what I'm recommending customers right now, to manually match
> indexing threads with CPUs, and that is the "manual" way.
>
> My question was rather whether we have or want to add some dynamic backoff
> system so that clients can just go full speed until told to back off, and
> thus adjust perfectly to what Solr can swallow.
>
> I had a client the other day where they ingested too fast, into a system
> with a other query load going on at the same time, and it caused some
> serious slowdown and even GC pauses.
>
> Jan
>
> > 11. mar. 2021 kl. 20:55 skrev Walter Underwood <wu...@wunderwood.org>:
> >
> > In a master/slave system, it is OK to run as fast as possible to the
> master.
> > In a cloud system, we want to keep the indexing load at a level that
> doesn’t
> > interfere with queries.
> >
> > I do this by matching the number of indexing threads to the number of
> CPUs.
> > Very roughly, two threads will keep one CPU busy, that is one thread
> waiting for
> > the CPU to finish the batch and another sending the next batch.
> >
> > With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads
> > to use 25% (2 CPUs).
> >
> > In a sharded system, the indexing is spread over the leaders. For
> example,
> > in our system with 8 shards, 64 threads will keep 2 CPUs busy on each
> > leader. That number of threads runs at nearly a half-million updates per
> > minute, so we don’t need further tuning. 2  busy CPUs is just fine on
> hosts
> > with 72 CPUs.
> >
> > Also, we don’t use the cloud-sensitive stuff, we just throw update
> batches
> > at the load balancer. One loader is a simple Python program, so that
> sends
> > it all in JSON. That is the one doing 480k/min with 64 threads.
> >
> > Finally, we use a separate load balancer for indexing. That lets us set
> different
> > response time alert levels for query traffic and update traffic. It also
> allows us
> > to see anomalous bursts of query traffic separate from updates.
> >
> > wunder
> > Walter Underwood
> > wunder@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Mar 11, 2021, at 10:51 AM, Jan Høydahl <ja...@cominvent.com>
> wrote:
> >>
> >> Hi,
> >>
> >> When sending updates to Solr, you often need to run multi threaded to
> utilize the CPU on the solr side.
> >> But how can the client (whether it is pure HTTP POST or SolrJ know
> whether Solr is happy with the indexing speed or not?
> >>
> >> I'm thinking of a feedback mechanism where Solr can check its load
> level, indexing queue filling rate or other metrics as desired, and respond
> to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
> >> Clients will then know that they should pause for a while and retry.
> Clients can then implement an exponential backoff strategy to adjust their
> indexing rate.
> >> A bonus with such a system would be that Solr could "tell" indexing to
> slow down during periods with heavy query traffic, background merge
> activity, recovery, replication, if warming is too slow (max warming
> searchers) etc etc.
> >>
> >> I know Elastic has something similar. Is there already something in our
> APIs that I don't know about?
> >>
> >> Jan
> >
>
>

Re: Best throttling / push-back strategy for updates?

Posted by Dwane Hall <dw...@hotmail.com>.
I really like the idea. I too have had instances in the past where (some) updates fail because of long (ish) gc pause times due to overloading and having the option to pause indexing and give Solr a chance to catch up would very useful.  I typically have a retry clause managing these issues but I'm generally catching generic errors so a specific error code that you could catch, sleep, and try again at a future intervals has some merit in my opinion.

Thanks,

Dwane
________________________________
From: Jan Høydahl <ja...@cominvent.com>
Sent: Friday, 12 March 2021 11:18 AM
To: users@solr.apache.org <us...@solr.apache.org>
Subject: Re: Best throttling / push-back strategy for updates?

Yes, that is what I'm recommending customers right now, to manually match indexing threads with CPUs, and that is the "manual" way.

My question was rather whether we have or want to add some dynamic backoff system so that clients can just go full speed until told to back off, and thus adjust perfectly to what Solr can swallow.

I had a client the other day where they ingested too fast, into a system with a other query load going on at the same time, and it caused some serious slowdown and even GC pauses.

Jan

> 11. mar. 2021 kl. 20:55 skrev Walter Underwood <wu...@wunderwood.org>:
>
> In a master/slave system, it is OK to run as fast as possible to the master.
> In a cloud system, we want to keep the indexing load at a level that doesn’t
> interfere with queries.
>
> I do this by matching the number of indexing threads to the number of CPUs.
> Very roughly, two threads will keep one CPU busy, that is one thread waiting for
> the CPU to finish the batch and another sending the next batch.
>
> With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads
> to use 25% (2 CPUs).
>
> In a sharded system, the indexing is spread over the leaders. For example,
> in our system with 8 shards, 64 threads will keep 2 CPUs busy on each
> leader. That number of threads runs at nearly a half-million updates per
> minute, so we don’t need further tuning. 2  busy CPUs is just fine on hosts
> with 72 CPUs.
>
> Also, we don’t use the cloud-sensitive stuff, we just throw update batches
> at the load balancer. One loader is a simple Python program, so that sends
> it all in JSON. That is the one doing 480k/min with 64 threads.
>
> Finally, we use a separate load balancer for indexing. That lets us set different
> response time alert levels for query traffic and update traffic. It also allows us
> to see anomalous bursts of query traffic separate from updates.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>> On Mar 11, 2021, at 10:51 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>
>> Hi,
>>
>> When sending updates to Solr, you often need to run multi threaded to utilize the CPU on the solr side.
>> But how can the client (whether it is pure HTTP POST or SolrJ know whether Solr is happy with the indexing speed or not?
>>
>> I'm thinking of a feedback mechanism where Solr can check its load level, indexing queue filling rate or other metrics as desired, and respond to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
>> Clients will then know that they should pause for a while and retry. Clients can then implement an exponential backoff strategy to adjust their indexing rate.
>> A bonus with such a system would be that Solr could "tell" indexing to slow down during periods with heavy query traffic, background merge activity, recovery, replication, if warming is too slow (max warming searchers) etc etc.
>>
>> I know Elastic has something similar. Is there already something in our APIs that I don't know about?
>>
>> Jan
>


Re: Best throttling / push-back strategy for updates?

Posted by Jan Høydahl <ja...@cominvent.com>.
Yes, that is what I'm recommending customers right now, to manually match indexing threads with CPUs, and that is the "manual" way.

My question was rather whether we have or want to add some dynamic backoff system so that clients can just go full speed until told to back off, and thus adjust perfectly to what Solr can swallow.

I had a client the other day where they ingested too fast, into a system with a other query load going on at the same time, and it caused some serious slowdown and even GC pauses.

Jan

> 11. mar. 2021 kl. 20:55 skrev Walter Underwood <wu...@wunderwood.org>:
> 
> In a master/slave system, it is OK to run as fast as possible to the master.
> In a cloud system, we want to keep the indexing load at a level that doesn’t
> interfere with queries.
> 
> I do this by matching the number of indexing threads to the number of CPUs.
> Very roughly, two threads will keep one CPU busy, that is one thread waiting for 
> the CPU to finish the batch and another sending the next batch.
> 
> With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads 
> to use 25% (2 CPUs).
> 
> In a sharded system, the indexing is spread over the leaders. For example,
> in our system with 8 shards, 64 threads will keep 2 CPUs busy on each 
> leader. That number of threads runs at nearly a half-million updates per
> minute, so we don’t need further tuning. 2  busy CPUs is just fine on hosts
> with 72 CPUs.
> 
> Also, we don’t use the cloud-sensitive stuff, we just throw update batches
> at the load balancer. One loader is a simple Python program, so that sends
> it all in JSON. That is the one doing 480k/min with 64 threads.
> 
> Finally, we use a separate load balancer for indexing. That lets us set different
> response time alert levels for query traffic and update traffic. It also allows us
> to see anomalous bursts of query traffic separate from updates.
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Mar 11, 2021, at 10:51 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>> 
>> Hi,
>> 
>> When sending updates to Solr, you often need to run multi threaded to utilize the CPU on the solr side.
>> But how can the client (whether it is pure HTTP POST or SolrJ know whether Solr is happy with the indexing speed or not?
>> 
>> I'm thinking of a feedback mechanism where Solr can check its load level, indexing queue filling rate or other metrics as desired, and respond to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
>> Clients will then know that they should pause for a while and retry. Clients can then implement an exponential backoff strategy to adjust their indexing rate.
>> A bonus with such a system would be that Solr could "tell" indexing to slow down during periods with heavy query traffic, background merge activity, recovery, replication, if warming is too slow (max warming searchers) etc etc.
>> 
>> I know Elastic has something similar. Is there already something in our APIs that I don't know about?
>> 
>> Jan
> 


Re: Best throttling / push-back strategy for updates?

Posted by Walter Underwood <wu...@wunderwood.org>.
In a master/slave system, it is OK to run as fast as possible to the master.
In a cloud system, we want to keep the indexing load at a level that doesn’t
interfere with queries.

I do this by matching the number of indexing threads to the number of CPUs.
Very roughly, two threads will keep one CPU busy, that is one thread waiting for 
the CPU to finish the batch and another sending the next batch.

With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads 
to use 25% (2 CPUs).

In a sharded system, the indexing is spread over the leaders. For example,
in our system with 8 shards, 64 threads will keep 2 CPUs busy on each 
leader. That number of threads runs at nearly a half-million updates per
minute, so we don’t need further tuning. 2  busy CPUs is just fine on hosts
with 72 CPUs.

Also, we don’t use the cloud-sensitive stuff, we just throw update batches
at the load balancer. One loader is a simple Python program, so that sends
it all in JSON. That is the one doing 480k/min with 64 threads.

Finally, we use a separate load balancer for indexing. That lets us set different
response time alert levels for query traffic and update traffic. It also allows us
to see anomalous bursts of query traffic separate from updates.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 11, 2021, at 10:51 AM, Jan Høydahl <ja...@cominvent.com> wrote:
> 
> Hi,
> 
> When sending updates to Solr, you often need to run multi threaded to utilize the CPU on the solr side.
> But how can the client (whether it is pure HTTP POST or SolrJ know whether Solr is happy with the indexing speed or not?
> 
> I'm thinking of a feedback mechanism where Solr can check its load level, indexing queue filling rate or other metrics as desired, and respond to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
> Clients will then know that they should pause for a while and retry. Clients can then implement an exponential backoff strategy to adjust their indexing rate.
> A bonus with such a system would be that Solr could "tell" indexing to slow down during periods with heavy query traffic, background merge activity, recovery, replication, if warming is too slow (max warming searchers) etc etc.
> 
> I know Elastic has something similar. Is there already something in our APIs that I don't know about?
> 
> Jan