You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Joe F <jo...@gmail.com> on 2017/09/07 14:44:22 UTC
Re: [GitHub] jai1 opened a new issue #745: Reduce the backoff time in Clients
Jai,
If I understand this correctly, this would almost double the number of
attempts in the first 2 secs (5->8)? [ and similarly for the first 10 secs
too]
Joe
On Wed, Sep 6, 2017 at 4:26 PM, <gi...@git.apache.org> wrote:
> jai1 opened a new issue #745: Reduce the backoff time in Clients
> URL: https://github.com/apache/incubator-pulsar/issues/745
>
>
> We have a use-case where the customer is latency sensitive, has a huge
> number of topics and ideally, never wants timeouts to occur (2 seconds).
>
> When brokers are restarted it takes around 1 to 4 seconds for bundles
> to unload especially those bundles which have 800+ number of topics - which
> leads to timeouts. To make matters during worse our backoff logic works as
> follows:-
>
> | Try number | Backoff Value | Next try |
> | ------------- |:-------------:| -----:|
> | 1 | 100ms | 100ms |
> | 2 | 200ms | 300ms |
> | 3 | 400ms | 700ms |
> | 4 | 800ms | **1500ms** |
> | 5 | 1600ms | **3100ms** |
> | 6 | 3200ms | 6300ms |
>
> As highlighted if the connect at 1.5 seconds fails then the next
> attempt is made in 3.1 seconds and we lose out on the 0.5 seconds where we
> could have potentially got the message.
>
>
> We could make the initial backoff value (100 ms) and multiplier (2)
> configurable but as pointed out by @rdhabalia - not many clients will be
> interested in configuring this value, hence we can may be just hard code
> the multiplier to 1.5 instead of 2.
>
> Difference: As seen below the growth of the backoff time is slower with
> a multiplier as 1.5 but it soon catches up and reaches the max value as
> desired.
>
> | Failure Number | Current backoff (ms) | New backoff (ms) |
> |----------------|----------------------|------------------|
> | | Multiplier = 2 | Multiplier = 1.5 |
> | 1 | 100 | 100 |
> | 2 | 200 | 150 |
> | 3 | 400 | 225 |
> | 4 | 800 | 338 |
> | 5 | 1600 | 506 |
> | 6 | 3200 | 759 |
> | 7 | 6400 | 1139 |
> | 8 | 12800 | 1709 |
> | 9 | 25600 | 2563 |
> | 10 | 51200 | 3844 |
> | 11 | 60000 | 5767 |
> | 12 | 60000 | 8650 |
> | 13 | 60000 | 12975 |
> | 14 | 60000 | 19462 |
> | 15 | 60000 | 29193 |
> | 16 | 60000 | 43789 |
> | 17 | 60000 | 60000 |
> | 18 | 60000 | 60000 |
>
>
> @saandrews @rdhabalia @merlimat - The code change for this is very
> small - just wanted consent on the approach.
>
> ----------------------------------------------------------------
> This is an automated message from the Apache Git Service.
> To respond to the message, please log on GitHub and use the
> URL above to go to the specific comment.
>
> For queries about this service, please contact Infrastructure at:
> users@infra.apache.org
>
>
> With regards,
> Apache Git Services
>