You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Joe F <jo...@gmail.com> on 2017/09/07 14:44:22 UTC

Re: [GitHub] jai1 opened a new issue #745: Reduce the backoff time in Clients

Jai,

If I understand this correctly, this would almost double the number of
attempts in the first 2 secs (5->8)? [ and similarly for the first 10 secs
too]

Joe

On Wed, Sep 6, 2017 at 4:26 PM, <gi...@git.apache.org> wrote:

> jai1 opened a new issue #745: Reduce the backoff time in Clients
> URL: https://github.com/apache/incubator-pulsar/issues/745
>
>
>    We have a use-case where the customer is latency sensitive, has a huge
> number of topics and ideally, never wants timeouts to occur (2 seconds).
>
>    When brokers are restarted it takes around 1 to 4 seconds for bundles
> to unload especially those bundles which have 800+ number of topics - which
> leads to timeouts. To make matters during worse our backoff logic works as
> follows:-
>
>    | Try number        | Backoff Value           | Next try  |
>    | ------------- |:-------------:| -----:|
>    | 1      | 100ms | 100ms |
>    | 2      | 200ms      | 300ms   |
>    | 3 | 400ms      |   700ms |
>    | 4      | 800ms | **1500ms** |
>    | 5    | 1600ms      |   **3100ms** |
>    | 6 | 3200ms      |  6300ms   |
>
>    As highlighted if the connect at 1.5 seconds fails then the next
> attempt is made in 3.1 seconds and we lose out on the 0.5 seconds where we
> could have potentially got the message.
>
>
>    We could make the initial backoff value (100 ms) and multiplier (2)
> configurable but as pointed out by @rdhabalia - not many clients will be
> interested in configuring this value, hence we can may be just hard code
> the multiplier to 1.5 instead of 2.
>
>    Difference: As seen below the growth of the backoff time is slower with
> a multiplier as 1.5 but it soon catches up and reaches the max value as
> desired.
>
>    | Failure Number | Current backoff (ms) | New backoff (ms) |
>    |----------------|----------------------|------------------|
>    |                | Multiplier = 2       | Multiplier = 1.5 |
>    | 1              | 100                  | 100              |
>    | 2              | 200                  | 150              |
>    | 3              | 400                  | 225              |
>    | 4              | 800                  | 338              |
>    | 5              | 1600                 | 506              |
>    | 6              | 3200                 | 759              |
>    | 7              | 6400                 | 1139             |
>    | 8              | 12800                | 1709             |
>    | 9              | 25600                | 2563             |
>    | 10             | 51200                | 3844             |
>    | 11             | 60000                | 5767             |
>    | 12             | 60000                | 8650             |
>    | 13             | 60000                | 12975            |
>    | 14             | 60000                | 19462            |
>    | 15             | 60000                | 29193            |
>    | 16             | 60000                | 43789            |
>    | 17             | 60000                | 60000            |
>    | 18             | 60000                | 60000            |
>
>
>    @saandrews @rdhabalia @merlimat  - The code change for this is very
> small - just wanted consent on the approach.
>
> ----------------------------------------------------------------
> This is an automated message from the Apache Git Service.
> To respond to the message, please log on GitHub and use the
> URL above to go to the specific comment.
>
> For queries about this service, please contact Infrastructure at:
> users@infra.apache.org
>
>
> With regards,
> Apache Git Services
>