You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jose Manuel Vega Monroy <jo...@williamhill.com> on 2019/11/08 17:18:50 UTC

Message order and retries question

Hi there,

I have a question about message order and retries.

After checking official documentation, and asking your feedback, we set this kafka client configuration in each producer:


    retries = 1

    # note to ensure order enable.idempotence=true, which forcing to acks=all and max.in.flight.requests.per.connection<=5

    enable.idempotence = true

    max.in.flight.requests.per.connection = 4

    acks = "all"

However, somehow while rolling upgrade, we saw producer retrying a lot of times (for example, 16 times), and finally sending fine when broker was up and running back, with exceptions like this:

Cause: org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number..
Cause: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition..

Is that behaviour expected? It’s that retries configuration right trying to ensure the message order, or maybe we should remove retries configuration from our producers?

As well we found this related to retries:

The default value for the producer's retries config was changed to Integer.MAX_VALUE, as we introduced delivery.timeout.ms<http://delivery.timeout.ms/> in KIP-91, which sets an upper bound on the total time between sending a record and receiving acknowledgement from the broker. By default, the delivery timeout is set to 2 minutes.


Allowing retries without setting `max.in.flight.requests.per.connection` to `1` will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first. Note additionally that produce requests will be failed before the number of retries has been exhausted if the timeout configured by delivery.timeout.ms<http://delivery.timeout.ms/> expires first before successful acknowledgement. Users should generally prefer to leave this config unset and instead use delivery.timeout.ms<http://delivery.timeout.ms/> to control retry behavior.

Note this was faced while rolling upgrade from 2.1.1 to 2.2.1.

Thanks

[https://www.williamhillplc.com/content/signature/WHlogo.gif?width=180]<http://www.williamhill.com/>
[https://www.williamhillplc.com/content/signature/senet.gif?width=180]<http://www.whenthefunstops.co.uk/>
Jose Manuel Vega Monroy
Java Developer / Software Developer Engineer in Test
Direct: +0035 0 2008038 (Ext. 8038)
Email: jose.monroy@williamhill.com<ma...@williamhill.com>
William Hill | 6/1 Waterport Place | Gibraltar | GX11 1AA



Confidentiality: The contents of this e-mail and any attachments transmitted with it are intended to be confidential to the intended recipient; and may be privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. This e-mail is sent by a William Hill PLC group company. The William Hill group companies include, among others, William Hill PLC (registered number 4212563), William Hill Organization Limited (registered number 278208), William Hill US HoldCo Inc, WHG (International) Limited (registered number 99191) and Mr Green Limited (registered number C43260). Each of William Hill PLC and William Hill Organization Limited is registered in England and Wales and has its registered office at 1 Bedford Avenue, London, WC1B 3AU, UK. William Hill U.S. HoldCo, Inc. is registered in Delaware and has its registered office at 1007 N. Orange Street, 9 Floor, Wilmington, New Castle County DE 19801 Delaware, United States of America. WHG (International) Limited is registered in Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar. Mr Green Limited is registered in Malta and has its registered office at Tagliaferro Business Centre, Level 7, 14 High Street, Sliema SLM 1549, Malta. Unless specifically indicated otherwise, the contents of this e-mail are subject to contract; and are not an official statement, and do not necessarily represent the views, of William Hill PLC, its subsidiaries or affiliated companies. Please note that neither William Hill PLC, nor its subsidiaries and affiliated companies can accept any responsibility for any viruses contained within this e-mail and it is your responsibility to scan any emails and their attachments. William Hill PLC, its subsidiaries and affiliated companies may monitor e-mail traffic data and also the content of e-mails for effective operation of the e-mail system, or for security, purposes.

Re: Message order and retries question

Posted by "Matthias J. Sax" <ma...@confluent.io>.
As you enable idempotance, you should set retries to
`Integer.MAX_VALUES` -- for newer version in which the default is
MAX_VALUES you can of course remove the config.

This will give you strict ordering guarantees, assuming that your topic
is configured correctly, ie, `min.insync.replicats=2` and
`replication.factor=3` (to allow for a single broker failure without
loosing data and don't loosing availability).

With retries=1 I am not surprised that you get exceptions if a broker
fail over occurs.

> Allowing retries without setting `max.in.flight.requests.per.connection` to `1` will potentially change the ordering of records...

This applies only if idempotance is disabled. Hence, you can leave
max.in.flight config at `4` and still have ordering guarantees.


-Matthias




On 11/8/19 9:36 AM, M. Manna wrote:
> Hi,
> 
> On Fri, 8 Nov 2019 at 17:19, Jose Manuel Vega Monroy <
> jose.monroy@williamhill.com> wrote:
> 
>> Hi there,
>>
>>
>>
>> I have a question about message order and retries.
>>
>>
>>
>> After checking official documentation, and asking your feedback, we set
>> this kafka client configuration in each producer:
>>
>>
>>
>>     retries = 1
>>
>>     # note to ensure order enable.idempotence=true, which forcing to acks=all and max.in.flight.requests.per.connection<=5
>>
>>     enable.idempotence = true
>>
>>     max.in.flight.requests.per.connection = 4
>>
>>     acks = "all"
>>
>>
>>
>  The documentation also says:
> 
>> Allowing retries without setting max.in.flight.requests.per.connection to
>> 1 will potentially change the ordering of records because if two batches
>> are sent to a single partition, and the first fails and is retried but the
>> second succeeds, then the records in the second batch may appear first.
>> Note additionally that produce requests will be failed before the number of
>> retries has been exhausted if the timeout configured by
>> delivery.timeout.ms expires first before successful acknowledgement.
>> Users should generally prefer to leave this config unset and instead use
>> delivery.timeout.ms to control retry behavior.
> 
> 
> Are you planning to do it via delivery.timeout.ms?
> 
> 
>> However, somehow while rolling upgrade, we saw producer retrying a lot of
>> times (for example, 16 times), and finally sending fine when broker was up
>> and running back, with exceptions like this:
>>
>>
>>
>> Cause: org.apache.kafka.common.errors.OutOfOrderSequenceException: The
>> broker received an out of order sequence number..
>>
>> Cause: org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>> server is not the leader for that topic-partition..
>>
>>
>>
>> Is that behaviour expected? It’s that retries configuration right trying
>> to ensure the message order, or maybe we should remove retries
>> configuration from our producers?
>>
>>
>>
>> As well we found this related to retries:
>>
>>
>>
>> The default value for the producer's retries config was changed to
>> Integer.MAX_VALUE, as we introduced delivery.timeout.ms in KIP-91, which
>> sets an upper bound on the total time between sending a record and
>> receiving acknowledgement from the broker. By default, the delivery timeout
>> is set to 2 minutes.
>>
>>
>>
>> Allowing retries without setting `max.in.flight.requests.per.connection` to `1` will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first. Note additionally that produce requests will be failed before the number of retries has been exhausted if the timeout configured by delivery.timeout.ms expires first before successful acknowledgement. Users should generally prefer to leave this config unset and instead use delivery.timeout.ms to control retry behavior.
>>
>>
>>
>> Note this was faced while rolling upgrade from 2.1.1 to 2.2.1.
>>
>>
>>
>> Thanks
>>
>>
>>
>> [image:
>> https://www.williamhillplc.com/content/signature/WHlogo.gif?width=180]
>> <http://www.williamhill.com/>
>>
>> [image:
>> https://www.williamhillplc.com/content/signature/senet.gif?width=180]
>> <http://www.whenthefunstops.co.uk/>
>>
>> *Jose Manuel Vega Monroy *
>> *Java Developer / Software Developer Engineer in Test*
>>
>> Direct: +*0035 0 2008038 (Ext. 8038)*
>> Email: jose.monroy@williamhill.com
>>
>> William Hill | 6/1 Waterport Place | Gibraltar | GX11 1AA
>>
>>
>>
>>
>>
>>
>> Confidentiality: The contents of this e-mail and any attachments
>> transmitted with it are intended to be confidential to the intended
>> recipient; and may be privileged or otherwise protected from disclosure. If
>> you are not an intended recipient of this e-mail, do not duplicate or
>> redistribute it by any means. Please delete it and any attachments and
>> notify the sender that you have received it in error. This e-mail is sent
>> by a William Hill PLC group company. The William Hill group companies
>> include, among others, William Hill PLC (registered number 4212563),
>> William Hill Organization Limited (registered number 278208), William Hill
>> US HoldCo Inc, WHG (International) Limited (registered number 99191) and Mr
>> Green Limited (registered number C43260). Each of William Hill PLC and
>> William Hill Organization Limited is registered in England and Wales and
>> has its registered office at 1 Bedford Avenue, London, WC1B 3AU, UK.
>> William Hill U.S. HoldCo, Inc. is registered in Delaware and has its
>> registered office at 1007 N. Orange Street, 9 Floor, Wilmington, New Castle
>> County DE 19801 Delaware, United States of America. WHG (International)
>> Limited is registered in Gibraltar and has its registered office at 6/1
>> Waterport Place, Gibraltar. Mr Green Limited is registered in Malta and has
>> its registered office at Tagliaferro Business Centre, Level 7, 14 High
>> Street, Sliema SLM 1549, Malta. Unless specifically indicated otherwise,
>> the contents of this e-mail are subject to contract; and are not an
>> official statement, and do not necessarily represent the views, of William
>> Hill PLC, its subsidiaries or affiliated companies. Please note that
>> neither William Hill PLC, nor its subsidiaries and affiliated companies can
>> accept any responsibility for any viruses contained within this e-mail and
>> it is your responsibility to scan any emails and their attachments. William
>> Hill PLC, its subsidiaries and affiliated companies may monitor e-mail
>> traffic data and also the content of e-mails for effective operation of the
>> e-mail system, or for security, purposes.
>>
> 


Re: Message order and retries question

Posted by "M. Manna" <ma...@gmail.com>.
Hi,

On Fri, 8 Nov 2019 at 17:19, Jose Manuel Vega Monroy <
jose.monroy@williamhill.com> wrote:

> Hi there,
>
>
>
> I have a question about message order and retries.
>
>
>
> After checking official documentation, and asking your feedback, we set
> this kafka client configuration in each producer:
>
>
>
>     retries = 1
>
>     # note to ensure order enable.idempotence=true, which forcing to acks=all and max.in.flight.requests.per.connection<=5
>
>     enable.idempotence = true
>
>     max.in.flight.requests.per.connection = 4
>
>     acks = "all"
>
>
>
 The documentation also says:

> Allowing retries without setting max.in.flight.requests.per.connection to
> 1 will potentially change the ordering of records because if two batches
> are sent to a single partition, and the first fails and is retried but the
> second succeeds, then the records in the second batch may appear first.
> Note additionally that produce requests will be failed before the number of
> retries has been exhausted if the timeout configured by
> delivery.timeout.ms expires first before successful acknowledgement.
> Users should generally prefer to leave this config unset and instead use
> delivery.timeout.ms to control retry behavior.


Are you planning to do it via delivery.timeout.ms?


> However, somehow while rolling upgrade, we saw producer retrying a lot of
> times (for example, 16 times), and finally sending fine when broker was up
> and running back, with exceptions like this:
>
>
>
> Cause: org.apache.kafka.common.errors.OutOfOrderSequenceException: The
> broker received an out of order sequence number..
>
> Cause: org.apache.kafka.common.errors.NotLeaderForPartitionException: This
> server is not the leader for that topic-partition..
>
>
>
> Is that behaviour expected? It’s that retries configuration right trying
> to ensure the message order, or maybe we should remove retries
> configuration from our producers?
>
>
>
> As well we found this related to retries:
>
>
>
> The default value for the producer's retries config was changed to
> Integer.MAX_VALUE, as we introduced delivery.timeout.ms in KIP-91, which
> sets an upper bound on the total time between sending a record and
> receiving acknowledgement from the broker. By default, the delivery timeout
> is set to 2 minutes.
>
>
>
> Allowing retries without setting `max.in.flight.requests.per.connection` to `1` will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first. Note additionally that produce requests will be failed before the number of retries has been exhausted if the timeout configured by delivery.timeout.ms expires first before successful acknowledgement. Users should generally prefer to leave this config unset and instead use delivery.timeout.ms to control retry behavior.
>
>
>
> Note this was faced while rolling upgrade from 2.1.1 to 2.2.1.
>
>
>
> Thanks
>
>
>
> [image:
> https://www.williamhillplc.com/content/signature/WHlogo.gif?width=180]
> <http://www.williamhill.com/>
>
> [image:
> https://www.williamhillplc.com/content/signature/senet.gif?width=180]
> <http://www.whenthefunstops.co.uk/>
>
> *Jose Manuel Vega Monroy *
> *Java Developer / Software Developer Engineer in Test*
>
> Direct: +*0035 0 2008038 (Ext. 8038)*
> Email: jose.monroy@williamhill.com
>
> William Hill | 6/1 Waterport Place | Gibraltar | GX11 1AA
>
>
>
>
>
>
> Confidentiality: The contents of this e-mail and any attachments
> transmitted with it are intended to be confidential to the intended
> recipient; and may be privileged or otherwise protected from disclosure. If
> you are not an intended recipient of this e-mail, do not duplicate or
> redistribute it by any means. Please delete it and any attachments and
> notify the sender that you have received it in error. This e-mail is sent
> by a William Hill PLC group company. The William Hill group companies
> include, among others, William Hill PLC (registered number 4212563),
> William Hill Organization Limited (registered number 278208), William Hill
> US HoldCo Inc, WHG (International) Limited (registered number 99191) and Mr
> Green Limited (registered number C43260). Each of William Hill PLC and
> William Hill Organization Limited is registered in England and Wales and
> has its registered office at 1 Bedford Avenue, London, WC1B 3AU, UK.
> William Hill U.S. HoldCo, Inc. is registered in Delaware and has its
> registered office at 1007 N. Orange Street, 9 Floor, Wilmington, New Castle
> County DE 19801 Delaware, United States of America. WHG (International)
> Limited is registered in Gibraltar and has its registered office at 6/1
> Waterport Place, Gibraltar. Mr Green Limited is registered in Malta and has
> its registered office at Tagliaferro Business Centre, Level 7, 14 High
> Street, Sliema SLM 1549, Malta. Unless specifically indicated otherwise,
> the contents of this e-mail are subject to contract; and are not an
> official statement, and do not necessarily represent the views, of William
> Hill PLC, its subsidiaries or affiliated companies. Please note that
> neither William Hill PLC, nor its subsidiaries and affiliated companies can
> accept any responsibility for any viruses contained within this e-mail and
> it is your responsibility to scan any emails and their attachments. William
> Hill PLC, its subsidiaries and affiliated companies may monitor e-mail
> traffic data and also the content of e-mails for effective operation of the
> e-mail system, or for security, purposes.
>