You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Anatoliy Soldatov <ak...@avito.ru.INVALID> on 2019/03/01 13:23:51 UTC

Re: Errors when broker rejoins the cluster 5+ mins after clean shutdown

Hello, Marcos!

Try to look at max.block.ms on producer config. I guess, you have to tune this parameter (in your case it should be something close to max.block.ms=360000)

Tolya

> 27 февр. 2019 г., в 22:09, Marcos Juarez <mj...@gmail.com> написал(а):
>
> We're doing some testing on Kafka 1.1 brokers in AWS EC2. Specifically,
> we're cleanly shutting brokers down for 5 mins or more, and then restarting
> them, while producing and consuming from the cluster all the time. In
> theory, this should be relatively seamless to both producers and consumers.
>
> However, we're getting these errors on the producer application when
> testing a broker being down for over 5 mins:
>
> ERROR - 2019-02-27 04:34:24.946 - message size [2494]
>       -Expiring 7 record(s) for topic2-0: 30033 ms has passed since last append
>
> And then the first of many errors similar to this.
>
> ERROR - 2019-02-27 04:35:13.098; Topic [topic2], message size [2494]
>       -Failed to allocate memory within the configured max blocking
> time 60000 ms.
>
> At that point, the long-running producers become non-responsive, and every
> producer request fails with that same Failed to allocate memory error. I
> tried to search online for similar issues, but all I could find is an old
> Kafka JIRA ticket that was resolved in 0.10.1.1, so it shouldn't apply for
> the newer 1.1 version we're using.
>
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKAFKA-3651&amp;data=02%7C01%7Caksoldatov%40avito.ru%7C7c6d83f5ffcc441f16a708d69ce72193%7Caf0e07b3b90b472392e63fab11dd5396%7C1%7C0%7C636868913844270376&amp;sdata=8yJ6oKCsM5nQ5kXC2a2BU5Ag9xmzFCEbJuc3JDwLD8Q%3D&amp;reserved=0
>
> We have attempted a lot of different scenarios, including changing the
> producer configuration back to Kafka defaults, to see if anything would
> help, but we always run into that problem whenever a broker comes back into
> the cluster after being down for 5 mins or more.
>
> I also posted on SO:
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F54911991%2Ffailed-to-allocate-memory-and-expiring-record-error-after-kafka-broker-is-do&amp;data=02%7C01%7Caksoldatov%40avito.ru%7C7c6d83f5ffcc441f16a708d69ce72193%7Caf0e07b3b90b472392e63fab11dd5396%7C1%7C0%7C636868913844270376&amp;sdata=Ci%2B%2BKruAhyL1%2Fx589%2Bb%2FXoU%2BY5pTdZYiOi6N2r9JZq8%3D&amp;reserved=0
> Any idea what we might be doing wrong?
>
> Thanks,
>
> Marcos


________________________________
"This message contains confidential information/commercial secret. If you are not the intended addressee of this message you may not copy, save, print or forward it to any third party and you are kindly requested to destroy this message and notify the sender thereof by email.
Данное сообщение содержит конфиденциальную информацию/информацию, являющуюся коммерческой тайной. Если Вы не являетесь надлежащим адресатом данного сообщения, Вы не вправе копировать, сохранять, печатать или пересылать его каким либо иным лицам. Просьба уничтожить данное сообщение и уведомить об этом отправителя электронным письмом.”