You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Omer Litov <om...@imperva.com> on 2018/05/22 15:32:49 UTC

Failed on publish to Kafka

Hi,
I have 4 producers, writing to a kafka cluster with 12 partitions.
Periodically (about once per hour or two), we get exceptions while trying to send messages to kafka.
The exceptions are “Expiring 12 record(s) for <topic_name>-<partition>: 49193 ms has passed since batch creation plus linger time”.
We get that for all partitions (some more than others).
When those exceptions arrive, they usually come a few hundred together in under a second, and then stops.
We don’t see any indications for retries (even though they are configured).

We first used a cluster of 3 brokers with version 0.10.1.
Then upgraded to version 1.0.1 (with 3 brokers), and eventually added additional 2 brokers.
On all of those setting there was no change in the described behavior.

Would appreciate any lead on investigation the issue.

Thanks,
Omer
-------------------------------------------
NOTICE:
This email and all attachments are confidential, may be proprietary, and may be privileged or otherwise protected from disclosure. They are intended solely for the individual or entity to whom the email is addressed. However, mistakes sometimes happen in addressing emails. If you believe that you are not an intended recipient, please stop reading immediately. Do not copy, forward, or rely on the contents in any way. Notify the sender and/or Imperva, Inc. by telephone at +1 (650) 832-6006 and then delete or destroy any copy of this email and its attachments. The sender reserves and asserts all rights to confidentiality, as well as any privileges that may apply. Any disclosure, copying, distribution or action taken or omitted to be taken by an unintended recipient in reliance on this message is prohibited and may be unlawful.
Please consider the environment before printing this email.

Re: Failed on publish to Kafka

Posted by Omer Litov <om...@imperva.com>.

Hi,
Thanks, I will try it.
I just don’t understand a couple of things:
1. how could reducing the buffer size can help here.
2. What is using a dummy topic relevant? Can’t I just continue to send to the existing topic?


Thanks

On 23/05/2018, 9:33 AM, "M. Manna" <ma...@gmail.com> wrote:

    Can you try following?
    
    1) reduce buffer size to exactly 2x of your batch size ?
    
    2) back up and save you old properties file.
    
    3) create a dummy topic
    
    4) produce and consume messages using the new config.
    
    Let us know.
    
    On Wed, 23 May 2018 at 07:02, Omer Litov <om...@imperva.com> wrote:
    
    > Hi,
    > Thanks for the quick response.
    >
    > What I meant is, that had those exceptions on both versions (0.10.1 and
    > 1.0.1), so upgrading didn't solve the problem. And the same for increasing
    > the cluster size.
    >
    > Here are the properties we set:
    >
    > "key.serializer" = "org.apache.kafka.common.serialization.StringSerializer"
    > "value.serializer" =
    > "org.apache.kafka.common.serialization.ByteArraySerializer"
    > "acks" = "all"
    > "retries" = 5
    > "buffer.memory" = 33554432
    > "batch.size" = 16384
    > "request.timeout.ms" = 50000
    >
    >
    > thanks,
    > Omer
    >
    > On 22/05/2018, 8:14 PM, "M. Manna" <ma...@gmail.com> wrote:
    >
    >     Hi,
    >
    >     You said "On all of those setting there was no change in the described
    >     behavior" - this is slightly confusing. Could you please clarify this?
    > If
    >     there is no change, that means everything is working :) ?
    >
    >     From the provided exception stack, it seems as if you are waiting to
    > batch
    >     a lot of request but they are getting timed out for some reasons.
    >     How have you upgraded the cluster to 1.0.1, and have you changed any
    > config?
    >
    >     Regards,
    >
    >
    >     On 22 May 2018 at 16:32, Omer Litov <om...@imperva.com> wrote:
    >
    >     > Hi,
    >     > I have 4 producers, writing to a kafka cluster with 12 partitions.
    >     > Periodically (about once per hour or two), we get exceptions while
    > trying
    >     > to send messages to kafka.
    >     > The exceptions are “Expiring 12 record(s) for
    > <topic_name>-<partition>:
    >     > 49193 ms has passed since batch creation plus linger time”.
    >     > We get that for all partitions (some more than others).
    >     > When those exceptions arrive, they usually come a few hundred
    > together in
    >     > under a second, and then stops.
    >     > We don’t see any indications for retries (even though they are
    > configured).
    >     >
    >     > We first used a cluster of 3 brokers with version 0.10.1.
    >     > Then upgraded to version 1.0.1 (with 3 brokers), and eventually added
    >     > additional 2 brokers.
    >     > On all of those setting there was no change in the described
    > behavior.
    >     >
    >     >
    >     > Would appreciate any lead on investigation the issue.
    >     >
    >     >
    >     > Thanks,
    >     > Omer
    >     > -------------------------------------------
    >     > NOTICE:
    >     > This email and all attachments are confidential, may be proprietary,
    > and
    >     > may be privileged or otherwise protected from disclosure. They are
    > intended
    >     > solely for the individual or entity to whom the email is addressed.
    >     > However, mistakes sometimes happen in addressing emails. If you
    > believe
    >     > that you are not an intended recipient, please stop reading
    > immediately. Do
    >     > not copy, forward, or rely on the contents in any way. Notify the
    > sender
    >     > and/or Imperva, Inc. by telephone at +1 (650) 832-6006 and then
    > delete or
    >     > destroy any copy of this email and its attachments. The sender
    > reserves and
    >     > asserts all rights to confidentiality, as well as any privileges
    > that may
    >     > apply. Any disclosure, copying, distribution or action taken or
    > omitted to
    >     > be taken by an unintended recipient in reliance on this message is
    >     > prohibited and may be unlawful.
    >     > Please consider the environment before printing this email.
    >     >
    >
    >
    >

Re: Failed on publish to Kafka

Posted by "M. Manna" <ma...@gmail.com>.

Can you try following?

1) reduce buffer size to exactly 2x of your batch size ?

2) back up and save you old properties file.

3) create a dummy topic

4) produce and consume messages using the new config.

Let us know.

On Wed, 23 May 2018 at 07:02, Omer Litov <om...@imperva.com> wrote:

> Hi,
> Thanks for the quick response.
>
> What I meant is, that had those exceptions on both versions (0.10.1 and
> 1.0.1), so upgrading didn't solve the problem. And the same for increasing
> the cluster size.
>
> Here are the properties we set:
>
> "key.serializer" = "org.apache.kafka.common.serialization.StringSerializer"
> "value.serializer" =
> "org.apache.kafka.common.serialization.ByteArraySerializer"
> "acks" = "all"
> "retries" = 5
> "buffer.memory" = 33554432
> "batch.size" = 16384
> "request.timeout.ms" = 50000
>
>
> thanks,
> Omer
>
> On 22/05/2018, 8:14 PM, "M. Manna" <ma...@gmail.com> wrote:
>
>     Hi,
>
>     You said "On all of those setting there was no change in the described
>     behavior" - this is slightly confusing. Could you please clarify this?
> If
>     there is no change, that means everything is working :) ?
>
>     From the provided exception stack, it seems as if you are waiting to
> batch
>     a lot of request but they are getting timed out for some reasons.
>     How have you upgraded the cluster to 1.0.1, and have you changed any
> config?
>
>     Regards,
>
>
>     On 22 May 2018 at 16:32, Omer Litov <om...@imperva.com> wrote:
>
>     > Hi,
>     > I have 4 producers, writing to a kafka cluster with 12 partitions.
>     > Periodically (about once per hour or two), we get exceptions while
> trying
>     > to send messages to kafka.
>     > The exceptions are “Expiring 12 record(s) for
> <topic_name>-<partition>:
>     > 49193 ms has passed since batch creation plus linger time”.
>     > We get that for all partitions (some more than others).
>     > When those exceptions arrive, they usually come a few hundred
> together in
>     > under a second, and then stops.
>     > We don’t see any indications for retries (even though they are
> configured).
>     >
>     > We first used a cluster of 3 brokers with version 0.10.1.
>     > Then upgraded to version 1.0.1 (with 3 brokers), and eventually added
>     > additional 2 brokers.
>     > On all of those setting there was no change in the described
> behavior.
>     >
>     >
>     > Would appreciate any lead on investigation the issue.
>     >
>     >
>     > Thanks,
>     > Omer
>     > -------------------------------------------
>     > NOTICE:
>     > This email and all attachments are confidential, may be proprietary,
> and
>     > may be privileged or otherwise protected from disclosure. They are
> intended
>     > solely for the individual or entity to whom the email is addressed.
>     > However, mistakes sometimes happen in addressing emails. If you
> believe
>     > that you are not an intended recipient, please stop reading
> immediately. Do
>     > not copy, forward, or rely on the contents in any way. Notify the
> sender
>     > and/or Imperva, Inc. by telephone at +1 (650) 832-6006 and then
> delete or
>     > destroy any copy of this email and its attachments. The sender
> reserves and
>     > asserts all rights to confidentiality, as well as any privileges
> that may
>     > apply. Any disclosure, copying, distribution or action taken or
> omitted to
>     > be taken by an unintended recipient in reliance on this message is
>     > prohibited and may be unlawful.
>     > Please consider the environment before printing this email.
>     >
>
>
>

Re: Failed on publish to Kafka

Posted by Omer Litov <om...@imperva.com>.

Hi,
Thanks for the quick response.

What I meant is, that had those exceptions on both versions (0.10.1 and 1.0.1), so upgrading didn't solve the problem. And the same for increasing the cluster size.

Here are the properties we set:

"key.serializer" = "org.apache.kafka.common.serialization.StringSerializer"
"value.serializer" = "org.apache.kafka.common.serialization.ByteArraySerializer"
"acks" = "all"
"retries" = 5
"buffer.memory" = 33554432
"batch.size" = 16384
"request.timeout.ms" = 50000


thanks,
Omer

On 22/05/2018, 8:14 PM, "M. Manna" <ma...@gmail.com> wrote:

    Hi,
    
    You said "On all of those setting there was no change in the described
    behavior" - this is slightly confusing. Could you please clarify this? If
    there is no change, that means everything is working :) ?
    
    From the provided exception stack, it seems as if you are waiting to batch
    a lot of request but they are getting timed out for some reasons.
    How have you upgraded the cluster to 1.0.1, and have you changed any config?
    
    Regards,
    
    
    On 22 May 2018 at 16:32, Omer Litov <om...@imperva.com> wrote:
    
    > Hi,
    > I have 4 producers, writing to a kafka cluster with 12 partitions.
    > Periodically (about once per hour or two), we get exceptions while trying
    > to send messages to kafka.
    > The exceptions are “Expiring 12 record(s) for <topic_name>-<partition>:
    > 49193 ms has passed since batch creation plus linger time”.
    > We get that for all partitions (some more than others).
    > When those exceptions arrive, they usually come a few hundred together in
    > under a second, and then stops.
    > We don’t see any indications for retries (even though they are configured).
    >
    > We first used a cluster of 3 brokers with version 0.10.1.
    > Then upgraded to version 1.0.1 (with 3 brokers), and eventually added
    > additional 2 brokers.
    > On all of those setting there was no change in the described behavior.
    >
    >
    > Would appreciate any lead on investigation the issue.
    >
    >
    > Thanks,
    > Omer
    > -------------------------------------------
    > NOTICE:
    > This email and all attachments are confidential, may be proprietary, and
    > may be privileged or otherwise protected from disclosure. They are intended
    > solely for the individual or entity to whom the email is addressed.
    > However, mistakes sometimes happen in addressing emails. If you believe
    > that you are not an intended recipient, please stop reading immediately. Do
    > not copy, forward, or rely on the contents in any way. Notify the sender
    > and/or Imperva, Inc. by telephone at +1 (650) 832-6006 and then delete or
    > destroy any copy of this email and its attachments. The sender reserves and
    > asserts all rights to confidentiality, as well as any privileges that may
    > apply. Any disclosure, copying, distribution or action taken or omitted to
    > be taken by an unintended recipient in reliance on this message is
    > prohibited and may be unlawful.
    > Please consider the environment before printing this email.
    >

Re: Failed on publish to Kafka

Posted by "M. Manna" <ma...@gmail.com>.

Hi,

You said "On all of those setting there was no change in the described
behavior" - this is slightly confusing. Could you please clarify this? If
there is no change, that means everything is working :) ?

From the provided exception stack, it seems as if you are waiting to batch
a lot of request but they are getting timed out for some reasons.
How have you upgraded the cluster to 1.0.1, and have you changed any config?

Regards,


On 22 May 2018 at 16:32, Omer Litov <om...@imperva.com> wrote:

> Hi,
> I have 4 producers, writing to a kafka cluster with 12 partitions.
> Periodically (about once per hour or two), we get exceptions while trying
> to send messages to kafka.
> The exceptions are “Expiring 12 record(s) for <topic_name>-<partition>:
> 49193 ms has passed since batch creation plus linger time”.
> We get that for all partitions (some more than others).
> When those exceptions arrive, they usually come a few hundred together in
> under a second, and then stops.
> We don’t see any indications for retries (even though they are configured).
>
> We first used a cluster of 3 brokers with version 0.10.1.
> Then upgraded to version 1.0.1 (with 3 brokers), and eventually added
> additional 2 brokers.
> On all of those setting there was no change in the described behavior.
>
>
> Would appreciate any lead on investigation the issue.
>
>
> Thanks,
> Omer
> -------------------------------------------
> NOTICE:
> This email and all attachments are confidential, may be proprietary, and
> may be privileged or otherwise protected from disclosure. They are intended
> solely for the individual or entity to whom the email is addressed.
> However, mistakes sometimes happen in addressing emails. If you believe
> that you are not an intended recipient, please stop reading immediately. Do
> not copy, forward, or rely on the contents in any way. Notify the sender
> and/or Imperva, Inc. by telephone at +1 (650) 832-6006 and then delete or
> destroy any copy of this email and its attachments. The sender reserves and
> asserts all rights to confidentiality, as well as any privileges that may
> apply. Any disclosure, copying, distribution or action taken or omitted to
> be taken by an unintended recipient in reliance on this message is
> prohibited and may be unlawful.
> Please consider the environment before printing this email.
>