You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Mihaela Stoycheva <mi...@gmail.com> on 2018/03/27 12:25:46 UTC

Question about Kafka Streams error message when a message is larger than the maximum size the server will accept

Hello,

I have a Kafka Streams application that is consuming from two topics and
internally aggregating, transforming and joining data. One of the
aggregation steps is adding an id to an ArrayList of ids. Naturally since
there was a lot of data the changelog message became too big and was not
sent to the changelog topic with the following exception:

[ERROR]  (1-producer)
org.apache.kafka.streams.processor.internals.RecordCollectorImpl   -
task [2_2] Error sending record (key {"eventId":432897452,"version":1}
value [<byte array>] timestamp 1521832424795) to topic
<application-id>-KSTREAM-AGGREGATE-STATE-STORE-0000000016-changelog
due to {}; No more records will be sent and no more offsets will be
recorded for this task.
org.apache.kafka.common.errors.RecordTooLargeException: The request included
a message larger than the max message size the server will accept.

In this message the key is a nicely formatted JSON as it should be, but the
value is an enormous byte array, instead of JSON. I checked the
corresponding changelog topic and the messages that were logged before that
are JSON strings. Also I am using Serdes for both the key and value class.
My question is why is the key logged as JSON and the value logged as byte
array instead of JSON?

Regards,
Mihaela Stoycheva

Re: Question about Kafka Streams error message when a message is larger than the maximum size the server will accept

Posted by Guozhang Wang <wa...@gmail.com>.
Yes that is correlated, thanks for the reminder.

I've updated the JIRA to reflect your observations as well.


Guozhang


On Wed, Mar 28, 2018 at 12:41 AM, Mihaela Stoycheva <
mihaela.stoycheva@gmail.com> wrote:

> Hello Guozhang,
>
> Thank you for the answer, that could explain what is happening. Is it
> possible that this is related in some way to
> https://issues.apache.org/jira/browse/KAFKA-6538?
>
> Mihaela
>
> On Wed, Mar 28, 2018 at 2:21 AM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hello Mihaela,
> >
> > It is possible that when you have caching enabled, the value of the
> record
> > has already been serialized before sending to the changelogger while the
> > key was not. Admittedly it is not very friendly for trouble-shooting
> > related log4j entries..
> >
> >
> > Guozhang
> >
> >
> > On Tue, Mar 27, 2018 at 5:25 AM, Mihaela Stoycheva <
> > mihaela.stoycheva@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > I have a Kafka Streams application that is consuming from two topics
> and
> > > internally aggregating, transforming and joining data. One of the
> > > aggregation steps is adding an id to an ArrayList of ids. Naturally
> since
> > > there was a lot of data the changelog message became too big and was
> not
> > > sent to the changelog topic with the following exception:
> > >
> > > [ERROR]  (1-producer)
> > > org.apache.kafka.streams.processor.internals.RecordCollectorImpl   -
> > > task [2_2] Error sending record (key {"eventId":432897452,"version":1}
> > > value [<byte array>] timestamp 1521832424795) to topic
> > > <application-id>-KSTREAM-AGGREGATE-STATE-STORE-0000000016-changelog
> > > due to {}; No more records will be sent and no more offsets will be
> > > recorded for this task.
> > > org.apache.kafka.common.errors.RecordTooLargeException: The request
> > > included
> > > a message larger than the max message size the server will accept.
> > >
> > > In this message the key is a nicely formatted JSON as it should be, but
> > the
> > > value is an enormous byte array, instead of JSON. I checked the
> > > corresponding changelog topic and the messages that were logged before
> > that
> > > are JSON strings. Also I am using Serdes for both the key and value
> > class.
> > > My question is why is the key logged as JSON and the value logged as
> byte
> > > array instead of JSON?
> > >
> > > Regards,
> > > Mihaela Stoycheva
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: Question about Kafka Streams error message when a message is larger than the maximum size the server will accept

Posted by Mihaela Stoycheva <mi...@gmail.com>.
Hello Guozhang,

Thank you for the answer, that could explain what is happening. Is it
possible that this is related in some way to
https://issues.apache.org/jira/browse/KAFKA-6538?

Mihaela

On Wed, Mar 28, 2018 at 2:21 AM, Guozhang Wang <wa...@gmail.com> wrote:

> Hello Mihaela,
>
> It is possible that when you have caching enabled, the value of the record
> has already been serialized before sending to the changelogger while the
> key was not. Admittedly it is not very friendly for trouble-shooting
> related log4j entries..
>
>
> Guozhang
>
>
> On Tue, Mar 27, 2018 at 5:25 AM, Mihaela Stoycheva <
> mihaela.stoycheva@gmail.com> wrote:
>
> > Hello,
> >
> > I have a Kafka Streams application that is consuming from two topics and
> > internally aggregating, transforming and joining data. One of the
> > aggregation steps is adding an id to an ArrayList of ids. Naturally since
> > there was a lot of data the changelog message became too big and was not
> > sent to the changelog topic with the following exception:
> >
> > [ERROR]  (1-producer)
> > org.apache.kafka.streams.processor.internals.RecordCollectorImpl   -
> > task [2_2] Error sending record (key {"eventId":432897452,"version":1}
> > value [<byte array>] timestamp 1521832424795) to topic
> > <application-id>-KSTREAM-AGGREGATE-STATE-STORE-0000000016-changelog
> > due to {}; No more records will be sent and no more offsets will be
> > recorded for this task.
> > org.apache.kafka.common.errors.RecordTooLargeException: The request
> > included
> > a message larger than the max message size the server will accept.
> >
> > In this message the key is a nicely formatted JSON as it should be, but
> the
> > value is an enormous byte array, instead of JSON. I checked the
> > corresponding changelog topic and the messages that were logged before
> that
> > are JSON strings. Also I am using Serdes for both the key and value
> class.
> > My question is why is the key logged as JSON and the value logged as byte
> > array instead of JSON?
> >
> > Regards,
> > Mihaela Stoycheva
> >
>
>
>
> --
> -- Guozhang
>

Re: Question about Kafka Streams error message when a message is larger than the maximum size the server will accept

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Mihaela,

It is possible that when you have caching enabled, the value of the record
has already been serialized before sending to the changelogger while the
key was not. Admittedly it is not very friendly for trouble-shooting
related log4j entries..


Guozhang


On Tue, Mar 27, 2018 at 5:25 AM, Mihaela Stoycheva <
mihaela.stoycheva@gmail.com> wrote:

> Hello,
>
> I have a Kafka Streams application that is consuming from two topics and
> internally aggregating, transforming and joining data. One of the
> aggregation steps is adding an id to an ArrayList of ids. Naturally since
> there was a lot of data the changelog message became too big and was not
> sent to the changelog topic with the following exception:
>
> [ERROR]  (1-producer)
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl   -
> task [2_2] Error sending record (key {"eventId":432897452,"version":1}
> value [<byte array>] timestamp 1521832424795) to topic
> <application-id>-KSTREAM-AGGREGATE-STATE-STORE-0000000016-changelog
> due to {}; No more records will be sent and no more offsets will be
> recorded for this task.
> org.apache.kafka.common.errors.RecordTooLargeException: The request
> included
> a message larger than the max message size the server will accept.
>
> In this message the key is a nicely formatted JSON as it should be, but the
> value is an enormous byte array, instead of JSON. I checked the
> corresponding changelog topic and the messages that were logged before that
> are JSON strings. Also I am using Serdes for both the key and value class.
> My question is why is the key logged as JSON and the value logged as byte
> array instead of JSON?
>
> Regards,
> Mihaela Stoycheva
>



-- 
-- Guozhang