You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by l vic <lv...@gmail.com> on 2019/08/21 13:51:42 UTC

OOM for large messages with compression?

I have to deal with large ( 16M) text messages in my Kafka system, so i
increased several message limit settings on broker/producer/consumer site
and now the system is able to get them through....I also tried to enable
compression in producer:
"compression.type"= "gzip"
but to my surprise ended up with OOM exceptions on producer side:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at
java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300) at
java.lang.StringCoding.encode(StringCoding.java:344) at
java.lang.String.getBytes(String.java:918) at
org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:43)
at
org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:24)
at
org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:326)
at
org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:248)
Shouldn't I be able to save memory with compression? Why does the
compression have the opposite effect?

Re: OOM for large messages with compression?

Posted by Pere Urbón Bayes <pe...@gmail.com>.
Hi,

https://www.enterpriseintegrationpatterns.com/patterns/messaging/StoreInLibrary.html

and

https://docs.microsoft.com/en-us/azure/architecture/patterns/claim-check

are good source of information.

end of the day, my experience Apachr Kafka will perform good with max
between 6 and 8mb messages. Is important to know if sporadic or regular,
etc.

My best recommendation implement this pattern.

-- Pere

On Thu, 22 Aug 2019, 20:01 l vic <lv...@gmail.com> wrote:

> Thank you, can you elaborate on "ClaimChecks"? Didn't find it with
> search...
> Thank you again,
>
>
> On Thu, Aug 22, 2019 at 12:49 AM Pere Urbón Bayes <pe...@gmail.com>
> wrote:
>
> > Hi,
> >   i agree with Liam, the OOM look to happen during produce time. I would
> > look into that.
> >
> > But with your message size, i would recommend investigating to implement
> > ClaimChecks. It will be much easier and reduce the avg message size.
> >
> > -- Pere
> >
> > On Thu, 22 Aug 2019, 01:12 Liam Clarke <li...@adscale.co.nz>
> wrote:
> >
> > > Hi I Vic,
> > >
> > > Your OOM is happening before any compression is applied. It's occurring
> > > when the StringSerializer is converting the string to bytes. Looking
> > deeper
> > > into StringCoding.encode, it's first allocating a byte array to fit
> your
> > > string, and this is where your OOM is occurring, line 300 of
> > > StringCoding.java is  byte[] ba = new byte[en];
> > >
> > > Compression is applied after the string is serialized to bytes. So
> you'll
> > > need to increase your heap size to support this.
> > >
> > > Hope that helps :)
> > >
> > > Liam Clarke
> > >
> > > On Thu, Aug 22, 2019 at 1:52 AM l vic <lv...@gmail.com> wrote:
> > >
> > > > I have to deal with large ( 16M) text messages in my Kafka system,
> so i
> > > > increased several message limit settings on broker/producer/consumer
> > site
> > > > and now the system is able to get them through....I also tried to
> > enable
> > > > compression in producer:
> > > > "compression.type"= "gzip"
> > > > but to my surprise ended up with OOM exceptions on producer side:
> > > > Exception in thread "main" java.lang.OutOfMemoryError: Java heap
> space
> > at
> > > > java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300) at
> > > > java.lang.StringCoding.encode(StringCoding.java:344) at
> > > > java.lang.String.getBytes(String.java:918) at
> > > >
> > > >
> > >
> >
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:43)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:24)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:326)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:248)
> > > > Shouldn't I be able to save memory with compression? Why does the
> > > > compression have the opposite effect?
> > > >
> > >
> >
>

Re: OOM for large messages with compression?

Posted by l vic <lv...@gmail.com>.
Thank you, can you elaborate on "ClaimChecks"? Didn't find it with search...
Thank you again,


On Thu, Aug 22, 2019 at 12:49 AM Pere Urbón Bayes <pe...@gmail.com>
wrote:

> Hi,
>   i agree with Liam, the OOM look to happen during produce time. I would
> look into that.
>
> But with your message size, i would recommend investigating to implement
> ClaimChecks. It will be much easier and reduce the avg message size.
>
> -- Pere
>
> On Thu, 22 Aug 2019, 01:12 Liam Clarke <li...@adscale.co.nz> wrote:
>
> > Hi I Vic,
> >
> > Your OOM is happening before any compression is applied. It's occurring
> > when the StringSerializer is converting the string to bytes. Looking
> deeper
> > into StringCoding.encode, it's first allocating a byte array to fit your
> > string, and this is where your OOM is occurring, line 300 of
> > StringCoding.java is  byte[] ba = new byte[en];
> >
> > Compression is applied after the string is serialized to bytes. So you'll
> > need to increase your heap size to support this.
> >
> > Hope that helps :)
> >
> > Liam Clarke
> >
> > On Thu, Aug 22, 2019 at 1:52 AM l vic <lv...@gmail.com> wrote:
> >
> > > I have to deal with large ( 16M) text messages in my Kafka system, so i
> > > increased several message limit settings on broker/producer/consumer
> site
> > > and now the system is able to get them through....I also tried to
> enable
> > > compression in producer:
> > > "compression.type"= "gzip"
> > > but to my surprise ended up with OOM exceptions on producer side:
> > > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at
> > > java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300) at
> > > java.lang.StringCoding.encode(StringCoding.java:344) at
> > > java.lang.String.getBytes(String.java:918) at
> > >
> > >
> >
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:43)
> > > at
> > >
> > >
> >
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:24)
> > > at
> > >
> > >
> >
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:326)
> > > at
> > >
> > >
> >
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:248)
> > > Shouldn't I be able to save memory with compression? Why does the
> > > compression have the opposite effect?
> > >
> >
>

Re: OOM for large messages with compression?

Posted by Pere Urbón Bayes <pe...@gmail.com>.
Hi,
  i agree with Liam, the OOM look to happen during produce time. I would
look into that.

But with your message size, i would recommend investigating to implement
ClaimChecks. It will be much easier and reduce the avg message size.

-- Pere

On Thu, 22 Aug 2019, 01:12 Liam Clarke <li...@adscale.co.nz> wrote:

> Hi I Vic,
>
> Your OOM is happening before any compression is applied. It's occurring
> when the StringSerializer is converting the string to bytes. Looking deeper
> into StringCoding.encode, it's first allocating a byte array to fit your
> string, and this is where your OOM is occurring, line 300 of
> StringCoding.java is  byte[] ba = new byte[en];
>
> Compression is applied after the string is serialized to bytes. So you'll
> need to increase your heap size to support this.
>
> Hope that helps :)
>
> Liam Clarke
>
> On Thu, Aug 22, 2019 at 1:52 AM l vic <lv...@gmail.com> wrote:
>
> > I have to deal with large ( 16M) text messages in my Kafka system, so i
> > increased several message limit settings on broker/producer/consumer site
> > and now the system is able to get them through....I also tried to enable
> > compression in producer:
> > "compression.type"= "gzip"
> > but to my surprise ended up with OOM exceptions on producer side:
> > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at
> > java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300) at
> > java.lang.StringCoding.encode(StringCoding.java:344) at
> > java.lang.String.getBytes(String.java:918) at
> >
> >
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:43)
> > at
> >
> >
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:24)
> > at
> >
> >
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:326)
> > at
> >
> >
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:248)
> > Shouldn't I be able to save memory with compression? Why does the
> > compression have the opposite effect?
> >
>

Re: OOM for large messages with compression?

Posted by Liam Clarke <li...@adscale.co.nz>.
Hi I Vic,

Your OOM is happening before any compression is applied. It's occurring
when the StringSerializer is converting the string to bytes. Looking deeper
into StringCoding.encode, it's first allocating a byte array to fit your
string, and this is where your OOM is occurring, line 300 of
StringCoding.java is  byte[] ba = new byte[en];

Compression is applied after the string is serialized to bytes. So you'll
need to increase your heap size to support this.

Hope that helps :)

Liam Clarke

On Thu, Aug 22, 2019 at 1:52 AM l vic <lv...@gmail.com> wrote:

> I have to deal with large ( 16M) text messages in my Kafka system, so i
> increased several message limit settings on broker/producer/consumer site
> and now the system is able to get them through....I also tried to enable
> compression in producer:
> "compression.type"= "gzip"
> but to my surprise ended up with OOM exceptions on producer side:
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at
> java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300) at
> java.lang.StringCoding.encode(StringCoding.java:344) at
> java.lang.String.getBytes(String.java:918) at
>
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:43)
> at
>
> org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:24)
> at
>
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:326)
> at
>
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:248)
> Shouldn't I be able to save memory with compression? Why does the
> compression have the opposite effect?
>