You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Robert Wagner (Jira)" <ji...@apache.org> on 2020/10/20 16:30:00 UTC

[jira] [Comment Edited] (KAFKA-10470) zstd decompression with small batches is slow and causes excessive GC

    [ https://issues.apache.org/jira/browse/KAFKA-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217734#comment-17217734 ] 

Robert Wagner edited comment on KAFKA-10470 at 10/20/20, 4:29 PM:
------------------------------------------------------------------

I have run a test with the zstd streams wrapped with buffered streams as [~yuzawa-san] mentioned, and together with the upgraded zstd-jni package seems to give zstd a throughput similar to gzip when using small message batches, and there is no additional discernable broker memory overhead.  I think this is the way to go.

I don't think that not being able to re-use the BufferedOutputStream or BufferedInputStream buffers is that important - as that seems to be connection scoped, not message batch scoped.


was (Author: wolfchimneyrock):
I have run a test with the zstd streams wrapped with buffered streams as [~yuzawa-san] mentioned, and together with the upgraded zstd-jni package seems to give zstd a throughput similar to gzip when using small message batches, and there is no additional discernable broker memory overhead.  I think this is the way to go.

> zstd decompression with small batches is slow and causes excessive GC
> ---------------------------------------------------------------------
>
>                 Key: KAFKA-10470
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10470
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.5.1
>            Reporter: Robert Wagner
>            Priority: Major
>
> Similar to KAFKA-5150 but for zstd instead of LZ4, it appears that a large decompression buffer (128kb) created by zstd-jni per batch is causing a significant performance bottleneck.
> The next upcoming version of zstd-jni (1.4.5-7) will have a new constructor for ZstdInputStream that allows the client to pass its own buffer.  A similar fix as [PR #2967|https://github.com/apache/kafka/pull/2967] could be used to have the  ZstdConstructor use a BufferSupplier to re-use the decompression buffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)