You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Robert Wagner (Jira)" <ji...@apache.org> on 2020/10/20 14:02:00 UTC

[jira] [Commented] (KAFKA-10470) zstd decompression with small batches is slow and causes excessive GC

    [ https://issues.apache.org/jira/browse/KAFKA-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217623#comment-17217623 ] 

Robert Wagner commented on KAFKA-10470:
---------------------------------------


[zstd-jni version 1.4.5-7|https://github.com/luben/zstd-jni/releases/tag/v1.4.5-7] has been released, which internally implements a BufferPool to re-use the decompression buffer without changing their API.  This fixes the issue of GC pressure but doesn't address [~yuzawa-san]'s concern about JNI boundary crossings.

My initial test with this updated dependency shows memory allocations and GC pressure comparable to gzip. overall throughput with small message batches still seems to lag behind all other codecs.


> zstd decompression with small batches is slow and causes excessive GC
> ---------------------------------------------------------------------
>
>                 Key: KAFKA-10470
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10470
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.5.1
>            Reporter: Robert Wagner
>            Priority: Major
>
> Similar to KAFKA-5150 but for zstd instead of LZ4, it appears that a large decompression buffer (128kb) created by zstd-jni per batch is causing a significant performance bottleneck.
> The next upcoming version of zstd-jni (1.4.5-7) will have a new constructor for ZstdInputStream that allows the client to pass its own buffer.  A similar fix as [PR #2967|https://github.com/apache/kafka/pull/2967] could be used to have the  ZstdConstructor use a BufferSupplier to re-use the decompression buffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)