You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2015/02/23 17:21:12 UTC

[jira] [Updated] (KAFKA-527) Compression support does numerous byte copies

     [ https://issues.apache.org/jira/browse/KAFKA-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-527:
--------------------------------
    Component/s: compression

> Compression support does numerous byte copies
> ---------------------------------------------
>
>                 Key: KAFKA-527
>                 URL: https://issues.apache.org/jira/browse/KAFKA-527
>             Project: Kafka
>          Issue Type: Bug
>          Components: compression
>            Reporter: Jay Kreps
>         Attachments: KAFKA-527.message-copy.history, java.hprof.no-compression.txt, java.hprof.snappy.text
>
>
> The data path for compressing or decompressing messages is extremely inefficient. We do something like 7 (?) complete copies of the data, often for simple things like adding a 4 byte size to the front. I am not sure how this went by unnoticed.
> This is likely the root cause of the performance issues we saw in doing bulk recompression of data in mirror maker.
> The mismatch between the InputStream and OutputStream interfaces and the Message/MessageSet interfaces which are based on byte buffers is the cause of many of these.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)