You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2011/08/18 11:11:27 UTC

[jira] [Created] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

CompressionUtils introduces a GZIP header while compressing empty message sets
------------------------------------------------------------------------------

                 Key: KAFKA-109
                 URL: https://issues.apache.org/jira/browse/KAFKA-109
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.7
            Reporter: Neha Narkhede


The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-109:
--------------------------

       Resolution: Fixed
    Fix Version/s: 0.7
           Status: Resolved  (was: Patch Available)

> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-109
>                 URL: https://issues.apache.org/jira/browse/KAFKA-109
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>             Fix For: 0.7
>
>         Attachments: KAFKA-109.patch, KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-109:
--------------------------------

    Attachment: KAFKA-109.patch

> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-109
>                 URL: https://issues.apache.org/jira/browse/KAFKA-109
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>         Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-109:
--------------------------------

    Status: Patch Available  (was: Open)

This patch handles the behavior of ByteBufferMessageSet for compression of empty list of messages. This modifies the ByteBufferMessageSet to create an empty byte buffer, in this case, instead of attaching a GZIP header to it. There are a couple of reasons to do this -

1. To maintain consistent behavior between an empty uncompressed message set and an empty compressed message set
2. To avoid attaching extraneous header information to non-existing data, effectively occupying space on disk

> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-109
>                 URL: https://issues.apache.org/jira/browse/KAFKA-109
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>         Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087116#comment-13087116 ] 

Jun Rao commented on KAFKA-109:
-------------------------------

Actually, this doesn't cover javaapi.ByteBufferMessageSet. It seems that javaapi.ByteBufferMessageSet duplicates some of the constructor code in ByteBufferMessageSet. We should avoid doing that.

> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-109
>                 URL: https://issues.apache.org/jira/browse/KAFKA-109
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>         Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087320#comment-13087320 ] 

Jun Rao commented on KAFKA-109:
-------------------------------

+1

> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-109
>                 URL: https://issues.apache.org/jira/browse/KAFKA-109
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>         Attachments: KAFKA-109.patch, KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087079#comment-13087079 ] 

Jun Rao commented on KAFKA-109:
-------------------------------

+1. The patch looks good.

> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-109
>                 URL: https://issues.apache.org/jira/browse/KAFKA-109
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>         Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP header while compressing empty message sets

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-109:
--------------------------------

    Attachment: KAFKA-109.patch

This is a revised patch that refactors the constructors of both java and scala ByteBufferMessageSet into a common API in MessageSet. This ensures that the bug fix exists both in the Java API as well as the Scala API

> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-109
>                 URL: https://issues.apache.org/jira/browse/KAFKA-109
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>         Attachments: KAFKA-109.patch, KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira