You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2011/08/18 11:11:27 UTC
[jira] [Created] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
CompressionUtils introduces a GZIP header while compressing empty message sets
------------------------------------------------------------------------------
Key: KAFKA-109
URL: https://issues.apache.org/jira/browse/KAFKA-109
Project: Kafka
Issue Type: Bug
Affects Versions: 0.7
Reporter: Neha Narkhede
The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-109:
--------------------------
Resolution: Fixed
Fix Version/s: 0.7
Status: Resolved (was: Patch Available)
> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
> Key: KAFKA-109
> URL: https://issues.apache.org/jira/browse/KAFKA-109
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: Neha Narkhede
> Fix For: 0.7
>
> Attachments: KAFKA-109.patch, KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede updated KAFKA-109:
--------------------------------
Attachment: KAFKA-109.patch
> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
> Key: KAFKA-109
> URL: https://issues.apache.org/jira/browse/KAFKA-109
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: Neha Narkhede
> Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede updated KAFKA-109:
--------------------------------
Status: Patch Available (was: Open)
This patch handles the behavior of ByteBufferMessageSet for compression of empty list of messages. This modifies the ByteBufferMessageSet to create an empty byte buffer, in this case, instead of attaching a GZIP header to it. There are a couple of reasons to do this -
1. To maintain consistent behavior between an empty uncompressed message set and an empty compressed message set
2. To avoid attaching extraneous header information to non-existing data, effectively occupying space on disk
> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
> Key: KAFKA-109
> URL: https://issues.apache.org/jira/browse/KAFKA-109
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: Neha Narkhede
> Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087116#comment-13087116 ]
Jun Rao commented on KAFKA-109:
-------------------------------
Actually, this doesn't cover javaapi.ByteBufferMessageSet. It seems that javaapi.ByteBufferMessageSet duplicates some of the constructor code in ByteBufferMessageSet. We should avoid doing that.
> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
> Key: KAFKA-109
> URL: https://issues.apache.org/jira/browse/KAFKA-109
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: Neha Narkhede
> Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087320#comment-13087320 ]
Jun Rao commented on KAFKA-109:
-------------------------------
+1
> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
> Key: KAFKA-109
> URL: https://issues.apache.org/jira/browse/KAFKA-109
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: Neha Narkhede
> Attachments: KAFKA-109.patch, KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087079#comment-13087079 ]
Jun Rao commented on KAFKA-109:
-------------------------------
+1. The patch looks good.
> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
> Key: KAFKA-109
> URL: https://issues.apache.org/jira/browse/KAFKA-109
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: Neha Narkhede
> Attachments: KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-109) CompressionUtils introduces a GZIP
header while compressing empty message sets
Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede updated KAFKA-109:
--------------------------------
Attachment: KAFKA-109.patch
This is a revised patch that refactors the constructors of both java and scala ByteBufferMessageSet into a common API in MessageSet. This ensures that the bug fix exists both in the Java API as well as the Scala API
> CompressionUtils introduces a GZIP header while compressing empty message sets
> ------------------------------------------------------------------------------
>
> Key: KAFKA-109
> URL: https://issues.apache.org/jira/browse/KAFKA-109
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: Neha Narkhede
> Attachments: KAFKA-109.patch, KAFKA-109.patch
>
>
> The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira