You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (Created) (JIRA)" <ji...@apache.org> on 2012/03/24 00:11:32 UTC
[jira] [Created] (KAFKA-315) enable shallow iterator in
ByteBufferMessageSet to allow mirroing data without decompression
enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
--------------------------------------------------------------------------------------------
Key: KAFKA-315
URL: https://issues.apache.org/jira/browse/KAFKA-315
Project: Kafka
Issue Type: Improvement
Components: core
Reporter: Jun Rao
Assignee: Jun Rao
Fix For: 0.7.1
Attachments: kafka-315.patch
Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-315) enable shallow iterator in
ByteBufferMessageSet to allow mirroing data without decompression
Posted by "Joel Koshy (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238579#comment-13238579 ]
Joel Koshy commented on KAFKA-315:
----------------------------------
Patch looks good - would be great if you can provide some idea on the performance overhead of decompression vs. compression from the analysis that you did.
Also, we can add this config under "Important configuration parameters for a mirror" in the mirroring wiki.
One question: ConsumerIterator still decodes the event - which would generally only make sense if you are using the DefaultDecoder right? So maybe we can just add a condition there to just return item.message if enableShallowIterator is true and also require that decoder is an instance of DefaultDecoder?
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
> Key: KAFKA-315
> URL: https://issues.apache.org/jira/browse/KAFKA-315
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.7.1
>
> Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-315) enable shallow iterator in
ByteBufferMessageSet to allow mirroing data without decompression
Posted by "Jun Rao (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-315:
--------------------------
Attachment: kafka-315.patch
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
> Key: KAFKA-315
> URL: https://issues.apache.org/jira/browse/KAFKA-315
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.7.1
>
> Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-315) enable shallow iterator in
ByteBufferMessageSet to allow mirroing data without decompression
Posted by "Jun Rao (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-315:
--------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Thanks for the review. Committed to trunk.
For performance, with a socket buffer and a fetch size of 2MB, I was able to improve the cross DC mirroring throughput btw 2 brokers from about 9MB/sec to 30MB/sec, using this patch.
As for the decoder, shallowIterator only works with the DefaultDecoder. If a wrong decoder is used, the user will get an exception and realize the decoder problem. So, this is probably not a big concern.
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
> Key: KAFKA-315
> URL: https://issues.apache.org/jira/browse/KAFKA-315
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.7.1
>
> Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-315) enable shallow iterator in
ByteBufferMessageSet to allow mirroing data without decompression
Posted by "Jun Rao (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-315:
--------------------------
Status: Patch Available (was: Open)
patch attached.
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
> Key: KAFKA-315
> URL: https://issues.apache.org/jira/browse/KAFKA-315
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.7.1
>
> Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira