You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (Created) (JIRA)" <ji...@apache.org> on 2012/03/24 00:11:32 UTC

[jira] [Created] (KAFKA-315) enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression

enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
--------------------------------------------------------------------------------------------

                 Key: KAFKA-315
                 URL: https://issues.apache.org/jira/browse/KAFKA-315
             Project: Kafka
          Issue Type: Improvement
          Components: core
            Reporter: Jun Rao
            Assignee: Jun Rao
             Fix For: 0.7.1
         Attachments: kafka-315.patch

Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-315) enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression

Posted by "Joel Koshy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238579#comment-13238579 ] 

Joel Koshy commented on KAFKA-315:
----------------------------------

Patch looks good - would be great if you can provide some idea on the performance overhead of decompression vs. compression from the analysis that you did.

Also, we can add this config under "Important configuration parameters for a mirror" in the mirroring wiki.

One question:  ConsumerIterator still decodes the event - which would generally only make sense if you are using the DefaultDecoder right? So maybe we can just add a condition there to just return item.message if enableShallowIterator is true and also require that decoder is an instance of DefaultDecoder?

                
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-315
>                 URL: https://issues.apache.org/jira/browse/KAFKA-315
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.7.1
>
>         Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-315) enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression

Posted by "Jun Rao (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-315:
--------------------------

    Attachment: kafka-315.patch
    
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-315
>                 URL: https://issues.apache.org/jira/browse/KAFKA-315
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.7.1
>
>         Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-315) enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression

Posted by "Jun Rao (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-315:
--------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Thanks for the review. Committed to trunk. 

For performance, with a socket buffer and a fetch size of 2MB, I was able to improve the cross DC mirroring throughput btw 2 brokers from about 9MB/sec to 30MB/sec, using this patch.

As for the decoder, shallowIterator only works with the DefaultDecoder. If a wrong decoder is used, the user will get an exception and realize the decoder problem. So, this is probably not a big concern.
                
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-315
>                 URL: https://issues.apache.org/jira/browse/KAFKA-315
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.7.1
>
>         Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-315) enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression

Posted by "Jun Rao (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-315:
--------------------------

    Status: Patch Available  (was: Open)

patch attached.
                
> enable shallow iterator in ByteBufferMessageSet to allow mirroing data without decompression
> --------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-315
>                 URL: https://issues.apache.org/jira/browse/KAFKA-315
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.7.1
>
>         Attachments: kafka-315.patch
>
>
> Currently, the iterator of ByteBufferMessageSet does deep iteration, ie, if messages are compressed, they will be decompressed first during the iteration. This adds CPU overhead. For mirroring data between 2 Kafka clusters, we can use a shallow iterator to avoid the decompression overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira