You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (Created) (JIRA)" <ji...@apache.org> on 2012/03/19 22:33:39 UTC

[jira] [Created] (KAFKA-310) Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log

Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log
----------------------------------------------------------------------------------------------

                 Key: KAFKA-310
                 URL: https://issues.apache.org/jira/browse/KAFKA-310
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.7
            Reporter: Neha Narkhede
            Priority: Critical


The behavior of the ByteBufferMessageSet's iterator is to ignore and return false if some trailing bytes are found that cannot be de serialized into a Kafka message. The append API in Log, iterates through a ByteBufferMessageSet and validates the checksum of each message. Though, while appending data to the log, it just uses the underlying ByteBuffer that forms the ByteBufferMessageSet. Now, due to some bug, if the ByteBuffer has some trailing data, that will get appended to the on-disk log too. This can cause corruption of the log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-310) Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log

Posted by "Jun Rao (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235675#comment-13235675 ] 

Jun Rao commented on KAFKA-310:
-------------------------------

+1 on v2.
                
> Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-310
>                 URL: https://issues.apache.org/jira/browse/KAFKA-310
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>            Priority: Critical
>         Attachments: kafka-310-v2.patch, kafka-310.patch
>
>
> The behavior of the ByteBufferMessageSet's iterator is to ignore and return false if some trailing bytes are found that cannot be de serialized into a Kafka message. The append API in Log, iterates through a ByteBufferMessageSet and validates the checksum of each message. Though, while appending data to the log, it just uses the underlying ByteBuffer that forms the ByteBufferMessageSet. Now, due to some bug, if the ByteBuffer has some trailing data, that will get appended to the on-disk log too. This can cause corruption of the log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-310) Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log

Posted by "Neha Narkhede (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-310:
--------------------------------

    Attachment: kafka-310-v2.patch

That's true. This was probably overlooked KAFKA-277. Fixed it and uploaded v2.
                
> Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-310
>                 URL: https://issues.apache.org/jira/browse/KAFKA-310
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>            Priority: Critical
>         Attachments: kafka-310-v2.patch, kafka-310.patch
>
>
> The behavior of the ByteBufferMessageSet's iterator is to ignore and return false if some trailing bytes are found that cannot be de serialized into a Kafka message. The append API in Log, iterates through a ByteBufferMessageSet and validates the checksum of each message. Though, while appending data to the log, it just uses the underlying ByteBuffer that forms the ByteBufferMessageSet. Now, due to some bug, if the ByteBuffer has some trailing data, that will get appended to the on-disk log too. This can cause corruption of the log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-310) Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log

Posted by "Neha Narkhede (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-310:
--------------------------------

    Attachment: kafka-310.patch

In this patch, Log's append API truncates the ByteBufferMessageSet to validBytes before appending its backing byte buffer to the FileChannel.
                
> Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-310
>                 URL: https://issues.apache.org/jira/browse/KAFKA-310
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>            Priority: Critical
>         Attachments: kafka-310.patch
>
>
> The behavior of the ByteBufferMessageSet's iterator is to ignore and return false if some trailing bytes are found that cannot be de serialized into a Kafka message. The append API in Log, iterates through a ByteBufferMessageSet and validates the checksum of each message. Though, while appending data to the log, it just uses the underlying ByteBuffer that forms the ByteBufferMessageSet. Now, due to some bug, if the ByteBuffer has some trailing data, that will get appended to the on-disk log too. This can cause corruption of the log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-310) Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log

Posted by "Jun Rao (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235636#comment-13235636 ] 

Jun Rao commented on KAFKA-310:
-------------------------------

ByteBufferMessageSet.validBytes currently makes a deep iteration of all messages, which means that we need to decompress messages. To avoid this overhead, we should change ByteBufferMessageSet.validBytes to use a shallow iterator.
                
> Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-310
>                 URL: https://issues.apache.org/jira/browse/KAFKA-310
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>            Priority: Critical
>         Attachments: kafka-310.patch
>
>
> The behavior of the ByteBufferMessageSet's iterator is to ignore and return false if some trailing bytes are found that cannot be de serialized into a Kafka message. The append API in Log, iterates through a ByteBufferMessageSet and validates the checksum of each message. Though, while appending data to the log, it just uses the underlying ByteBuffer that forms the ByteBufferMessageSet. Now, due to some bug, if the ByteBuffer has some trailing data, that will get appended to the on-disk log too. This can cause corruption of the log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (KAFKA-310) Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log

Posted by "Neha Narkhede (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede resolved KAFKA-310.
---------------------------------

    Resolution: Fixed
      Assignee: Neha Narkhede

Thanks for the review. Committed this to trunk
                
> Incomplete message set validation checks in kafka.log.Log's append API can corrupt on disk log
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-310
>                 URL: https://issues.apache.org/jira/browse/KAFKA-310
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Critical
>         Attachments: kafka-310-v2.patch, kafka-310.patch
>
>
> The behavior of the ByteBufferMessageSet's iterator is to ignore and return false if some trailing bytes are found that cannot be de serialized into a Kafka message. The append API in Log, iterates through a ByteBufferMessageSet and validates the checksum of each message. Though, while appending data to the log, it just uses the underlying ByteBuffer that forms the ByteBufferMessageSet. Now, due to some bug, if the ByteBuffer has some trailing data, that will get appended to the on-disk log too. This can cause corruption of the log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira