You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Thiruvalluvan M. G. (JIRA)" <ji...@apache.org> on 2012/05/21 07:39:40 UTC

[jira] [Created] (AVRO-1097) BinaryDecoder does not detect EOF sometimes

Thiruvalluvan M. G. created AVRO-1097:
-----------------------------------------

             Summary: BinaryDecoder does not detect EOF sometimes
                 Key: AVRO-1097
                 URL: https://issues.apache.org/jira/browse/AVRO-1097
             Project: Avro
          Issue Type: Bug
          Components: java
            Reporter: Thiruvalluvan M. G.
            Assignee: Thiruvalluvan M. G.
             Fix For: 1.7.0


This is the first problem reported in AVRO-1058.

The trouble is, in case of end of stream, ensureBounds() does not really ensure that the requisite number of actual bytes are available in the buffer. It merely ensures that there won't be array index overflow. readInt() and readLong() check for overflow at the very end. But these two methods continue to read whatever bytes are in the buffer and interpret. If the bytes do not really belong to the stream (because of EOF), they need not be valid Zigzag encoding. That is the reason we get the "Invalid int encoding" exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1097) BinaryDecoder does not detect EOF sometimes

Posted by "Thiruvalluvan M. G. (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thiruvalluvan M. G. updated AVRO-1097:
--------------------------------------

    Status: Patch Available  (was: Open)
    
> BinaryDecoder does not detect EOF sometimes
> -------------------------------------------
>
>                 Key: AVRO-1097
>                 URL: https://issues.apache.org/jira/browse/AVRO-1097
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Thiruvalluvan M. G.
>            Assignee: Thiruvalluvan M. G.
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1097.patch
>
>
> This is the first problem reported in AVRO-1058.
> The trouble is, in case of end of stream, ensureBounds() does not really ensure that the requisite number of actual bytes are available in the buffer. It merely ensures that there won't be array index overflow. readInt() and readLong() check for overflow at the very end. But these two methods continue to read whatever bytes are in the buffer and interpret. If the bytes do not really belong to the stream (because of EOF), they need not be valid Zigzag encoding. That is the reason we get the "Invalid int encoding" exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1097) BinaryDecoder does not detect EOF sometimes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-1097:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I committed this.  Thanks, Thiru!
                
> BinaryDecoder does not detect EOF sometimes
> -------------------------------------------
>
>                 Key: AVRO-1097
>                 URL: https://issues.apache.org/jira/browse/AVRO-1097
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Thiruvalluvan M. G.
>            Assignee: Thiruvalluvan M. G.
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1097-test.patch, AVRO-1097.patch, AVRO-1097.patch
>
>
> This is the first problem reported in AVRO-1058.
> The trouble is, in case of end of stream, ensureBounds() does not really ensure that the requisite number of actual bytes are available in the buffer. It merely ensures that there won't be array index overflow. readInt() and readLong() check for overflow at the very end. But these two methods continue to read whatever bytes are in the buffer and interpret. If the bytes do not really belong to the stream (because of EOF), they need not be valid Zigzag encoding. That is the reason we get the "Invalid int encoding" exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1097) BinaryDecoder does not detect EOF sometimes

Posted by "Thiruvalluvan M. G. (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thiruvalluvan M. G. updated AVRO-1097:
--------------------------------------

    Attachment: AVRO-1097-test.patch

+1. I'm fine with Doug's patch.

The attached test reproduces the problem.


                
> BinaryDecoder does not detect EOF sometimes
> -------------------------------------------
>
>                 Key: AVRO-1097
>                 URL: https://issues.apache.org/jira/browse/AVRO-1097
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Thiruvalluvan M. G.
>            Assignee: Thiruvalluvan M. G.
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1097-test.patch, AVRO-1097.patch, AVRO-1097.patch
>
>
> This is the first problem reported in AVRO-1058.
> The trouble is, in case of end of stream, ensureBounds() does not really ensure that the requisite number of actual bytes are available in the buffer. It merely ensures that there won't be array index overflow. readInt() and readLong() check for overflow at the very end. But these two methods continue to read whatever bytes are in the buffer and interpret. If the bytes do not really belong to the stream (because of EOF), they need not be valid Zigzag encoding. That is the reason we get the "Invalid int encoding" exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1097) BinaryDecoder does not detect EOF sometimes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-1097:
-------------------------------

    Attachment: AVRO-1097.patch

This doesn't seem like it should be that high of a priority, since it's just about getting a more meaningful exception for bad data.

Could we instead just put the check into ensureBounds()?  That should keep this from affecting performance.  Here's a patch that does that.

The change from EOFException to IOException looks right, but it causes some other tests to fail.  Perhaps we should update those other tests?

Also, we should add some new tests for the specific case being fixed (EOF reading int or long).
                
> BinaryDecoder does not detect EOF sometimes
> -------------------------------------------
>
>                 Key: AVRO-1097
>                 URL: https://issues.apache.org/jira/browse/AVRO-1097
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Thiruvalluvan M. G.
>            Assignee: Thiruvalluvan M. G.
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1097.patch, AVRO-1097.patch
>
>
> This is the first problem reported in AVRO-1058.
> The trouble is, in case of end of stream, ensureBounds() does not really ensure that the requisite number of actual bytes are available in the buffer. It merely ensures that there won't be array index overflow. readInt() and readLong() check for overflow at the very end. But these two methods continue to read whatever bytes are in the buffer and interpret. If the bytes do not really belong to the stream (because of EOF), they need not be valid Zigzag encoding. That is the reason we get the "Invalid int encoding" exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1097) BinaryDecoder does not detect EOF sometimes

Posted by "Thiruvalluvan M. G. (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thiruvalluvan M. G. updated AVRO-1097:
--------------------------------------

    Attachment: AVRO-1097.patch

This patch addresses the issue by checking if there are indeed bytes to be decoded. If not, it throws an EOFException. If there are some bytes and they do not constitute a valid Zigzag encoding, it throws "Invalid int/long encoding" exception.
                
> BinaryDecoder does not detect EOF sometimes
> -------------------------------------------
>
>                 Key: AVRO-1097
>                 URL: https://issues.apache.org/jira/browse/AVRO-1097
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Thiruvalluvan M. G.
>            Assignee: Thiruvalluvan M. G.
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1097.patch
>
>
> This is the first problem reported in AVRO-1058.
> The trouble is, in case of end of stream, ensureBounds() does not really ensure that the requisite number of actual bytes are available in the buffer. It merely ensures that there won't be array index overflow. readInt() and readLong() check for overflow at the very end. But these two methods continue to read whatever bytes are in the buffer and interpret. If the bytes do not really belong to the stream (because of EOF), they need not be valid Zigzag encoding. That is the reason we get the "Invalid int encoding" exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira