You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2017/07/31 21:36:00 UTC

[jira] [Commented] (KUDU-2085) Seek past last element of a prefix-encoded binary block may crash

    [ https://issues.apache.org/jira/browse/KUDU-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16108003#comment-16108003 ] 

Todd Lipcon commented on KUDU-2085:
-----------------------------------

Here's an example of a block that is incorrectly handled:

{code}
000000: 1080 3930 1000 0009 d92e e41b ddb6 bf21 ..90...........!
          NN      RR   AABB XX

NN: number of elements = 128
RR: restart interval = 128
AA: number of 'shared' (prefix) chars for first element = 0
BB: number of 'non-shared' chars for first element = 9
XX: offset 7 (pointed to by incorrectly-interpreted offset at
              end of block)
    <actually the beginning of the data for the first element>

000010: 3008 0131 0801 3208 0133 0801 3408 0135 0..1..2..3..4..5
...<snip>...
0001d0: 340a 0135 0a01 360a 0137 3f00 0000 7a00 4..5..6..7?...z.
0001e0: 0000 b400 0000 ef00 0000 2901 0000 6301 ..........)...c.
0001f0: 0000 9f01 0000 0700 0000 
                       <------->
                      number of restarts
                     (misinterpreted as a restart offset)
{code}

Here the num_restarts=7 at the end of the block is instead interpreted as an offset to a value, and so we try to decode the data 'd9 2e e4 1b' as <shared=5977, non-shared=3556>. This clearly points past the end of the block and causes the Corruption status: "Could not decode value length data at idx 128"


> Seek past last element of a prefix-encoded binary block may crash
> -----------------------------------------------------------------
>
>                 Key: KUDU-2085
>                 URL: https://issues.apache.org/jira/browse/KUDU-2085
>             Project: Kudu
>          Issue Type: Bug
>          Components: cfile
>    Affects Versions: 1.0.1, 1.1.0, 1.2.0, 1.3.1, 1.4.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>
> Similar to  KUDU-2049, the binary prefix block encoder has a bug when seeking past the end of the block (i.e to the offset past the last element). The bug only causes issues in very specific circumstances:
> - the number of elements in the block has to be a multiple of 16 (the "restart interval")
> -- this causes the code to interpret the "restart count" at the end of the block data as an offset instead of part of the footer.
> - this value, when interpreted as an offset, points to a piece of data in the block which, when interpreted as a varint, ends up large enough to point past the end of the block.
> This results in an error like:
> F0730 09:56:07.291882 124055 binary_prefix_block.cc:325] Check failed: _s.ok() Bad status: Corruption: Could not decode value length data at idx 32



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)