You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Bankim Bhavsar (Jira)" <ji...@apache.org> on 2019/10/09 20:58:00 UTC

[jira] [Updated] (KUDU-2968) RleDecoder::GetNextRun() may attempt decoding past the last byte leading to assertion failure

     [ https://issues.apache.org/jira/browse/KUDU-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bankim Bhavsar updated KUDU-2968:
---------------------------------
    Status: In Review  (was: Open)

> RleDecoder::GetNextRun() may attempt decoding past the last byte leading to assertion failure
> ---------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2968
>                 URL: https://issues.apache.org/jira/browse/KUDU-2968
>             Project: Kudu
>          Issue Type: Bug
>          Components: util
>            Reporter: Bankim Bhavsar
>            Assignee: Bankim Bhavsar
>            Priority: Major
>
> RLE encoding may encode "literally" when it doesn't find sufficient repeated values.
> SeeĀ [https://github.com/apache/kudu/blob/master/src/kudu/util/rle-encoding.h#L28]
> Consider a scenarios where consecutive (non-repeated) integers are encoded using RLE encoding. In that case values are encoded in literal fashion. Literal count is encoded and it's a multiple of 8.
> When the number of values are not multiple of 8, literal count is rounded up to multiple of 8.
> For e.g. if number of values is 100, then literal_count is 104 but max_bytes is correctly set at 100 for int8 datatype.
> In this scenario after reading the last value when {{ret}} is 0, literal_count still remains at 4.
> Hence the next {{GetValue}} return false since it's trying to read beyond {{max_bytes}}.
> https://github.com/apache/kudu/blob/master/src/kudu/util/rle-encoding.h#L319
> {code}
>       DCHECK(literal_count_ > 0);
>       if (ret == 0) {
>         bool has_more = bit_reader_.GetValue(bit_width_, val);
>         DCHECK(has_more);
>         literal_count_--;
>         ret++;
>         rem--;
>       }
>       while (literal_count_ > 0) {
>         bool result = bit_reader_.GetValue(bit_width_, &current_value_);
>         DCHECK(result);
>         if (current_value_ != *val || rem == 0) {
>           bit_reader_.Rewind(bit_width_);
>           return ret;
>         }
>         ret++;
>         rem--;
>         literal_count_--;
>       }
>     }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)