You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2018/10/17 13:08:00 UTC

[jira] [Comment Edited] (LUCENE-8533) Incorrect readVInt: negative number could be returned

    [ https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653507#comment-16653507 ] 

Uwe Schindler edited comment on LUCENE-8533 at 10/17/18 1:07 PM:
-----------------------------------------------------------------

In contrast to readVLong, this is explicitely implemented like this for backwards compatibility with older indexes. Also it allows negative numbers (and they were used quite often). The explanation is here: LUCENE-3738 (this is also why the Exception looks different for readVLong and readVInt).

See also the inverse, writeVLong: https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/store/DataOutput.html#writeVInt-int-

bq. Writes an int in a variable-length format. Writes between one and five bytes. Smaller values take fewer bytes. Negative numbers are supported, but should be avoided.

I think the documentation in readVLong is just wrong (was copypasted).

But as we no longer support old indexes, we may be able to improve this, so your patch can be investigated. In that case also writeVInt needs to be changed to disallow negative numbers.

In addition, If we change this, the exception needs to be consistent with readVLong! Please change that, too.


was (Author: thetaphi):
In contrast to readVLong, this is explicitely implemented like this for backwards compatibility with older indexes. Also it allows negative numbers (and they were used quite often). The explanation is here: LUCENE-3738 (this is also why the Exception looks different for readVLong and readVInt).

See also the inverse, writeVLong: https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/store/DataOutput.html#writeVInt-int-

bq. Writes an int in a variable-length format. Writes between one and five bytes. Smaller values take fewer bytes. Negative numbers are supported, but should be avoided. I think the documentation in readVLong is just wrong (was copypasted).

But as we no longer support old indexes, we may be able to improve this, so your patch can be investigated. In that case also writeVInt needs to be changed to disallow negative numbers.

In addition, If we change this, the exception needs to be consistent with readVLong! Please change that, too.

> Incorrect readVInt: negative number could be returned
> -----------------------------------------------------
>
>                 Key: LUCENE-8533
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8533
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Vladimir Dolzhenko
>            Assignee: Uwe Schindler
>            Priority: Major
>         Attachments: readVInt.patch
>
>
> {{readVInt()}} has to return positive numbers (and zero), throw some exception in case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113] up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is {{00001111_11111111_11111111_11111111}}
> Keeping in mind that {{int}} is a signed type we have only 3 more bits before overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org