You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2019/01/14 06:16:00 UTC

[jira] [Comment Edited] (HBASE-20928) Rewrite calculation of midpoint in binarySearch functions to prevent overflow

    [ https://issues.apache.org/jira/browse/HBASE-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741782#comment-16741782 ] 

Lars Hofhansl edited comment on HBASE-20928 at 1/14/19 6:15 AM:
----------------------------------------------------------------

The cited article _explicitly_ states that

{{int mid = (low + high) >>> 1;}}

is OK and likely faster. I'm somewhat confused then about this change, especially since said article is given as the reason.

[~stack]

I'm testing HBase 2 again HBase 1 right now and find that its scan performance is 50% worse in many cases (with ROW_INDEX_V1 at least). Not saying this is causing this, but it came up as a change in that code path)


was (Author: lhofhansl):
The cited article _explicitly_ states that

{{int mid = (low + high) >>> 1;}}

is OK and likely faster. I'm somewhat confused then about this change, especially since said article is given as the reason.

 

> Rewrite calculation of midpoint in binarySearch functions to prevent overflow
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-20928
>                 URL: https://issues.apache.org/jira/browse/HBASE-20928
>             Project: HBase
>          Issue Type: Bug
>          Components: io
>            Reporter: saurabh singh
>            Assignee: saurabh singh
>            Priority: Minor
>             Fix For: 1.5.0, 2.2.0
>
>         Attachments: HBASE-20928-addendum.patch, HBASE-20928-fix-binarySearch-v5.patch, HBASE-20928-fix-binarySearch-v5.patch
>
>
> There are couple of issues in the function:
>  * {{>>>}} operator would mess the values if {{low}} + {{high}} end up being negative. This shouldn't happen but I don't see anything to prevent this from happening.
>  * The code fails around boundary values of {{low}} and {{high}}. This is a well known binary search catch. [https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html]
>  
> Most of the code should already be covered by tests. I would have liked to add a test that actually fails without the fix but given these are private methods I am not sure on the best place to add the test. Suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)