You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2012/06/17 11:24:42 UTC

[jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix

    [ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393509#comment-13393509 ] 

Lars Hofhansl commented on HBASE-6200:
--------------------------------------

The problem is that families are not compared at all (so in the case above '' is less than 'a').
Actually, that is correct as far as the StoreScanners, etc, are concerned, as there need not be an explicit ordering between KVs in different CFs - they are stored in different stores. So the cycles comparing families can be spared.

Of course, when we want to sort KVs from multiple stores (as in this case a ResultSet) this comes back to bite us.

It seems we need an extra comparator that includes families, for use in cases where KVs from multiple stores are compared.

I forgot why we decided not remove all the comparators from the KeyValue class, but in this case there could be a comparator in hbase-server which ignores families and one in common (or client in the future) that includes them.

                
> KeyComparator.compareWithoutRow can be wrong when families have the same prefix
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-6200
>                 URL: https://issues.apache.org/jira/browse/HBASE-6200
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.6, 0.92.1, 0.94.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
>
> As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh
> Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so "f:a" is said to be bigger than "f1:", which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined.
> I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned.
> I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira