You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2008/08/14 19:29:44 UTC

[jira] Created: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Problem with row keys beginnig with characters < than ',' and the region location cache
---------------------------------------------------------------------------------------

                 Key: HBASE-832
                 URL: https://issues.apache.org/jira/browse/HBASE-832
             Project: Hadoop HBase
          Issue Type: Bug
          Components: client
    Affects Versions: 0.2.0
            Reporter: Jean-Daniel Cryans
             Fix For: 0.2.1, 0.3.0


We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:

- A client has a certain set of regions in cache
- One region with the faulty row key splits 
- The client receives a request for a row in the split region

The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
Row in META: entities,,1216750777411
Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999

The passed row is lesser then the row in .META.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626384#action_12626384 ] 

stack commented on HBASE-832:
-----------------------------

J-D: One thing to keep in mind is that going between 0.2.0 and 0.2.1, a migration should not be necessary.  I'm afraid that changing format of key in meta will require migration.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622698#action_12622698 ] 

stack commented on HBASE-832:
-----------------------------

If region is a 'meta' region, could we write row keys with a subclass of HSK named something like MetaHSK?  MetaHSK would not treat row as a byte array but instead do simple parse to pull out the tablename and timestamp components.  Remained would be startkey.  Should be possible to then do compare that is not susceptible to changed ordering just because startkey contains delimiter?

Might have to have a version for root and another for meta given that the root has rows made of the meta tables rows.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626730#action_12626730 ] 

Jim Kellerman commented on HBASE-832:
-------------------------------------

Reviewed patch. Some issues and questions:
- Missing javadoc for @param in HStoreKey, Memcache
- lines too long in HStoreKey and HStore
- I am still unclear on why the table name needs to be put in and stripped out for the ROOT and Meta regions. Can you explain?


> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-832:
-------------------------------------

    Attachment: hbase-832-v1.patch

Patch that attempts at comparing rows from META and ROOT differently. Please review.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans reassigned HBASE-832:
----------------------------------------

    Assignee: Jean-Daniel Cryans

I will try to do it the Bigtable-way: build the meta row keys with only the table and the end-row since it is the "," between the two keys that is problematic.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622657#action_12622657 ] 

apurtell edited comment on HBASE-832 at 8/14/08 12:42 PM:
----------------------------------------------------------------

Of the set of bytes < ','  '$' and '*' are commonly used for special keys in various applications. '!' is a possibility also. 

Why not use the space character instead of comma? 

      was (Author: apurtell):
    Of the set of bytes < ','  '$' and '*' are commonly used for special keys in various operations. '!' is a possibility also. 

Why not use the space character instead of comma? 
  
> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-832:
-------------------------------------

      Component/s: regionserver
    Fix Version/s:     (was: 0.19.0)
                   0.18.0
                   0.2.1
         Priority: Blocker  (was: Major)

Getting worse. Here is another symptom:

{code}
org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion web_pages,http://www.altitude737.com/choix_centre.html,1219807997979, startKey='http://www.altitude737.com/choix_centre.html', getEndKey()='http://www.amoll.qc.ca/', row='http://www.amoll.qc.ca/%8elections.html'
	at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1703)
	at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1759)
	at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1383)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
	at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
{code}

Made this a blocker for 0.2.1 and 0.18.0

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627070#action_12627070 ] 

stack commented on HBASE-832:
-----------------------------

Would suggest that tests do actual full HSK compares rather than just row postions.

Maybe add not to the data member 'tablename' a javadoc that its not serialized as part of HSK (Point at this issue?).

I should look closer, but does it need to be passed into HSK?  Or does HStoreKey.compareTwoRowKeys not suffice in all cases?

If you passed HRI instead of table name to HSK, you could do HRI.isMetatable and HRI.isRoottable rather than do the table name compares you're currently doing.



> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-832:
-------------------------------------

    Attachment: hbase-832-v3-trunk.patch

Cleaner patch, adds a test and uses compareTo. Review please.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch, hbase-832-v3-trunk.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-832:
--------------------------------

    Fix Version/s:     (was: 0.18.0)
                   0.19.0

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622671#action_12622671 ] 

stack commented on HBASE-832:
-----------------------------

Taking a quick look, inserting into memcache and out to store files, we use HStoreKey.  The comparator in HSK knows how to make sense of our row/column/ts keys.  So, problem is elsewhere in the system; somewhere we are looking at the 'row' as raw byte array and we're using dumb Bytes.BYTE_COMPARATOR when it should be a byte comparator that knows it a HSK and that parses it appropriatlely (i.e going left to right up to the DELIMITER, first compare row component, then parse out the timestamp and the remainder is the column; compare this, then ts).  I wonder if its in our MR classes where we're using ImmutableBytesWritable?  Perhaps we should be using something 'smarter'

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626601#action_12626601 ] 

Jean-Daniel Cryans commented on HBASE-832:
------------------------------------------

Yeah, I have it in mind. For 0.2.1, in the release notes, we should specify that this bug is very important and that it will be fixed for 0.3.0

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626732#action_12626732 ] 

Jim Kellerman commented on HBASE-832:
-------------------------------------

19 javadoc warnings. Run

ant clean javadoc



> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-832.
-------------------------

    Resolution: Fixed

Looks good J-D.  I tested it by doing a decent loading up on cluster and ran all unit tests.  Patch is ugly but we can fix later when we have luxury of a migration (HBASE-859).  Applied trunk and branch.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch, hbase-832-v3-trunk.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627074#action_12627074 ] 

Jean-Daniel Cryans commented on HBASE-832:
------------------------------------------

bq. Would suggest that tests do actual full HSK compares rather than just row postions.

You mean using compareTo?

bq. Maybe add not to the data member 'tablename' a javadoc that its not serialized as part of HSK (Point at this issue?).

Indeed.

bq. I should look closer, but does it need to be passed into HSK? Or does HStoreKey.compareTwoRowKeys not suffice in all cases?

I passed it when the compareTo method was used. Sometimes in the code it was a row comparison, other times it was a HSK comparison in which I had to make sure that we checked the rows correctly.

bq. If you passed HRI instead of table name to HSK, you could do HRI.isMetatable and HRI.isRoottable rather than do the table name compares you're currently doing.

Indeed.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627133#action_12627133 ] 

stack commented on HBASE-832:
-----------------------------

Yeah, I mean doing compareTo... Add tests to the testHStoreKey method that take the messy meta keys.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622668#action_12622668 ] 

Jonathan Gray commented on HBASE-832:
-------------------------------------

I think it's a very bad idea to put any kind of constraint on byte[]'s anywhere.

Moving to byte[] has given the flexibility to put anything you want, wherever you want.  I can use serialized java objects as row keys if I want to.

Having any kind of reserved characters limits your ability to blindly store binary objects wherever you'd like.  Often the encoding of a binary object is implemented in a library you don't touch, if it just so happens to be using a reserved byte to start, you're SOL or stuck doing manually munging to ensure you fit the constraints on what are valid byte[]'s.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622678#action_12622678 ] 

stack commented on HBASE-832:
-----------------------------

Ignore my comment above.  Its off the mark.  The problem is not full row key sorting.  Its the sort of the row component of a key made of row/column/timestamp.  Issue is in meta where rows are made of the tablename, delimiter, rowname, delimiter, timestamp.  If the rowname in the table has delimiter in it or bytes that are < delimiter, then sort can be off.



> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-832:
-------------------------------------

    Fix Version/s:     (was: 0.2.1)

Won't fix for 0.2.1

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626314#action_12626314 ] 

stack commented on HBASE-832:
-----------------------------

Ugh.  +1 on fixing for 0.2.1

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626827#action_12626827 ] 

Jean-Daniel Cryans commented on HBASE-832:
------------------------------------------

Jim, sorry for the sloppy patch, it wasn't my best one. I will correct the last javadoc issue then will try it using the 2h MR job on which it fails.

Clint, I guess further refactoring would make this relatively easy to do. We'll discuss it later.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622657#action_12622657 ] 

Andrew Purtell commented on HBASE-832:
--------------------------------------

Of the set of bytes < ','  '$' and '*' are commonly used for special keys in various operations. '!' is a possibility also. 

Why not use the space character instead of comma? 

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626784#action_12626784 ] 

Jim Kellerman commented on HBASE-832:
-------------------------------------

javadoc still incorrect for:

public Memcache(final long ttl, HRegionInfo regionInfo)

otherwise +1

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622654#action_12622654 ] 

stack commented on HBASE-832:
-----------------------------

For 0.2.1, should we add rejecting row if starts with a byte that is <= ','?

My guess is that there is little extant data that has rows with row keys that are <= ',', else we would have heard about it (table wouldn't work).

Minimally, lets add release note that this is a known issue for 0.2.1.

Good find J-D.



> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.3.0
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626800#action_12626800 ] 

Clint Morgan commented on HBASE-832:
------------------------------------

Hey J-D,

When working on this, did you get a sense of how easy it would be to add an arbitrary row key comparator (per table)? I'm probably gonna need this (HBASE-661) in the next few weeks.

cheers,

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-832) Problem with row keys beginnig with characters < than ',' and the region location cache

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-832:
-------------------------------------

    Attachment: hbase-832-v2.patch

Fixes javadoc and the timestamp I forgot to check. Would have been critical when comparing the rows of slitted regions in catalog tables. Passes unit tests. Please review.

> Problem with row keys beginnig with characters < than ',' and the region location cache
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-832
>                 URL: https://issues.apache.org/jira/browse/HBASE-832
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-832-v1.patch, hbase-832-v2.patch
>
>
> We currently have a problem the way we design .META. row keys. When user table row keys begin with characters lesser than ',' like a '$', any operation will fail when:
> - A client has a certain set of regions in cache
> - One region with the faulty row key splits 
> - The client receives a request for a row in the split region
> The reason is that it will first get a NSRE then it will try to locate a region using the passed row key. For example: 
> Row in META: entities,,1216750777411
> Row passed: entities,$-94f9386f-e235-4cbd-aacc-37210a870991,99999999999999
> The passed row is lesser then the row in .META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.