You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Doug Meil (JIRA)" <ji...@apache.org> on 2012/11/26 22:56:57 UTC

[jira] [Created] (HBASE-7221) RowKey utility class for rowkey construction

Doug Meil created HBASE-7221:
--------------------------------

             Summary: RowKey utility class for rowkey construction
                 Key: HBASE-7221
                 URL: https://issues.apache.org/jira/browse/HBASE-7221
             Project: HBase
          Issue Type: Improvement
            Reporter: Doug Meil
            Assignee: Doug Meil
            Priority: Minor


A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.

The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:

{code}
   RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
   key.addHash(a);
   key.add(b);
   byte bytes[] = key.getBytes();
{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505716#comment-13505716 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

re:  "classpath"

Ahhh... good point.  hbase-common, then?

re:  name.

RowKeyBuilder?  
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509729#comment-13509729 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

Additionally, in terms of additional encodings RowKeySchema can be added with different options down the line.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509529#comment-13509529 ] 

Lars Hofhansl commented on HBASE-7221:
--------------------------------------

Interface looks good (so does this idea, generally).

But we are now (kind of) prescribing a way in which HBase should do its encoding.
It seems this is best handled by an external library (orderly or lily do this).

Encoding numbers into correctly sorted byte[] is not entirely trivial, neither is separating variable length parts of the key, different users will have different needs. Are strings UTF8? what about special sorting for languages?
What about floats? They need to be encoded to sort correctly even considering their exponents (again there're libraries out there doing this already).

At the same time everybody using HBase is building something like this, so maybe HBase should ship with a reasonable set of default encodings.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506812#comment-13506812 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

re:  "Builder"

Yeah, I really wasn't going for a builder pattern.  Elliott had a concern about the name "RowKey" (I must admit I'm still partial to it because there isn't a class with that name anywhere in the codebase).

I wasn't really aiming for a builder pattern in the first place because I didn't want to necessarily force people to destroy and re-create the RowKey/Builder for each rowkey they create - that's why the reset method is there.  The only thing that would have to get reset was the backing byte array.

re:  "fixed size"

I wanted any particular instance to have a fixed size so that the backing byte-array didn't have resize like an ArrayList (and wind up burning a lot of byte-arrays in the process).  So it's "easier" to create rowkeys than without the utility, but not without required thought.
 
If your table had multiple length keys, there's nothing wrong with creating 2 different instances, one for each length.

That's where I was coming from.

re:  "formatting"

I'll fix that.  Doh!  

Thanks!

                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504216#comment-13504216 ] 

Hadoop QA commented on HBASE-7221:
----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12554921/HBASE_7221.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new or modified tests.

    {color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 2.0 profile.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 98 warning messages.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 27 new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3406//console

This message is automatically generated.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Elliott Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505784#comment-13505784 ] 

Elliott Clark commented on HBASE-7221:
--------------------------------------

Common's much better than server, so If you're against examples then common seems the best.

RowKeyBuilder seems better.  Thanks
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506661#comment-13506661 ] 

stack commented on HBASE-7221:
------------------------------

Hey boss... fix the formatting.  Its all over the place -- see the patch.  You have tabs in there?   If you look at other code you'll see it uses spaces for tabs and two spaces at that.

Builder is a good name but you don't seem to follow the general builder pattern... i.e. get a 'builder', then do things against it and on the end call 'build' to return the result.  That could confuse (You have some of it w/ your static to create an instance...)

You keep adding to the backing array... just let the ArrayOutOfBounds happen if they try to add off the end?

Why does the key have to be of fixed size?

Good stuff.

                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Meil updated HBASE-7221:
-----------------------------

    Attachment: HBASE_7221.patch
    
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Elliott Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505703#comment-13505703 ] 

Elliott Clark commented on HBASE-7221:
--------------------------------------

So coming soon (HBASE-7012) most users won't have anything in hbase-server on their client classpath.  With that in mind it, it seems like hbase-examples will actually have more visibility for people just starting out with HBase.  (hadoop-examples seems like a great corollary here; more starting users read that code than hadoop-common)

Also should this be renamed to something like CompositeRowKey so that the name isn't confused with a row key that is used inside the hbase-server ?
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Elliott Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504356#comment-13504356 ] 

Elliott Clark commented on HBASE-7221:
--------------------------------------

Seems like this would be better in the hbase-examples module.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Meil updated HBASE-7221:
-----------------------------

    Status: Patch Available  (was: Open)
    
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504863#comment-13504863 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

Elliot, the reason I'd lobby not for "examples" is that this is such a common question and so easy to screw up unless you really know what you're doing.  The people that used HBase back when it was 0.20 were adventurers, as the release gets closer to 1.0 it just needs to make the right thing happen.

Additionally, there are other key-utilities in the util package already...

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Keying.html
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Meil updated HBASE-7221:
-----------------------------

    Attachment: hbase-common_hbase_7221_v3.patch

Ok, I think this one's a winner.  :-)

There is a RowKeySchema and a RowKey.  This creates fixed-length keys without delimiters (generally considered to be a best practice), and enforces the defined lengths when the elements are set.  It's also bi-directional, so that you can pass in a byte-array (i.e., rowkey) from a table and then read the key elements back.

Creation example...
{code}
    int elements[] = {RowKeySchema.SIZEOF_MD5_HASH, RowKeySchema.SIZEOF_INT, RowKeySchema.SIZEOF_LONG};
    RowKeySchema schema = new RowKeySchema(elements);
    
    RowKey rowkey = schema.createRowKey();
    rowkey.setHash(0, hashVal);
    rowkey.setInt(1, intVal);
    rowkey.setLong(2, longVal);
	  
    byte bytes[] = rowkey.getBytes();

{code}
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Ian Varley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509780#comment-13509780 ] 

Ian Varley commented on HBASE-7221:
-----------------------------------

Cool! Looks good, we do something similar internally (as I'm sure do most shops). Definitely lots of room for making this experience more natural for beginners, good on you Doug.

My first thought, like Lars, was: why are we "blessing" int, long & MD5 hash? As opposed to setting bytes only, and having this class just help with the arranging part? Sure, you can always just use the "setBytes/getBytes" methods and ignore the other stuff, but I feel like adding specific types to the list is a slippery slope (but I have no data to back that feeling up. :)

Questions it raises for me:
 - As Lars mentioned, do you want to use standard ints, or binary comparable like Lily does (where the sign's at the end)?
 - What about Date objects? They'll be really common in row keys, of course. But, they're also easy to change into a long.
 - What about ways to indicate that something should be reverse ordered (descending), via bit inversion?

If you stripped this down to just getByte(s)/setByte(s), would it still be useful? Seems like that's got the lion's share of the pattern there. Maybe then a subclass that adds common encodings (doing it as a subclass maybe makes it more obvious that this is just one set of encodings, and anybody else can do likewise).

Anyway, what you have seems straightforward enough that adding it might point some people in the right direction, without getting too fancy.

Also, seems like this should go in the forthcoming hbase-client module, y? Elliott, what's the timeline for that?
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510472#comment-13510472 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

So what now?  Seems like there is general agreement that something like this would be a good idea.  And we all agree that there are plenty of edge cases that this doesn't cover.

One thing to mention re: alternate encodings...   I think this pattern is extensible and RowKeySchema can be augmented for different encoding strategies, while still keeping the easy-to-use that exists in the RowKey class.

As for class-names, Elliot isn't crazy about the name RowKey.  FixedLengthRowKey?

As for variable length keys (e.g., for people that still insist on using Strings in keys), that's not a pattern that this class supports.  I think you'd have to use delimiters between fields in that case, but that's seems like it could be supported in a subsequent patch (e.g., VariableLengthRowKey) in a different ticket. 

Thanks everybody for the review effort!
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505167#comment-13505167 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

I do think, though, that some code examples using this would be a good idea in "examples" but this class belongs in the util package like the other class.  I've heard that request from some folks about having more "cookbook" examples.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505877#comment-13505877 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

Gotcha, I'll re-submit with those changes.

As an aside, that "Keying" class I cited above should move to common too.  That's clearly a client utility and shouldn't be in server.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506848#comment-13506848 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

One more thought on size:  then again, I could do what ArrayList does with it's overloaded constructor - use that size initially, and then auto-size if needed.  But you could still define the exact size if you wanted for performance purposes.  that's probably the nicest possible approach.  


                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509431#comment-13509431 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

Check out the v3 version of the patch with the RowKeySchema and RowKey.  Thanks!  (it was uploaded earlier today, but I thought I'd state that explicitly)
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509823#comment-13509823 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

Thanks Ian!

re:  "why are we "blessing" int, long & MD5 hash?"

For this I have to refer to my prevent comment on HBase making anything possible but not much "easy" in terms of rowkey construction.  I think between those datatypes it represents the commonly used key-element datatypes.  And it makes it easy (i.e., will do the encoding/decoding for you).

But you can always use setBytes if you want to do something custom (and getBytes for that position too).

re:  "Anyway, what you have seems straightforward enough that adding it might point some people in the right direction, without getting too fancy."

Yep, that was the intent.  You can always drop down to doing it 100% yourself, but this handles a lot of cases.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504845#comment-13504845 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

Ramkrishna, regarding addHash, there is addHash(int), addHash(long), addHash(byte[]), and addHash(String).  So the API already supports adding the "common" datatypes, plus a catch-all for a byte[].  What did you feel needed to be added?

Personally, for the common "key" datatypes, I'd rather not force people to call Bytes.toBytes on them as I think that's kind of clunky.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504352#comment-13504352 ] 

ramkrishna.s.vasudevan commented on HBASE-7221:
-----------------------------------------------

{code}
addHash(s.getBytes());
{code}

Can we make this too as 'addHash(Bytes.toBytes(s));?
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509495#comment-13509495 ] 

Lars Hofhansl commented on HBASE-7221:
--------------------------------------

I think this will just paste over the interesting issues.
A composite rowkeys needs to be designed carefully for correct sorting.
-1 should sort before +1, etc. Just encoding/decoding with Bytes won't work.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508408#comment-13508408 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

I am overhauling this with some new ideas.  Stay tuned.
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Elliott Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509521#comment-13509521 ] 

Elliott Clark commented on HBASE-7221:
--------------------------------------

I still am against naming anything RowKey.  It just invites confusion. I don't think that there's any plan to support passing these objects as a row key.  So in my opinion they shouldn't be named that way.  When a user searches on google/bing/duck duck go for hbase row key they should get the documentation about how they should structure a row key not some class that's named row key, but isn't actually the type to be passed in as a row key.

Why not a more general builder style ?  

{code:java}
//Could just use magic numbers to represent replace.  Not sure how I feel about that
enum HashPosition {
PREPEND,
APPEND,
REPLACE
}
enum Order {
ASSENDING, //Smaller Numbers first
DESCENGING //Larger numbers first
}
class CompositeRowKeyBuilder {
  public CompositeRowKeyBuilder();
  public CompositeRowKeyBuilder(int numFields); //throw exception if the number of fields is off when build is called.
  public CompositeRowKeyBuilder(int numFields, int expectedBytes); //Same as above but allows for hint to ByteBuffer.
  private ByteBuffer bb = new ByteBuffer();
  public CompositeRowKeyBuilder addHash(Hash hash, HashPosition position);
  public CompositeRowKeyBuilder add(String s);
  public CompositeRowKeyBuilder add(String s, int length); //We should be trying to encourage fixed keys if possible
  public CompositeRowKeyBuilder add(Int i);
  public CompositeRowKeyBuilder add(Int i, Order o);
  public CompositeRowKeyBuilder add(Long l);
  public CompositeRowKeyBuilder add(Long l, Order o);
  public CompositeRowKeyBuilder add(Double d); //Use something like Orderly's() formatting allowing the sorting of double and float
  public CompositeRowKeyBuilder add(Double d, Order o);
  public CompositeRowKeyBuilder add(Float f);
  public CompositeRowKeyBuilder add(Float f, Order o);
  public CompositeRowKeyBuilder add(byte[] bytes);
  public byte[] build(); //yes I know this breaks the builder pattern a little bit.  But I think it's worth it.
}
class ExamplUsage {
public static void main(String[]args) {
CompositeRowKeyBuilder builder = new Builder();
//rk should = MURUMUR_HASH("TestString".getBytes + 100.toBytes) + "TestString".getBytes + 100.getBytes
byte[] rk = builder.addHash(Hash.MURUMUR, PREPEND).add(100).add("TestString").build()

//rkTwo = "MyOtherTestString".reverse().getBytes.
byte[] rkTwo = builder.setHash(Hash.REVERSE, REPLACE).add("MyOtherTestString").build()
}
}
{code}

Thoughts ?
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509726#comment-13509726 ] 

Doug Meil commented on HBASE-7221:
----------------------------------

re:  "Interface looks good (so does this idea, generally)."

Thanks!

re:  "At the same time everybody using HBase is building something like this, so maybe HBase should ship with a reasonable set of default encodings."

This is precisely why I'm pushing this.  Key construction a common question on the dist-list, and HBase makes anything possible - but not much easy - in this critical area.

re:  "RowKey"

Change this class to FixedLengthRowKey?

There are a lot of situations that are handled by the v3 proposal, and I also need to stress that this can not just create byte-arrays for keys, but also read them back (that was an addition in the v3 patch).  You can pass in a rowkey when processing a table and then pick out the parts if needed (not something that you can do with a builder).  

re:  other encodings

As for exotic coding, RowKey (referring to v3 name) does allow for a setBytes(position, byte[]) for you to do the encoding yourself if you want.  I was picking the most common datatypes used in key-construction, but this does not represent an exhaustive list.

There are plenty of advance cases that this doesn't handle, such as variable length keys.  I think a builder is probably better for that pattern.  


                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7221) RowKey utility class for rowkey construction

Posted by "Doug Meil (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Meil updated HBASE-7221:
-----------------------------

    Attachment: hbase-common_hbase_7221_2.patch

Moved from server to common, renamed class name to RowKeyBuilder.  (7221_2.patch)
                
> RowKey utility class for rowkey construction
> --------------------------------------------
>
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch
>
>
> A common question in the dist-lists is how to construct rowkeys, particularly composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira