You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (Created) (JIRA)" <ji...@apache.org> on 2012/01/31 03:12:10 UTC

[jira] [Created] (HBASE-5304) Pluggable split key policy

Pluggable split key policy
--------------------------

                 Key: HBASE-5304
                 URL: https://issues.apache.org/jira/browse/HBASE-5304
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
            Reporter: Lars Hofhansl
            Assignee: Lars Hofhansl
             Fix For: 0.94.0


We need a way to specify custom policies to determine split keys.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199100#comment-13199100 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

{code}
+    if (this.region.getExplicitSplitPoint() != null) {
+      return this.region.getExplicitSplitPoint();
{code}
Can we store the return value from region.getExplicitSplitPoint() in the above case ?
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Status: Patch Available  (was: Open)
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199197#comment-13199197 ] 

Hadoop QA commented on HBASE-5304:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12513024/5304-v7.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -136 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 156 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.io.hfile.TestHFileBlock
                  org.apache.hadoop.hbase.client.TestAdmin

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/896//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/896//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/896//console

This message is automatically generated.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197473#comment-13197473 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

It would be nice if the new patch is published on review board.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197537#comment-13197537 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

Since there is a chance that SplitKeyPolicy may decide not to split, I think the following code should be prepared:
{code}
+      SplitTransaction st = new SplitTransaction(parent, parent
+          .getSplitKeyPolicy().getSplitKey(parent, midKey));
{code}
Meaning, we should check whether a SplitTransaction is needed.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198409#comment-13198409 ] 

Todd Lipcon commented on HBASE-5304:
------------------------------------

Oh, I see... indeed you're right. Nope, no objection (though I did not review the code, only the discussion here)
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197892#comment-13197892 ] 

jiraposter@reviews.apache.org commented on HBASE-5304:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3717/#review4744
-----------------------------------------------------------


Looks pretty good.  I'm not strong of the subtleties of the ruby parts -- I'll leave that to someone else to review.

Could we add a test case for the split veto case? (that is new functionality and a new unexercised code path).  





http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
<https://reviews.apache.org/r/3717/#comment10477>

    nit



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<https://reviews.apache.org/r/3717/#comment10473>

    maybe debug level log saying splitKeyPolicy loaded?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitKeyPolicy.java
<https://reviews.apache.org/r/3717/#comment10474>

    sp: sat->say



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ConfigurationUtil.java
<https://reviews.apache.org/r/3717/#comment10476>

    nit



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ConfigurationUtil.java
<https://reviews.apache.org/r/3717/#comment10478>

    I'm assuming this handles binary encoding (CDATA xml stuff?) and escaping?  What happens if we insert a goofy string with '<', '>' or other xml control chars?


- jmhsieh


On 2012-02-01 04:52:33, Lars Hofhansl wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3717/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-02-01 04:52:33)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This patch allows for simple pluggable SplitKeyPolicies. A SplitKeyPolicy determines the actual to key to be split on after a SplitRequest was issued.
bq.  
bq.  A SplitKeyPolicy can optionally be configured via a Configuration object that is serialized (as XML) into anb HTableDescriptor (in analogy to the Constraints feature - some common between the two features is factored out).
bq.  
bq.  DefaultSplitKeyPolicy implements the current logic (which splits along a store's midKey as determined by HFileReaderV*).
bq.  
bq.  No alternate SplitKeyPolicy is provided as part of this patch.
bq.  
bq.  Some changes to the HBase Shell are included to allow setting arbitrary HTableDescriptor values.
bq.  
bq.  Jesse Yates and I worked together on the initial version of this patch.
bq.  
bq.  
bq.  This addresses bug HBASE-5304.
bq.      https://issues.apache.org/jira/browse/HBASE-5304
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultSplitKeyPolicy.java PRE-CREATION 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitKeyPolicy.java PRE-CREATION 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ConfigurationUtil.java PRE-CREATION 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1238830 
bq.  
bq.  Diff: https://reviews.apache.org/r/3717/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Existing tests. No functional changes are introduced with this.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Lars
bq.  
bq.


                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198348#comment-13198348 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

I don't have much of a preference one way or the other.

Just that we have need for controlling "what key should we split at" to be able to co-locate certain parts of the data that belong to the same tenant...

Note that RegionSplitPolicy already determines the split point, that part was not just not extensible and did not work correctly for forced-splits.
This is a small change to make the existing functionality more useful.

Not opposed separating the two. In a different jira?

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196644#comment-13196644 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

Jesse Yates and I are working pluggable split key policy. Will attach a patch momentarily.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment:     (was: 5304.txt)
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Status: Patch Available  (was: Open)
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196712#comment-13196712 ] 

Hadoop QA commented on HBASE-5304:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12512513/5304.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -136 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.io.TestHeapSize

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/879//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/879//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/879//console

This message is automatically generated.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to trunk.
Thanks for the reviews Jesse and Ted.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Jesse Yates (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199094#comment-13199094 ] 

Jesse Yates commented on HBASE-5304:
------------------------------------

A lot of it can be if jenkins is sketchy. Can we rerun it on jenkins to see if we get the failures again?
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197642#comment-13197642 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

@Matt: Do you think what we propose is flexible enough. Another idea we have floated here is to make the run() method of SplitRequest pluggable, or have even deeper "pluggability". The current proposal is very simple and provided for the use-cases that we could think of.

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197532#comment-13197532 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

@Ted
* Region too big... That is why we make this pluggable. If you cannot manage your region size then do not write a pluggable split key policy. This is like a coprocessor calling System.exit(). You need to know what you are doing.
The whole point of this is to not put any specific split key policy into the trunk, so that trunk does not have to be changed and folks can hood up their own policy.
Surely you will not argue that nobody will ever want to change the split key policy. We do the same for the existing split policy (what if that leads to too big regions?)
* Will publish the next version of RB. Good point. I'll actually remove the PrefixSplitKeyPolicy, which should not be part of the patch. Instead we'll add documentation of SplitKeyPolicy on how to add a new policy.
* DefaultSplitKeyPolicy is not configurable. So passing in the config might be confusing.

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198427#comment-13198427 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

@Jesse: Thanks for the comments.
* Good point on making PrefixSplitKeyPolicy it's own class. I'll do that. Need to make it a bit nicer then. Or I can add a section to the HBase book (reference guide) with PrefixSplitKeyPolicy as an example.
* The splitKey is always validated against the region before the actual split. If it is not valid you get an error message (see HRegion.checkSplit and SplitTransaction.prepare) and the split is not performed. Falling back to the default instead might invalidate other assumptions that the application code had based on the provided SplitPolicy. Generally I see this as a server extension. If you wanted to call System.exit() from the code you can, and it'll kill your server on every split :)
* RegionSplitPolicy is already optionally loaded per table as well as per configuration (see RegionSplitPolicy.getSplitPolicyClass)

I think shell support would be for generally for setting HTableDescriptor values. I have some ideas, but some refactoring would be needed to get it right, I'd rather do that in another jira.

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment: 5304-v5.txt

Addressing Jesse's comments.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Status: Patch Available  (was: Open)

Getting another test run.
I also ran TestInstantSchemaChangeSplit in a loop for a bit.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Status: Open  (was: Patch Available)
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199999#comment-13199999 ] 

Hudson commented on HBASE-5304:
-------------------------------

Integrated in HBase-TRUNK #2653 (See [https://builds.apache.org/job/HBase-TRUNK/2653/])
    HBASE-5304 Pluggable split key policy

larsh : 
Files : 
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/PrefixSplitKeyPolicy.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionSplitPolicy.java

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196644#comment-13196644 ] 

Lars Hofhansl edited comment on HBASE-5304 at 1/31/12 2:14 AM:
---------------------------------------------------------------

Jesse Yates and I are working pluggable split key policies. Will attach a patch momentarily.
                
      was (Author: lhofhansl):
    Jesse Yates and I are working pluggable split key policy. Will attach a patch momentarily.
                  
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Status: Patch Available  (was: Open)

Might as well make sure that DefaultSplitKeyPolicy did not break anything.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198359#comment-13198359 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

And of course deciding what key we split on, would also allow us to do some changes atomically across multiple rows (also something we are going to need soon)
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199070#comment-13199070 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

I noticed... Will check it out of course.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198463#comment-13198463 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

The test classes are not part of the public API, so I cannot link to it. And I do not want to place PrefixSplitKeyPolicy outside the test directory becuase then it becomes an official part of HBase.

Agree on the camel stuff.

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199109#comment-13199109 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

Table creation may take some time.
The test failure was caused by findRSWithOnlineRegionFor() not retrying.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197954#comment-13197954 ] 

jiraposter@reviews.apache.org commented on HBASE-5304:
------------------------------------------------------



bq.  On 2012-02-01 15:06:48, jmhsieh wrote:
bq.  > Looks pretty good.  I'm not strong of the subtleties of the ruby parts -- I'll leave that to someone else to review.
bq.  > 
bq.  > Could we add a test case for the split veto case? (that is new functionality and a new unexercised code path).  
bq.  > 
bq.  > 
bq.  >

The ruby part does two things:
1. Allow arbitrary HTableDescriptor values to be set (strange that nobody needed that, yet).
2. If the value to be set is actually a configuration it is automatically serialized (as XML).

Neither of them are stricly needed for this, but make it possible to set/configure a SplitKeyPolicy via the shell.


bq.  On 2012-02-01 15:06:48, jmhsieh wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 503
bq.  > <https://reviews.apache.org/r/3717/diff/2/?file=71604#file71604line503>
bq.  >
bq.  >     maybe debug level log saying splitKeyPolicy loaded?

Ah yes, forgot that. Will add.


bq.  On 2012-02-01 15:06:48, jmhsieh wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ConfigurationUtil.java, line 76
bq.  > <https://reviews.apache.org/r/3717/diff/2/?file=71607#file71607line76>
bq.  >
bq.  >     I'm assuming this handles binary encoding (CDATA xml stuff?) and escaping?  What happens if we insert a goofy string with '<', '>' or other xml control chars?

Yep.

I think I am going to backpaddle on prescribing that SplitKeyPolicies need to be configured via a Configuration. It should be up to the implementor of a SplitKeyPolicy how it is configured.

This utility class is still useful, so I'll leave them in, and show in an example how one could them.


- Lars


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3717/#review4744
-----------------------------------------------------------


On 2012-02-01 04:52:33, Lars Hofhansl wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3717/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-02-01 04:52:33)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This patch allows for simple pluggable SplitKeyPolicies. A SplitKeyPolicy determines the actual to key to be split on after a SplitRequest was issued.
bq.  
bq.  A SplitKeyPolicy can optionally be configured via a Configuration object that is serialized (as XML) into anb HTableDescriptor (in analogy to the Constraints feature - some common between the two features is factored out).
bq.  
bq.  DefaultSplitKeyPolicy implements the current logic (which splits along a store's midKey as determined by HFileReaderV*).
bq.  
bq.  No alternate SplitKeyPolicy is provided as part of this patch.
bq.  
bq.  Some changes to the HBase Shell are included to allow setting arbitrary HTableDescriptor values.
bq.  
bq.  Jesse Yates and I worked together on the initial version of this patch.
bq.  
bq.  
bq.  This addresses bug HBASE-5304.
bq.      https://issues.apache.org/jira/browse/HBASE-5304
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultSplitKeyPolicy.java PRE-CREATION 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitKeyPolicy.java PRE-CREATION 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java 1238830 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ConfigurationUtil.java PRE-CREATION 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1238830 
bq.  
bq.  Diff: https://reviews.apache.org/r/3717/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Existing tests. No functional changes are introduced with this.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Lars
bq.  
bq.


                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197567#comment-13197567 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

SplitTransaction just returns when null or a key that does not fall inside the parent region is passed. I guess we could make this more explicit: If SplitKeyPolicy.getSplitKey(...) returns null or an empty byte[], it means we will not split.
I'll have a new patch soon.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199537#comment-13199537 ] 

Hudson commented on HBASE-5304:
-------------------------------

Integrated in HBase-TRUNK-security #96 (See [https://builds.apache.org/job/HBase-TRUNK-security/96/])
    HBASE-5304 Pluggable split key policy

larsh : 
Files : 
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/PrefixSplitKeyPolicy.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionSplitPolicy.java

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Jesse Yates (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199018#comment-13199018 ] 

Jesse Yates commented on HBASE-5304:
------------------------------------

+1 ship it.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Status: Open  (was: Patch Available)
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment:     (was: 5304-v4.txt)
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198378#comment-13198378 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

All the failed tests in the run pass locally, it's a small patch, scratches an itch that we need to scratch, so from my side this is good to go.

@Todd: Are you opposed to the change?
@Ted: Any comments?
@Jesse: Any comments?

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199046#comment-13199046 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

Wait, TestInstantSchemaChangeSplit was a new test failure.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment: 5304.txt

Initial sketch (no testing done, yet) that Jesse and I came up with.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197522#comment-13197522 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

How do we deal with the case where a single region gets too big (according to the first N bytes) ?

I think we should poll dev@hbase to see how useful this feature is.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment: 5304-v4.txt
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198504#comment-13198504 ] 

Hadoop QA commented on HBASE-5304:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12512899/5304-v6.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -136 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 156 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.client.TestInstantSchemaChangeSplit
                  org.apache.hadoop.hbase.io.hfile.TestHFileBlock
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/892//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/892//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/892//console

This message is automatically generated.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199280#comment-13199280 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

Ran test admin locally... passes fine. Do I get a +1 Ted? :)
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment: 5304-v7.txt
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198012#comment-13198012 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

In the end it might be cleaner to unify this with RegionSplitPolicy, rather than having two mechanisms to define part of the split policy.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197465#comment-13197465 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

SplitKeyPolicy.java and SplitConfigurationUtil.java are missing license.

Along with SPLIT_KEY_CONFIG_KEY we're storing another config in HTableDescription. Is there some measure that we can take to make sure the config isn't bloated ?
{code}
+      } else {
+        splitKeyPolicy = new DefaultSplitKeyPolicy();
+      }
{code}
I think DefaultSplitKeyPolicy should be given the config for SPLIT_KEY_CONFIG_KEY if the config is present in HTableDescriptor.

                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment: 5304-v4.txt

Here's an entirely different patch that used RegionSplitPolicy instead.
(Seems RegionSplitPolicy needs to be documented more prominently).

Consider the original patch abandoned.

Only two changes:
1. Allow classes outside of org.apache.hadoop.hbase.regionserver to extend RegionSplitPolicy or ConstantSizeRegionSplitPolicy
2. Deal correctly with forced splits.

Please have a look. Thanks.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Jesse Yates (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198433#comment-13198433 ] 

Jesse Yates commented on HBASE-5304:
------------------------------------

bq. I can add a section to the HBase book (reference guide) with PrefixSplitKeyPolicy as an example

Can you do both? This would be a good start to that Tips and Tricks section that has been mentioned recently.

bq. The splitKey is always validated against the region before the actual split. If it is not valid you get an error message (see HRegion.checkSplit and SplitTransaction.prepare) and the split is not performed. Falling back to the default instead might invalidate other assumptions that the application code had based on the provided SplitPolicy. 

Good call. +1 on that and everything else.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199088#comment-13199088 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

TestInstantSchemaChangeSplit passes locally. Not sure what is going on with precommit and why we see all these spurious failures.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Matt Corgan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197632#comment-13197632 ] 

Matt Corgan commented on HBASE-5304:
------------------------------------

This would be useful for time series data.  Let's say you split a region every 2 weeks.  The current split policy would be back in time 1 week ago.  It would be better if it split at the current time and just let the old region go to sleep (maybe it gets a few more inserts from lagging clients).  Or you could get more intricate and split on nice rounded hour or day intervals.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Attachment: 5304-v6.txt

Removed the verbatim copy and just linked to RegionSplitPolicy and CostantSizeRegionSplitPolicy... That's better anyway.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196745#comment-13196745 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

TestHeapSize is a legit failure, need to fix FIXED_OVERHEAD in HRegion by a REFERENCE.
We'll add specific tests for this tomorrow.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199148#comment-13199148 ] 

Lars Hofhansl commented on HBASE-5304:
--------------------------------------

@Ted: re: storing the value... Sure. I'll do that at commit, unless you have other objections.
I'll reattach the same file to another jenkins build.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197578#comment-13197578 ] 

jiraposter@reviews.apache.org commented on HBASE-5304:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3717/
-----------------------------------------------------------

Review request for hbase.


Summary
-------

This patch allows for simple pluggable SplitKeyPolicies. A SplitKeyPolicy determines the actual to key to be split on after a SplitRequest was issued.

A SplitKeyPolicy can optionally be configured via a Configuration object that is serialized (as XML) into anb HTableDescriptor (in analogy to the Constraints feature - some common between the two features is factored out).

DefaultSplitKeyPolicy implements the current logic (which splits along a store's midKey as determined by HFileReaderV*).

No alternate SplitKeyPolicy is provided as part of this patch.

Some changes to the HBase Shell are included to allow setting arbitrary HTableDescriptor values.

Jesse Yates and I worked together on the initial version of this patch.


This addresses bug HBASE-5304.
    https://issues.apache.org/jira/browse/HBASE-5304


Diffs
-----

  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1238830 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java 1238830 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultSplitKeyPolicy.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1238830 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitKeyPolicy.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java 1238830 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ConfigurationUtil.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1238830 

Diff: https://reviews.apache.org/r/3717/diff


Testing
-------

Existing tests. No functional changes are introduced with this.


Thanks,

Lars


                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Jesse Yates (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198419#comment-13198419 ] 

Jesse Yates commented on HBASE-5304:
------------------------------------

@Lars:

I was thinking you could make PrefixSplitKeyPolicy its own full class in the test package. This would help make it more obvious to people that it is around. Also, some docs on it's usage would be awesome.

I'm also worried that the test for the split policy needs to be run on a minicluster to ensure that we "do the right thing" for messed up splits. This goes into Ted's point of about what if we put in a bad policy. My first inclination on that was that we fall back on the default split policy and push and angry message (possibly tying that into the log), rather than just bailing (taking a preference here for uptime)

Excuse the ignorance here, nbut since the policy is loaded just based on the configuration of the RS, is there a way to make it just table based? Or will it be global across the cluster? If the former, then I think a little more work needs to be put into make that work. Also, shell support?
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199284#comment-13199284 ] 

Zhihong Yu commented on HBASE-5304:
-----------------------------------

Good to go.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt, 5304-v6.txt, 5304-v7.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198288#comment-13198288 ] 

Todd Lipcon commented on HBASE-5304:
------------------------------------

I'm a little skeptical of combining this with RegionSplitPolicy -- but perhaps that class needs a better name. In my mind, the decision of "when should we split?" is orthogonal to "what key should we split at". Of course there is some interaction between the two - eg if the split policy says we should split, but the region has only one row, we can't actually do so... but for the common case they seem to be separate decisions.

For example, there's a proposal (which I never got around to implementing) of a split policy which would make small regions "early on" as a table's created, and then once a table has many regions, start to up the region size automatically. That should work whether you do the current mid-point splits or you do a custom split policy that avoids "entity groups".
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198191#comment-13198191 ] 

Hadoop QA commented on HBASE-5304:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12512795/5304-v4.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -136 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 156 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplicationPeer
                  org.apache.hadoop.hbase.coprocessor.TestMasterObserver
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.io.hfile.TestHFileBlock
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/889//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/889//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/889//console

This message is automatically generated.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5304) Pluggable split key policy

Posted by "Jesse Yates (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198457#comment-13198457 ] 

Jesse Yates commented on HBASE-5304:
------------------------------------

I don't think the program listing for the prefixSplitKeyPolicy should be copied into the docs, but instead just use a link to the javadocs for the the class and then just include the bit about setting the values on the HTD.

Also, I not loving the camel casing or the underscore in "PREFIX_LENGTH_KEY = "PrefixSplitKeyPolicy.prefix_length", but I can live with it.

otherwise, +1.
                
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt, 5304-v5.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5304) Pluggable split key policy

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5304:
---------------------------------

    Status: Open  (was: Patch Available)
    
> Pluggable split key policy
> --------------------------
>
>                 Key: HBASE-5304
>                 URL: https://issues.apache.org/jira/browse/HBASE-5304
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5304-v4.txt
>
>
> We need a way to specify custom policies to determine split keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira