You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "David S. Wang (Created) (JIRA)" <ji...@apache.org> on 2012/02/29 06:46:02 UTC

[jira] [Created] (HBASE-5489) Add HTable accessor to get regions for a key range

Add HTable accessor to get regions for a key range
--------------------------------------------------

                 Key: HBASE-5489
                 URL: https://issues.apache.org/jira/browse/HBASE-5489
             Project: HBase
          Issue Type: Improvement
          Components: client
            Reporter: David S. Wang
            Assignee: David S. Wang
            Priority: Minor


It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:

* It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
* It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.

An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.

Here's a proposal for the accessor:

  /**
   * Get the corresponding regions for an arbitrary range of keys.
   * <p>
   * @param startRow Starting row in range, inclusive
   * @param endRow Ending row in range, inclusive
   * @return A list of HRegionLocations corresponding to the regions that
   * contain the specified range
   * @throws IOException if a remote or network exception occurs
   */
  public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
    final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220587#comment-13220587 ] 

Hudson commented on HBASE-5489:
-------------------------------

Integrated in HBase-0.94 #8 (See [https://builds.apache.org/job/HBase-0.94/8/])
    HBASE-5489 Addendum (Revision 1296010)

     Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTable.java

                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220563#comment-13220563 ] 

Hadoop QA commented on HBASE-5489:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12516762/HBASE-5489-4.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
                  org.apache.hadoop.hbase.TestZooKeeper
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1077//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1077//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1077//console

This message is automatically generated.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220326#comment-13220326 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------



bq.  On 2012-03-01 06:44:36, Lars Hofhansl wrote:
bq.  > Looks good to me.
bq.  > Curious: Do you have a specific usecase in mind for this API?
bq.  
bq.  David Wang wrote:
bq.      Yes, I would like to not have to be forced to scan .META. everytime my client just wants the regions for a particular range, and that information is already cached in the client.  This is also more convenient for the caller than having to parse through all of the start/end keys in the table everytime.

Wait. TableInputFormat is already configured with a Scan object, which do exactly the same thing (via a scanner).
You don't special InputFormat for this.


- Lars


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5490
-----------------------------------------------------------


On 2012-03-01 18:24:18, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 18:24:18)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219828#comment-13219828 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/
-----------------------------------------------------------

(Updated 2012-03-01 06:05:46.664490)


Review request for hbase.


Changes
-------

Made endKey exclusive.  Also added a few more unit tests and fixed a logic error for the termination of the main loop in a couple of corner cases regarding when the specified end key to getRegionsInRange() is EMPTY_END_ROW, and/or when the last region in the range's end key is EMPTY_END_ROW.


Summary
-------

getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.

I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.


This addresses bug HBASE-5489.
    https://issues.apache.org/jira/browse/HBASE-5489


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 

Diff: https://reviews.apache.org/r/4117/diff


Testing
-------

Ran the TestFromClientSide unit tests and passed repeatedly.

Ran test-patch.sh with the following results:

-1 overall.  

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.


Thanks,

David


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Work started) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HBASE-5489 started by David S. Wang.

> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222871#comment-13222871 ] 

Hudson commented on HBASE-5489:
-------------------------------

Integrated in HBase-TRUNK #2672 (See [https://builds.apache.org/job/HBase-TRUNK/2672/])
    HBASE-5489 Addendum (Revision 1296011)
HBASE-5489 Add HTable accessor to get regions for a key range (Revision 1295729)

     Result = SUCCESS
larsh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220641#comment-13220641 ] 

Hudson commented on HBASE-5489:
-------------------------------

Integrated in HBase-0.92 #312 (See [https://builds.apache.org/job/HBase-0.92/312/])
    HBASE-5489 Addendum (Revision 1296008)

     Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HTable.java

                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220214#comment-13220214 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/
-----------------------------------------------------------

(Updated 2012-03-01 18:24:18.762517)


Review request for hbase.


Changes
-------

Made minor changes as per Stack's comments.


Summary
-------

getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.

I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.


This addresses bug HBASE-5489.
    https://issues.apache.org/jira/browse/HBASE-5489


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 

Diff: https://reviews.apache.org/r/4117/diff


Testing
-------

Ran the TestFromClientSide unit tests and passed repeatedly.

Ran test-patch.sh with the following results:

-1 overall.  

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.


Thanks,

David


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219692#comment-13219692 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/
-----------------------------------------------------------

Review request for hbase.


Summary
-------

getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.


This addresses bug HBASE-5489.
    https://issues.apache.org/jira/browse/HBASE-5489


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 

Diff: https://reviews.apache.org/r/4117/diff


Testing
-------

Ran the TestFromClientSide unit tests and passed repeatedly.

Ran test-patch.sh with the following results:

-1 overall.  

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.


Thanks,

David


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219723#comment-13219723 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5477
-----------------------------------------------------------


I'd go with exclusive. That way it is easy to have an inclusive (just add a binary 0 at the end) and exclusive.
Patch looks good otherwise.

- Lars


On 2012-03-01 01:51:42, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 01:51:42)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220185#comment-13220185 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------



bq.  On 2012-03-01 17:03:25, Michael Stack wrote:
bq.  > Minor comments below.  Else patch looks good to me.  If you want to skip making new version, just say so.   As per LarsH, what do you intend to use this for?

Answered in previous comment to Lars's question.


bq.  On 2012-03-01 17:03:25, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 516
bq.  > <https://reviews.apache.org/r/4117/diff/2/?file=86911#file86911line516>
bq.  >
bq.  >     Ditto.  No need to compare to true?

I'll make the changes and resubmit shortly.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5500
-----------------------------------------------------------


On 2012-03-01 06:05:46, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 06:05:46)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220327#comment-13220327 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------



bq.  On 2012-03-01 06:44:36, Lars Hofhansl wrote:
bq.  > Looks good to me.
bq.  > Curious: Do you have a specific usecase in mind for this API?
bq.  
bq.  David Wang wrote:
bq.      Yes, I would like to not have to be forced to scan .META. everytime my client just wants the regions for a particular range, and that information is already cached in the client.  This is also more convenient for the caller than having to parse through all of the start/end keys in the table everytime.
bq.  
bq.  Lars Hofhansl wrote:
bq.      Wait. TableInputFormat is already configured with a Scan object, which do exactly the same thing (via a scanner).
bq.      You don't special InputFormat for this.

Sorry, that last was in response to your email where you say that you want to "make a TableInputFormat equivalent that only scans a sub-range of the table"


- Lars


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5490
-----------------------------------------------------------


On 2012-03-01 18:24:18, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 18:24:18)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David S. Wang updated HBASE-5489:
---------------------------------

    Attachment: HBASE-5489-3.patch
    
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David S. Wang updated HBASE-5489:
---------------------------------

    Attachment: HBASE-5489-3-0.92.1.patch

This is the patch for backport to 0.92.1.  I also pulled in HBASE-5177 as HBASE-5489 depends on the getRegionLocation() variant that was introduced by HBASE-5177.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220161#comment-13220161 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5500
-----------------------------------------------------------

Ship it!


Minor comments below.  Else patch looks good to me.  If you want to skip making new version, just say so.   As per LarsH, what do you intend to use this for?


src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/4117/#comment11921>

    Write it as:
    
    if ((Bytes.compareTo(startKey, endKey) > 0) && !endKeyIsEndOfTable)



src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/4117/#comment11922>

    Ditto.  No need to compare to true?


- Michael


On 2012-03-01 06:05:46, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 06:05:46)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-5489:
-------------------------

    Fix Version/s:     (was: 0.96.0)
                   0.92.1

Committed to 0.92 branch too... thanks for patch David
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220270#comment-13220270 ] 

Hadoop QA commented on HBASE-5489:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12516706/HBASE-5489-3.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1075//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1075//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1075//console

This message is automatically generated.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David S. Wang updated HBASE-5489:
---------------------------------

    Fix Version/s: 0.96.0
                   0.94.0
                   0.92.1
           Status: Patch Available  (was: In Progress)

The patch is in:

https://reviews.apache.org/r/4117/

I will submit it here for the Hadoop QA robot once it is approved.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David S. Wang updated HBASE-5489:
---------------------------------

    Status: Patch Available  (was: Reopened)

Submitted patch to fix comment bug.  Sorry for the inconvenience.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David S. Wang reopened HBASE-5489:
----------------------------------


The comment for getRegionsInRange() has an error.  Submitting patch to fix that.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219318#comment-13219318 ] 

stack commented on HBASE-5489:
------------------------------

+1
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220302#comment-13220302 ] 

Hudson commented on HBASE-5489:
-------------------------------

Integrated in HBase-0.94 #7 (See [https://builds.apache.org/job/HBase-0.94/7/])
    HBASE-5489 Add HTable accessor to get regions for a key range (Revision 1295728)

     Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David S. Wang updated HBASE-5489:
---------------------------------

    Attachment: HBASE-5489-4.patch

Patch to fix comment error.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-5489:
-------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.92.1)
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

Committed to 0.94 and to trunk.  Patch does not apply to 0.92 David.  If you want me to apply it there, make me one.  Good on you. 
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5489:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed addendum to 0.92, 0.94, and trunk.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220114#comment-13220114 ] 

Hadoop QA commented on HBASE-5489:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12516685/HBASE-5489-2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplicationPeer
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1067//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1067//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1067//console

This message is automatically generated.
                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219714#comment-13219714 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/
-----------------------------------------------------------

(Updated 2012-03-01 01:51:42.601163)


Review request for hbase.


Changes
-------

Added question to description about whether endKey should be inclusive or exclusive.


Summary (updated)
-------

getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.

I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.


This addresses bug HBASE-5489.
    https://issues.apache.org/jira/browse/HBASE-5489


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 

Diff: https://reviews.apache.org/r/4117/diff


Testing
-------

Ran the TestFromClientSide unit tests and passed repeatedly.

Ran test-patch.sh with the following results:

-1 overall.  

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.


Thanks,

David


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222115#comment-13222115 ] 

Hudson commented on HBASE-5489:
-------------------------------

Integrated in HBase-0.92-security #96 (See [https://builds.apache.org/job/HBase-0.92-security/96/])
    HBASE-5489 Addendum (Revision 1296008)
HBASE-5489 Add HTable accessor to get regions for a key range (Revision 1295767)
HBASE-5489 Add HTable accessor to get regions for a key range (Revision 1295727)

     Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HTable.java

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt

                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch, HBASE-5489-4.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220181#comment-13220181 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------



bq.  On 2012-03-01 06:44:36, Lars Hofhansl wrote:
bq.  > Looks good to me.
bq.  > Curious: Do you have a specific usecase in mind for this API?

Yes, I would like to not have to be forced to scan .META. everytime my client just wants the regions for a particular range, and that information is already cached in the client.  This is also more convenient for the caller than having to parse through all of the start/end keys in the table everytime. 


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5490
-----------------------------------------------------------


On 2012-03-01 06:05:46, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 06:05:46)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "David S. Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David S. Wang updated HBASE-5489:
---------------------------------

    Attachment: HBASE-5489-2.patch
    
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5489-2.patch
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219850#comment-13219850 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5490
-----------------------------------------------------------

Ship it!


Looks good to me.
Curious: Do you have a specific usecase in mind for this API?


src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/4117/#comment11892>

    Ah yed, I like this better, since we're guaranteed to never scan past the end of the table.


- Lars


On 2012-03-01 06:05:46, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 06:05:46)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219750#comment-13219750 ] 

jiraposter@reviews.apache.org commented on HBASE-5489:
------------------------------------------------------



bq.  On 2012-03-01 02:05:25, Lars Hofhansl wrote:
bq.  > I'd go with exclusive. That way it is easy to have an inclusive (just add a binary 0 at the end) and exclusive.
bq.  > Patch looks good otherwise.

Thanks!  I'll redo the patch and retest.  New patch should be up soon.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4117/#review5477
-----------------------------------------------------------


On 2012-03-01 01:51:42, David Wang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-01 01:51:42)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with the specified key range, using client-side cache if possible.
bq.  
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However, other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which way we should go here.  I can easily change the patch if we want the endKey to be exclusive; please let me know.  Thanks in advance.
bq.  
bq.  
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  
bq.  Ran test-patch.sh with the following results:
bq.  
bq.  -1 overall.  
bq.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.  
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.  
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.  
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.  
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.  
bq.      +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  David
bq.  
bq.


                
> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0, 0.96.0
>
>
> It would be nice to have an accessor to find all regions that overlap with a particular range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(), then follow that with calls to getRegionLocation() for the range of keys you are interested in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at all if the HRegionLocations being fetched were already cached by the client, thereby potentially making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira