You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Nicholas Telford (Created) (JIRA)" <ji...@apache.org> on 2012/01/16 14:10:40 UTC

[jira] [Created] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Allow setting Scan start/stop row individually in TableInputFormat
------------------------------------------------------------------

                 Key: HBASE-5208
                 URL: https://issues.apache.org/jira/browse/HBASE-5208
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce
            Reporter: Nicholas Telford
            Priority: Minor


Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.

TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"

The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186945#comment-13186945 ] 

Zhihong Yu commented on HBASE-5208:
-----------------------------------

@Nicolas:
You should use --no-prefix to generate your patch so that Hadoop Qa can run it. 

This is a useful feature. Can you add a unit test for it ?

Thanks
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187065#comment-13187065 ] 

Zhihong Yu commented on HBASE-5208:
-----------------------------------

Looks like testScan() is always followed by testScanFromConfiguration() with the same parameters:
{code}
     testScan(null, "app", "apo");
+    testScanFromConfiguration(null, "app", "apo");
{code}
I suggest adding an intermediary method that calls both testScanFromConfiguration() and testScan().

So using the existing TestTableInputFormatScan should be fine.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187838#comment-13187838 ] 

Zhihong Yu commented on HBASE-5208:
-----------------------------------

@Nicolas:
See MAPREDUCE-3583 for the cause of NumberFormatException.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187843#comment-13187843 ] 

Zhihong Yu commented on HBASE-5208:
-----------------------------------

{code}
+    assertTrue(job.isComplete());
{code}
Can we add more validation on top of the above ?

Also, TestTableInputFormatScan took 885 seconds on Jenkins. Is there a way to shorten it ?
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas Telford updated HBASE-5208:
------------------------------------

    Release Note: Added "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.stop" for defining start and stop rows for a MapReduce job without having to serialize a Scan object.
          Status: Patch Available  (was: Open)
    
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186964#comment-13186964 ] 

Hadoop QA commented on HBASE-5208:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510705/HBASE-5208-002.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -145 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/777//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/777//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/777//console

This message is automatically generated.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas Telford updated HBASE-5208:
------------------------------------

    Attachment: HBASE-5208-002.txt

Git patches seem to break the QA bot.

Manually edited to remove the "a/" prefixes.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186962#comment-13186962 ] 

Nicholas Telford commented on HBASE-5208:
-----------------------------------------

Tests were excluded from the patch as for now I'm unable to get the "large" tests to run in my environment, even from a clean trunk. I do have a patch with tests, but I'm not happy submitting them until I can get it working.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas Telford updated HBASE-5208:
------------------------------------

    Attachment: HBASE-5208-001.txt

Adds "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.stop" options to TableInputFormat to permit defining start/stop row separately.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187683#comment-13187683 ] 

Nicholas Telford commented on HBASE-5208:
-----------------------------------------

Not entirely sure why there are (unrelated) tests failing. Looking at the error, they all appear to be caused by the following. Can someone verify whether or not this is caused by something in my patch?

java.lang.NumberFormatException: For input string: "18446743988250694508"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
	at java.lang.Long.parseLong(Long.java:422)
	at java.lang.Long.parseLong(Long.java:468)
	at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413)
	at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148)
	at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401)
	at org.apache.hadoop.mapred.Task.initialize(Task.java:536)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

As for the findbugs and JavaDoc issues: JavaDoc is reporting a negative number of problems, so I'm disregarding it. Findbugs doesn't seem to be finding anything in my new code, although it's difficult to be sure given the volume of warnings.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186936#comment-13186936 ] 

Hadoop QA commented on HBASE-5208:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510704/HBASE-5208-001.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/776//console

This message is automatically generated.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187604#comment-13187604 ] 

Hadoop QA commented on HBASE-5208:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510830/HBASE-5208-004.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -144 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/791//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/791//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/791//console

This message is automatically generated.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas Telford updated HBASE-5208:
------------------------------------

    Attachment: HBASE-5208-003.txt

Adds tests for Scans defined by a Configuration.

Getting the largeTests suite running proved difficult and I think this actually makes the test run too long - I had to comment out the old testScan() tests to get it to complete in a reasonable time (i.e. without being killed for taking too long).

Should I have separated this out in to a separate test file?
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187060#comment-13187060 ] 

Hadoop QA commented on HBASE-5208:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510724/HBASE-5208-003.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -145 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplicationPeer
                  org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/779//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/779//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/779//console

This message is automatically generated.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188406#comment-13188406 ] 

Nicholas Telford commented on HBASE-5208:
-----------------------------------------

Regarding test length, without my additions (i.e. clean trunk) the tests take a very long time. Most of the time is spent doing the original tests as they spin up 11 MapReduce jobs. I imagine running the tests in parallel might improve things, but I haven't tested that.

My additional test adds another MapReduce job, so it will increase the test length, but not substantially.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187071#comment-13187071 ] 

Nicholas Telford commented on HBASE-5208:
-----------------------------------------

That was my intention. I can extract that out to an intermediary method if that's preferable, however that doesn't really solve the problem that doubling the number of MR jobs spun up causes the test to timeout. Any ideas on that one?
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187132#comment-13187132 ] 

Zhihong Yu commented on HBASE-5208:
-----------------------------------

>From https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2634/console:
{code}
Running org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 454.79 sec
{code}
It was indeed long - without the new test cases.

Can you pick only a few of the test cases from TestTableInputFormatScan for your new tests ?


                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189639#comment-13189639 ] 

Hudson commented on HBASE-5208:
-------------------------------

Integrated in HBase-TRUNK-security #82 (See [https://builds.apache.org/job/HBase-TRUNK-security/82/])
    HBASE-5208 Allow setting Scan start/stop row individually in TableInputFormat

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormat.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableInputFormatScan.java

                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Nicholas Telford (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas Telford updated HBASE-5208:
------------------------------------

    Attachment: HBASE-5208-004.txt

Test broken out in to single test that doesn't cause the TestSuite to timeout.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189011#comment-13189011 ] 

Hudson commented on HBASE-5208:
-------------------------------

Integrated in HBase-TRUNK #2639 (See [https://builds.apache.org/job/HBase-TRUNK/2639/])
    HBASE-5208 Allow setting Scan start/stop row individually in TableInputFormat

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormat.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableInputFormatScan.java

                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187143#comment-13187143 ] 

Zhihong Yu commented on HBASE-5208:
-----------------------------------

Running the test based on patch v3 timed out. Here is strace:
{code}
"main" prio=5 tid=101801000 nid=0x100601000 waiting on condition [1005fe000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1295)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:498)
	at org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan.testScanFromConfiguration(TestTableInputFormatScan.java:355)
	at org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan.testScanYZYToEmpty(TestTableInputFormatScan.java:319)
{code}

You can use the following command to verify that the new test case passes (just an example):
{code}
mvn test -P localTests TestTableInputFormatScan#testScanEmptyToEmpty
{code}
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188709#comment-13188709 ] 

Zhihong Yu commented on HBASE-5208:
-----------------------------------

Integrated to TRUNK.

Thanks for the patch Nicolas.
                
> Allow setting Scan start/stop row individually in TableInputFormat
> ------------------------------------------------------------------
>
>                 Key: HBASE-5208
>                 URL: https://issues.apache.org/jira/browse/HBASE-5208
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nicholas Telford
>            Priority: Minor
>         Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt, HBASE-5208-004.txt
>
>
> Currently, TableInputFormat initializes a serialized Scan from "hbase.mapreduce.scan". Alternatively, it will instantiate a new Scan using properties defined in "hbase.mapreduce.scan.*". However, of these properties the "start row" and "stop row" (arguably the most pertinent) are missing.
> TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: "hbase.mapreduce.scan.row.start" and "hbase.mapreduce.scan.row.end"
> The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira