You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Lars George (JIRA)" <ji...@apache.org> on 2009/03/25 13:12:52 UTC

[jira] Created: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
---------------------------------------------------------------------

                 Key: HBASE-1287
                 URL: https://issues.apache.org/jira/browse/HBASE-1287
             Project: Hadoop HBase
          Issue Type: Bug
          Components: mapred
            Reporter: Lars George
            Assignee: Lars George


Upon checking the available utility methods in TableMapReduceUtil I came across this code:

{code}
  public static void initTableReduceJob(String table,
    Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
  throws IOException {
    job.setOutputFormat(TableOutputFormat.class);
    job.setReducerClass(reducer);
    job.set(TableOutputFormat.OUTPUT_TABLE, table);
    job.setOutputKeyClass(ImmutableBytesWritable.class);
    job.setOutputValueClass(BatchUpdate.class);
    if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
  }
{code}

It seems though as it should be

{code}
    if (partitioner != null) {
      job.setPartitionerClass(partitioner);
{code}

and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702037#action_12702037 ] 

Billy Pearson edited comment on HBASE-1287 at 4/23/09 11:06 AM:
----------------------------------------------------------------

Edited the to not set partitioner if null is as in from 75 passed 
it gives java.lang.NullPointerException otherwise
+1

      was (Author: viper799):
    Edited the to not set partitioner if null is as in from 75 passed 
+1
  
> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287-5.patch, 1287-6.patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702333#action_12702333 ] 

Lars George commented on HBASE-1287:
------------------------------------

+1

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287-5.patch, 1287-6.patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689112#action_12689112 ] 

stack commented on HBASE-1287:
------------------------------

Yeah, that looks silly Lars; always using HRegionPartitioner regardless of whats passed.

Does the rest of that function belong inside the '    if (partitioner != null) { }' loop?  The bit where its doing:

{code}
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
{code}

Should it be rather something like if partition == HRegionPartition.class, do above?  Otherwise, presume the custom partitioner is figuring the number of reduce tasks?

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billy Pearson updated HBASE-1287:
---------------------------------

    Status: Patch Available  (was: Open)

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698606#action_12698606 ] 

Billy Pearson edited comment on HBASE-1287 at 4/13/09 6:25 PM:
---------------------------------------------------------------

yea that was my mess up should have used partitioner in the function but just hard linked the Hregion one for no reasion.

The idea behind setting the number of reducers is to lower it if the number of reducers if it is set > then we have regions the 
HRegionPartitioner will handle if the number of reduce is set lower then the number of regions it just converts back to using default hashPartitioner
if greater then number of regions it will still work but will be a wast of launching reducers that will have no work to do.

{code}
if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
{code}

should be something like this

{code}
if (partitioner == HRegionPartitioner.class) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
  } else {
    job.setPartitionerClass(partitioner);
  }
{code}



      was (Author: viper799):
    yea that was my mess up should have used partitioner in the function but just hard linked the Hregion one for no reasion.

The idea behind setting the number of reducers is to lower it if the number of reducers if it is set > then we have regions the 
HRegionPartitioner will handle if the number of reduce is set lower then the number of regions it just converts back to using default hashPartitioner
if greater then number of regions it will still work but will be a wast of launching reducers that will have no work to do.

{code}
if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
{code}

should be something like this

{code}
if (partitioner == HRegionPartitioner.class) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
  } else {
    job.setPartitionerClass(HRegionPartitioner.class);
  }
{code}


  
> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1287:
-------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to branch and trunk.  Thanks Lars and Billy for figuring this patch.

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287-5.patch, 1287-6.patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1287:
-------------------------------

    Attachment: 1287-2.patch

As discussed. Moved code that ensured upper limits to separate methods and added generic setters to be able to specify the number of mappers or reducers for a given job. Also added setting the default HRegionPartitioner back in as that seemed like it was the original intention. So either the given one is set or the default one when that specific method is called. The initTableReduceJob without the partitioner parameter is unchanged and does not set anything specific related to the partitioner class.

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287-2.patch, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1287:
-------------------------------

    Description: 
Upon checking the available utility methods in TableMapReduceUtil I came across this code

{code}
  public static void initTableReduceJob(String table,
    Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
  throws IOException {
    job.setOutputFormat(TableOutputFormat.class);
    job.setReducerClass(reducer);
    job.set(TableOutputFormat.OUTPUT_TABLE, table);
    job.setOutputKeyClass(ImmutableBytesWritable.class);
    job.setOutputValueClass(BatchUpdate.class);
    if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
  }
{code}

It seems though as it should be

{code}
    if (partitioner != null) {
      job.setPartitionerClass(partitioner);
{code}

and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

  was:
Upon checking the available utility methods in TableMapReduceUtil I came across this code:

{code}
  public static void initTableReduceJob(String table,
    Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
  throws IOException {
    job.setOutputFormat(TableOutputFormat.class);
    job.setReducerClass(reducer);
    job.set(TableOutputFormat.OUTPUT_TABLE, table);
    job.setOutputKeyClass(ImmutableBytesWritable.class);
    job.setOutputValueClass(BatchUpdate.class);
    if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
  }
{code}

It seems though as it should be

{code}
    if (partitioner != null) {
      job.setPartitionerClass(partitioner);
{code}

and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.


> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702033#action_12702033 ] 

Billy Pearson commented on HBASE-1287:
--------------------------------------

question?
the default partitioner should be hash defualt in hadoop
if we pass null as we do in line 75
and its set in line 105 as null will hadoop use default hash or will this break the partitioner?

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287-5.patch, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1287:
-------------------------------

    Attachment: 1287.patch

Fixes the usage of the partitioner class as opposed to be using the hardcoded default class.

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billy Pearson updated HBASE-1287:
---------------------------------

    Attachment: 1287-6.patch.txt

Edited the to not set partitioner if null is as in from 75 passed 
+1

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287-5.patch, 1287-6.patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billy Pearson updated HBASE-1287:
---------------------------------

    Status: Open  (was: Patch Available)

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701904#action_12701904 ] 

Lars George commented on HBASE-1287:
------------------------------------

No. His patch is negating on the comparison, while I assume from his comment that he wanted to have an equal:

{code}-    if (partitioner != null) {
+    if (partitioner != HRegionPartitioner.class) { {code}

should be:

{code}-    if (partitioner != null) {
+    if (partitioner == HRegionPartitioner.class) { {code}

?

Also I would like to keep the setters I added if there is no objection against it. Because I need them and I have also talked to others that need those too by the looks. 

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689115#action_12689115 ] 

Lars George commented on HBASE-1287:
------------------------------------

I was wondering about this too Michael and was about to ask. I am not sure who added it and why (I have yet to figure out how to search the file history for a specific change as opposed to browse them one by one). Anyhow, I do a similar thing, computing the number of mappers, in my main MR program. Why this is here and why it is executed just when a custom partitioner is given eludes me. Either you do it always, or never - or have an extra flag that triggers it. 

It also sets the upper limit only when the given number of reducers is higher than the actual regions. Why though? I am using this only to compute the number of mappers so that I do not have to specify this on the command line. 

All in all I would drop the whole thing if there is no reason to keep it and rather ask the caller to set the boundaries. Or add another helper method that computes these on demand, for example

{code}
public setNumMapTasks(String table, JobConf job) {...}
public setNumReduceTasks(String table, JobConf job) {...}
{code}

where both simply do the same thing, get the number of region and assign that value to the respective field in the job configuration.

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701799#action_12701799 ] 

stack commented on HBASE-1287:
------------------------------

@Lars  You good w/ Billy's patch?  (It looks good to me).

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1287:
-------------------------------

    Attachment: 1287-5.patch

Combined patch including BIlly's patch and the additional functions I see being useful (and use myself). Also cleaned up JavaDoc of all other methods.

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287-5.patch, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billy Pearson updated HBASE-1287:
---------------------------------

    Fix Version/s: 0.20.0
                   0.19.2
           Status: Patch Available  (was: Open)

forgot to include it will patch trunk and branch-0.19

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694287#action_12694287 ] 

stack commented on HBASE-1287:
------------------------------

Sounds great Lars.

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698606#action_12698606 ] 

Billy Pearson commented on HBASE-1287:
--------------------------------------

yea that was my mess up should have used partitioner in the function but just hard linked the Hregion one for no reasion.

The idea behind setting the number of reducers is to lower it if the number of reducers if it is set > then we have regions the 
HRegionPartitioner will handle if the number of reduce is set lower then the number of regions it just converts back to using default hashPartitioner
if greater then number of regions it will still work but will be a wast of launching reducers that will have no work to do.

{code}
if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
{code}

should be something like this

{code}
if (partitioner == HRegionPartitioner.class) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
  } else {
    job.setPartitionerClass(HRegionPartitioner.class);
  }
{code}



> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287-2.patch, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702029#action_12702029 ] 

Billy Pearson commented on HBASE-1287:
--------------------------------------

+1 I like the added functions I can see where they will be used often in hbase MR jobs

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287-5.patch, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701964#action_12701964 ] 

stack commented on HBASE-1287:
------------------------------

Billy, want to make a new patch?  Or want Lars or I to do it?  Thanks.

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billy Pearson updated HBASE-1287:
---------------------------------

    Attachment: 1287-4-patch.txt

Sorry about that fixed

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287-4-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693731#action_12693731 ] 

stack commented on HBASE-1287:
------------------------------

Hey Lars:  I did svn blame and svn log against that file.  Looks like its my commit of a Billy Pearson patch, HBASE-987 that added the partitioner test.  As is, as you say, its broke.  Regards your suggestion, would you still be able to pass a custom partitioner?

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billy Pearson updated HBASE-1287:
---------------------------------

    Attachment: 1287-3-patch.txt

upload patch to fix I thank what you are talking about I would like to keep the setNumReduceTasks for HRegion Partitioner

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Lars George (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693907#action_12693907 ] 

Lars George commented on HBASE-1287:
------------------------------------

Hi Stack, yes, that would stay the same, I would simply move the next few lines below the partitioner part to specific helper methods that can be called if needed - as opposed to implicitly assuming that is always what a caller wants when they set a custom partitioner. Does that make sense?

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billy Pearson reassigned HBASE-1287:
------------------------------------

    Assignee: Billy Pearson  (was: Lars George)

> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Billy Pearson
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.