You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2009/02/25 21:47:01 UTC

[jira] Updated: (HBASE-1172) Modify TableInputFormat splitting algorithm to allow any number of mappers

     [ https://issues.apache.org/jira/browse/HBASE-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-1172:
---------------------------------

    Fix Version/s:     (was: 0.19.1)

Fix for 0.20.0

> Modify TableInputFormat splitting algorithm to allow any number of mappers
> --------------------------------------------------------------------------
>
>                 Key: HBASE-1172
>                 URL: https://issues.apache.org/jira/browse/HBASE-1172
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.20.0
>
>
> Currently, the number of mappers specified when using TableInputFormat is strictly followed if less than total regions on the input table.  If greater, the number of regions is used.
> This will modify the splitting algorithm to do the following:
> - Specify 0 mappers when you want # mappers = # regions
> - If you specify fewer mappers than regions, will use exactly the number you specify based on the current algorithm
> - If you specify more mappers than regions, will divide regions up by determining [start,X) [X,end).  The number of mappers will always be a multiple of number of regions.  This is so we do not have scanners spanning multiple regions.
> There is an additional issue in that the default number of mappers in JobConf is set to 1.  That means if a user does not explicitly set number of map tasks, a single mapper will be used.  I'm going to deal with that in a separate jira as the issue currently exists, there are a number of ways to implement this, and it's not required to complete this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.