You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2010/02/25 20:16:27 UTC

[jira] Created: (HIVE-1199) configure total number of mappers

configure total number of mappers
---------------------------------

                 Key: HIVE-1199
                 URL: https://issues.apache.org/jira/browse/HIVE-1199
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain
             Fix For: 0.6.0


For users, it might be very difficult to control the number of mappers. There are many parameters which confuses the users - 
for CombineHiveInputFormat, a different set of parameters is required to control the number of mappers.

In general, users should have a way to specify the total number of mappers, which should be obeyed. This will be very difficult
to guarantee, since the query might be reading from a large number of partitions, where a mapper can only span one partition.
What if the number of mappers that the user wants is less than the total number of partitions ?

It would be a very hueristic to have - a simple usecase that Joy had is as follows:

A query needs to be run on one table, which has a lot of small files - it will be easy for him to specify the total number of mappers
rather than the various rac local/node local combinefileinputformat parameters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.