You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Pedro Costa <ps...@gmail.com> on 2011/03/18 18:04:52 UTC

set number of map tasks in GridMix2

Hi,

I would like define the number of map tasks to use in the GridMix2.

For example, I would like to run the GridMixMonsterQuery at GridMix2
with 5 maps, another with 10 and another with 20 maps.

How can I do that?

Thanks,

-- 
Pedro

Re: set number of map tasks in GridMix2

Posted by Denny Ye <de...@gmail.com>.
hi Pedro,
      You are right, the number of map tasks is defined by the number of
input splits. In default, one DFS block
      one split. In your first example. Each file is sole block in DFS, so
it have 10 map tasks for ten blocks. The second
      example, default block size is 64m in DFS, each 1GB file may have 16
blocks. 10 files * 16 splits = 160 map tasks

      I think it have two ways to change the map tasks number by setting
configuration parameters :
      Block size : *dfs.blocksize* to suitable size
      Input split limitation : *
mapreduce.input.fileinputformat.split.maxsize* and
*mapreduce.input.fileinputformat.split.minsize.
The split size may between minimum size and maximum size.*
*
*
* Regards*
* -- Denny Ye*

On Sat, Mar 19, 2011 at 1:18 AM, Pedro Costa <ps...@gmail.com> wrote:

> I've another question.
>
> The number of map tasks is defined by the number of input splits?
> For example, if I run an example that read 10 txt files with 1kb each,
> does it means that 10 map tasks will run?
> And if I've 10 txt files with 1GB each, how many map tasks I will run?
>
> Thanks,
>
> On Fri, Mar 18, 2011 at 5:04 PM, Pedro Costa <ps...@gmail.com> wrote:
> > Hi,
> >
> > I would like define the number of map tasks to use in the GridMix2.
> >
> > For example, I would like to run the GridMixMonsterQuery at GridMix2
> > with 5 maps, another with 10 and another with 20 maps.
> >
> > How can I do that?
> >
> > Thanks,
> >
> > --
> > Pedro
> >
>
>
>
> --
> Pedro
>

Re: set number of map tasks in GridMix2

Posted by Pedro Costa <ps...@gmail.com>.
I've another question.

The number of map tasks is defined by the number of input splits?
For example, if I run an example that read 10 txt files with 1kb each,
does it means that 10 map tasks will run?
And if I've 10 txt files with 1GB each, how many map tasks I will run?

Thanks,

On Fri, Mar 18, 2011 at 5:04 PM, Pedro Costa <ps...@gmail.com> wrote:
> Hi,
>
> I would like define the number of map tasks to use in the GridMix2.
>
> For example, I would like to run the GridMixMonsterQuery at GridMix2
> with 5 maps, another with 10 and another with 20 maps.
>
> How can I do that?
>
> Thanks,
>
> --
> Pedro
>



-- 
Pedro