You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by anil gupta <an...@gmail.com> on 2012/12/26 19:19:58 UTC

Setting number of mappers in Teragen

Hi All,

I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
slots in my cluster. But when, i start my Teragen job, it only spawns 2
mappers for entire job. I have even tried using the option -Dmapred.map.tasks
= 20 . Can anyone tell me how to force teragen to use 20 mappers for
generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop 0.20.2)
-- 
Thanks & Regards,
Anil Gupta

Re: Setting number of mappers in Teragen

Posted by anil gupta <an...@gmail.com>.
Hi Harsh,

Fixed it. I was putting the -Dmapred.map.tasks=20 after specifying the
input directory. I completely forgot about this trick of
genericOptionParser of Hadoop. Thanks a lot. :)

On Wed, Dec 26, 2012 at 10:33 AM, Harsh J <ha...@cloudera.com> wrote:

> The MR1 teragen's mappers # depends on the total number of rows and
> demanded # of maps.
>
> How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
> generic options must go in before any other options do, so it should
> appear right after the word "teragen" in your command.
>
> On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com>
> wrote:
> > Hi All,
> >
> > I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> > slots in my cluster. But when, i start my Teragen job, it only spawns 2
> > mappers for entire job. I have even tried using the option
> > -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use
> 20
> > mappers for generating the data? I am using cdh4.1.2 with
> Mapreducev1(Hadoop
> > 0.20.2)
> > --
> > Thanks & Regards,
> > Anil Gupta
>
>
>
> --
> Harsh J
>



-- 
Thanks & Regards,
Anil Gupta

Re: Setting number of mappers in Teragen

Posted by anil gupta <an...@gmail.com>.
Hi Harsh,

Fixed it. I was putting the -Dmapred.map.tasks=20 after specifying the
input directory. I completely forgot about this trick of
genericOptionParser of Hadoop. Thanks a lot. :)

On Wed, Dec 26, 2012 at 10:33 AM, Harsh J <ha...@cloudera.com> wrote:

> The MR1 teragen's mappers # depends on the total number of rows and
> demanded # of maps.
>
> How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
> generic options must go in before any other options do, so it should
> appear right after the word "teragen" in your command.
>
> On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com>
> wrote:
> > Hi All,
> >
> > I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> > slots in my cluster. But when, i start my Teragen job, it only spawns 2
> > mappers for entire job. I have even tried using the option
> > -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use
> 20
> > mappers for generating the data? I am using cdh4.1.2 with
> Mapreducev1(Hadoop
> > 0.20.2)
> > --
> > Thanks & Regards,
> > Anil Gupta
>
>
>
> --
> Harsh J
>



-- 
Thanks & Regards,
Anil Gupta

Re: Setting number of mappers in Teragen

Posted by anil gupta <an...@gmail.com>.
Hi Harsh,

Fixed it. I was putting the -Dmapred.map.tasks=20 after specifying the
input directory. I completely forgot about this trick of
genericOptionParser of Hadoop. Thanks a lot. :)

On Wed, Dec 26, 2012 at 10:33 AM, Harsh J <ha...@cloudera.com> wrote:

> The MR1 teragen's mappers # depends on the total number of rows and
> demanded # of maps.
>
> How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
> generic options must go in before any other options do, so it should
> appear right after the word "teragen" in your command.
>
> On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com>
> wrote:
> > Hi All,
> >
> > I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> > slots in my cluster. But when, i start my Teragen job, it only spawns 2
> > mappers for entire job. I have even tried using the option
> > -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use
> 20
> > mappers for generating the data? I am using cdh4.1.2 with
> Mapreducev1(Hadoop
> > 0.20.2)
> > --
> > Thanks & Regards,
> > Anil Gupta
>
>
>
> --
> Harsh J
>



-- 
Thanks & Regards,
Anil Gupta

Re: Setting number of mappers in Teragen

Posted by anil gupta <an...@gmail.com>.
Hi Harsh,

Fixed it. I was putting the -Dmapred.map.tasks=20 after specifying the
input directory. I completely forgot about this trick of
genericOptionParser of Hadoop. Thanks a lot. :)

On Wed, Dec 26, 2012 at 10:33 AM, Harsh J <ha...@cloudera.com> wrote:

> The MR1 teragen's mappers # depends on the total number of rows and
> demanded # of maps.
>
> How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
> generic options must go in before any other options do, so it should
> appear right after the word "teragen" in your command.
>
> On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com>
> wrote:
> > Hi All,
> >
> > I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> > slots in my cluster. But when, i start my Teragen job, it only spawns 2
> > mappers for entire job. I have even tried using the option
> > -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use
> 20
> > mappers for generating the data? I am using cdh4.1.2 with
> Mapreducev1(Hadoop
> > 0.20.2)
> > --
> > Thanks & Regards,
> > Anil Gupta
>
>
>
> --
> Harsh J
>



-- 
Thanks & Regards,
Anil Gupta

Re: Setting number of mappers in Teragen

Posted by Harsh J <ha...@cloudera.com>.
The MR1 teragen's mappers # depends on the total number of rows and
demanded # of maps.

How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
generic options must go in before any other options do, so it should
appear right after the word "teragen" in your command.

On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com> wrote:
> Hi All,
>
> I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> slots in my cluster. But when, i start my Teragen job, it only spawns 2
> mappers for entire job. I have even tried using the option
> -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use 20
> mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop
> 0.20.2)
> --
> Thanks & Regards,
> Anil Gupta



-- 
Harsh J

Re: Setting number of mappers in Teragen

Posted by Harsh J <ha...@cloudera.com>.
The MR1 teragen's mappers # depends on the total number of rows and
demanded # of maps.

How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
generic options must go in before any other options do, so it should
appear right after the word "teragen" in your command.

On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com> wrote:
> Hi All,
>
> I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> slots in my cluster. But when, i start my Teragen job, it only spawns 2
> mappers for entire job. I have even tried using the option
> -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use 20
> mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop
> 0.20.2)
> --
> Thanks & Regards,
> Anil Gupta



-- 
Harsh J

Re: Setting number of mappers in Teragen

Posted by Harsh J <ha...@cloudera.com>.
The MR1 teragen's mappers # depends on the total number of rows and
demanded # of maps.

How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
generic options must go in before any other options do, so it should
appear right after the word "teragen" in your command.

On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com> wrote:
> Hi All,
>
> I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> slots in my cluster. But when, i start my Teragen job, it only spawns 2
> mappers for entire job. I have even tried using the option
> -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use 20
> mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop
> 0.20.2)
> --
> Thanks & Regards,
> Anil Gupta



-- 
Harsh J

Re: Setting number of mappers in Teragen

Posted by Harsh J <ha...@cloudera.com>.
The MR1 teragen's mappers # depends on the total number of rows and
demanded # of maps.

How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All
generic options must go in before any other options do, so it should
appear right after the word "teragen" in your command.

On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <an...@gmail.com> wrote:
> Hi All,
>
> I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map
> slots in my cluster. But when, i start my Teragen job, it only spawns 2
> mappers for entire job. I have even tried using the option
> -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use 20
> mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop
> 0.20.2)
> --
> Thanks & Regards,
> Anil Gupta



-- 
Harsh J