You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by jamal sasha <ja...@gmail.com> on 2012/11/20 20:38:54 UTC

number of reducers

Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks

Re: number of reducers

Posted by Harsh J <ha...@cloudera.com>.
Hey Jamal,

I'd recommend first going over the whole tutorial to get a good grip
on how Hadoop MR is designed to work:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html

On Wed, Nov 21, 2012 at 1:08 AM, jamal sasha <ja...@gmail.com> wrote:
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just 1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>



-- 
Harsh J

Re: number of reducers

Posted by Harsh J <ha...@cloudera.com>.
Hey Jamal,

I'd recommend first going over the whole tutorial to get a good grip
on how Hadoop MR is designed to work:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html

On Wed, Nov 21, 2012 at 1:08 AM, jamal sasha <ja...@gmail.com> wrote:
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just 1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>



-- 
Harsh J

Re: number of reducers

Posted by al...@aim.com.
 What is the relationship between number of reducers and cpu cores in your setup? I read somewhere that it must be .5 of number of cpu cores.

Thanks.
Alex.

 

 

-----Original Message-----
From: Kartashov, Andy <An...@mpac.ca>
To: user <us...@hadoop.apache.org>; bejoy.hadoop <be...@gmail.com>
Sent: Tue, Nov 20, 2012 1:51 pm
Subject: RE: number of reducers



I specify mine inside mapred-site.xml
 
<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>
 
Rgds,
AK47

From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

 
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...

Regards
Bejoy KS

Sent from handheld, please excuse typos.



From: jamal sasha <ja...@gmail.com> 

Date: Tue, 20 Nov 2012 14:38:54 -0500

To: <us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: number of reducers

 



Hi,

  I wrote a simple map reduce job in hadoop streaming.

 

I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this

 

For line in sys.stdin:

    Print line

 

Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line

 

 

Thanks


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel
 

Re: number of reducers

Posted by al...@aim.com.
 What is the relationship between number of reducers and cpu cores in your setup? I read somewhere that it must be .5 of number of cpu cores.

Thanks.
Alex.

 

 

-----Original Message-----
From: Kartashov, Andy <An...@mpac.ca>
To: user <us...@hadoop.apache.org>; bejoy.hadoop <be...@gmail.com>
Sent: Tue, Nov 20, 2012 1:51 pm
Subject: RE: number of reducers



I specify mine inside mapred-site.xml
 
<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>
 
Rgds,
AK47

From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

 
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...

Regards
Bejoy KS

Sent from handheld, please excuse typos.



From: jamal sasha <ja...@gmail.com> 

Date: Tue, 20 Nov 2012 14:38:54 -0500

To: <us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: number of reducers

 



Hi,

  I wrote a simple map reduce job in hadoop streaming.

 

I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this

 

For line in sys.stdin:

    Print line

 

Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line

 

 

Thanks


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel
 

Re: number of reducers

Posted by al...@aim.com.
 What is the relationship between number of reducers and cpu cores in your setup? I read somewhere that it must be .5 of number of cpu cores.

Thanks.
Alex.

 

 

-----Original Message-----
From: Kartashov, Andy <An...@mpac.ca>
To: user <us...@hadoop.apache.org>; bejoy.hadoop <be...@gmail.com>
Sent: Tue, Nov 20, 2012 1:51 pm
Subject: RE: number of reducers



I specify mine inside mapred-site.xml
 
<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>
 
Rgds,
AK47

From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

 
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...

Regards
Bejoy KS

Sent from handheld, please excuse typos.



From: jamal sasha <ja...@gmail.com> 

Date: Tue, 20 Nov 2012 14:38:54 -0500

To: <us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: number of reducers

 



Hi,

  I wrote a simple map reduce job in hadoop streaming.

 

I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this

 

For line in sys.stdin:

    Print line

 

Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line

 

 

Thanks


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel
 

Re: number of reducers

Posted by al...@aim.com.
 What is the relationship between number of reducers and cpu cores in your setup? I read somewhere that it must be .5 of number of cpu cores.

Thanks.
Alex.

 

 

-----Original Message-----
From: Kartashov, Andy <An...@mpac.ca>
To: user <us...@hadoop.apache.org>; bejoy.hadoop <be...@gmail.com>
Sent: Tue, Nov 20, 2012 1:51 pm
Subject: RE: number of reducers



I specify mine inside mapred-site.xml
 
<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>
 
Rgds,
AK47

From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

 
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...

Regards
Bejoy KS

Sent from handheld, please excuse typos.



From: jamal sasha <ja...@gmail.com> 

Date: Tue, 20 Nov 2012 14:38:54 -0500

To: <us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: number of reducers

 



Hi,

  I wrote a simple map reduce job in hadoop streaming.

 

I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this

 

For line in sys.stdin:

    Print line

 

Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line

 

 

Thanks


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel
 

RE: number of reducers

Posted by "Kartashov, Andy" <An...@mpac.ca>.
I specify mine inside mapred-site.xml

<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>

Rgds,
AK47
From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 -0500
To: <us...@hadoop.apache.org>
ReplyTo: user@hadoop.apache.org
Subject: number of reducers



Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It's couple of TB's worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

Re: number of reducers

Posted by jamal sasha <ja...@gmail.com>.
Awesome thanks . Works great now

On Tuesday, November 20, 2012, Bejoy KS <be...@gmail.com> wrote:
> Hi Sasha
>
> By default the number or reducers are set to be 1. If you want more you
need to specify it as
>
> hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
> ________________________________
> From: jamal sasha <ja...@gmail.com>
> Date: Tue, 20 Nov 2012 14:38:54 -0500
> To: <us...@hadoop.apache.org>
> ReplyTo: user@hadoop.apache.org
> Subject: number of reducers
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just
1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>
>

RE: number of reducers

Posted by "Kartashov, Andy" <An...@mpac.ca>.
I specify mine inside mapred-site.xml

<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>

Rgds,
AK47
From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 -0500
To: <us...@hadoop.apache.org>
ReplyTo: user@hadoop.apache.org
Subject: number of reducers



Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It's couple of TB's worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

RE: number of reducers

Posted by "Kartashov, Andy" <An...@mpac.ca>.
I specify mine inside mapred-site.xml

<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>

Rgds,
AK47
From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 -0500
To: <us...@hadoop.apache.org>
ReplyTo: user@hadoop.apache.org
Subject: number of reducers



Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It's couple of TB's worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

RE: number of reducers

Posted by "Kartashov, Andy" <An...@mpac.ca>.
I specify mine inside mapred-site.xml

<property>
    <name>mapred.reduce.tasks</name>
    <value>20</value>
  </property>

Rgds,
AK47
From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers

Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 -0500
To: <us...@hadoop.apache.org>
ReplyTo: user@hadoop.apache.org
Subject: number of reducers



Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It's couple of TB's worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

Re: number of reducers

Posted by jamal sasha <ja...@gmail.com>.
Awesome thanks . Works great now

On Tuesday, November 20, 2012, Bejoy KS <be...@gmail.com> wrote:
> Hi Sasha
>
> By default the number or reducers are set to be 1. If you want more you
need to specify it as
>
> hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
> ________________________________
> From: jamal sasha <ja...@gmail.com>
> Date: Tue, 20 Nov 2012 14:38:54 -0500
> To: <us...@hadoop.apache.org>
> ReplyTo: user@hadoop.apache.org
> Subject: number of reducers
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just
1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>
>

Re: number of reducers

Posted by jamal sasha <ja...@gmail.com>.
Awesome thanks . Works great now

On Tuesday, November 20, 2012, Bejoy KS <be...@gmail.com> wrote:
> Hi Sasha
>
> By default the number or reducers are set to be 1. If you want more you
need to specify it as
>
> hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
> ________________________________
> From: jamal sasha <ja...@gmail.com>
> Date: Tue, 20 Nov 2012 14:38:54 -0500
> To: <us...@hadoop.apache.org>
> ReplyTo: user@hadoop.apache.org
> Subject: number of reducers
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just
1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>
>

Re: number of reducers

Posted by jamal sasha <ja...@gmail.com>.
Awesome thanks . Works great now

On Tuesday, November 20, 2012, Bejoy KS <be...@gmail.com> wrote:
> Hi Sasha
>
> By default the number or reducers are set to be 1. If you want more you
need to specify it as
>
> hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
> ________________________________
> From: jamal sasha <ja...@gmail.com>
> Date: Tue, 20 Nov 2012 14:38:54 -0500
> To: <us...@hadoop.apache.org>
> ReplyTo: user@hadoop.apache.org
> Subject: number of reducers
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just
1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>
>

Re: number of reducers

Posted by Bejoy KS <be...@gmail.com>.
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...


Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 
To: <us...@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: number of reducers

Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks


Re: number of reducers

Posted by Harsh J <ha...@cloudera.com>.
Hey Jamal,

I'd recommend first going over the whole tutorial to get a good grip
on how Hadoop MR is designed to work:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html

On Wed, Nov 21, 2012 at 1:08 AM, jamal sasha <ja...@gmail.com> wrote:
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just 1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>



-- 
Harsh J

Re: number of reducers

Posted by Harsh J <ha...@cloudera.com>.
Hey Jamal,

I'd recommend first going over the whole tutorial to get a good grip
on how Hadoop MR is designed to work:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html

On Wed, Nov 21, 2012 at 1:08 AM, jamal sasha <ja...@gmail.com> wrote:
>
>
> Hi,
>
>   I wrote a simple map reduce job in hadoop streaming.
>
>
>
> I am wondering if I am doing something wrong ..
>
> While number of mappers are projected to be around 1700.. reducers.. just 1?
>
> It’s couple of TB’s worth of data.
>
> What can I do to address this.
>
> Basically mapper looks like this
>
>
>
> For line in sys.stdin:
>
>     Print line
>
>
>
> Reducer
>
> For line in sys.stdin:
>
>     New_line = process_line(line)
>
>     Print new_line
>
>
>
>
>
> Thanks
>
>



-- 
Harsh J

Re: number of reducers

Posted by Bejoy KS <be...@gmail.com>.
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...


Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 
To: <us...@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: number of reducers

Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks


Re: number of reducers

Posted by Bejoy KS <be...@gmail.com>.
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...


Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 
To: <us...@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: number of reducers

Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks


Re: number of reducers

Posted by Bejoy KS <be...@gmail.com>.
Hi Sasha

By default the number or reducers are set to be 1. If you want more you need to specify it as

hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...


Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: jamal sasha <ja...@gmail.com>
Date: Tue, 20 Nov 2012 14:38:54 
To: <us...@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: number of reducers

Hi,

  I wrote a simple map reduce job in hadoop streaming.



I am wondering if I am doing something wrong ..

While number of mappers are projected to be around 1700.. reducers.. just 1?

It’s couple of TB’s worth of data.

What can I do to address this.

Basically mapper looks like this



For line in sys.stdin:

    Print line



Reducer

For line in sys.stdin:

    New_line = process_line(line)

    Print new_line





Thanks