You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Abdul Navaz <na...@gmail.com> on 2014/12/12 07:36:39 UTC

Where the output of mappers are saved ?

Hello,


I am interested in efficiently manage the Hadoop shuffling traffic and
utilize the network bandwidth effectively. To do this I want to know how
much shuffling traffic generated by each Datanodes ? Shuffling traffic is
nothing but the output of mappers. So where this mapper output is saved ?
How can i get the size of mapper output from each datanodes in a real time ?
Appreciate your help.

Thanks & Regards,

Abdul Navaz




Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,

As of hadoop documentation, mapper output is not saved in HDFS. It will be
saved in temporary local disk which we can modify using mapred.local.dir in
mapred.site.xml file. I was able to see the mapper output in this directory
and once the job is done it flushes out the data from this temporary
directory as it is not needed anymore.
My question was how to get the size of this mapper output. Anyhow I figured
it now, When running map reduce job execute “du –ah” will give me the size
of all directories including subdirectories.


Thanks & Regards,

Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX



From:  "bit1129@163.com" <bi...@163.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Tuesday, December 16, 2014 at 2:12 AM
To:  user <us...@hadoop.apache.org>
Subject:  Re: Re: Where the output of mappers are saved ?

Thanks Susheel !, understood.


bit1129@163.com
>  
> From: Susheel Kumar Gadalay <ma...@gmail.com>
> Date: 2014-12-16 15:27
> To: user <ma...@hadoop.apache.org>
> Subject: Re: Re: Where the output of mappers are saved ?
> I don't think so. It will be a single output file per reducer.
>  
> If u want multiple small size output files then specify the number of
> reducers in the job configuration.
>  
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> > Thanks Susheel!!
>> > One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
>> > file be splitted into more files under the output directory,that is, one
>> > reducer could product more than one files.
>> >
>> >
>> >
>> > bit1129@163.com
>> >
>> > From: Susheel Kumar Gadalay
>> > Date: 2014-12-16 14:17
>> > To: user
>> > Subject: Re: Re: Where the output of mappers are saved ?
>> > Yes, the map outputs will be cleaned on job completion.
>> >
>> > If u want to see the map outputs give number of reducers as zero
>> > and verify the files part-m-0000, part-m-0001....
>> >
>> > On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>>> >> Do they only exist during the map/reduce process and will be removed
>>> >> after
>>> >> the MR finished?
>>> >>
>>> >> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>>> >> are reduce results.
>>> >>
>>> >>
>>> >>
>>> >> bit1129@163.com
>>> >>
>>> >> From: Susheel Kumar Gadalay
>>> >> Date: 2014-12-16 13:05
>>> >> To: user
>>> >> Subject: Re: Where the output of mappers are saved ?
>>> >> Map outputs will be in hdfs under your user name and output directory.
>>> >>
>>> >> They will have name like part-m-0000, part-m-0001 ....
>>> >>
>>> >>
>>> >> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> Second Try !
>>>> >>>
>>>> >>>
>>>> >>> I  have created a directory to store this mapper output as below.
>>>> >>>  <property>
>>>> >>>  <name>mapred.local.dir</name>
>>>> >>>  <value>/app/hadoop/tmp/myoutput</value>
>>>> >>>  </property>
>>>> >>> and i looked at
>>>> >>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>> >>>  total 16
>>>> >>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>> >>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>> >>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>> >>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>>> >>> and i couldnot find anything here when i run the map reduce job . Where
>>>> >>> by
>>>> >>> default mapper output is saved and how can I get the size of mapper
>>>> >>> output
>>>> >>> in bytes
>>>> >>>
>>>> >>>
>>>> >>> Thanks.
>>>> >>>
>>>> >>>
>>>> >>> From:  Abdul Navaz <na...@gmail.com>
>>>> >>> Date:  Friday, December 12, 2014 at 12:36 AM
>>>> >>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>>> >>> Subject:  Where the output of mappers are saved ?
>>>> >>>
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>>> >>> utilize the network bandwidth effectively. To do this I want to know
how
>>>> >>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>>> >>> is
>>>> >>> nothing but the output of mappers. So where this mapper output is saved
>>>> >>> ?
>>>> >>> How can i get the size of mapper output from each datanodes in a real
>>>> >>> time
>>>> >>> ?
>>>> >>> Appreciate your help.
>>>> >>>
>>>> >>> Thanks & Regards,
>>>> >>>
>>>> >>> Abdul Navaz
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>> >>
>> >



Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,

As of hadoop documentation, mapper output is not saved in HDFS. It will be
saved in temporary local disk which we can modify using mapred.local.dir in
mapred.site.xml file. I was able to see the mapper output in this directory
and once the job is done it flushes out the data from this temporary
directory as it is not needed anymore.
My question was how to get the size of this mapper output. Anyhow I figured
it now, When running map reduce job execute “du –ah” will give me the size
of all directories including subdirectories.


Thanks & Regards,

Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX



From:  "bit1129@163.com" <bi...@163.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Tuesday, December 16, 2014 at 2:12 AM
To:  user <us...@hadoop.apache.org>
Subject:  Re: Re: Where the output of mappers are saved ?

Thanks Susheel !, understood.


bit1129@163.com
>  
> From: Susheel Kumar Gadalay <ma...@gmail.com>
> Date: 2014-12-16 15:27
> To: user <ma...@hadoop.apache.org>
> Subject: Re: Re: Where the output of mappers are saved ?
> I don't think so. It will be a single output file per reducer.
>  
> If u want multiple small size output files then specify the number of
> reducers in the job configuration.
>  
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> > Thanks Susheel!!
>> > One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
>> > file be splitted into more files under the output directory,that is, one
>> > reducer could product more than one files.
>> >
>> >
>> >
>> > bit1129@163.com
>> >
>> > From: Susheel Kumar Gadalay
>> > Date: 2014-12-16 14:17
>> > To: user
>> > Subject: Re: Re: Where the output of mappers are saved ?
>> > Yes, the map outputs will be cleaned on job completion.
>> >
>> > If u want to see the map outputs give number of reducers as zero
>> > and verify the files part-m-0000, part-m-0001....
>> >
>> > On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>>> >> Do they only exist during the map/reduce process and will be removed
>>> >> after
>>> >> the MR finished?
>>> >>
>>> >> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>>> >> are reduce results.
>>> >>
>>> >>
>>> >>
>>> >> bit1129@163.com
>>> >>
>>> >> From: Susheel Kumar Gadalay
>>> >> Date: 2014-12-16 13:05
>>> >> To: user
>>> >> Subject: Re: Where the output of mappers are saved ?
>>> >> Map outputs will be in hdfs under your user name and output directory.
>>> >>
>>> >> They will have name like part-m-0000, part-m-0001 ....
>>> >>
>>> >>
>>> >> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> Second Try !
>>>> >>>
>>>> >>>
>>>> >>> I  have created a directory to store this mapper output as below.
>>>> >>>  <property>
>>>> >>>  <name>mapred.local.dir</name>
>>>> >>>  <value>/app/hadoop/tmp/myoutput</value>
>>>> >>>  </property>
>>>> >>> and i looked at
>>>> >>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>> >>>  total 16
>>>> >>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>> >>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>> >>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>> >>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>>> >>> and i couldnot find anything here when i run the map reduce job . Where
>>>> >>> by
>>>> >>> default mapper output is saved and how can I get the size of mapper
>>>> >>> output
>>>> >>> in bytes
>>>> >>>
>>>> >>>
>>>> >>> Thanks.
>>>> >>>
>>>> >>>
>>>> >>> From:  Abdul Navaz <na...@gmail.com>
>>>> >>> Date:  Friday, December 12, 2014 at 12:36 AM
>>>> >>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>>> >>> Subject:  Where the output of mappers are saved ?
>>>> >>>
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>>> >>> utilize the network bandwidth effectively. To do this I want to know
how
>>>> >>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>>> >>> is
>>>> >>> nothing but the output of mappers. So where this mapper output is saved
>>>> >>> ?
>>>> >>> How can i get the size of mapper output from each datanodes in a real
>>>> >>> time
>>>> >>> ?
>>>> >>> Appreciate your help.
>>>> >>>
>>>> >>> Thanks & Regards,
>>>> >>>
>>>> >>> Abdul Navaz
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>> >>
>> >



Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,

As of hadoop documentation, mapper output is not saved in HDFS. It will be
saved in temporary local disk which we can modify using mapred.local.dir in
mapred.site.xml file. I was able to see the mapper output in this directory
and once the job is done it flushes out the data from this temporary
directory as it is not needed anymore.
My question was how to get the size of this mapper output. Anyhow I figured
it now, When running map reduce job execute “du –ah” will give me the size
of all directories including subdirectories.


Thanks & Regards,

Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX



From:  "bit1129@163.com" <bi...@163.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Tuesday, December 16, 2014 at 2:12 AM
To:  user <us...@hadoop.apache.org>
Subject:  Re: Re: Where the output of mappers are saved ?

Thanks Susheel !, understood.


bit1129@163.com
>  
> From: Susheel Kumar Gadalay <ma...@gmail.com>
> Date: 2014-12-16 15:27
> To: user <ma...@hadoop.apache.org>
> Subject: Re: Re: Where the output of mappers are saved ?
> I don't think so. It will be a single output file per reducer.
>  
> If u want multiple small size output files then specify the number of
> reducers in the job configuration.
>  
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> > Thanks Susheel!!
>> > One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
>> > file be splitted into more files under the output directory,that is, one
>> > reducer could product more than one files.
>> >
>> >
>> >
>> > bit1129@163.com
>> >
>> > From: Susheel Kumar Gadalay
>> > Date: 2014-12-16 14:17
>> > To: user
>> > Subject: Re: Re: Where the output of mappers are saved ?
>> > Yes, the map outputs will be cleaned on job completion.
>> >
>> > If u want to see the map outputs give number of reducers as zero
>> > and verify the files part-m-0000, part-m-0001....
>> >
>> > On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>>> >> Do they only exist during the map/reduce process and will be removed
>>> >> after
>>> >> the MR finished?
>>> >>
>>> >> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>>> >> are reduce results.
>>> >>
>>> >>
>>> >>
>>> >> bit1129@163.com
>>> >>
>>> >> From: Susheel Kumar Gadalay
>>> >> Date: 2014-12-16 13:05
>>> >> To: user
>>> >> Subject: Re: Where the output of mappers are saved ?
>>> >> Map outputs will be in hdfs under your user name and output directory.
>>> >>
>>> >> They will have name like part-m-0000, part-m-0001 ....
>>> >>
>>> >>
>>> >> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> Second Try !
>>>> >>>
>>>> >>>
>>>> >>> I  have created a directory to store this mapper output as below.
>>>> >>>  <property>
>>>> >>>  <name>mapred.local.dir</name>
>>>> >>>  <value>/app/hadoop/tmp/myoutput</value>
>>>> >>>  </property>
>>>> >>> and i looked at
>>>> >>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>> >>>  total 16
>>>> >>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>> >>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>> >>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>> >>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>>> >>> and i couldnot find anything here when i run the map reduce job . Where
>>>> >>> by
>>>> >>> default mapper output is saved and how can I get the size of mapper
>>>> >>> output
>>>> >>> in bytes
>>>> >>>
>>>> >>>
>>>> >>> Thanks.
>>>> >>>
>>>> >>>
>>>> >>> From:  Abdul Navaz <na...@gmail.com>
>>>> >>> Date:  Friday, December 12, 2014 at 12:36 AM
>>>> >>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>>> >>> Subject:  Where the output of mappers are saved ?
>>>> >>>
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>>> >>> utilize the network bandwidth effectively. To do this I want to know
how
>>>> >>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>>> >>> is
>>>> >>> nothing but the output of mappers. So where this mapper output is saved
>>>> >>> ?
>>>> >>> How can i get the size of mapper output from each datanodes in a real
>>>> >>> time
>>>> >>> ?
>>>> >>> Appreciate your help.
>>>> >>>
>>>> >>> Thanks & Regards,
>>>> >>>
>>>> >>> Abdul Navaz
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>> >>
>> >



Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,

As of hadoop documentation, mapper output is not saved in HDFS. It will be
saved in temporary local disk which we can modify using mapred.local.dir in
mapred.site.xml file. I was able to see the mapper output in this directory
and once the job is done it flushes out the data from this temporary
directory as it is not needed anymore.
My question was how to get the size of this mapper output. Anyhow I figured
it now, When running map reduce job execute “du –ah” will give me the size
of all directories including subdirectories.


Thanks & Regards,

Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX



From:  "bit1129@163.com" <bi...@163.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Tuesday, December 16, 2014 at 2:12 AM
To:  user <us...@hadoop.apache.org>
Subject:  Re: Re: Where the output of mappers are saved ?

Thanks Susheel !, understood.


bit1129@163.com
>  
> From: Susheel Kumar Gadalay <ma...@gmail.com>
> Date: 2014-12-16 15:27
> To: user <ma...@hadoop.apache.org>
> Subject: Re: Re: Where the output of mappers are saved ?
> I don't think so. It will be a single output file per reducer.
>  
> If u want multiple small size output files then specify the number of
> reducers in the job configuration.
>  
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> > Thanks Susheel!!
>> > One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
>> > file be splitted into more files under the output directory,that is, one
>> > reducer could product more than one files.
>> >
>> >
>> >
>> > bit1129@163.com
>> >
>> > From: Susheel Kumar Gadalay
>> > Date: 2014-12-16 14:17
>> > To: user
>> > Subject: Re: Re: Where the output of mappers are saved ?
>> > Yes, the map outputs will be cleaned on job completion.
>> >
>> > If u want to see the map outputs give number of reducers as zero
>> > and verify the files part-m-0000, part-m-0001....
>> >
>> > On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>>> >> Do they only exist during the map/reduce process and will be removed
>>> >> after
>>> >> the MR finished?
>>> >>
>>> >> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>>> >> are reduce results.
>>> >>
>>> >>
>>> >>
>>> >> bit1129@163.com
>>> >>
>>> >> From: Susheel Kumar Gadalay
>>> >> Date: 2014-12-16 13:05
>>> >> To: user
>>> >> Subject: Re: Where the output of mappers are saved ?
>>> >> Map outputs will be in hdfs under your user name and output directory.
>>> >>
>>> >> They will have name like part-m-0000, part-m-0001 ....
>>> >>
>>> >>
>>> >> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> Second Try !
>>>> >>>
>>>> >>>
>>>> >>> I  have created a directory to store this mapper output as below.
>>>> >>>  <property>
>>>> >>>  <name>mapred.local.dir</name>
>>>> >>>  <value>/app/hadoop/tmp/myoutput</value>
>>>> >>>  </property>
>>>> >>> and i looked at
>>>> >>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>> >>>  total 16
>>>> >>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>> >>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>> >>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>> >>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>>> >>> and i couldnot find anything here when i run the map reduce job . Where
>>>> >>> by
>>>> >>> default mapper output is saved and how can I get the size of mapper
>>>> >>> output
>>>> >>> in bytes
>>>> >>>
>>>> >>>
>>>> >>> Thanks.
>>>> >>>
>>>> >>>
>>>> >>> From:  Abdul Navaz <na...@gmail.com>
>>>> >>> Date:  Friday, December 12, 2014 at 12:36 AM
>>>> >>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>>> >>> Subject:  Where the output of mappers are saved ?
>>>> >>>
>>>> >>> Hello,
>>>> >>>
>>>> >>>
>>>> >>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>>> >>> utilize the network bandwidth effectively. To do this I want to know
how
>>>> >>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>>> >>> is
>>>> >>> nothing but the output of mappers. So where this mapper output is saved
>>>> >>> ?
>>>> >>> How can i get the size of mapper output from each datanodes in a real
>>>> >>> time
>>>> >>> ?
>>>> >>> Appreciate your help.
>>>> >>>
>>>> >>> Thanks & Regards,
>>>> >>>
>>>> >>> Abdul Navaz
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>> >>
>> >



Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel !, understood.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 15:27
To: user
Subject: Re: Re: Where the output of mappers are saved ?
I don't think so. It will be a single output file per reducer.
 
If u want multiple small size output files then specify the number of
reducers in the job configuration.
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel !, understood.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 15:27
To: user
Subject: Re: Re: Where the output of mappers are saved ?
I don't think so. It will be a single output file per reducer.
 
If u want multiple small size output files then specify the number of
reducers in the job configuration.
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel !, understood.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 15:27
To: user
Subject: Re: Re: Where the output of mappers are saved ?
I don't think so. It will be a single output file per reducer.
 
If u want multiple small size output files then specify the number of
reducers in the job configuration.
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel !, understood.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 15:27
To: user
Subject: Re: Re: Where the output of mappers are saved ?
I don't think so. It will be a single output file per reducer.
 
If u want multiple small size output files then specify the number of
reducers in the job configuration.
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
I don't think so. It will be a single output file per reducer.

If u want multiple small size output files then specify the number of
reducers in the job configuration.

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
I don't think so. It will be a single output file per reducer.

If u want multiple small size output files then specify the number of
reducers in the job configuration.

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
I don't think so. It will be a single output file per reducer.

If u want multiple small size output files then specify the number of
reducers in the job configuration.

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
I don't think so. It will be a single output file per reducer.

If u want multiple small size output files then specify the number of
reducers in the job configuration.

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <na...@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel!!
One more question.. If  part-r-XXXX is extremely large,say, 2G, will the file be splitted into more files under the output directory,that is, one reducer could product more than one files.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 14:17
To: user
Subject: Re: Re: Where the output of mappers are saved ?
Yes, the map outputs will be cleaned on job completion.
 
If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel!!
One more question.. If  part-r-XXXX is extremely large,say, 2G, will the file be splitted into more files under the output directory,that is, one reducer could product more than one files.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 14:17
To: user
Subject: Re: Re: Where the output of mappers are saved ?
Yes, the map outputs will be cleaned on job completion.
 
If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel!!
One more question.. If  part-r-XXXX is extremely large,say, 2G, will the file be splitted into more files under the output directory,that is, one reducer could product more than one files.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 14:17
To: user
Subject: Re: Re: Where the output of mappers are saved ?
Yes, the map outputs will be cleaned on job completion.
 
If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Susheel!!
One more question.. If  part-r-XXXX is extremely large,say, 2G, will the file be splitted into more files under the output directory,that is, one reducer could product more than one files.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 14:17
To: user
Subject: Re: Re: Where the output of mappers are saved ?
Yes, the map outputs will be cleaned on job completion.
 
If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....
 
On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Yes, the map outputs will be cleaned on job completion.

If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Yes, the map outputs will be cleaned on job completion.

If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Yes, the map outputs will be cleaned on job completion.

If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Yes, the map outputs will be cleaned on job completion.

If u want to see the map outputs give number of reducers as zero
and verify the files part-m-0000, part-m-0001....

On 12/16/14, bit1129@163.com <bi...@163.com> wrote:
> Do they only exist during the map/reduce process and will be removed after
> the MR finished?
>
> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
> are reduce results.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 13:05
> To: user
> Subject: Re: Where the output of mappers are saved ?
> Map outputs will be in hdfs under your user name and output directory.
>
> They will have name like part-m-0000, part-m-0001 ....
>
>
> On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
>> Hello,
>>
>>
>> Second Try !
>>
>>
>> I  have created a directory to store this mapper output as below.
>>  <property>
>>  <name>mapred.local.dir</name>
>>  <value>/app/hadoop/tmp/myoutput</value>
>>  </property>
>> and i looked at
>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>  total 16
>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>> and i couldnot find anything here when i run the map reduce job . Where
>> by
>> default mapper output is saved and how can I get the size of mapper
>> output
>> in bytes
>>
>>
>> Thanks.
>>
>>
>> From:  Abdul Navaz <na...@gmail.com>
>> Date:  Friday, December 12, 2014 at 12:36 AM
>> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject:  Where the output of mappers are saved ?
>>
>> Hello,
>>
>>
>> I am interested in efficiently manage the Hadoop shuffling traffic and
>> utilize the network bandwidth effectively. To do this I want to know how
>> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
>> nothing but the output of mappers. So where this mapper output is saved ?
>> How can i get the size of mapper output from each datanodes in a real
>> time
>> ?
>> Appreciate your help.
>>
>> Thanks & Regards,
>>
>> Abdul Navaz
>>
>>
>>
>>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Do they only exist during the map/reduce process and will be removed after the MR finished?

When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which are reduce results.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 13:05
To: user
Subject: Re: Where the output of mappers are saved ?
Map outputs will be in hdfs under your user name and output directory.
 
They will have name like part-m-0000, part-m-0001 ....
 
 
On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Do they only exist during the map/reduce process and will be removed after the MR finished?

When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which are reduce results.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 13:05
To: user
Subject: Re: Where the output of mappers are saved ?
Map outputs will be in hdfs under your user name and output directory.
 
They will have name like part-m-0000, part-m-0001 ....
 
 
On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Do they only exist during the map/reduce process and will be removed after the MR finished?

When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which are reduce results.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 13:05
To: user
Subject: Re: Where the output of mappers are saved ?
Map outputs will be in hdfs under your user name and output directory.
 
They will have name like part-m-0000, part-m-0001 ....
 
 
On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Re: Where the output of mappers are saved ?

Posted by "bit1129@163.com" <bi...@163.com>.
Do they only exist during the map/reduce process and will be removed after the MR finished?

When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which are reduce results.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 13:05
To: user
Subject: Re: Where the output of mappers are saved ?
Map outputs will be in hdfs under your user name and output directory.
 
They will have name like part-m-0000, part-m-0001 ....
 
 
On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Map outputs will be in hdfs under your user name and output directory.

They will have name like part-m-0000, part-m-0001 ....


On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Map outputs will be in hdfs under your user name and output directory.

They will have name like part-m-0000, part-m-0001 ....


On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Map outputs will be in hdfs under your user name and output directory.

They will have name like part-m-0000, part-m-0001 ....


On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Where the output of mappers are saved ?

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Map outputs will be in hdfs under your user name and output directory.

They will have name like part-m-0000, part-m-0001 ....


On 12/16/14, Abdul Navaz <na...@gmail.com> wrote:
> Hello,
>
>
> Second Try !
>
>
> I  have created a directory to store this mapper output as below.
>  <property>
>  <name>mapred.local.dir</name>
>  <value>/app/hadoop/tmp/myoutput</value>
>  </property>
> and i looked at
>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>  total 16
>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
> and i couldnot find anything here when i run the map reduce job . Where by
> default mapper output is saved and how can I get the size of mapper output
> in bytes
>
>
> Thanks.
>
>
> From:  Abdul Navaz <na...@gmail.com>
> Date:  Friday, December 12, 2014 at 12:36 AM
> To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject:  Where the output of mappers are saved ?
>
> Hello,
>
>
> I am interested in efficiently manage the Hadoop shuffling traffic and
> utilize the network bandwidth effectively. To do this I want to know how
> much shuffling traffic generated by each Datanodes ? Shuffling traffic is
> nothing but the output of mappers. So where this mapper output is saved ?
> How can i get the size of mapper output from each datanodes in a real time
> ?
> Appreciate your help.
>
> Thanks & Regards,
>
> Abdul Navaz
>
>
>
>

Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,


Second Try !


I  have created a directory to store this mapper output as below.
 <property>
 <name>mapred.local.dir</name>
 <value>/app/hadoop/tmp/myoutput</value>
 </property>       
and i looked at
 hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
 total 16
 drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
 drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
 drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
 drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
and i couldnot find anything here when i run the map reduce job . Where by
default mapper output is saved and how can I get the size of mapper output
in bytes 


Thanks.


From:  Abdul Navaz <na...@gmail.com>
Date:  Friday, December 12, 2014 at 12:36 AM
To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject:  Where the output of mappers are saved ?

Hello,


I am interested in efficiently manage the Hadoop shuffling traffic and
utilize the network bandwidth effectively. To do this I want to know how
much shuffling traffic generated by each Datanodes ? Shuffling traffic is
nothing but the output of mappers. So where this mapper output is saved ?
How can i get the size of mapper output from each datanodes in a real time ?
Appreciate your help.

Thanks & Regards,

Abdul Navaz




Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,


Second Try !


I  have created a directory to store this mapper output as below.
 <property>
 <name>mapred.local.dir</name>
 <value>/app/hadoop/tmp/myoutput</value>
 </property>       
and i looked at
 hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
 total 16
 drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
 drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
 drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
 drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
and i couldnot find anything here when i run the map reduce job . Where by
default mapper output is saved and how can I get the size of mapper output
in bytes 


Thanks.


From:  Abdul Navaz <na...@gmail.com>
Date:  Friday, December 12, 2014 at 12:36 AM
To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject:  Where the output of mappers are saved ?

Hello,


I am interested in efficiently manage the Hadoop shuffling traffic and
utilize the network bandwidth effectively. To do this I want to know how
much shuffling traffic generated by each Datanodes ? Shuffling traffic is
nothing but the output of mappers. So where this mapper output is saved ?
How can i get the size of mapper output from each datanodes in a real time ?
Appreciate your help.

Thanks & Regards,

Abdul Navaz




Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,


Second Try !


I  have created a directory to store this mapper output as below.
 <property>
 <name>mapred.local.dir</name>
 <value>/app/hadoop/tmp/myoutput</value>
 </property>       
and i looked at
 hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
 total 16
 drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
 drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
 drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
 drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
and i couldnot find anything here when i run the map reduce job . Where by
default mapper output is saved and how can I get the size of mapper output
in bytes 


Thanks.


From:  Abdul Navaz <na...@gmail.com>
Date:  Friday, December 12, 2014 at 12:36 AM
To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject:  Where the output of mappers are saved ?

Hello,


I am interested in efficiently manage the Hadoop shuffling traffic and
utilize the network bandwidth effectively. To do this I want to know how
much shuffling traffic generated by each Datanodes ? Shuffling traffic is
nothing but the output of mappers. So where this mapper output is saved ?
How can i get the size of mapper output from each datanodes in a real time ?
Appreciate your help.

Thanks & Regards,

Abdul Navaz




Re: Where the output of mappers are saved ?

Posted by Abdul Navaz <na...@gmail.com>.
Hello,


Second Try !


I  have created a directory to store this mapper output as below.
 <property>
 <name>mapred.local.dir</name>
 <value>/app/hadoop/tmp/myoutput</value>
 </property>       
and i looked at
 hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
 total 16
 drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
 drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
 drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
 drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
and i couldnot find anything here when i run the map reduce job . Where by
default mapper output is saved and how can I get the size of mapper output
in bytes 


Thanks.


From:  Abdul Navaz <na...@gmail.com>
Date:  Friday, December 12, 2014 at 12:36 AM
To:  "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject:  Where the output of mappers are saved ?

Hello,


I am interested in efficiently manage the Hadoop shuffling traffic and
utilize the network bandwidth effectively. To do this I want to know how
much shuffling traffic generated by each Datanodes ? Shuffling traffic is
nothing but the output of mappers. So where this mapper output is saved ?
How can i get the size of mapper output from each datanodes in a real time ?
Appreciate your help.

Thanks & Regards,

Abdul Navaz