You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by sangroya <sa...@gmail.com> on 2011/07/04 11:12:15 UTC

measure the time taken to complete map and reduce phase

Hi,

I am trying to monitor the time to complete a map phase and reduce
phase in hadoop. Is there any way to measure the time taken to
complete map and reduce phase in a cluster.

Thanks,
Amit

--
View this message in context: http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: measure the time taken to complete map and reduce phase

Posted by Bharath Mundlapudi <bh...@yahoo.com>.

Short answer: Job History Logs(JHL).
Long answer: JHL stores all the meta information related to jobs which were executed like Job start/finish, Task start/finish, TaskAttempt start/finish.

Job.Map_Phase = Function ( Job.Map(first).StartTime, Job.Map(last).FinishTime)
Job.Reduce_Phase = Function ( Job.Reduce(first).StartTime, Job.Reduce(last).FinishTime)
  

-Bharath



________________________________
From: sangroya <sa...@gmail.com>
To: hadoop-user@lucene.apache.org
Sent: Monday, July 4, 2011 2:12 AM
Subject: measure the time taken to complete map and reduce phase

Hi,

I am trying to monitor the time to complete a map phase and reduce
phase in hadoop. Is there any way to measure the time taken to
complete map and reduce phase in a cluster.

Thanks,
Amit

--
View this message in context: http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: measure the time taken to complete map and reduce phase

Posted by "real great.." <gr...@gmail.com>.

hey,amit
it comes in the web interface too..

On 7/5/11, Alberto Andreotti <al...@gmail.com> wrote:
> Hi Amit,
>
> I think you can just measure how much it takes to complete the job. I mean,
> from submission until it's done. This would require just regular Java calls.
>
> Alberto.
>
> On 4 July 2011 06:12, sangroya <sa...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to monitor the time to complete a map phase and reduce
>> phase in hadoop. Is there any way to measure the time taken to
>> complete map and reduce phase in a cluster.
>>
>> Thanks,
>> Amit
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
>> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>>
>
>
>
> --
> José Pablo Alberto Andreotti.
> Tel: 54 351 4730292
> Móvil: 54351156526363.
> MSN: albertoandreotti@gmail.com
> Skype: andreottialberto
>


-- 
Regards,
R.V.

Re: measure the time taken to complete map and reduce phase

Posted by Alberto Andreotti <al...@gmail.com>.

Hi Amit,

I think you can just measure how much it takes to complete the job. I mean,
from submission until it's done. This would require just regular Java calls.

Alberto.

On 4 July 2011 06:12, sangroya <sa...@gmail.com> wrote:

> Hi,
>
> I am trying to monitor the time to complete a map phase and reduce
> phase in hadoop. Is there any way to measure the time taken to
> complete map and reduce phase in a cluster.
>
> Thanks,
> Amit
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>



-- 
José Pablo Alberto Andreotti.
Tel: 54 351 4730292
Móvil: 54351156526363.
MSN: albertoandreotti@gmail.com
Skype: andreottialberto

Re: measure the time taken to complete map and reduce phase

Posted by madhu phatak <ph...@gmail.com>.

The console will tell how much time taken by job
On Jul 5, 2011 8:26 AM, "sangroya" <sa...@gmail.com> wrote:
> Hi,
>
> I am trying to monitor the time to complete a map phase and reduce
> phase in hadoop. Is there any way to measure the time taken to
> complete map and reduce phase in a cluster.
>
> Thanks,
> Amit
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: Re: measure the time taken to complete map and reduce phase

Posted by sangroya <sa...@gmail.com>.

Hi,

Thanks for the response!

I have the following queries regarding the Job History file.

I want to know if what the TOTAL_MAPS in the Job history represents.

Also, if FINISHED_MAPS represents the TOTAL_MAPS or the (TOTAL_MAPS -
FAILED_MAPS).

Does FINISHED_MAPS represents successfully executed maps.

I have the same question for REDUCE tasks.

Thanks,
Amit



On Thu, Jul 7, 2011 at 10:58 AM, Hailong [via Lucene]
<ml...@n3.nabble.com> wrote:
> Hi sangroya,
>
> I think you may be interested in reading the following piece of code from
> JobHistory.java in Hadoop.
>
>     /**
>      * Generates the job history filename for a new job
>      */
>     private static String getNewJobHistoryFileName(JobConf jobConf, JobID
> id) {
>       return JOBTRACKER_UNIQUE_STRING
>              + id.toString() + "_" + getUserName(jobConf) + "_"
>              + trimJobName(getJobName(jobConf));
>     }
>
>     /**
>      * Trims the job-name if required
>      */
>     private static String trimJobName(String jobName) {
>       if (jobName.length() > JOB_NAME_TRIM_LENGTH) {
>         jobName = jobName.substring(0, JOB_NAME_TRIM_LENGTH);
>       }
>       return jobName;
>     }
>
> Roughly speaking, the history file name is composed in the following way:
>
> hostname of JT + "_" + start time of JT + "_" + job id + "_" + user name +
> "_" + trimed job name
>
> Cheers!
>
> Hailong
>
> 2011-07-07
>
>
>
> ***********************************************
> * Hailong Yang, PhD. Candidate
> * Sino-German Joint Software Institute,
> * School of Computer Science&Engineering, Beihang University
> * Phone: (86-010)82315908
> * Email: [hidden email]
> * Address: G413, New Main Building in Beihang University,
> *              No.37 XueYuan Road,HaiDian District,
> *              Beijing,P.R.China,100191
> ***********************************************
>
>
>
> 发件人： sangroya
> 发送时间： 2011-07-07  15:49:58
> 收件人： hadoop-user
> 抄送：
> 主题： Re: measure the time taken to complete map and reduce phase
>
> Hi,
> Thanks!
> I am able to parse the Job History Logs(JHL). But, I need to know how
> hadoop assign a name to a file in Job History Logs(JHL).
> I can see that files are named on my local single node cluster as this:
> localhost_1309975809398_job_201107062010_0759_sangroya_word+count.
> But, I am just wondering, what is the exact pattern to name every file
> like this.
> Best Regards,
> Amit
> On Tue, Jul 5, 2011 at 6:53 AM, Hailong [via Lucene]
> <[hidden email]> wrote:
>> Hi sangroya,
>>
>> You can look at the job administration portal at port of 50030 on your
>> JobTracker such as '<a
>> href="<a href="http://localhost:50030'">http://localhost:50030'"><a
>> href="http://localhost:50030'">http://localhost:50030'. At the bottom of the
>> web page there is an item named 'Job Tracker History', click into it and
>> find you job with the job id. There goes the information you want.
>>
>>
>> Cheers!
>>
>> Hailong
>>
>> 2011-07-05
>>
>>
>>
>> ***********************************************
>> * Hailong Yang, PhD. Candidate
>> * Sino-German Joint Software Institute,
>> * School of Computer Science&Engineering, Beihang University
>> * Phone: (86-010)82315908
>> * Email: [hidden email]
>> * Address: G413, New Main Building in Beihang University,
>> *              No.37 XueYuan Road,HaiDian District,
>> *              Beijing,P.R.China,100191
>> ***********************************************
>>
>>
>>
>> 发件人： sangroya
>> 发送时间： 2011-07-05  10:56:38
>> 收件人： hadoop-user
>> 抄送：
>> 主题： measure the time taken to complete map and reduce phase
>>
>> Hi,
>> I am trying to monitor the time to complete a map phase and reduce
>> phase in hadoop. Is there any way to measure the time taken to
>> complete map and reduce phase in a cluster.
>> Thanks,
>> Amit
>> --
>> View this message in context:
>>
>> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
>> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>>
>>
>> ________________________________
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3139665.html
>> To unsubscribe from measure the time taken to complete map and reduce
>> phase,
>> click here.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3147426.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3147566.html
> To unsubscribe from measure the time taken to complete map and reduce phase,
> click here.


--
View this message in context: http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3148620.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: Re: Re: measure the time taken to complete map and reduce phase

Posted by "hailong.yang1115" <ha...@gmail.com>.

I think the TOTAL_MAPS has the same meaning with the FINISHED_MAPS, which represents the total number of map tasks successfully executed. Although there is another metric named Launched_map_tasks, which is the sum of  FINISHED_MAPS, FAILED_MAPS and KILLED_MAPS.

And it is the same for Reduce tasks.


Cheer!

Hailong

2011-07-08 



***********************************************
* Hailong Yang, PhD. Candidate 
* Sino-German Joint Software Institute, 
* School of Computer Science&Engineering, Beihang University
* Phone: (86-010)82315908
* Email: hailong.yang1115@gmail.com
* Address: G413, New Main Building in Beihang University, 
*              No.37 XueYuan Road,HaiDian District, 
*              Beijing,P.R.China,100191
***********************************************



发件人： sangroya 
发送时间： 2011-07-07  23:50:45 
收件人： hadoop-user 
抄送： 
主题： Re: Re: measure the time taken to complete map and reduce phase 
 
Hi,
Thanks for the response!
I have the following queries regarding the Job History file.
I want to know if what the TOTAL_MAPS in the Job history represents.
Also, if FINISHED_MAPS represents the TOTAL_MAPS or the (TOTAL_MAPS -
FAILED_MAPS).
Does FINISHED_MAPS represents successfully executed maps.
I have the same question for REDUCE tasks.
Thanks,
Amit
On Thu, Jul 7, 2011 at 10:58 AM, Hailong [via Lucene]
<ml...@n3.nabble.com> wrote:
> Hi sangroya,
>
> I think you may be interested in reading the following piece of code from
> JobHistory.java in Hadoop.
>
>     /**
>      * Generates the job history filename for a new job
>      */
>     private static String getNewJobHistoryFileName(JobConf jobConf, JobID
> id) {
>       return JOBTRACKER_UNIQUE_STRING
>              + id.toString() + "_" + getUserName(jobConf) + "_"
>              + trimJobName(getJobName(jobConf));
>     }
>
>     /**
>      * Trims the job-name if required
>      */
>     private static String trimJobName(String jobName) {
>       if (jobName.length() > JOB_NAME_TRIM_LENGTH) {
>         jobName = jobName.substring(0, JOB_NAME_TRIM_LENGTH);
>       }
>       return jobName;
>     }
>
> Roughly speaking, the history file name is composed in the following way:
>
> hostname of JT + "_" + start time of JT + "_" + job id + "_" + user name +
> "_" + trimed job name
>
> Cheers!
>
> Hailong
>
> 2011-07-07
>
>
>
> ***********************************************
> * Hailong Yang, PhD. Candidate
> * Sino-German Joint Software Institute,
> * School of Computer Science&Engineering, Beihang University
> * Phone: (86-010)82315908
> * Email: [hidden email]
> * Address: G413, New Main Building in Beihang University,
> *              No.37 XueYuan Road,HaiDian District,
> *              Beijing,P.R.China,100191
> ***********************************************
>
>
>
> 发件人： sangroya
> 发送时间： 2011-07-07  15:49:58
> 收件人： hadoop-user
> 抄送：
> 主题： Re: measure the time taken to complete map and reduce phase
>
> Hi,
> Thanks!
> I am able to parse the Job History Logs(JHL). But, I need to know how
> hadoop assign a name to a file in Job History Logs(JHL).
> I can see that files are named on my local single node cluster as this:
> localhost_1309975809398_job_201107062010_0759_sangroya_word+count.
> But, I am just wondering, what is the exact pattern to name every file
> like this.
> Best Regards,
> Amit
> On Tue, Jul 5, 2011 at 6:53 AM, Hailong [via Lucene]
> <[hidden email]> wrote:
>> Hi sangroya,
>>
>> You can look at the job administration portal at port of 50030 on your
>> JobTracker such as '<a
>> href="<a href="http://localhost:50030'">http://localhost:50030'"><a
>> href="http://localhost:50030'">http://localhost:50030'. At the bottom of the
>> web page there is an item named 'Job Tracker History', click into it and
>> find you job with the job id. There goes the information you want.
>>
>>
>> Cheers!
>>
>> Hailong
>>
>> 2011-07-05
>>
>>
>>
>> ***********************************************
>> * Hailong Yang, PhD. Candidate
>> * Sino-German Joint Software Institute,
>> * School of Computer Science&Engineering, Beihang University
>> * Phone: (86-010)82315908
>> * Email: [hidden email]
>> * Address: G413, New Main Building in Beihang University,
>> *              No.37 XueYuan Road,HaiDian District,
>> *              Beijing,P.R.China,100191
>> ***********************************************
>>
>>
>>
>> 发件人： sangroya
>> 发送时间： 2011-07-05  10:56:38
>> 收件人： hadoop-user
>> 抄送：
>> 主题： measure the time taken to complete map and reduce phase
>>
>> Hi,
>> I am trying to monitor the time to complete a map phase and reduce
>> phase in hadoop. Is there any way to measure the time taken to
>> complete map and reduce phase in a cluster.
>> Thanks,
>> Amit
>> --
>> View this message in context:
>>
>> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
>> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>>
>>
>> ________________________________
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3139665.html
>> To unsubscribe from measure the time taken to complete map and reduce
>> phase,
>> click here.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3147426.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3147566.html
> To unsubscribe from measure the time taken to complete map and reduce phase,
> click here.
--
View this message in context: http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3148620.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: Re: measure the time taken to complete map and reduce phase

Posted by "hailong.yang1115" <ha...@gmail.com>.

Hi sangroya,

I think you may be interested in reading the following piece of code from JobHistory.java in Hadoop.

    /**
     * Generates the job history filename for a new job
     */
    private static String getNewJobHistoryFileName(JobConf jobConf, JobID id) {
      return JOBTRACKER_UNIQUE_STRING
             + id.toString() + "_" + getUserName(jobConf) + "_" 
             + trimJobName(getJobName(jobConf));
    }
    
    /**
     * Trims the job-name if required
     */
    private static String trimJobName(String jobName) {
      if (jobName.length() > JOB_NAME_TRIM_LENGTH) {
        jobName = jobName.substring(0, JOB_NAME_TRIM_LENGTH);
      }
      return jobName;
    }
    
Roughly speaking, the history file name is composed in the following way:

hostname of JT + "_" + start time of JT + "_" + job id + "_" + user name + "_" + trimed job name

Cheers!

Hailong

2011-07-07 



***********************************************
* Hailong Yang, PhD. Candidate 
* Sino-German Joint Software Institute, 
* School of Computer Science&Engineering, Beihang University
* Phone: (86-010)82315908
* Email: hailong.yang1115@gmail.com
* Address: G413, New Main Building in Beihang University, 
*              No.37 XueYuan Road,HaiDian District, 
*              Beijing,P.R.China,100191
***********************************************



发件人： sangroya 
发送时间： 2011-07-07  15:49:58 
收件人： hadoop-user 
抄送： 
主题： Re: measure the time taken to complete map and reduce phase 
 
Hi,
Thanks!
I am able to parse the Job History Logs(JHL). But, I need to know how
hadoop assign a name to a file in Job History Logs(JHL).
I can see that files are named on my local single node cluster as this:
localhost_1309975809398_job_201107062010_0759_sangroya_word+count.
But, I am just wondering, what is the exact pattern to name every file
like this.
Best Regards,
Amit
On Tue, Jul 5, 2011 at 6:53 AM, Hailong [via Lucene]
<ml...@n3.nabble.com> wrote:
> Hi sangroya,
>
> You can look at the job administration portal at port of 50030 on your
> JobTracker such as '<a
> href="http://localhost:50030'">http://localhost:50030'. At the bottom of the
> web page there is an item named 'Job Tracker History', click into it and
> find you job with the job id. There goes the information you want.
>
>
> Cheers!
>
> Hailong
>
> 2011-07-05
>
>
>
> ***********************************************
> * Hailong Yang, PhD. Candidate
> * Sino-German Joint Software Institute,
> * School of Computer Science&Engineering, Beihang University
> * Phone: (86-010)82315908
> * Email: [hidden email]
> * Address: G413, New Main Building in Beihang University,
> *              No.37 XueYuan Road,HaiDian District,
> *              Beijing,P.R.China,100191
> ***********************************************
>
>
>
> 发件人： sangroya
> 发送时间： 2011-07-05  10:56:38
> 收件人： hadoop-user
> 抄送：
> 主题： measure the time taken to complete map and reduce phase
>
> Hi,
> I am trying to monitor the time to complete a map phase and reduce
> phase in hadoop. Is there any way to measure the time taken to
> complete map and reduce phase in a cluster.
> Thanks,
> Amit
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3139665.html
> To unsubscribe from measure the time taken to complete map and reduce phase,
> click here.
--
View this message in context: http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3147426.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: measure the time taken to complete map and reduce phase

Posted by sangroya <sa...@gmail.com>.

Hi,

Thanks!

I am able to parse the Job History Logs(JHL). But, I need to know how
hadoop assign a name to a file in Job History Logs(JHL).

I can see that files are named on my local single node cluster as this:

localhost_1309975809398_job_201107062010_0759_sangroya_word+count.

But, I am just wondering, what is the exact pattern to name every file
like this.

Best Regards,
Amit

On Tue, Jul 5, 2011 at 6:53 AM, Hailong [via Lucene]
<ml...@n3.nabble.com> wrote:
> Hi sangroya,
>
> You can look at the job administration portal at port of 50030 on your
> JobTracker such as '<a
> href="http://localhost:50030'">http://localhost:50030'. At the bottom of the
> web page there is an item named 'Job Tracker History', click into it and
> find you job with the job id. There goes the information you want.
>
>
> Cheers!
>
> Hailong
>
> 2011-07-05
>
>
>
> ***********************************************
> * Hailong Yang, PhD. Candidate
> * Sino-German Joint Software Institute,
> * School of Computer Science&Engineering, Beihang University
> * Phone: (86-010)82315908
> * Email: [hidden email]
> * Address: G413, New Main Building in Beihang University,
> *              No.37 XueYuan Road,HaiDian District,
> *              Beijing,P.R.China,100191
> ***********************************************
>
>
>
> 发件人： sangroya
> 发送时间： 2011-07-05  10:56:38
> 收件人： hadoop-user
> 抄送：
> 主题： measure the time taken to complete map and reduce phase
>
> Hi,
> I am trying to monitor the time to complete a map phase and reduce
> phase in hadoop. Is there any way to measure the time taken to
> complete map and reduce phase in a cluster.
> Thanks,
> Amit
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3136991.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3139665.html
> To unsubscribe from measure the time taken to complete map and reduce phase,
> click here.


--
View this message in context: http://lucene.472066.n3.nabble.com/measure-the-time-taken-to-complete-map-and-reduce-phase-tp3136991p3147426.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.