You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by achilles852 <fa...@gmail.com> on 2009/09/30 02:57:35 UTC

last map task taking too long

Hey.. I am trying to write a small mapreduce program. I launch a few map
tasks, each of which should complete within a certain time (say 5
minutes)... all the tasks complete within 5 minutes except the last one -
which takes around 10 times more the time taken by all other map
tasks.....any idea why this is happening?

I am using Hadoop version 0.19.2, tried running it locally as well as on
EC2. 
-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

RE: last map task taking too long

Posted by achilles852 <fa...@gmail.com>.

Hi
The input is a plain text file. I use the parameters specified in the input
file to launch a process on each machine and then collect the results back.
I am not using cached files. Everything needed is contained in the job jar
file. Each map task is supposed to finish within one minute.  

Here's the output from the reduce phase, where things get stuck:
Running Hadoop in Pseudo-distributed mode.

[code]
2009-09-30 06:27:38,601 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 2 map output(s) where 0 is
already in progress
2009-09-30 06:27:38,603 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:27:38,603 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:28:33,623 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 1 new map-outputs
2009-09-30 06:28:33,624 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 1 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:28:33,628 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
2 bytes (6 raw bytes) into RAM from attempt_200909292242_0017_m_000007_0
2009-09-30 06:28:33,628 INFO org.apache.hadoop.mapred.ReduceTask: Read 2
bytes from map-output for attempt_200909292242_0017_m_000007_0
2009-09-30 06:28:33,629 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1
from attempt_200909292242_0017_m_000007_0 -> (-1, -1) from pc01
2009-09-30 06:28:40,624 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:28:40,625 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:28:40,626 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:29:40,639 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:29:40,640 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:29:40,641 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:30:40,655 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:30:40,657 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:30:40,657 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:31:40,677 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:31:40,679 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:31:40,679 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:32:40,692 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:32:40,693 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:32:40,694 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:33:40,708 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:33:40,710 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:33:40,710 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:34:40,731 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:34:40,733 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:34:40,733 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:35:40,753 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:35:40,755 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 1 new map-outputs
[/code]




Amogh Vasekar-2 wrote:
> 
> Hi,
> Can you provide info on the input like compression etc? Also, are you
> using cached files in your map tasks? It might be helpful if you paste the
> logs here after blanking your system specific info., as then one can find
> out where till the reduce it went or if the copy phase started at all.
> 
> Thanks,
> Amogh
> 
> -----Original Message-----
> From: achilles852 [mailto:faheemkhan@gmail.com] 
> Sent: Wednesday, September 30, 2009 6:38 AM
> To: core-dev@hadoop.apache.org
> Subject: Re: last map task taking too long
> 
> 
> Basically, it finishes what it is supposed to do (I view the logs to find
> out), but does not move onto the reduce stage.
> 
> 
> Ted Dunning wrote:
>> 
>> Is that last map task actually running, or is it pending?
>> 
>> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com>
>> wrote:
>> 
>>>
>>> Hey.. I am trying to write a small mapreduce program. I launch a few map
>>> tasks, each of which should complete within a certain time (say 5
>>> minutes)... all the tasks complete within 5 minutes except the last one
>>> -
>>> which takes around 10 times more the time taken by all other map
>>> tasks.....any idea why this is happening?
>>>
>>> I am using Hadoop version 0.19.2, tried running it locally as well as on
>>> EC2.
>>> --
>>> View this message in context:
>>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
>>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>>
>>>
>> 
>> 
>> -- 
>> Ted Dunning, CTO
>> DeepDyve
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
> 
> 
> 
-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25675439.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

Re: last map task taking too long

Posted by achilles852 <fa...@gmail.com>.

Yes, I did everything I could think of, but nothing has worked so far. I even
set the number of reducers to zero, just to be sure. What I don't understand
is, if all of the map tasks are using the same code, then why do the first
few tasks complete normally while the last one takes ages.
Do you think using a different version of Hadoop would help?
Any help will be highly appreciated.
Thanks.


thushw wrote:
> 
> did you look for network-related errors in the logs. i've found instances
> when a VPN connection confused networking in hadoop. (could be a JVM
> issue,
> didn't spend the time to know for sure)
> 
> thanks,
> thushara
> 
> On Wed, Sep 30, 2009 at 1:46 AM, achilles852 <fa...@gmail.com> wrote:
> 
>>
>> Running Hadoop on EC2 using 32 bit small instances with 5 slaves. The job
>> has
>> 5 map tasks and 1 reduce task. Each map task is 5 minutes long.
>> After 5 minutes and 14 seconds, 4 map tasks completed but the last one is
>> now stalling the entire process.
>>
>> Here are the logs for JobTracker:
>> ----------------------------------------
>> [start of JobTracker Logs]
>> ----------------------------------------
>> 2009-09-30 04:31:24,468 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000006_0' to tip
>> task_200909300419_0001_m_000006, for tracker
>> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:58857'
>> 2009-09-30 04:31:28,009 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000006_0' has completed
>> task_200909300419_0001_m_000006 successfully.
>> 2009-09-30 04:31:28,012 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000001_0' to tip
>> task_200909300419_0001_m_000001, for tracker
>> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:58857'
>> 2009-09-30 04:31:28,015 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000001
>> 2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000002_0' to tip
>> task_200909300419_0001_m_000002, for tracker
>> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:60031'
>> 2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000002
>> 2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000000_0' to tip
>> task_200909300419_0001_m_000000, for tracker
>> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:54570'
>> 2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000000
>> 2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000003_0' to tip
>> task_200909300419_0001_m_000003, for tracker
>> 'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:42359'
>> 2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000003
>> 2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000004_0' to tip
>> task_200909300419_0001_m_000004, for tracker
>> 'tracker_domU-12-31-39-02-25-A1.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:56865'
>> 2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000004
>> 2009-09-30 04:31:33,095 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_r_000000_0' to tip
>> task_200909300419_0001_r_000000, for tracker
>> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:58857'
>> 2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000000_0' has completed
>> task_200909300419_0001_m_000000 successfully.
>> 2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000000 was 803/200 = 4.015
>> 2009-09-30 04:36:33,941 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 4.015
>> 2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000002_1' to tip
>> task_200909300419_0001_m_000002, for tracker
>> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:54570'
>> 2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000002
>> 2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000002_0' has completed
>> task_200909300419_0001_m_000002 successfully.
>> 2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000002 was 317/190 =
>> 1.668421052631579
>> 2009-09-30 04:36:34,794 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 2.841710526315789
>> 2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000003_1' to tip
>> task_200909300419_0001_m_000003, for tracker
>> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:60031'
>> 2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000003
>> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000003_0' has completed
>> task_200909300419_0001_m_000003 successfully.
>> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000003 was 73/189 =
>> 0.3862433862433862
>> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 2.023221479624988
>> 2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000001_1' to tip
>> task_200909300419_0001_m_000001, for tracker
>> 'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:42359'
>> 2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000001
>> 2009-09-30 04:36:35,854 INFO org.apache.hadoop.mapred.JobTracker:
>> attempt_200909300419_0001_m_000002_1 is 1909 ms debug.
>> 2009-09-30 04:36:36,038 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000004_0' has completed
>> task_200909300419_0001_m_000004 successfully.
>> 2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000004 was 918/187 =
>> 4.909090909090909
>> 2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 2.744688836991468
>> 2009-09-30 04:36:43,966 INFO org.apache.hadoop.mapred.JobTracker: Removed
>> completed task 'attempt_200909300419_0001_m_000002_1' from
>> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:54570'
>> 2009-09-30 04:36:44,853 INFO org.apache.hadoop.mapred.JobTracker: Removed
>> completed task 'attempt_200909300419_0001_m_000003_1' from
>> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:60031'
>> ----------------------------------------
>> [End of JobTracker Logs]
>> ----------------------------------------
>>
>>
>> Amogh Vasekar-2 wrote:
>> >
>> > Hi,
>> > Can you provide info on the input like compression etc? Also, are you
>> > using cached files in your map tasks? It might be helpful if you paste
>> the
>> > logs here after blanking your system specific info., as then one can
>> find
>> > out where till the reduce it went or if the copy phase started at all.
>> >
>> > Thanks,
>> > Amogh
>> >
>> > -----Original Message-----
>> > From: achilles852 [mailto:faheemkhan@gmail.com]
>> > Sent: Wednesday, September 30, 2009 6:38 AM
>> > To: core-dev@hadoop.apache.org
>> > Subject: Re: last map task taking too long
>> >
>> >
>> > Basically, it finishes what it is supposed to do (I view the logs to
>> find
>> > out), but does not move onto the reduce stage.
>> >
>> >
>> > Ted Dunning wrote:
>> >>
>> >> Is that last map task actually running, or is it pending?
>> >>
>> >> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com>
>> >> wrote:
>> >>
>> >>>
>> >>> Hey.. I am trying to write a small mapreduce program. I launch a few
>> map
>> >>> tasks, each of which should complete within a certain time (say 5
>> >>> minutes)... all the tasks complete within 5 minutes except the last
>> one
>> >>> -
>> >>> which takes around 10 times more the time taken by all other map
>> >>> tasks.....any idea why this is happening?
>> >>>
>> >>> I am using Hadoop version 0.19.2, tried running it locally as well as
>> on
>> >>> EC2.
>> >>> --
>> >>> View this message in context:
>> >>>
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
>> >>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Ted Dunning, CTO
>> >> DeepDyve
>> >>
>> >>
>> >
>> > --
>> > View this message in context:
>> >
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
>> > Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>> >
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25677160.html
>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25691439.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

Re: last map task taking too long

Posted by Thushara Wijeratna <th...@gmail.com>.

did you look for network-related errors in the logs. i've found instances
when a VPN connection confused networking in hadoop. (could be a JVM issue,
didn't spend the time to know for sure)

thanks,
thushara

On Wed, Sep 30, 2009 at 1:46 AM, achilles852 <fa...@gmail.com> wrote:

>
> Running Hadoop on EC2 using 32 bit small instances with 5 slaves. The job
> has
> 5 map tasks and 1 reduce task. Each map task is 5 minutes long.
> After 5 minutes and 14 seconds, 4 map tasks completed but the last one is
> now stalling the entire process.
>
> Here are the logs for JobTracker:
> ----------------------------------------
> [start of JobTracker Logs]
> ----------------------------------------
> 2009-09-30 04:31:24,468 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000006_0' to tip
> task_200909300419_0001_m_000006, for tracker
> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
> 127.0.0.1:58857'
> 2009-09-30 04:31:28,009 INFO org.apache.hadoop.mapred.JobInProgress: Task
> 'attempt_200909300419_0001_m_000006_0' has completed
> task_200909300419_0001_m_000006 successfully.
> 2009-09-30 04:31:28,012 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000001_0' to tip
> task_200909300419_0001_m_000001, for tracker
> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
> 127.0.0.1:58857'
> 2009-09-30 04:31:28,015 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000001
> 2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000002_0' to tip
> task_200909300419_0001_m_000002, for tracker
> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
> 127.0.0.1:60031'
> 2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000002
> 2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000000_0' to tip
> task_200909300419_0001_m_000000, for tracker
> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
> 127.0.0.1:54570'
> 2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000000
> 2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000003_0' to tip
> task_200909300419_0001_m_000003, for tracker
> 'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/
> 127.0.0.1:42359'
> 2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000003
> 2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000004_0' to tip
> task_200909300419_0001_m_000004, for tracker
> 'tracker_domU-12-31-39-02-25-A1.compute-1.internal:localhost.localdomain/
> 127.0.0.1:56865'
> 2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000004
> 2009-09-30 04:31:33,095 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_r_000000_0' to tip
> task_200909300419_0001_r_000000, for tracker
> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
> 127.0.0.1:58857'
> 2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.JobInProgress: Task
> 'attempt_200909300419_0001_m_000000_0' has completed
> task_200909300419_0001_m_000000 successfully.
> 2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.ResourceEstimator:
> measured blowup on task_200909300419_0001_m_000000 was 803/200 = 4.015
> 2009-09-30 04:36:33,941 INFO org.apache.hadoop.mapred.ResourceEstimator:
> new
> estimate is blowup = 4.015
> 2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000002_1' to tip
> task_200909300419_0001_m_000002, for tracker
> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
> 127.0.0.1:54570'
> 2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000002
> 2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.JobInProgress: Task
> 'attempt_200909300419_0001_m_000002_0' has completed
> task_200909300419_0001_m_000002 successfully.
> 2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.ResourceEstimator:
> measured blowup on task_200909300419_0001_m_000002 was 317/190 =
> 1.668421052631579
> 2009-09-30 04:36:34,794 INFO org.apache.hadoop.mapred.ResourceEstimator:
> new
> estimate is blowup = 2.841710526315789
> 2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000003_1' to tip
> task_200909300419_0001_m_000003, for tracker
> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
> 127.0.0.1:60031'
> 2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000003
> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.JobInProgress: Task
> 'attempt_200909300419_0001_m_000003_0' has completed
> task_200909300419_0001_m_000003 successfully.
> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
> measured blowup on task_200909300419_0001_m_000003 was 73/189 =
> 0.3862433862433862
> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
> new
> estimate is blowup = 2.023221479624988
> 2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200909300419_0001_m_000001_1' to tip
> task_200909300419_0001_m_000001, for tracker
> 'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/
> 127.0.0.1:42359'
> 2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing data-local task task_200909300419_0001_m_000001
> 2009-09-30 04:36:35,854 INFO org.apache.hadoop.mapred.JobTracker:
> attempt_200909300419_0001_m_000002_1 is 1909 ms debug.
> 2009-09-30 04:36:36,038 INFO org.apache.hadoop.mapred.JobInProgress: Task
> 'attempt_200909300419_0001_m_000004_0' has completed
> task_200909300419_0001_m_000004 successfully.
> 2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
> measured blowup on task_200909300419_0001_m_000004 was 918/187 =
> 4.909090909090909
> 2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
> new
> estimate is blowup = 2.744688836991468
> 2009-09-30 04:36:43,966 INFO org.apache.hadoop.mapred.JobTracker: Removed
> completed task 'attempt_200909300419_0001_m_000002_1' from
> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
> 127.0.0.1:54570'
> 2009-09-30 04:36:44,853 INFO org.apache.hadoop.mapred.JobTracker: Removed
> completed task 'attempt_200909300419_0001_m_000003_1' from
> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
> 127.0.0.1:60031'
> ----------------------------------------
> [End of JobTracker Logs]
> ----------------------------------------
>
>
> Amogh Vasekar-2 wrote:
> >
> > Hi,
> > Can you provide info on the input like compression etc? Also, are you
> > using cached files in your map tasks? It might be helpful if you paste
> the
> > logs here after blanking your system specific info., as then one can find
> > out where till the reduce it went or if the copy phase started at all.
> >
> > Thanks,
> > Amogh
> >
> > -----Original Message-----
> > From: achilles852 [mailto:faheemkhan@gmail.com]
> > Sent: Wednesday, September 30, 2009 6:38 AM
> > To: core-dev@hadoop.apache.org
> > Subject: Re: last map task taking too long
> >
> >
> > Basically, it finishes what it is supposed to do (I view the logs to find
> > out), but does not move onto the reduce stage.
> >
> >
> > Ted Dunning wrote:
> >>
> >> Is that last map task actually running, or is it pending?
> >>
> >> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com>
> >> wrote:
> >>
> >>>
> >>> Hey.. I am trying to write a small mapreduce program. I launch a few
> map
> >>> tasks, each of which should complete within a certain time (say 5
> >>> minutes)... all the tasks complete within 5 minutes except the last one
> >>> -
> >>> which takes around 10 times more the time taken by all other map
> >>> tasks.....any idea why this is happening?
> >>>
> >>> I am using Hadoop version 0.19.2, tried running it locally as well as
> on
> >>> EC2.
> >>> --
> >>> View this message in context:
> >>>
> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
> >>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
> >>>
> >>>
> >>
> >>
> >> --
> >> Ted Dunning, CTO
> >> DeepDyve
> >>
> >>
> >
> > --
> > View this message in context:
> >
> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
> > Sent from the Hadoop core-dev mailing list archive at Nabble.com.
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25677160.html
> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>
>

RE: last map task taking too long

Posted by achilles852 <fa...@gmail.com>.

Running Hadoop on EC2 using 32 bit small instances with 5 slaves. The job has
5 map tasks and 1 reduce task. Each map task is 5 minutes long.
After 5 minutes and 14 seconds, 4 map tasks completed but the last one is
now stalling the entire process.

Here are the logs for JobTracker:
----------------------------------------
[start of JobTracker Logs]
----------------------------------------
2009-09-30 04:31:24,468 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000006_0' to tip
task_200909300419_0001_m_000006, for tracker
'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/127.0.0.1:58857'
2009-09-30 04:31:28,009 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000006_0' has completed
task_200909300419_0001_m_000006 successfully.
2009-09-30 04:31:28,012 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000001_0' to tip
task_200909300419_0001_m_000001, for tracker
'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/127.0.0.1:58857'
2009-09-30 04:31:28,015 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000001
2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000002_0' to tip
task_200909300419_0001_m_000002, for tracker
'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/127.0.0.1:60031'
2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000002
2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000000_0' to tip
task_200909300419_0001_m_000000, for tracker
'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/127.0.0.1:54570'
2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000000
2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000003_0' to tip
task_200909300419_0001_m_000003, for tracker
'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/127.0.0.1:42359'
2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000003
2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000004_0' to tip
task_200909300419_0001_m_000004, for tracker
'tracker_domU-12-31-39-02-25-A1.compute-1.internal:localhost.localdomain/127.0.0.1:56865'
2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000004
2009-09-30 04:31:33,095 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_r_000000_0' to tip
task_200909300419_0001_r_000000, for tracker
'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/127.0.0.1:58857'
2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000000_0' has completed
task_200909300419_0001_m_000000 successfully.
2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000000 was 803/200 = 4.015
2009-09-30 04:36:33,941 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 4.015
2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000002_1' to tip
task_200909300419_0001_m_000002, for tracker
'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/127.0.0.1:54570'
2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000002
2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000002_0' has completed
task_200909300419_0001_m_000002 successfully.
2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000002 was 317/190 =
1.668421052631579
2009-09-30 04:36:34,794 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 2.841710526315789
2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000003_1' to tip
task_200909300419_0001_m_000003, for tracker
'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/127.0.0.1:60031'
2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000003
2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000003_0' has completed
task_200909300419_0001_m_000003 successfully.
2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000003 was 73/189 =
0.3862433862433862
2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 2.023221479624988
2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000001_1' to tip
task_200909300419_0001_m_000001, for tracker
'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/127.0.0.1:42359'
2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000001
2009-09-30 04:36:35,854 INFO org.apache.hadoop.mapred.JobTracker:
attempt_200909300419_0001_m_000002_1 is 1909 ms debug.
2009-09-30 04:36:36,038 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000004_0' has completed
task_200909300419_0001_m_000004 successfully.
2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000004 was 918/187 =
4.909090909090909
2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 2.744688836991468
2009-09-30 04:36:43,966 INFO org.apache.hadoop.mapred.JobTracker: Removed
completed task 'attempt_200909300419_0001_m_000002_1' from
'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/127.0.0.1:54570'
2009-09-30 04:36:44,853 INFO org.apache.hadoop.mapred.JobTracker: Removed
completed task 'attempt_200909300419_0001_m_000003_1' from
'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/127.0.0.1:60031'
----------------------------------------
[End of JobTracker Logs]
----------------------------------------


Amogh Vasekar-2 wrote:
> 
> Hi,
> Can you provide info on the input like compression etc? Also, are you
> using cached files in your map tasks? It might be helpful if you paste the
> logs here after blanking your system specific info., as then one can find
> out where till the reduce it went or if the copy phase started at all.
> 
> Thanks,
> Amogh
> 
> -----Original Message-----
> From: achilles852 [mailto:faheemkhan@gmail.com] 
> Sent: Wednesday, September 30, 2009 6:38 AM
> To: core-dev@hadoop.apache.org
> Subject: Re: last map task taking too long
> 
> 
> Basically, it finishes what it is supposed to do (I view the logs to find
> out), but does not move onto the reduce stage.
> 
> 
> Ted Dunning wrote:
>> 
>> Is that last map task actually running, or is it pending?
>> 
>> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com>
>> wrote:
>> 
>>>
>>> Hey.. I am trying to write a small mapreduce program. I launch a few map
>>> tasks, each of which should complete within a certain time (say 5
>>> minutes)... all the tasks complete within 5 minutes except the last one
>>> -
>>> which takes around 10 times more the time taken by all other map
>>> tasks.....any idea why this is happening?
>>>
>>> I am using Hadoop version 0.19.2, tried running it locally as well as on
>>> EC2.
>>> --
>>> View this message in context:
>>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
>>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>>
>>>
>> 
>> 
>> -- 
>> Ted Dunning, CTO
>> DeepDyve
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25677160.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

RE: last map task taking too long

Posted by Amogh Vasekar <am...@yahoo-inc.com>.

Hi,
Can you provide info on the input like compression etc? Also, are you using cached files in your map tasks? It might be helpful if you paste the logs here after blanking your system specific info., as then one can find out where till the reduce it went or if the copy phase started at all.

Thanks,
Amogh

-----Original Message-----
From: achilles852 [mailto:faheemkhan@gmail.com] 
Sent: Wednesday, September 30, 2009 6:38 AM
To: core-dev@hadoop.apache.org
Subject: Re: last map task taking too long


Basically, it finishes what it is supposed to do (I view the logs to find
out), but does not move onto the reduce stage.


Ted Dunning wrote:
> 
> Is that last map task actually running, or is it pending?
> 
> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com> wrote:
> 
>>
>> Hey.. I am trying to write a small mapreduce program. I launch a few map
>> tasks, each of which should complete within a certain time (say 5
>> minutes)... all the tasks complete within 5 minutes except the last one -
>> which takes around 10 times more the time taken by all other map
>> tasks.....any idea why this is happening?
>>
>> I am using Hadoop version 0.19.2, tried running it locally as well as on
>> EC2.
>> --
>> View this message in context:
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Ted Dunning, CTO
> DeepDyve
> 
> 

-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

Re: last map task taking too long

Posted by achilles852 <fa...@gmail.com>.

Basically, it finishes what it is supposed to do (I view the logs to find
out), but does not move onto the reduce stage.


Ted Dunning wrote:
> 
> Is that last map task actually running, or is it pending?
> 
> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com> wrote:
> 
>>
>> Hey.. I am trying to write a small mapreduce program. I launch a few map
>> tasks, each of which should complete within a certain time (say 5
>> minutes)... all the tasks complete within 5 minutes except the last one -
>> which takes around 10 times more the time taken by all other map
>> tasks.....any idea why this is happening?
>>
>> I am using Hadoop version 0.19.2, tried running it locally as well as on
>> EC2.
>> --
>> View this message in context:
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Ted Dunning, CTO
> DeepDyve
> 
> 

-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

Re: last map task taking too long

Posted by Ted Dunning <te...@gmail.com>.

Is that last map task actually running, or is it pending?

On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com> wrote:

>
> Hey.. I am trying to write a small mapreduce program. I launch a few map
> tasks, each of which should complete within a certain time (say 5
> minutes)... all the tasks complete within 5 minutes except the last one -
> which takes around 10 times more the time taken by all other map
> tasks.....any idea why this is happening?
>
> I am using Hadoop version 0.19.2, tried running it locally as well as on
> EC2.
> --
> View this message in context:
> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>
>


-- 
Ted Dunning, CTO
DeepDyve