You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by achilles852 <fa...@gmail.com> on 2009/10/01 02:23:30 UTC

Re: last map task taking too long

Yes, I did everything I could think of, but nothing has worked so far. I even
set the number of reducers to zero, just to be sure. What I don't understand
is, if all of the map tasks are using the same code, then why do the first
few tasks complete normally while the last one takes ages.
Do you think using a different version of Hadoop would help?
Any help will be highly appreciated.
Thanks.


thushw wrote:
> 
> did you look for network-related errors in the logs. i've found instances
> when a VPN connection confused networking in hadoop. (could be a JVM
> issue,
> didn't spend the time to know for sure)
> 
> thanks,
> thushara
> 
> On Wed, Sep 30, 2009 at 1:46 AM, achilles852 <fa...@gmail.com> wrote:
> 
>>
>> Running Hadoop on EC2 using 32 bit small instances with 5 slaves. The job
>> has
>> 5 map tasks and 1 reduce task. Each map task is 5 minutes long.
>> After 5 minutes and 14 seconds, 4 map tasks completed but the last one is
>> now stalling the entire process.
>>
>> Here are the logs for JobTracker:
>> ----------------------------------------
>> [start of JobTracker Logs]
>> ----------------------------------------
>> 2009-09-30 04:31:24,468 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000006_0' to tip
>> task_200909300419_0001_m_000006, for tracker
>> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:58857'
>> 2009-09-30 04:31:28,009 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000006_0' has completed
>> task_200909300419_0001_m_000006 successfully.
>> 2009-09-30 04:31:28,012 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000001_0' to tip
>> task_200909300419_0001_m_000001, for tracker
>> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:58857'
>> 2009-09-30 04:31:28,015 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000001
>> 2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000002_0' to tip
>> task_200909300419_0001_m_000002, for tracker
>> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:60031'
>> 2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000002
>> 2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000000_0' to tip
>> task_200909300419_0001_m_000000, for tracker
>> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:54570'
>> 2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000000
>> 2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000003_0' to tip
>> task_200909300419_0001_m_000003, for tracker
>> 'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:42359'
>> 2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000003
>> 2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000004_0' to tip
>> task_200909300419_0001_m_000004, for tracker
>> 'tracker_domU-12-31-39-02-25-A1.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:56865'
>> 2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000004
>> 2009-09-30 04:31:33,095 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_r_000000_0' to tip
>> task_200909300419_0001_r_000000, for tracker
>> 'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:58857'
>> 2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000000_0' has completed
>> task_200909300419_0001_m_000000 successfully.
>> 2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000000 was 803/200 = 4.015
>> 2009-09-30 04:36:33,941 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 4.015
>> 2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000002_1' to tip
>> task_200909300419_0001_m_000002, for tracker
>> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:54570'
>> 2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000002
>> 2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000002_0' has completed
>> task_200909300419_0001_m_000002 successfully.
>> 2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000002 was 317/190 =
>> 1.668421052631579
>> 2009-09-30 04:36:34,794 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 2.841710526315789
>> 2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000003_1' to tip
>> task_200909300419_0001_m_000003, for tracker
>> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:60031'
>> 2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000003
>> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000003_0' has completed
>> task_200909300419_0001_m_000003 successfully.
>> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000003 was 73/189 =
>> 0.3862433862433862
>> 2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 2.023221479624988
>> 2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_200909300419_0001_m_000001_1' to tip
>> task_200909300419_0001_m_000001, for tracker
>> 'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:42359'
>> 2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing data-local task task_200909300419_0001_m_000001
>> 2009-09-30 04:36:35,854 INFO org.apache.hadoop.mapred.JobTracker:
>> attempt_200909300419_0001_m_000002_1 is 1909 ms debug.
>> 2009-09-30 04:36:36,038 INFO org.apache.hadoop.mapred.JobInProgress: Task
>> 'attempt_200909300419_0001_m_000004_0' has completed
>> task_200909300419_0001_m_000004 successfully.
>> 2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> measured blowup on task_200909300419_0001_m_000004 was 918/187 =
>> 4.909090909090909
>> 2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
>> new
>> estimate is blowup = 2.744688836991468
>> 2009-09-30 04:36:43,966 INFO org.apache.hadoop.mapred.JobTracker: Removed
>> completed task 'attempt_200909300419_0001_m_000002_1' from
>> 'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:54570'
>> 2009-09-30 04:36:44,853 INFO org.apache.hadoop.mapred.JobTracker: Removed
>> completed task 'attempt_200909300419_0001_m_000003_1' from
>> 'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/
>> 127.0.0.1:60031'
>> ----------------------------------------
>> [End of JobTracker Logs]
>> ----------------------------------------
>>
>>
>> Amogh Vasekar-2 wrote:
>> >
>> > Hi,
>> > Can you provide info on the input like compression etc? Also, are you
>> > using cached files in your map tasks? It might be helpful if you paste
>> the
>> > logs here after blanking your system specific info., as then one can
>> find
>> > out where till the reduce it went or if the copy phase started at all.
>> >
>> > Thanks,
>> > Amogh
>> >
>> > -----Original Message-----
>> > From: achilles852 [mailto:faheemkhan@gmail.com]
>> > Sent: Wednesday, September 30, 2009 6:38 AM
>> > To: core-dev@hadoop.apache.org
>> > Subject: Re: last map task taking too long
>> >
>> >
>> > Basically, it finishes what it is supposed to do (I view the logs to
>> find
>> > out), but does not move onto the reduce stage.
>> >
>> >
>> > Ted Dunning wrote:
>> >>
>> >> Is that last map task actually running, or is it pending?
>> >>
>> >> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <fa...@gmail.com>
>> >> wrote:
>> >>
>> >>>
>> >>> Hey.. I am trying to write a small mapreduce program. I launch a few
>> map
>> >>> tasks, each of which should complete within a certain time (say 5
>> >>> minutes)... all the tasks complete within 5 minutes except the last
>> one
>> >>> -
>> >>> which takes around 10 times more the time taken by all other map
>> >>> tasks.....any idea why this is happening?
>> >>>
>> >>> I am using Hadoop version 0.19.2, tried running it locally as well as
>> on
>> >>> EC2.
>> >>> --
>> >>> View this message in context:
>> >>>
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
>> >>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Ted Dunning, CTO
>> >> DeepDyve
>> >>
>> >>
>> >
>> > --
>> > View this message in context:
>> >
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
>> > Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>> >
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25677160.html
>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25691439.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.