You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Runping Qi (JIRA)" <ji...@apache.org> on 2008/05/02 07:10:56 UTC
[jira] Issue Comment Edited: (HADOOP-3327) Shufflinge fetachers waited too long between map output fetch re-tries

    [ https://issues.apache.org/jira/browse/HADOOP-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593435#action_12593435 ] 

runping edited comment on HADOOP-3327 at 5/1/08 10:10 PM:
-------------------------------------------------------------

Here are the related lines from the job tracker log:

2008-04-30 17:00:01,346 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200804301615_0003_m_000756_0' to tip tip_200804301615_0003_m_00075
6, for tracker 'tracker_xxxx'
2008-04-30 17:07:04,827 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200804301615_0003_m_000756_0' has completed tip_200804301615_0003_m_00
0756 successfully.
2008-04-30 17:32:49,981 INFO org.apache.hadoop.mapred.JobInProgress: Failed fetch notification #1 for task task_200804301615_0003_m_000756_0
2008-04-30 17:45:38,438 INFO org.apache.hadoop.mapred.JobInProgress: Failed fetch notification #2 for task task_200804301615_0003_m_000756_0
2008-04-30 17:56:43,950 INFO org.apache.hadoop.mapred.JobInProgress: Failed fetch notification #3 for task task_200804301615_0003_m_000756_0
2008-04-30 17:56:43,950 INFO org.apache.hadoop.mapred.JobInProgress: Too many fetch-failures for output of task: task_200804301615_0003_m_000756_0 ...
 killing it
2008-04-30 17:56:43,950 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200804301615_0003_m_000756_0: Too many fetch-failures
2008-04-30 17:56:43,952 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200804301615_0003_m_000756_1' to tip tip_200804301615_0003_m_00075
6, for tracker 'tracker_xxxx
2008-04-30 17:56:45,377 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200804301615_0003_m_000756_0' from 'tracker_xxxx
2008-04-30 18:02:17,893 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200804301615_0003_m_000756_1' has completed tip_200804301615_0003_m_00
0756 successfully.
2008-04-30 18:03:16,193 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200804301615_0003_m_000756_0' from 'tracker_xxxx
2008-04-30 18:03:16,471 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200804301615_0003_m_000756_1' from 'tracker_xxxx

The above lines show that there ere about 24 minutes time between the first notification of failuring to fetch the map output and the third notice.
That means the reducer waited for about 12 minutes between each re-tries!
The re-execution of the map took only about 7 minutes! 
During that time interval between fetch failure notifications,
there were very few tasks active.




      was (Author: runping):
    
Here are the related lines from the job tracker log:

2008-04-30 17:00:01,346 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200804301615_0003_m_000756_0' to tip tip_200804301615_0003_m_00075
6, for tracker 'tracker_gs205506.inktomisearch.com:gs205506.inktomisearch.com/76.13.187.122:52870'
2008-04-30 17:07:04,827 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200804301615_0003_m_000756_0' has completed tip_200804301615_0003_m_00
0756 successfully.
2008-04-30 17:32:49,981 INFO org.apache.hadoop.mapred.JobInProgress: Failed fetch notification #1 for task task_200804301615_0003_m_000756_0
2008-04-30 17:45:38,438 INFO org.apache.hadoop.mapred.JobInProgress: Failed fetch notification #2 for task task_200804301615_0003_m_000756_0
2008-04-30 17:56:43,950 INFO org.apache.hadoop.mapred.JobInProgress: Failed fetch notification #3 for task task_200804301615_0003_m_000756_0
2008-04-30 17:56:43,950 INFO org.apache.hadoop.mapred.JobInProgress: Too many fetch-failures for output of task: task_200804301615_0003_m_000756_0 ...
 killing it
2008-04-30 17:56:43,950 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200804301615_0003_m_000756_0: Too many fetch-failures
2008-04-30 17:56:43,952 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200804301615_0003_m_000756_1' to tip tip_200804301615_0003_m_00075
6, for tracker 'tracker_gs205368.inktomisearch.com:gs205368.inktomisearch.com/76.13.186.156:52243'
2008-04-30 17:56:45,377 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200804301615_0003_m_000756_0' from 'tracker_gs205506.in
ktomisearch.com:gs205506.inktomisearch.com/76.13.187.122:52870'
2008-04-30 18:02:17,893 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200804301615_0003_m_000756_1' has completed tip_200804301615_0003_m_00
0756 successfully.
2008-04-30 18:03:16,193 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200804301615_0003_m_000756_0' from 'tracker_gs205506.in
ktomisearch.com:gs205506.inktomisearch.com/76.13.187.122:52870'
2008-04-30 18:03:16,471 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200804301615_0003_m_000756_1' from 'tracker_gs205368.in
ktomisearch.com:gs205368.inktomisearch.com/76.13.186.156:52243'

The above lines show that there ere about 24 minutes time between the first notification of failuring to fetch the map output and the third notice.
That means the reducer waited for about 12 minutes between each re-tries!
The re-execution of the map took only about 7 minutes! 
During that time interval between fetch failure notifications,
there were very few tasks active.



  
> Shufflinge fetachers waited too long between map output fetch re-tries
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-3327
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3327
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.