You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Runping Qi (JIRA)" <ji...@apache.org> on 2008/05/01 03:34:55 UTC
[jira] Created: (HADOOP-3332) improving the logging during
shuffling
improving the logging during shuffling
--------------------------------------
Key: HADOOP-3332
URL: https://issues.apache.org/jira/browse/HADOOP-3332
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Runping Qi
Below is an excerpt from the log file of a reducer.
A same set of of messages about fetching schedule is logged every second.
Yet, the critical information --- which hosts were slow --- was not there.
2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595913#action_12595913 ]
Devaraj Das commented on HADOOP-3332:
-------------------------------------
Arun, actually there was a call to currentTimeMillis already. I moved it up so that it can be reused for logging also. From the patch, this is the place where currentTime is removed from...
@@ -1098,7 +1123,7 @@
Iterator<MapOutputLocation> locIt = knownOutputs.iterator();
- currentTime = System.currentTimeMillis();
+
while (locIt.hasNext()) {
MapOutputLocation loc = locIt.next();
@@ -1112,8 +1137,12 @@
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595371#action_12595371 ]
Runping Qi commented on HADOOP-3332:
------------------------------------
Does this patch apply to 0.17?
I don't think we can deploy 0.17 without cutting down the logging volume.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596305#action_12596305 ]
Arun C Murthy commented on HADOOP-3332:
---------------------------------------
bq. The call to System.currentTimeMillis was within that earlier (to be precise within, synchronized (scheduledCopies) ), I moved it outside the "synchronized (scheduledCopies)" and I think I moved it to the place where it should be in the loop.
Ok, I missed that one - sorry!
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597122#action_12597122 ]
Hudson commented on HADOOP-3332:
--------------------------------
Integrated in Hadoop-trunk #492 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/492/])
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595459#action_12595459 ]
Devaraj Das commented on HADOOP-3332:
-------------------------------------
No this patch doesn't apply to 0.17. I will put up a version for 0.17
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595257#action_12595257 ]
Runping Qi commented on HADOOP-3332:
------------------------------------
Does this patch address the issue of excessive logging (resulting in hudge log files)?
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nigel Daley updated HADOOP-3332:
--------------------------------
Priority: Blocker (was: Critical)
Affects Version/s: 0.17.0
Fix Version/s: (was: 0.18.0)
0.17.0
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-3332:
--------------------------------
Attachment: 3332.patch
Straightforward patch..
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596157#action_12596157 ]
Arun C Murthy commented on HADOOP-3332:
---------------------------------------
I might be missing something, but the patch moves it down into the while loop, from outside it...
Can we do something simple like log every 1% of shuffle progress or something like that? i.e. log status once we copy 1% of map-outputs...
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595136#action_12595136 ]
Arun C Murthy commented on HADOOP-3332:
---------------------------------------
bq. It turns out that this problem is much more severe that it looks initially. [...]
+1, I've seen this happen too. THe hod log-harvester was eating up CPU on each of the task-trackers...
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Priority: Critical
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596167#action_12596167 ]
Devaraj Das commented on HADOOP-3332:
-------------------------------------
Hey Arun, if you see the fetchOutputs method, the entire stuff is within a big while loop "while (!neededOutputs.isEmpty() && mergeThrowable == null) {" The call to System.currentTimeMillis was within that earlier (to be precise within, synchronized (scheduledCopies) ), I moved it outside the "synchronized (scheduledCopies)" and I think I moved it to the place where it should be in the loop.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-3332:
--------------------------------
Fix Version/s: 0.18.0
Assignee: Devaraj Das
Status: Patch Available (was: Open)
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-3332:
--------------------------------
Attachment: 3332.branch17.patch
This patch is for the 0.17 branch
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-3332:
-------------------------------
Description:
Below is an excerpt from the log file of a reducer.
A same set of of messages about fetching schedule is logged every second.
Yet, the critical information --- which hosts were slow --- was not there.
2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
was:
Below is an excerpt from the log file of a reducer.
A same set of of messages about fetching schedule is logged every second.
Yet, the critical information --- which hosts were slow --- was not there.
2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
Priority: Critical (was: Major)
It turns out that this problem is much more severe that it looks initially.
For any reasonable size of jobs where the shuffling may take some time, the userlog/syslog file of each reducer task may
reach unreasonably large (0.5GB, say). This may impose a big burden for hod to harvest the log files when deallocating
a cluster. Also, if those log files are archived on a DFS (as what the hod does now), the space requirements on DFS
will be quite significant.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Priority: Critical
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595889#action_12595889 ]
Arun C Murthy commented on HADOOP-3332:
---------------------------------------
I'm a little concerned about doing adding a 'gettimeofday'-ish call in the inner-loop; especially when it is just for logging...
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595447#action_12595447 ]
Hadoop QA commented on HADOOP-3332:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381688/3332.patch
against trunk revision 654315.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2430/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2430/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2430/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2430/console
This message is automatically generated.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley updated HADOOP-3332:
----------------------------------
Affects Version/s: (was: 0.17.0)
0.18.0
Fix Version/s: (was: 0.17.0)
0.18.0
This was caused by HADOOP-3297, which was only committed to 0.18. The confusion happened because Runping was running with it applied to 0.17.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-3332:
--------------------------------
Attachment: 3332.patch
Here is another stab at this. The patch logs only when updates happen or a specific time limit expires (1 minute). I hope I have covered all possible logging that happens frequently in the shuffle.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595260#action_12595260 ]
Devaraj Das commented on HADOOP-3332:
-------------------------------------
*sigh* I forgot about that point. Will upload one shortly.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595786#action_12595786 ]
Runping Qi commented on HADOOP-3332:
------------------------------------
The patch looks good.
It should be checked into 17 too.
Without this fix, the log file of EACH reduce task can easily reach gig bytes in size; any useful information
will be burried in the huge volume of repeated noise, thus will be hard to find. Log archiving will also be problematic.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Critical
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3332) improving the logging during
shuffling
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596270#action_12596270 ]
Amar Kamat commented on HADOOP-3332:
------------------------------------
bq. .... actually there was a call to currentTimeMillis already. I moved it up so that it can be reused for logging also. ....
+1
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3332) improving the logging during
shuffling
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-3332:
--------------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
I just committed this to the trunk.
> improving the logging during shuffling
> --------------------------------------
>
> Key: HADOOP-3332
> URL: https://issues.apache.org/jira/browse/HADOOP-3332
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Runping Qi
> Assignee: Devaraj Das
> Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: 3332.branch17.patch, 3332.patch, 3332.patch
>
>
> Below is an excerpt from the log file of a reducer.
> A same set of of messages about fetching schedule is logged every second.
> Yet, the critical information --- which hosts were slow --- was not there.
>
> 2008-05-01 00:33:13,215 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:14,216 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Got 2 known map output location(s); scheduling...
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0 dup hosts)
> 2008-05-01 00:33:15,217 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0 Need another 3 map output(s) where 1 is already in progress
> 2008-05-01 00:33:16,218 INFO org.apache.hadoop.mapred.ReduceTask: task_200804302255_0002_r_000720_0: Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.