You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sébastien Rainville <se...@gmail.com> on 2008/11/04 05:41:28 UTC

map tasks fail to report status

Hi,

I have a map task that works most of the time but fails on some data. I keep
getting these exceptions:

Task attempt_200811031947_0003_m_000095_0 failed to report status for 600
seconds. Killing!


I noticed that the tasks that fail have a lot of these at the end of the
syslogs:

2008-11-03 21:05:52,745 INFO org.apache.hadoop.mapred.Merger: Merging 41
sorted segments
2008-11-03 21:05:52,746 INFO org.apache.hadoop.mapred.Merger: Merging 5
intermediate segments out of a total of 41
2008-11-03 21:05:53,016 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 37
2008-11-03 21:05:53,147 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 28
2008-11-03 21:05:53,329 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 19
2008-11-03 21:05:53,525 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 10 segments left of total size: 7866139 bytes
2008-11-03 21:05:53,848 INFO org.apache.hadoop.mapred.MapTask: Index:
(2465254733, 7866121, 7866121)
2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 41
sorted segments
2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 5
intermediate segments out of a total of 41
2008-11-03 21:05:53,963 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 37
2008-11-03 21:05:53,976 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 28
2008-11-03 21:05:53,996 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 19
2008-11-03 21:05:54,013 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 10 segments left of total size: 4290611 bytes
...


Sure the ones that succeed have them too but the number of segments is
always significantly lower:

2008-11-03 20:42:38,214 INFO org.apache.hadoop.mapred.MapTask: Index:
(125745724, 351203, 351203)
2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Merging 2
sorted segments
2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 2 segments left of total size: 345895 bytes
2008-11-03 20:42:38,226 INFO org.apache.hadoop.mapred.MapTask: Index:
(126096927, 345893, 345893)
2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Merging 2
sorted segments
2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 2 segments left of total size: 364718 bytes
2008-11-03 20:42:38,237 INFO org.apache.hadoop.mapred.MapTask: Index:
(126442820, 364716, 364716)
2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Merging 2
sorted segments
2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 2 segments left of total size: 440435 bytes
2008-11-03 20:42:38,247 INFO org.apache.hadoop.mapred.MapTask: Index:
(126807536, 440433, 440433)


I don't get any exceptions beside the timeouts because the tasks don't
report their status. So, my questions are:
- what exactly is the Merger? Why is it only merging at the end of the
tasks? Why does it seems to merge several times the same data?
- Can it really be causing the problem or should I look somewhere else
(there's no exception after all) ? It's most probably in my code but I don't
see any exception so it's kind of hard to tell what's happening.

Thanks in advance,
Sebastien