You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Simon Willnauer <si...@googlemail.com> on 2009/12/09 19:31:14 UTC

Copyphase dies on timeout

Hi folks,

I do have a questions regarding a map-reduce job who gets killed
during the copy phase due to a timeout. I don't really understand the
output completely so I first look for some help to understand what is
going on.
There are about 50 mapper running where each mapper succeeds. the job
is configured with a single reduces (I assume this could be closely
related to that?!)
So this is the output:

2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Merging
5 sorted segments
2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 5 segments left of total size: 216092233
bytes
2009-12-09 15:01:50,727 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Merge of the 5 files in-memory
complete. Local file is
/data/hadoop-data/mapred/local/taskTracker/jobcache/job_200912081402_0042/attempt_200912081402_0042_r_000000_0/output/map_275.out
of size 216092229
2009-12-09 15:01:50,728 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Thread waiting: Thread for
merging on-disk files
2009-12-09 15:01:58,109 INFO org.apache.hadoop.mapred.ReduceTask: Read
53281055 bytes from map-output for
attempt_200912081402_0042_m_000253_0
2009-12-09 15:01:58,148 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200912081402_0042_m_000253_0 -> (71, 103) from
hadoop32
2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Need another 44 map output(s)
where 3 is already in progress
2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
and41 dup hosts)
2009-12-09 15:02:40,174 INFO org.apache.hadoop.mapred.ReduceTask: Read
81297732 bytes from map-output for
attempt_200912081402_0042_m_000112_0
2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200912081402_0042_m_000112_0 -> (129, 153) from
hadoop31
2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
and16 dup hosts)
2009-12-09 15:02:40,223 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 82364774 bytes (82364778 raw bytes) into RAM from
attempt_200912081402_0042_m_000121_0
2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Read
82722679 bytes from map-output for
attempt_200912081402_0042_m_000196_0
2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200912081402_0042_m_000196_0 -> (114, 137) from
hadoop21
2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
and32 dup hosts)
2009-12-09 15:02:45,143 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 82639206 bytes (82639210 raw bytes) into RAM from
attempt_200912081402_0042_m_000195_0
2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
where 3 is already in progress
2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
and39 dup hosts)
2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
where 3 is already in progress
2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
and39 dup hosts)
......

Task attempt_200912081402_0042_r_000000_0 failed to report status for
902 seconds. Killing!

my first guess would be some IO problem but I don not have any
indicator that those logs. If somebody can clarify or give me a hint
would be great. Even if it is something like: use more reducers
because of ... <-- this is the interesting part at the moment!
btw: this is hadoop-0.20

simon

RE: Copyphase dies on timeout

Posted by Fuad Efendi <fu...@efendi.ca>.
> >> 2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
> >> and39 dup hosts)

- Reduce was started
- you are probably using kind of web crawler (39 duplicate hosts, 0 slow hosts)


> >> Task attempt_200912081402_0042_r_000000_0 failed to report status for
> >> 902 seconds. Killing!

- Try to increase <name>mapred.task.timeout</name> 10 times; you are currently using 900 seconds.


Another use case which I had before (with Reducer): TIKA hangs forever on some weird HTML.

And another use case (BIXO): HttpClient configured with 20 attempts, 30 seconds timeout, which makes 20*30 of "non reporting delay" so that default (mapred.task.timeout)  600 seconds setting is not good.



Fuad Efendi
+1 416-993-2060
http://www.tokenizer.ca
Data Mining, Vertical Search



> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
> Sent: December-09-09 1:46 PM
> To: Fuad Efendi
> Cc: mapreduce-user@hadoop.apache.org
> Subject: Re: Copyphase dies on timeout
> 
> Hmm it looks like the reducer hasn't even started. and I do not have
> an external resource at all.
> 
> simon
> 
> On Wed, Dec 9, 2009 at 7:39 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> > I had similar issues before; Reduce can do some 'additional' processing
> and have dependency on external services.
> >
> > For instance, in my case it sends documents to SOLR (Lucene based). So
> that I have dependency on Lucene in a reducer. And, Lucene sometimes needs
> 30-50 minutes to perform index merge. If your timeout settings are not
> enough, job will be killed. You may also execute reporter.keepAlive() each
> 60 seconds (in a separate thread spawned by your reducer).
> >
> > See at <name>mapred.task.timeout</name>, and use reporter.progress(),
> keepAlive(), etc.
> >
> >
> > Fuad Efendi
> > +1 416-993-2060
> > http://www.tokenizer.ca
> > Data Mining, Vertical Search
> >
> >
> >> -----Original Message-----
> >> From: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
> >> Sent: December-09-09 1:31 PM
> >> To: mapreduce-user@hadoop.apache.org
> >> Subject: Copyphase dies on timeout
> >>
> >> Hi folks,
> >>
> >> I do have a questions regarding a map-reduce job who gets killed
> >> during the copy phase due to a timeout. I don't really understand the
> >> output completely so I first look for some help to understand what is
> >> going on.
> >> There are about 50 mapper running where each mapper succeeds. the job
> >> is configured with a single reduces (I assume this could be closely
> >> related to that?!)
> >> So this is the output:
> >>
> >> 2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Merging
> >> 5 sorted segments
> >> 2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Down to
> >> the last merge-pass, with 5 segments left of total size: 216092233
> >> bytes
> >> 2009-12-09 15:01:50,727 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Merge of the 5 files in-memory
> >> complete. Local file is
> >> /data/hadoop-
> >>
> data/mapred/local/taskTracker/jobcache/job_200912081402_0042/attempt_20091
> >> 2081402_0042_r_000000_0/output/map_275.out
> >> of size 216092229
> >> 2009-12-09 15:01:50,728 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Thread waiting: Thread for
> >> merging on-disk files
> >> 2009-12-09 15:01:58,109 INFO org.apache.hadoop.mapred.ReduceTask: Read
> >> 53281055 bytes from map-output for
> >> attempt_200912081402_0042_m_000253_0
> >> 2009-12-09 15:01:58,148 INFO org.apache.hadoop.mapred.ReduceTask: Rec
> >> #1 from attempt_200912081402_0042_m_000253_0 -> (71, 103) from
> >> hadoop32
> >> 2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Need another 44 map output(s)
> >> where 3 is already in progress
> >> 2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
> >> and41 dup hosts)
> >> 2009-12-09 15:02:40,174 INFO org.apache.hadoop.mapred.ReduceTask: Read
> >> 81297732 bytes from map-output for
> >> attempt_200912081402_0042_m_000112_0
> >> 2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask: Rec
> >> #1 from attempt_200912081402_0042_m_000112_0 -> (129, 153) from
> >> hadoop31
> >> 2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
> >> and16 dup hosts)
> >> 2009-12-09 15:02:40,223 INFO org.apache.hadoop.mapred.ReduceTask:
> >> Shuffling 82364774 bytes (82364778 raw bytes) into RAM from
> >> attempt_200912081402_0042_m_000121_0
> >> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Read
> >> 82722679 bytes from map-output for
> >> attempt_200912081402_0042_m_000196_0
> >> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Rec
> >> #1 from attempt_200912081402_0042_m_000196_0 -> (114, 137) from
> >> hadoop21
> >> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
> >> and32 dup hosts)
> >> 2009-12-09 15:02:45,143 INFO org.apache.hadoop.mapred.ReduceTask:
> >> Shuffling 82639206 bytes (82639210 raw bytes) into RAM from
> >> attempt_200912081402_0042_m_000195_0
> >> 2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
> >> where 3 is already in progress
> >> 2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
> >> and39 dup hosts)
> >> 2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
> >> where 3 is already in progress
> >> 2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
> >> and39 dup hosts)
> >> ......
> >>
> >> Task attempt_200912081402_0042_r_000000_0 failed to report status for
> >> 902 seconds. Killing!
> >>
> >> my first guess would be some IO problem but I don not have any
> >> indicator that those logs. If somebody can clarify or give me a hint
> >> would be great. Even if it is something like: use more reducers
> >> because of ... <-- this is the interesting part at the moment!
> >> btw: this is hadoop-0.20
> >>
> >> simon
> >
> >
> >



Re: Copyphase dies on timeout

Posted by Simon Willnauer <si...@googlemail.com>.
Hmm it looks like the reducer hasn't even started. and I do not have
an external resource at all.

simon

On Wed, Dec 9, 2009 at 7:39 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> I had similar issues before; Reduce can do some 'additional' processing and have dependency on external services.
>
> For instance, in my case it sends documents to SOLR (Lucene based). So that I have dependency on Lucene in a reducer. And, Lucene sometimes needs 30-50 minutes to perform index merge. If your timeout settings are not enough, job will be killed. You may also execute reporter.keepAlive() each 60 seconds (in a separate thread spawned by your reducer).
>
> See at <name>mapred.task.timeout</name>, and use reporter.progress(), keepAlive(), etc.
>
>
> Fuad Efendi
> +1 416-993-2060
> http://www.tokenizer.ca
> Data Mining, Vertical Search
>
>
>> -----Original Message-----
>> From: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
>> Sent: December-09-09 1:31 PM
>> To: mapreduce-user@hadoop.apache.org
>> Subject: Copyphase dies on timeout
>>
>> Hi folks,
>>
>> I do have a questions regarding a map-reduce job who gets killed
>> during the copy phase due to a timeout. I don't really understand the
>> output completely so I first look for some help to understand what is
>> going on.
>> There are about 50 mapper running where each mapper succeeds. the job
>> is configured with a single reduces (I assume this could be closely
>> related to that?!)
>> So this is the output:
>>
>> 2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Merging
>> 5 sorted segments
>> 2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Down to
>> the last merge-pass, with 5 segments left of total size: 216092233
>> bytes
>> 2009-12-09 15:01:50,727 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Merge of the 5 files in-memory
>> complete. Local file is
>> /data/hadoop-
>> data/mapred/local/taskTracker/jobcache/job_200912081402_0042/attempt_20091
>> 2081402_0042_r_000000_0/output/map_275.out
>> of size 216092229
>> 2009-12-09 15:01:50,728 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Thread waiting: Thread for
>> merging on-disk files
>> 2009-12-09 15:01:58,109 INFO org.apache.hadoop.mapred.ReduceTask: Read
>> 53281055 bytes from map-output for
>> attempt_200912081402_0042_m_000253_0
>> 2009-12-09 15:01:58,148 INFO org.apache.hadoop.mapred.ReduceTask: Rec
>> #1 from attempt_200912081402_0042_m_000253_0 -> (71, 103) from
>> hadoop32
>> 2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Need another 44 map output(s)
>> where 3 is already in progress
>> 2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
>> and41 dup hosts)
>> 2009-12-09 15:02:40,174 INFO org.apache.hadoop.mapred.ReduceTask: Read
>> 81297732 bytes from map-output for
>> attempt_200912081402_0042_m_000112_0
>> 2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask: Rec
>> #1 from attempt_200912081402_0042_m_000112_0 -> (129, 153) from
>> hadoop31
>> 2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
>> and16 dup hosts)
>> 2009-12-09 15:02:40,223 INFO org.apache.hadoop.mapred.ReduceTask:
>> Shuffling 82364774 bytes (82364778 raw bytes) into RAM from
>> attempt_200912081402_0042_m_000121_0
>> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Read
>> 82722679 bytes from map-output for
>> attempt_200912081402_0042_m_000196_0
>> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Rec
>> #1 from attempt_200912081402_0042_m_000196_0 -> (114, 137) from
>> hadoop21
>> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
>> and32 dup hosts)
>> 2009-12-09 15:02:45,143 INFO org.apache.hadoop.mapred.ReduceTask:
>> Shuffling 82639206 bytes (82639210 raw bytes) into RAM from
>> attempt_200912081402_0042_m_000195_0
>> 2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
>> where 3 is already in progress
>> 2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
>> and39 dup hosts)
>> 2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
>> where 3 is already in progress
>> 2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
>> and39 dup hosts)
>> ......
>>
>> Task attempt_200912081402_0042_r_000000_0 failed to report status for
>> 902 seconds. Killing!
>>
>> my first guess would be some IO problem but I don not have any
>> indicator that those logs. If somebody can clarify or give me a hint
>> would be great. Even if it is something like: use more reducers
>> because of ... <-- this is the interesting part at the moment!
>> btw: this is hadoop-0.20
>>
>> simon
>
>
>

RE: Copyphase dies on timeout

Posted by Fuad Efendi <fu...@efendi.ca>.
I had similar issues before; Reduce can do some 'additional' processing and have dependency on external services.

For instance, in my case it sends documents to SOLR (Lucene based). So that I have dependency on Lucene in a reducer. And, Lucene sometimes needs 30-50 minutes to perform index merge. If your timeout settings are not enough, job will be killed. You may also execute reporter.keepAlive() each 60 seconds (in a separate thread spawned by your reducer).

See at <name>mapred.task.timeout</name>, and use reporter.progress(), keepAlive(), etc.


Fuad Efendi
+1 416-993-2060
http://www.tokenizer.ca
Data Mining, Vertical Search


> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
> Sent: December-09-09 1:31 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Copyphase dies on timeout
> 
> Hi folks,
> 
> I do have a questions regarding a map-reduce job who gets killed
> during the copy phase due to a timeout. I don't really understand the
> output completely so I first look for some help to understand what is
> going on.
> There are about 50 mapper running where each mapper succeeds. the job
> is configured with a single reduces (I assume this could be closely
> related to that?!)
> So this is the output:
> 
> 2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Merging
> 5 sorted segments
> 2009-12-09 15:01:45,103 INFO org.apache.hadoop.mapred.Merger: Down to
> the last merge-pass, with 5 segments left of total size: 216092233
> bytes
> 2009-12-09 15:01:50,727 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Merge of the 5 files in-memory
> complete. Local file is
> /data/hadoop-
> data/mapred/local/taskTracker/jobcache/job_200912081402_0042/attempt_20091
> 2081402_0042_r_000000_0/output/map_275.out
> of size 216092229
> 2009-12-09 15:01:50,728 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Thread waiting: Thread for
> merging on-disk files
> 2009-12-09 15:01:58,109 INFO org.apache.hadoop.mapred.ReduceTask: Read
> 53281055 bytes from map-output for
> attempt_200912081402_0042_m_000253_0
> 2009-12-09 15:01:58,148 INFO org.apache.hadoop.mapred.ReduceTask: Rec
> #1 from attempt_200912081402_0042_m_000253_0 -> (71, 103) from
> hadoop32
> 2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Need another 44 map output(s)
> where 3 is already in progress
> 2009-12-09 15:01:58,273 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
> and41 dup hosts)
> 2009-12-09 15:02:40,174 INFO org.apache.hadoop.mapred.ReduceTask: Read
> 81297732 bytes from map-output for
> attempt_200912081402_0042_m_000112_0
> 2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask: Rec
> #1 from attempt_200912081402_0042_m_000112_0 -> (129, 153) from
> hadoop31
> 2009-12-09 15:02:40,175 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
> and16 dup hosts)
> 2009-12-09 15:02:40,223 INFO org.apache.hadoop.mapred.ReduceTask:
> Shuffling 82364774 bytes (82364778 raw bytes) into RAM from
> attempt_200912081402_0042_m_000121_0
> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Read
> 82722679 bytes from map-output for
> attempt_200912081402_0042_m_000196_0
> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask: Rec
> #1 from attempt_200912081402_0042_m_000196_0 -> (114, 137) from
> hadoop21
> 2009-12-09 15:02:45,044 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Scheduled 1 outputs (0 slow hosts
> and32 dup hosts)
> 2009-12-09 15:02:45,143 INFO org.apache.hadoop.mapred.ReduceTask:
> Shuffling 82639206 bytes (82639210 raw bytes) into RAM from
> attempt_200912081402_0042_m_000195_0
> 2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
> where 3 is already in progress
> 2009-12-09 15:02:58,274 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
> and39 dup hosts)
> 2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Need another 42 map output(s)
> where 3 is already in progress
> 2009-12-09 15:03:58,275 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200912081402_0042_r_000000_0 Scheduled 0 outputs (0 slow hosts
> and39 dup hosts)
> ......
> 
> Task attempt_200912081402_0042_r_000000_0 failed to report status for
> 902 seconds. Killing!
> 
> my first guess would be some IO problem but I don not have any
> indicator that those logs. If somebody can clarify or give me a hint
> would be great. Even if it is something like: use more reducers
> because of ... <-- this is the interesting part at the moment!
> btw: this is hadoop-0.20
> 
> simon