You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Joseph McMahon <jo...@iswcorp.com> on 2012/02/06 20:15:09 UTC

SequenceFile problem

I have a 654M sequence file <Text,BytesWritable> that I'm using as the
input to a MR job.  I have it loaded into HDFS on my cluster.  The first
job is simple: iterate through the text files in the sequence file and
generate some counts.  Nothing CPU intensive.  It seems like the process
stalls periodically, where no map tasks are executing - all are waiting for
next key/value pairs.

I will get task attempts timing out after 600 seconds, then getting killed.
 The map progress % reverts.  I put logging into the map job and it runs
start-to-end in milliseconds-seconds.  Another map task just doesn't seem
to fire up.

Thanks.