You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Aaron Baff <Aa...@telescope.tv> on 2011/03/16 01:24:15 UTC

Reducer error/hang on small single file Map input

I'm trying to figure out why, when I use a single input file that is relatively small (< 3 MB, ~95K records), my Reducer hangs at 33% and the attempt never completes, with the output below. It looks like there's a problem when it tries to write to the log file, for some reason it gets Interrupted which is not handled, and the Thread then terminates without ever catching that Exception and so the Reducer Task sits there until the JobTracker kills it and starts up another one. Which inevitably hits the same one.

When I take that same data, and split it up into multiple files (5, 10, whichever, just >1), then it runs fine, no issues. Very strange, anyone have any ideas?


--Aaron


stderr logs

log4j:ERROR Failed to flush writer,
java.io.InterruptedIOException
            at java.io.FileOutputStream.writeBytes(Native Method)
            at java.io.FileOutputStream.write(FileOutputStream.java:260)
            at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
            at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
            at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:276)
            at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122)
            at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)
            at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:58)
            at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:316)
            at org.apache.log4j.WriterAppender.append(WriterAppender.java:160)
            at org.apache.hadoop.mapred.TaskLogAppender.append(TaskLogAppender.java:58)
            at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
            at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
            at org.apache.log4j.Category.callAppenders(Category.java:206)
            at org.apache.log4j.Category.forcedLog(Category.java:391)
            at org.apache.log4j.Category.log(Category.java:856)
            at org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:199)
            at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.freeHost(ShuffleScheduler.java:345)
            at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152)



syslog logs

2011-03-15 17:02:32,258 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=SHUFFLE, sessionId=
2011-03-15 17:02:32,265 WARN org.apache.hadoop.conf.Configuration: user.name is deprecated. Instead, use mapreduce.job.user.name
2011-03-15 17:02:32,360 WARN org.apache.hadoop.conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2011-03-15 17:02:32,413 INFO org.apache.hadoop.mapreduce.task.reduce.MergeManager: MergerManager: memoryLimit=130514944, maxSingleShuffleLimit=32628736, mergeThreshold=86139864, ioSortFactor=10, memToMemMergeOutputsThreshold=10
2011-03-15 17:02:32,416 INFO org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_201103141158_0104_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
2011-03-15 17:02:32,423 INFO org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_201103141158_0104_r_000000_0: Got 1 new map-outputs
2011-03-15 17:02:32,423 INFO org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Assiging 10.10.11.50:50060 with 1 to fetcher#5
2011-03-15 17:02:32,423 INFO org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: assigned 1 of 1 to 10.10.11.50:50060 to fetcher#5
2011-03-15 17:02:32,730 INFO org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=50060/mapOutput?job=job_201103141158_0104&reduce=0&map=attempt_201103141158_0104_m_000000_0 sent hash and receievd reply
2011-03-15 17:02:32,735 INFO org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#5 about to shuffle output of map attempt_201103141158_0104_m_000000_0 decomp: 2050216 len: 2050220 to MEMORY
2011-03-15 17:02:32,790 INFO org.apache.hadoop.mapreduce.task.reduce.Fetcher: Read 2050216 bytes from map-output for attempt_201103141158_0104_m_000000_0
2011-03-15 17:02:32,790 INFO org.apache.hadoop.mapreduce.task.reduce.MergeManager: closeInMemoryFile -> map-output of size: 2050216, inMemoryMapOutputs.size() -> 1