You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Devaraj Das <dd...@yahoo-inc.com> on 2007/03/01 04:56:17 UTC

RE: some reducers stock in copying stage

Weird! This looks like some other problem which happened while merging the
outputs at the Reduce task. The copying stage went through fine. This
requires some more analysis.

> -----Original Message-----
> From: Mike Smith [mailto:mike.smith.dev@gmail.com]
> Sent: Thursday, March 01, 2007 3:44 AM
> To: hadoop-dev@lucene.apache.org
> Subject: Re: some reducers stock in copying stage
> 
> Devaraj,
> 
> After applying patch 1043 the copying problem is solved. But, I am
> getting new exceptions, but, the tasks will be finished after reassigning
> to
> another tasktracker. So, the job gets done eventually. But, I never had
> this
> exception before applying this patch (or could it be because of chaning
> back-off time to 5 sec?):
> 
> java.lang.NullPointerException
> at
> org.apache.hadoop.fs.FSDataInputStream$Buffer.seek(FSDataInputStream.java
> :74)
> at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:121)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(
> ChecksumFileSystem.java:217)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(
> ChecksumFileSystem.java:163)
> at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(
> FSDataInputStream.java:41)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> at java.io.DataInputStream.readFully(DataInputStream.java:178)
> at java.io.DataInputStream.readFully(DataInputStream.java:152)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.reset(
> SequenceFile.java:427)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.access$700(
> SequenceFile.java:414)
> at org.apache.hadoop.io.SequenceFile$Reader.nextRawValue(SequenceFile.java
> :1665)
> at
> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawValue(
> SequenceFile.java:2579)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.next(
> SequenceFile.java:2351)
> at org.apache.hadoop.io.SequenceFile$Sorter.writeFile(SequenceFile.java
> :2226)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(
> SequenceFile.java:2442)
> at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2164)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:270)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1444)
> 
> java.lang.NullPointerException
> at
> org.apache.hadoop.fs.FSDataInputStream$Buffer.seek(FSDataInputStream.java
> :74)
> at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:121)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(
> ChecksumFileSystem.java:217)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(
> ChecksumFileSystem.java:163)
> at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(
> FSDataInputStream.java:41)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> at java.io.DataInputStream.readFully(DataInputStream.java:178)
> at java.io.DataInputStream.readFully(DataInputStream.java:152)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.reset(
> SequenceFile.java:427)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.access$700(
> SequenceFile.java:414)
> at org.apache.hadoop.io.SequenceFile$Reader.nextRawValue(SequenceFile.java
> :1665)
> at
> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawValue(
> SequenceFile.java:2579)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.next(
> SequenceFile.java:2351)
> at org.apache.hadoop.io.SequenceFile$Sorter.writeFile(SequenceFile.java
> :2226)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(
> SequenceFile.java:2442)
> at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2164)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:270)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1444)
> 
> 
> 
> On 2/28/07, Mike Smith <mi...@gmail.com> wrote:
> >
> > Thanks Devaraj, patch 1042 seems to be already committed. Also, the
> system
> > never recovered even after 1 min, 300 sec, it stocked there for hours. I
> > will try patch 1043 and also decrease the back-off time to see if those
> help
> >
> >