You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by bhawna singh <si...@gmail.com> on 2011/03/10 22:44:45 UTC

Spill Failed Error while fetching

Hi All,
I am crawling a URL list of 300K and after fetching around 200K I see
IOException: Spill Failed error.
Below is the stack trace.

Would anyone have some insight as to what am I running into and how I can
overcome this issue.
Thanks in advance,
Bhawna

Stack Trace:

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - java.io.IOException: Spill
failed

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:899)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:647)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - Caused by:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
valid local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill26.out

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)

2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - fetcher
caught:java.io.IOException: Spill failed

Re: Spill Failed Error while fetching

Posted by Ken Krugler <kk...@transpac.com>.
On Mar 10, 2011, at 1:44pm, bhawna singh wrote:

> Hi All,
> I am crawling a URL list of 300K and after fetching around 200K I see
> IOException: Spill Failed error.
> Below is the stack trace.
>
> Would anyone have some insight as to what am I running into and how  
> I can
> overcome this issue.

I believe that can happen when you run out of free local disk space to  
use during the shuffle phrase of a Hadoop job.

-- Ken


> Thanks in advance,
> Bhawna
>
> Stack Trace:
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - java.io.IOException:  
> Spill
> failed
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.hadoop.mapred.MapTask 
> $MapOutputBuffer.collect(MapTask.java:860)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.hadoop.mapred.MapTask 
> $OldOutputCollector.collect(MapTask.java:466)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java: 
> 899)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:647)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - Caused by:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not  
> find any
> valid local directory for
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/ 
> output/spill26.out
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.hadoop.fs.LocalDirAllocator 
> $AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org 
> .apache 
> .hadoop 
> .fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org 
> .apache 
> .hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java: 
> 107)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.hadoop.mapred.MapTask 
> $MapOutputBuffer.sortAndSpill(MapTask.java:1221)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access 
> $1800(MapTask.java:686)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
> $SpillThread.run(MapTask.java:1173)
>
> 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - fetcher
> caught:java.io.IOException: Spill failed

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g