You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by SIP COP 009 <si...@gmail.com> on 2008/01/12 07:08:57 UTC

Error while crawling

Folks,

I have been using Nutch to crawl a set of documents. The crawl went fine for
about 5 days and now it gave the following error:

2008-01-11 20:30:23,364 WARN  mapred.LocalJobRunner - job_local_1
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
valid local directory for map_0000/intermediate.29
        at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite
(LocalDirAllocator.java:281)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(
LocalDirAllocator.java:124)
        at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(
SequenceFile.java:2732)
        at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java
:2392)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(
MapTask.java:552)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(
MapTask.java:607)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
LocalJobRunner.java:132


Any idea what this is ?

Also is there a way I can continue from the place it failed. I do not want
start from scratch again. Can I make use of the already crawled data ?

Thanks in advance.
ashutosh