You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Edward Quick <ed...@hotmail.com> on 2005/09/21 22:23:05 UTC
resuming intranet crawl
Hi,
I ran out of space whilst doing a crawl of our intranet (which has so far
it's taken 24 hours). Is there a way to pick up the crawl from where it left
off, or do I have to restart it?
Thanks,
Ed.
050921 151225 Processing document 52000
050921 151300 Finishing update
Exception in thread "main" java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at
org.apache.nutch.fs.LocalFileSystem$LocalNFSFileOutputStream.write(LocalFileSystem.java:126)
at
org.apache.nutch.fs.NFSDataOutputStream$PositionCache.write(NFSDataOutputStream.java:36)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:66)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:110)
at java.io.DataOutputStream.write(DataOutputStream.java:85)
at
org.apache.nutch.io.SequenceFile$Writer.append(SequenceFile.java:154)
at
org.apache.nutch.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:753)
at
org.apache.nutch.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:654)
at
org.apache.nutch.io.SequenceFile$Sorter.mergePass(SequenceFile.java:591)
at
org.apache.nutch.io.SequenceFile$Sorter.sort(SequenceFile.java:419)
at
org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:535)
at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544)
at
org.apache.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:321)
at
org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:371)
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:141)