You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by meh <me...@gmail.com> on 2009/10/10 12:56:28 UTC

NUTCH_CRAWLING

Hai,

bin/nutch crawl urls -dir crawl_NEW1 -depth 3 -topN 50

I have used the above command to crawl.

I am getting the following error.

Dedup: adding indexes in: crawl_NEW1/indexes
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
        at
org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java
:439)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)


can anyone help me to resolve this problem.

Thank you in advance.



-- 
View this message in context: http://www.nabble.com/NUTCH_CRAWLING-tp25833071p25833071.html
Sent from the Nutch - User mailing list archive at Nabble.com.