You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ramakrishna <ra...@dioxe.com> on 2013/07/12 14:54:41 UTC
Exception while running Nutch
Injector: starting at 2013-07-12 18:17:41
Injector: crawlDb: bin/nutch
Injector: urlDir: crawl
Injector: Converting injected urls to crawl db entries.
Injector: org.apache.hadoop.mapred.InvalidInputException: Input path does
not exist: file:/D:/apache-nutch/crawl
at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
at org.apache.nutch.crawl.Injector.run(Injector.java:248)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Injector.main(Injector.java:238)
can anybody clear this exception. I think this is the basic one. I don't
know how to solve it.
--
View this message in context: http://lucene.472066.n3.nabble.com/Exception-while-running-Nutch-tp4077551.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Exception while running Nutch
Posted by Ramakrishna <ra...@dioxe.com>.
Crawl urls -dir crawl -threads 3 -depth 3 -topN 5
this s the command I'm using pal.. Is there any fault.?
--
View this message in context: http://lucene.472066.n3.nabble.com/Exception-while-running-Nutch-tp4077551p4077966.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Exception while running Nutch
Posted by Lewis John Mcgibbney <le...@gmail.com>.
Please check the syntax you are using for the cli arguments. It is all
wrong.
You can see correct usage syntax on nutch tutorial or on command line
options.
hth
Lewis
On Friday, July 12, 2013, Ramakrishna <ra...@dioxe.com> wrote:
> Injector: starting at 2013-07-12 18:17:41
> Injector: crawlDb: bin/nutch
> Injector: urlDir: crawl
> Injector: Converting injected urls to crawl db entries.
> Injector: org.apache.hadoop.mapred.InvalidInputException: Input path does
> not exist: file:/D:/apache-nutch/crawl
> at
>
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
> at
>
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
> at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
> at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
> at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
> at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
> at org.apache.nutch.crawl.Injector.run(Injector.java:248)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.nutch.crawl.Injector.main(Injector.java:238)
>
>
> can anybody clear this exception. I think this is the basic one. I don't
> know how to solve it.
>
>
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/Exception-while-running-Nutch-tp4077551.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
--
*Lewis*