You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by Ramakrishna <ra...@dioxe.com> on 2013/07/12 14:54:41 UTC

Exception while running Nutch

Injector: starting at 2013-07-12 18:17:41
Injector: crawlDb: bin/nutch
Injector: urlDir: crawl
Injector: Converting injected urls to crawl db entries.
Injector: org.apache.hadoop.mapred.InvalidInputException: Input path does
not exist: file:/D:/apache-nutch/crawl
	at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
	at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
	at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
	at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
	at org.apache.nutch.crawl.Injector.run(Injector.java:248)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.nutch.crawl.Injector.main(Injector.java:238)


can anybody clear this exception. I think this is the basic one. I don't
know how to solve it.



--
View this message in context: http://lucene.472066.n3.nabble.com/Exception-while-running-Nutch-tp4077551.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Exception while running Nutch

Posted by Ramakrishna <ra...@dioxe.com>.

Crawl urls -dir crawl -threads 3 -depth 3 -topN 5

this s the command I'm using pal.. Is there any fault.?



--
View this message in context: http://lucene.472066.n3.nabble.com/Exception-while-running-Nutch-tp4077551p4077966.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Exception while running Nutch

Posted by Lewis John Mcgibbney <le...@gmail.com>.

Please check the syntax you are using for the cli arguments. It is all
wrong.
You can see correct usage syntax on nutch tutorial or on command line
options.
hth
Lewis

On Friday, July 12, 2013, Ramakrishna <ra...@dioxe.com> wrote:
> Injector: starting at 2013-07-12 18:17:41
> Injector: crawlDb: bin/nutch
> Injector: urlDir: crawl
> Injector: Converting injected urls to crawl db entries.
> Injector: org.apache.hadoop.mapred.InvalidInputException: Input path does
> not exist: file:/D:/apache-nutch/crawl
>         at
>
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
>         at
>
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
>         at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
>         at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
>         at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:248)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:238)
>
>
> can anybody clear this exception. I think this is the basic one. I don't
> know how to solve it.
>
>
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/Exception-while-running-Nutch-tp4077551.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

-- 
*Lewis*