You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by MOHIT GOYAL <go...@students.iiit.ac.in> on 2007/08/24 14:03:36 UTC

Re: protocol not found for url=file


-- 
I tried to crawl the local directory files by giving links to local 
directory in urls.I got the following error.

command:
bin/nutch crawl ../urls -dir crawlresult_localfs1


please help

-------------------------------------------------
failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not 
found for url=file
fetching file:///root/Desktop/csiro-split/CSIRO002







MOHIT GOYAL
CSE
200502013






Re: protocol not found for url=file

Posted by "kevin.Y" <02...@163.com>.
Hi MOHIT!
I haven't any experience on file-protocol crawl.But i think you should check
out the  plugin.includes property in the nutch-default.xml . Are you sure
you have enabled the protocol-file plugin ?

Regards ,
Keven


MOHIT GOYAL wrote:
> 
> 
> 
> -- 
> I tried to crawl the local directory files by giving links to local 
> directory in urls.I got the following error.
> 
> command:
> bin/nutch crawl ../urls -dir crawlresult_localfs1
> 
> 
> please help
> 
> -------------------------------------------------
> failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not 
> found for url=file
> fetching file:///root/Desktop/csiro-split/CSIRO002
> 
> 
> 
> 
> 
> 
> 
> MOHIT GOYAL
> CSE
> 200502013
> 
> 
> 
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/why-did-nutch-miss-so-many-links-when-crawling--tf4322916.html#a12322096
Sent from the Nutch - User mailing list archive at Nabble.com.