You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by MOHIT GOYAL <go...@students.iiit.ac.in> on 2007/08/24 14:03:36 UTC
Re: protocol not found for url=file
--
I tried to crawl the local directory files by giving links to local
directory in urls.I got the following error.
command:
bin/nutch crawl ../urls -dir crawlresult_localfs1
please help
-------------------------------------------------
failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not
found for url=file
fetching file:///root/Desktop/csiro-split/CSIRO002
MOHIT GOYAL
CSE
200502013
Re: protocol not found for url=file
Posted by "kevin.Y" <02...@163.com>.
Hi MOHIT!
I haven't any experience on file-protocol crawl.But i think you should check
out the plugin.includes property in the nutch-default.xml . Are you sure
you have enabled the protocol-file plugin ?
Regards ,
Keven
MOHIT GOYAL wrote:
>
>
>
> --
> I tried to crawl the local directory files by giving links to local
> directory in urls.I got the following error.
>
> command:
> bin/nutch crawl ../urls -dir crawlresult_localfs1
>
>
> please help
>
> -------------------------------------------------
> failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not
> found for url=file
> fetching file:///root/Desktop/csiro-split/CSIRO002
>
>
>
>
>
>
>
> MOHIT GOYAL
> CSE
> 200502013
>
>
>
>
>
>
>
--
View this message in context: http://www.nabble.com/why-did-nutch-miss-so-many-links-when-crawling--tf4322916.html#a12322096
Sent from the Nutch - User mailing list archive at Nabble.com.