You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Arun Kaundal <ar...@gmail.com> on 2005/12/05 06:56:47 UTC

org.apache.nutch.protocol.ProtocolNotFound: protocol

Nutch Geeks-

   I am facing problem with parsing, as protocol for parsing of particular
type of file is not found. How do I parse the content of those files?
What configuration changes are require (if any ) or Is it problem with
particular library . Please send your reply asap. Thanx a ton

Complete log is attched in errorlog.txt file

051205 104856 fetching file:///F:/atsd/Crawl_Files/v4n.txt
051205 104856 fetching file:///F:/atsd/Crawl_Files/FetcherTask.html
051205 104856 fetch of file:///F:/atsd/Crawl_Files/FetcherTask.html failed
with: org.apache.nutch.protocol.ProtocolNotFound:

protocol not found for url=file
051205 104856 fetch of file:///F:/atsd/Crawl_Files/v4n.txt failed with:
org.apache.nutch.protocol.ProtocolNotFound: protocol

not found for url=file
051205 104856 Unable to parse [null].Reason is [
java.net.MalformedURLException]
051205 104856 Could not clean the content-type [], Reason is [
org.apache.nutch.util.mime.MimeTypeException: The type can not

be null or empty]. Using its raw version...
051205 104856 Could not clean the content-type [], Reason is [
org.apache.nutch.util.mime.MimeTypeException: The type can not

be null or empty]. Using its raw version...
java.lang.NullPointerException
        at org.apache.nutch.parse.ParserFactory.findExtensions(
ParserFactory.java:280)
        at org.apache.nutch.parse.ParserFactory.getExtensions(
ParserFactory.java:254)
        at org.apache.nutch.parse.ParserFactory.getParsers(
ParserFactory.java:149)
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:58)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.handleFetch(
Fetcher.java:252)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java
:204)
java.lang.NullPointerException
        at org.apache.nutch.parse.ParserFactory.findExtensions(
ParserFactory.java:280)
        at org.apache.nutch.parse.ParserFactory.getExtensions(
ParserFactory.java:254)
        at org.apache.nutch.parse.ParserFactory.getParsers(
ParserFactory.java:149)
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:58)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.handleFetch(
Fetcher.java:252)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java
:204)
051205 104856 fetch okay, but can't parse
file:///F:/atsd/Crawl_Files/v4n.txt, reason: failed(2,200):

java.lang.NullPointerException
051205 104856 fetch okay, but can't parse
file:///F:/atsd/Crawl_Files/FetcherTask.html, reason: failed(2,200):

java.lang.NullPointerException