You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Vinci <vi...@polyu.edu.hk> on 2008/01/30 20:24:41 UTC
Re: Fetch issue with Feeds (SOLVED)
Hi,
finally I figure out the solution:
go to conf/
rename the old mime-types.xml into anyting else,
then copy tika-mimetypes.xml into the same directory with name
mime-types.xml
the crawler should work now.
in short, this is because 1.0-dev using tika, but old-day mime detection
config file is loaded.
Vinci wrote:
>
> Hi,
>
> Here is the additional information: before the exception appear, nutch
> advertise 2 message:
>
> fetching http://cnn.com
> org.apache.tika.mime.MimeUtils load
> INFO loading [mime-types.xml]
> fetch of http://www.cnn.com/ failed with: java.lang.NullPointerException
> Fetcher: done
>
> Seems mime-type has problem....did I need to config the file it loaded?
>
>
>
> Vinci wrote:
>>
>> Hi All,
>>
>> I get the same exception when I trying with the nightly build for a
>> static page, any one can help?
>>
>>
>> Vicious wrote:
>>>
>>> Hi All,
>>>
>>> Using the latest nightly build I am trying to run a crawl. I have set
>>> the agent property and all relevant plugin. However as soon as I run the
>>> crawl I get the following error in hadoop.log. I read all the post here
>>> and the only suggestion was the http.agent property should not be empty.
>>> Well in my case it isnt and yet I see the error. Any help will be
>>> appreciated.
>>>
>>> Thanks-
>>>
>>> fetcher.Fetcher - fetch of http://feeds.wired.com/CultOfMac failed
>>> with: java.lang.NullPointerE
>>> http.Http - java.lang.NullPointerException
>>> http.Http - at
>>> org.apache.nutch.protocol.Content.getContentType(Content.java:327)
>>> http.Http - at
>>> org.apache.nutch.protocol.Content.<init>(Content.java:95)
>>> http.Http - at
>>> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:226)
>>> http.Http - at
>>> org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:164)
>>>
>>
>>
>
>
--
View this message in context: http://www.nabble.com/Fetch-issue-with-Feeds-tp15114911p15189897.html
Sent from the Nutch - User mailing list archive at Nabble.com.