You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Hut <mm...@gmail.com> on 2008/07/04 03:44:44 UTC

Re: problem running nutch from eclipse 3.2 in ubuntu hardy.

Hi, I checked out the nutch/hadoop source code, and set their configuration
file under /conf/ folder to build path. 

IDE: Intellij 7.02.  OS: Ubuntu8.04.
-----------------------------------------------------------------------------------------------------------------------------------------------------------
08/07/04 21:21:21 INFO conf.Configuration: crawl-urlfilter.txt not found
08/07/04 21:21:21 FATAL api.RegexURLFilterBase: Can't find resource:
crawl-urlfilter.txt
08/07/04 21:21:21 INFO mapred.LocalJobRunner:
file:/home/hut/installfiles/nutch-0.9/urls/anhuiligong:0+23
08/07/04 21:21:21 WARN regex.RegexURLNormalizer: can't find rules for scope
'inject', using default
08/07/04 21:21:21 WARN crawl.Injector: Skipping
http://www.aust.edu.cn/:java.lang.NullPointerException
08/07/04 21:21:21 INFO mapred.LocalJobRunner:
file:/home/hut/installfiles/nutch-0.9/urls/anhuiligong:0+23
-----------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------
08/07/04 21:24:17 INFO conf.Configuration: crawl-urlfilter.txt not found
08/07/04 21:24:17 FATAL api.RegexURLFilterBase: Can't find resource:
crawl-urlfilter.txt
08/07/04 21:24:17 WARN regex.RegexURLNormalizer: can't find rules for scope
'inject', using default
08/07/04 21:24:17 WARN crawl.Injector: Skipping
http://www2.ahut.edu.cn/:java.lang.NullPointerException
08/07/04 21:24:17 INFO mapred.LocalJobRunner:
file:/home/hut/installfiles/nutch-0.9/urls/anhuigongye:0+24
08/07/04 21:24:18 INFO mapred.LocalJobRunner: reduce > reduce
-----------------------------------------------------------------------------------------------------------------------------------------------------------

crawl-urlfilter.txt  always can not be deleted by Intellij, infact, the
build has already been placed on the build path of IDE as some XML
configuration files.  Is it related with .txt ? or others? 

Thanks.



-- 
View this message in context: http://www.nabble.com/Re%3A-problem-running-nutch-from-eclipse-3.2-in-ubuntu-hardy.-tp17836199p18271235.html
Sent from the Nutch - User mailing list archive at Nabble.com.