You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/08/09 17:06:27 UTC

[jira] [Commented] (NUTCH-839) nutch doesnt run under 0.20.2+228-1~karmic-cdh3b1 version of hadoop

    [ https://issues.apache.org/jira/browse/NUTCH-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081686#comment-13081686 ] 

Lewis John McGibbney commented on NUTCH-839:
--------------------------------------------

It would appear that a very similar ticket exists as of NUTCH-937

The log output posted by Claudio on that ticket also refers to the RuntimeException - URLNormalizer not found, which is the same exception shown above.

Is it fair to say that seeing as there is more correspondence on that particular ticket this issue has been superseded?

> nutch doesnt run under 0.20.2+228-1~karmic-cdh3b1 version of hadoop
> -------------------------------------------------------------------
>
>                 Key: NUTCH-839
>                 URL: https://issues.apache.org/jira/browse/NUTCH-839
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.1
>         Environment: ubuntu linux version 2.6.31-14-server, x86_64 GNU/Linux
>            Reporter: Robert Gonzalez
>
> new versions of hadoop appear to put jars in a different format now, instead of file:/a/b/c/d/job.jar, its now jar:file:/a/b/c/d/job.jar!, which breaks nutch when its trying to load its plugins.  Specifically, the stack trace looks like:
> Caused by: java.lang.RuntimeException: x point org.apache.nutch.net.URLNormalizer not found.
> 	at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:124)
> 	at org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:57)
> A simple test class was written the used the URLFilters class, and the following stack trace resulted:
> 10/07/01 14:25:25 INFO mapred.JobClient: Task Id : attempt_201006171624_46525_m_000000_1, Status : FAILED
> java.lang.RuntimeException: org.apache.nutch.net.URLFilter not found.
> 	at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:52)
> 	at com.maxpoint.crawl.BidSampler$BIdSMapper.setup(BidSampler.java:42)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Running this on an older version of hadoop works.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira