You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 07:33:04 UTC
[jira] [Commented] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.4
stable
[ https://issues.apache.org/jira/browse/NUTCH-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641402#comment-14641402 ]
Lewis John McGibbney commented on NUTCH-2049:
---------------------------------------------
BTW, this is only for 2.4.0 for same reason as explained at last issue.
Thsi is an upgrade of dependencies and API usage.... NOT mapred --> mapreduce API's for each NutchJob.
[~markus.jelsma@openindex.io] had a great crack at trying to upgrade some... I would also join his ranks and make best efforts to make all jobs 2.X mapreduce API if it makes sense. It would be nice to have a Nutch roadMap TBH.
Team, how do we feel here?
Tests are broken as follows
{code}
1 Testsuite: org.apache.nutch.crawl.TestCrawlDbFilter
2 Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.986 sec
3 ------------- Standard Output ---------------
4 2015-07-25 01:29:50,852 WARN util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
5 2015-07-25 01:29:51,215 INFO compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.deflate]
6 2015-07-25 01:29:51,231 INFO compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.deflate]
7 2015-07-25 01:29:51,231 INFO crawl.CrawlDBTestUtil (CrawlDBTestUtil.java:createCrawlDb(67)) - adding:http://www.example.com
8 2015-07-25 01:29:51,232 INFO crawl.CrawlDBTestUtil (CrawlDBTestUtil.java:createCrawlDb(67)) - adding:http://www.example1.com
9 2015-07-25 01:29:51,235 INFO crawl.CrawlDBTestUtil (CrawlDBTestUtil.java:createCrawlDb(67)) - adding:http://www.example2.com
10 ------------- ---------------- ---------------
11 ------------- Standard Error -----------------
12 SLF4J: Class path contains multiple SLF4J bindings.
13 SLF4J: Found binding in [jar:file:/usr/local/trunk_clean/build/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
14 SLF4J: Found binding in [jar:file:/usr/local/trunk_clean/build/test/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
15 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
16 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17 ------------- ---------------- ---------------
18
19 Testcase: testUrl404Purging took 0.969 sec
20 Caused an ERROR
21 Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
22 java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
23 at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
24 at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
25 at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
26 at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
27 at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:449)
28 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:832)
29 at org.apache.nutch.crawl.TestCrawlDbFilter.testUrl404Purging(TestCrawlDbFilter.java:107)
{code}
> Upgrade Trunk to Hadoop > 2.4 stable
> ------------------------------------
>
> Key: NUTCH-2049
> URL: https://issues.apache.org/jira/browse/NUTCH-2049
> Project: Nutch
> Issue Type: Improvement
> Components: build
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Fix For: 1.11
>
> Attachments: NUTCH-2049.patch
>
>
> Convo here - http://www.mail-archive.com/dev%40nutch.apache.org/msg18225.html
> I am +1 for taking trunk (or a branch of trunk) to explicit dependency on > Hadoop 2.6.
> We can run our tests, we can validate, we can fix.
> I will be doing validation on 2.X in paralegal as this is what I use on my own projects.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)