You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/01/20 13:23:26 UTC

[jira] [Created] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.3

Markus Jelsma created NUTCH-2354:
------------------------------------

             Summary: Upgrade Hadoop dependencies to 2.7.3
                 Key: NUTCH-2354
                 URL: https://issues.apache.org/jira/browse/NUTCH-2354
             Project: Nutch
          Issue Type: Bug
          Components: injector
    Affects Versions: 1.12
            Reporter: Markus Jelsma
            Assignee: Markus Jelsma
            Priority: Blocker
             Fix For: 1.13


This wednesday we experienced trouble running the 1.12 injector on Hadoop 2.7.3. We operated 2.7.2 before and we had no trouble running a job.

2017-01-18 15:36:53,005 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter, but class was expected
	at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:216)
	at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:100)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter, but class was expected
        at org.apache.nutch.crawl.Injector.inject(Injector.java:383)
        at org.apache.nutch.crawl.Injector.run(Injector.java:467)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.nutch.crawl.Injector.main(Injector.java:441)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Our processes retried injecting for a few minutes until we manually shut it down. Meanwhile on HDFS, our CrawlDB was gone, thanks for snapshots and/or backups we could restore it, so enable those if you haven't done so yet.

These freak Hadoop errors can be notoriously difficult to debug but it seems we are in luck, recompile Nutch with Hadoop 2.7.3 instead 2.4.0. You are also in luck if your job file uses the old org.hadoop.mapred.* API, only jobs using the org.hadoop.mapreduce.* API seem to fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

re: [jira] How to unsubscribe?

Posted by Trevor Oakley <tr...@merrows.co.uk>.
Please can I unsubscribe?
  
  
  

----------------------------------------
 From: "Markus Jelsma (JIRA)" <ji...@apache.org>
Sent: 20 January 2017 08:30
To: dev@nutch.apache.org
Subject: [jira] [Created] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.3 
  
Markus Jelsma created NUTCH-2354:
------------------------------------

Summary: Upgrade Hadoop dependencies to 2.7.3
Key: NUTCH-2354
URL: https://issues.apache.org/jira/browse/NUTCH-2354
Project: Nutch
Issue Type: Bug
Components: injector
Affects Versions: 1.12
Reporter: Markus Jelsma
Assignee: Markus Jelsma
Priority: Blocker
Fix For: 1.13

This wednesday we experienced trouble running the 1.12 injector on Hadoop 
2.7.3. We operated 2.7.2 before and we had no trouble running a job.

2017-01-18 15:36:53,005 FATAL [main] org.apache.hadoop.mapred.YarnChild: 
Error running child : java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.Counter, but class was expected
at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:216)
at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:100)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.Counter, but class was expected
at org.apache.nutch.crawl.Injector.inject(Injector.java:383)
at org.apache.nutch.crawl.Injector.run(Injector.java:467)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.Injector.main(Injector.java:441)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62
)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Our processes retried injecting for a few minutes until we manually shut it 
down. Meanwhile on HDFS, our CrawlDB was gone, thanks for snapshots and/or 
backups we could restore it, so enable those if you haven't done so yet.

These freak Hadoop errors can be notoriously difficult to debug but it 
seems we are in luck, recompile Nutch with Hadoop 2.7.3 instead 2.4.0. You 
are also in luck if your job file uses the old org.hadoop.mapred.* API, 
only jobs using the org.hadoop.mapreduce.* API seem to fail.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)