You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@avro.apache.org by EdwardKing <zh...@neusoft.com> on 2014/02/24 09:45:43 UTC

avro run error

I use hadoop 2.2.0 to run avro, I know hadoop contains avro-1.7.4.jar,like follows:
[hadoop@master lib]$ pwd
/home/software/hadoop-2.2.0/share/hadoop/common/lib

[hadoop@master lib]$ ls av*
avro-1.7.4.jar

Then I put avro-1.7.4.jar into classpath
$export CLASSPATH=.:/home/software/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:${CLASSPATH}

My code is follows:

 public int run(String[] args) throws Exception{
    JobConf conf=new JobConf(getConf(),getClass());
    conf.setJobName("UFO count");
    String[] otherArgs=new GenericOptionsParser(conf,args).getRemainingArgs();
    if(otherArgs.length!=2){
      System.err.println("Usage: avro UFO counter <in><out>");
      System.exit(2);
    }
    FileInputFormat.addInputPath(conf,new Path(otherArgs[0]));
    Path outputPath=new Path(otherArgs[1]);
    FileOutputFormat.setOutputPath(conf,outputPath);
    outputPath.getFileSystem(conf).delete(outputPath);
    Schema input_schema=Schema.parse(getClass().getResourceAsStream("ufo.avsc"));
    AvroJob.setInputSchema(conf,input_schema);
    AvroJob.setMapOutputSchema(conf,Pair.getPairSchema(Schema.create(Schema.Type.STRING),Schema.create(Schema.Type.LONG)));
    AvroJob.setOutputSchema(conf,OUTPUT_SCHEMA);
    AvroJob.setMapperClass(conf,AvroRecordMapper.class);
    AvroJob.setReducerClass(conf,AvroRecordReducer.class);
    conf.setInputFormat(AvroInputFormat.class); 
    JobClient.runJob(conf);           /*----------------AvroMR.java:41----------------*/


Then I run avro

[hadoop@master ~]$ hadoop jar avroufo.jar AvroMR avroin avroout
14/02/24 00:06:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at master/172.11.12.6:8993
14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at master/172.11.12.6:8993
14/02/24 00:07:00 INFO mapred.FileInputFormat: Total input paths to process : 1
14/02/24 00:07:00 INFO mapreduce.JobSubmitter: number of splits:2
14/02/24 00:07:00 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.key.comparator.class is deprecated. Instead, use mapreduce.job.output.key.comparator.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/24 00:07:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1393229044702_0001
14/02/24 00:07:04 INFO impl.YarnClientImpl: Submitted application application_1393229044702_0001 to ResourceManager at master/172.11.12.6:8993
14/02/24 00:07:04 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1393229044702_0001/
14/02/24 00:07:04 INFO mapreduce.Job: Running job: job_1393229044702_0001
14/02/24 00:07:59 INFO mapreduce.Job: Job job_1393229044702_0001 running in uber mode : false
14/02/24 00:07:59 INFO mapreduce.Job:  map 0% reduce 0%
14/02/24 00:12:12 INFO mapreduce.Job:  map 50% reduce 0%
14/02/24 00:12:13 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000001_0, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:13 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000000_0, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:14 INFO mapreduce.Job:  map 0% reduce 0%
14/02/24 00:12:30 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000001_1, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:31 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000000_1, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:43 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000001_2, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:44 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000000_2, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:56 INFO mapreduce.Job:  map 50% reduce 0%
14/02/24 00:12:57 INFO mapreduce.Job:  map 100% reduce 100%
14/02/24 00:12:57 INFO mapreduce.Job: Job job_1393229044702_0001 failed with state FAILED due to: Task failed task_1393229044702_0001_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

14/02/24 00:12:57 INFO mapreduce.Job: Counters: 10
        Job Counters 
                Failed map tasks=7
                Killed map tasks=1
                Launched map tasks=8
                Other local map tasks=6
                Data-local map tasks=2
                Total time spent by all maps in occupied slots (ms)=571664
                Total time spent by all reduces in occupied slots (ms)=0
        Map-Reduce Framework
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
        at AvroMR.run(AvroMR.java:41)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at AvroMR.main(AvroMR.java:67)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
[hadoop@master ~]$ 

I view the history of Track UI,I find following information:
2014-02-24 00:12:12,624 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
 at org.apache.avro.mapred.AvroSerialization.getSerializer(AvroSerialization.java:107)

I know avro-1.7.4.jar contains org.apache.avro.generic.GenericData.createDatumWriter,why it still raise the error "java.lang.NoSuchMethodError"? How to correct it? Thanks.



---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this communication in error,please 
immediately notify the sender by return e-mail, and delete the original message and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------

Re: avro run error

Posted by Gary Steelman <ga...@gmail.com>.

It looks to me like your master node is called "master" so why not try
going to master:8088 instead? Also, I'd suggest checking out yarn-site.xml
and seeing if there is a URI in there which has the master node for your
cluster.


On Mon, Feb 24, 2014 at 7:55 PM, EdwardKing <zh...@neusoft.com> wrote:

> Thanks for Gary
> I add mapreduce.job.user.classpath.first = true into mapred-site.xml
> file,like follows:
>
> $ vi /home/software/hadoop-2.2.0/etc/hadoop/mapred-site.xml
> <configuration>
> <property>
>  <name>mapreduce.framework.name</name>
>  <value>yarn</value>
> </property>
> <property>
>  <name>mapreduce.jobhistory.address</name>
>  <value>master:10020</value>
> </property>
> <property>
>  <name>mapreduce.jobhistory.webapp.address</name>
>  <value>master:19888</value>
> </property>
> <property>
>  <name>mapreduce.job.user.classpath.first</name>
>  <value>true</value>
> </property>
> </configuration>
>
> then I run start-dfs.sh and start-yarn.sh, then I find
> http://172.11.12.6:8088/cluster can't visit. Where is wrong? Which
> configure file need i to modify? Thanks.
>
>
>
>
>
>
> ----- Original Message -----
> From: Gary Steelman
> To: user@avro.apache.org
> Sent: Tuesday, February 25, 2014 12:05 AM
> Subject: Re: avro run error
>
>
> Hi EdwardKing,
>
>
> I had a similar issue come up last week, though mine was because I was
> trying to use Avro 1.7.5 instead of 1.7.4. After some Googling I found a
> MapReduce job config parameter you can change which solved my problem.
>
>
> It turns out that Hadoop places jars it ships with BEFORE user defined
> jars on the classpath by default. This means that if you specify Avro and
> Hadoop specifies Avro, its version (1.7.4) goes first. I needed MY jars to
> come first so I could use updated versions of Avro and other libraries. The
> parameter I changed is this:
>
> mapreduce.job.user.classpath.first = true
>
> Please see these two links for more details:
> http://www.kiji.org/2013/10/08/fixing-classpath-ordering-issues-in-hadoop/+
> https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kiWv2PFjT1s
>
>
> Good luck,
> Gary
>
>
>
>
> On Mon, Feb 24, 2014 at 2:45 AM, EdwardKing <zh...@neusoft.com> wrote:
>
> I use hadoop 2.2.0 to run avro, I know hadoop contains avro-1.7.4.jar,like
> follows:
> [hadoop@master lib]$ pwd
> /home/software/hadoop-2.2.0/share/hadoop/common/lib
>
> [hadoop@master lib]$ ls av*
> avro-1.7.4.jar
>
> Then I put avro-1.7.4.jar into classpath
> $export
> CLASSPATH=.:/home/software/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:${CLASSPATH}
>
> My code is follows:
>
>  public int run(String[] args) throws Exception{
>     JobConf conf=new JobConf(getConf(),getClass());
>     conf.setJobName("UFO count");
>     String[] otherArgs=new
> GenericOptionsParser(conf,args).getRemainingArgs();
>     if(otherArgs.length!=2){
>       System.err.println("Usage: avro UFO counter <in><out>");
>       System.exit(2);
>     }
>     FileInputFormat.addInputPath(conf,new Path(otherArgs[0]));
>     Path outputPath=new Path(otherArgs[1]);
>     FileOutputFormat.setOutputPath(conf,outputPath);
>     outputPath.getFileSystem(conf).delete(outputPath);
>     Schema
> input_schema=Schema.parse(getClass().getResourceAsStream("ufo.avsc"));
>     AvroJob.setInputSchema(conf,input_schema);
>
> AvroJob.setMapOutputSchema(conf,Pair.getPairSchema(Schema.create(Schema.Type.STRING),Schema.create(Schema.Type.LONG)));
>     AvroJob.setOutputSchema(conf,OUTPUT_SCHEMA);
>     AvroJob.setMapperClass(conf,AvroRecordMapper.class);
>     AvroJob.setReducerClass(conf,AvroRecordReducer.class);
>     conf.setInputFormat(AvroInputFormat.class);
>     JobClient.runJob(conf);
> /*----------------AvroMR.java:41----------------*/
>
>
> Then I run avro
>
> [hadoop@master ~]$ hadoop jar avroufo.jar AvroMR avroin avroout
> 14/02/24 00:06:54 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at
> master/172.11.12.6:8993
> 14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at
> master/172.11.12.6:8993
> 14/02/24 00:07:00 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> 14/02/24 00:07:00 INFO mapreduce.JobSubmitter: number of splits:2
> 14/02/24 00:07:00 INFO Configuration.deprecation: user.name is
> deprecated. Instead, use mapreduce.job.user.name
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.jar is
> deprecated. Instead, use mapreduce.job.jar
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.output.key.comparator.class is deprecated. Instead, use
> mapreduce.job.output.key.comparator.class
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.key.class
> is deprecated. Instead, use mapreduce.job.output.key.class
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.mapoutput.key.class is deprecated. Instead, use
> mapreduce.map.output.key.class
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
> 14/02/24 00:07:01 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1393229044702_0001
> 14/02/24 00:07:04 INFO impl.YarnClientImpl: Submitted application
> application_1393229044702_0001 to ResourceManager at master/
> 172.11.12.6:8993
> 14/02/24 00:07:04 INFO mapreduce.Job: The url to track the job:
> http://master:8088/proxy/application_1393229044702_0001/
> 14/02/24 00:07:04 INFO mapreduce.Job: Running job: job_1393229044702_0001
> 14/02/24 00:07:59 INFO mapreduce.Job: Job job_1393229044702_0001 running
> in uber mode : false
> 14/02/24 00:07:59 INFO mapreduce.Job:  map 0% reduce 0%
> 14/02/24 00:12:12 INFO mapreduce.Job:  map 50% reduce 0%
> 14/02/24 00:12:13 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000001_0, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:13 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000000_0, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:14 INFO mapreduce.Job:  map 0% reduce 0%
> 14/02/24 00:12:30 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000001_1, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:31 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000000_1, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:43 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000001_2, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:44 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000000_2, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:56 INFO mapreduce.Job:  map 50% reduce 0%
> 14/02/24 00:12:57 INFO mapreduce.Job:  map 100% reduce 100%
> 14/02/24 00:12:57 INFO mapreduce.Job: Job job_1393229044702_0001 failed
> with state FAILED due to: Task failed task_1393229044702_0001_m_000001
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
> 14/02/24 00:12:57 INFO mapreduce.Job: Counters: 10
>         Job Counters
>                 Failed map tasks=7
>                 Killed map tasks=1
>                 Launched map tasks=8
>                 Other local map tasks=6
>                 Data-local map tasks=2
>                 Total time spent by all maps in occupied slots (ms)=571664
>                 Total time spent by all reduces in occupied slots (ms)=0
>         Map-Reduce Framework
>                 CPU time spent (ms)=0
>                 Physical memory (bytes) snapshot=0
>                 Virtual memory (bytes) snapshot=0
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>         at AvroMR.run(AvroMR.java:41)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at AvroMR.main(AvroMR.java:67)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> [hadoop@master ~]$
>
> I view the history of Track UI,I find following information:
> 2014-02-24 00:12:12,624 FATAL [main] org.apache.hadoop.mapred.YarnChild:
> Error running child : java.lang.NoSuchMethodError:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
>  at
> org.apache.avro.mapred.AvroSerialization.getSerializer(AvroSerialization.java:107)
>
> I know avro-1.7.4.jar contains
> org.apache.avro.generic.GenericData.createDatumWriter,why it still raise
> the error "java.lang.NoSuchMethodError"? How to correct it? Thanks.
>
>
>
>
>
> ---------------------------------------------------------------------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
>  storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---------------------------------------------------------------------------------------------------
>
> ---------------------------------------------------------------------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
>  storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---------------------------------------------------------------------------------------------------
>

Re: avro run error

Posted by EdwardKing <zh...@neusoft.com>.

Thanks for Gary
I add mapreduce.job.user.classpath.first = true into mapred-site.xml file,like follows:

$ vi /home/software/hadoop-2.2.0/etc/hadoop/mapred-site.xml
<configuration>
<property> 
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
</property>
<property>
 <name>mapreduce.jobhistory.address</name>
 <value>master:10020</value>
</property>
<property>
 <name>mapreduce.jobhistory.webapp.address</name>
 <value>master:19888</value>
</property>
<property>
 <name>mapreduce.job.user.classpath.first</name>
 <value>true</value>  
</property>
</configuration>

then I run start-dfs.sh and start-yarn.sh, then I find http://172.11.12.6:8088/cluster can't visit. Where is wrong? Which configure file need i to modify? Thanks.


 



----- Original Message ----- 
From: Gary Steelman 
To: user@avro.apache.org 
Sent: Tuesday, February 25, 2014 12:05 AM
Subject: Re: avro run error


Hi EdwardKing,


I had a similar issue come up last week, though mine was because I was trying to use Avro 1.7.5 instead of 1.7.4. After some Googling I found a MapReduce job config parameter you can change which solved my problem. 


It turns out that Hadoop places jars it ships with BEFORE user defined jars on the classpath by default. This means that if you specify Avro and Hadoop specifies Avro, its version (1.7.4) goes first. I needed MY jars to come first so I could use updated versions of Avro and other libraries. The parameter I changed is this:

mapreduce.job.user.classpath.first = true

Please see these two links for more details:
http://www.kiji.org/2013/10/08/fixing-classpath-ordering-issues-in-hadoop/ + https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kiWv2PFjT1s 


Good luck,
Gary




On Mon, Feb 24, 2014 at 2:45 AM, EdwardKing <zh...@neusoft.com> wrote:

I use hadoop 2.2.0 to run avro, I know hadoop contains avro-1.7.4.jar,like follows:
[hadoop@master lib]$ pwd
/home/software/hadoop-2.2.0/share/hadoop/common/lib

[hadoop@master lib]$ ls av*
avro-1.7.4.jar

Then I put avro-1.7.4.jar into classpath
$export CLASSPATH=.:/home/software/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:${CLASSPATH}

My code is follows:

 public int run(String[] args) throws Exception{
    JobConf conf=new JobConf(getConf(),getClass());
    conf.setJobName("UFO count");
    String[] otherArgs=new GenericOptionsParser(conf,args).getRemainingArgs();
    if(otherArgs.length!=2){
      System.err.println("Usage: avro UFO counter <in><out>");
      System.exit(2);
    }
    FileInputFormat.addInputPath(conf,new Path(otherArgs[0]));
    Path outputPath=new Path(otherArgs[1]);
    FileOutputFormat.setOutputPath(conf,outputPath);
    outputPath.getFileSystem(conf).delete(outputPath);
    Schema input_schema=Schema.parse(getClass().getResourceAsStream("ufo.avsc"));
    AvroJob.setInputSchema(conf,input_schema);
    AvroJob.setMapOutputSchema(conf,Pair.getPairSchema(Schema.create(Schema.Type.STRING),Schema.create(Schema.Type.LONG)));
    AvroJob.setOutputSchema(conf,OUTPUT_SCHEMA);
    AvroJob.setMapperClass(conf,AvroRecordMapper.class);
    AvroJob.setReducerClass(conf,AvroRecordReducer.class);
    conf.setInputFormat(AvroInputFormat.class); 
    JobClient.runJob(conf);           /*----------------AvroMR.java:41----------------*/


Then I run avro

[hadoop@master ~]$ hadoop jar avroufo.jar AvroMR avroin avroout
14/02/24 00:06:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at master/172.11.12.6:8993
14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at master/172.11.12.6:8993
14/02/24 00:07:00 INFO mapred.FileInputFormat: Total input paths to process : 1
14/02/24 00:07:00 INFO mapreduce.JobSubmitter: number of splits:2
14/02/24 00:07:00 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.key.comparator.class is deprecated. Instead, use mapreduce.job.output.key.comparator.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/24 00:07:00 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/24 00:07:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1393229044702_0001
14/02/24 00:07:04 INFO impl.YarnClientImpl: Submitted application application_1393229044702_0001 to ResourceManager at master/172.11.12.6:8993
14/02/24 00:07:04 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1393229044702_0001/
14/02/24 00:07:04 INFO mapreduce.Job: Running job: job_1393229044702_0001
14/02/24 00:07:59 INFO mapreduce.Job: Job job_1393229044702_0001 running in uber mode : false
14/02/24 00:07:59 INFO mapreduce.Job:  map 0% reduce 0%
14/02/24 00:12:12 INFO mapreduce.Job:  map 50% reduce 0%
14/02/24 00:12:13 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000001_0, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:13 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000000_0, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:14 INFO mapreduce.Job:  map 0% reduce 0%
14/02/24 00:12:30 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000001_1, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:31 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000000_1, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:43 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000001_2, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:44 INFO mapreduce.Job: Task Id : attempt_1393229044702_0001_m_000000_2, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
14/02/24 00:12:56 INFO mapreduce.Job:  map 50% reduce 0%
14/02/24 00:12:57 INFO mapreduce.Job:  map 100% reduce 100%
14/02/24 00:12:57 INFO mapreduce.Job: Job job_1393229044702_0001 failed with state FAILED due to: Task failed task_1393229044702_0001_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

14/02/24 00:12:57 INFO mapreduce.Job: Counters: 10
        Job Counters 
                Failed map tasks=7
                Killed map tasks=1
                Launched map tasks=8
                Other local map tasks=6
                Data-local map tasks=2
                Total time spent by all maps in occupied slots (ms)=571664
                Total time spent by all reduces in occupied slots (ms)=0
        Map-Reduce Framework
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
        at AvroMR.run(AvroMR.java:41)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at AvroMR.main(AvroMR.java:67)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
[hadoop@master ~]$ 

I view the history of Track UI,I find following information:
2014-02-24 00:12:12,624 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
 at org.apache.avro.mapred.AvroSerialization.getSerializer(AvroSerialization.java:107)

I know avro-1.7.4.jar contains org.apache.avro.generic.GenericData.createDatumWriter,why it still raise the error "java.lang.NoSuchMethodError"? How to correct it? Thanks.




---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this communication in error,please 
immediately notify the sender by return e-mail, and delete the original message and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this communication in error,please 
immediately notify the sender by return e-mail, and delete the original message and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------

Re: avro run error

Posted by Gary Steelman <ga...@gmail.com>.

Hi EdwardKing,

I had a similar issue come up last week, though mine was because I was
trying to use Avro 1.7.5 instead of 1.7.4. After some Googling I found a
MapReduce job config parameter you can change which solved my problem.

It turns out that Hadoop places jars it ships with BEFORE user defined jars
on the classpath by default. This means that if you specify Avro and Hadoop
specifies Avro, its version (1.7.4) goes first. I needed MY jars to come
first so I could use updated versions of Avro and other libraries. The
parameter I changed is this:

mapreduce.job.user.classpath.first = true

Please see these two links for more details:
http://www.kiji.org/2013/10/08/fixing-classpath-ordering-issues-in-hadoop/+
https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kiWv2PFjT1s

Good luck,
Gary


On Mon, Feb 24, 2014 at 2:45 AM, EdwardKing <zh...@neusoft.com> wrote:

>  I use hadoop 2.2.0 to run avro, I know hadoop contains
> avro-1.7.4.jar,like follows:
> [hadoop@master lib]$ pwd
> /home/software/hadoop-2.2.0/share/hadoop/common/lib
>
> [hadoop@master lib]$ ls av*
> avro-1.7.4.jar
>
> Then I put avro-1.7.4.jar into classpath
> $export
> CLASSPATH=.:/home/software/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:${CLASSPATH}
>
> My code is follows:
>
>  public int run(String[] args) throws Exception{
>     JobConf conf=new JobConf(getConf(),getClass());
>     conf.setJobName("UFO count");
>     String[] otherArgs=new
> GenericOptionsParser(conf,args).getRemainingArgs();
>     if(otherArgs.length!=2){
>       System.err.println("Usage: avro UFO counter <in><out>");
>       System.exit(2);
>     }
>     FileInputFormat.addInputPath(conf,new Path(otherArgs[0]));
>     Path outputPath=new Path(otherArgs[1]);
>     FileOutputFormat.setOutputPath(conf,outputPath);
>     outputPath.getFileSystem(conf).delete(outputPath);
>     Schema
> input_schema=Schema.parse(getClass().getResourceAsStream("ufo.avsc"));
>     AvroJob.setInputSchema(conf,input_schema);
>
> AvroJob.setMapOutputSchema(conf,Pair.getPairSchema(Schema.create(Schema.Type.STRING),Schema.create(Schema.Type.LONG)));
>     AvroJob.setOutputSchema(conf,OUTPUT_SCHEMA);
>     AvroJob.setMapperClass(conf,AvroRecordMapper.class);
>     AvroJob.setReducerClass(conf,AvroRecordReducer.class);
>     conf.setInputFormat(AvroInputFormat.class);
>     JobClient.runJob(conf);
> /*----------------AvroMR.java:41----------------*/
>
>
> Then I run avro
>
> [hadoop@master ~]$ hadoop jar avroufo.jar AvroMR avroin avroout
> 14/02/24 00:06:54 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at
> master/172.11.12.6:8993
> 14/02/24 00:06:56 INFO client.RMProxy: Connecting to ResourceManager at
> master/172.11.12.6:8993
> 14/02/24 00:07:00 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> 14/02/24 00:07:00 INFO mapreduce.JobSubmitter: number of splits:2
> 14/02/24 00:07:00 INFO Configuration.deprecation: user.name is
> deprecated. Instead, use mapreduce.job.user.name
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.jar is
> deprecated. Instead, use mapreduce.job.jar
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.output.key.comparator.class is deprecated. Instead, use
> mapreduce.job.output.key.comparator.class
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.output.key.class
> is deprecated. Instead, use mapreduce.job.output.key.class
> 14/02/24 00:07:00 INFO Configuration.deprecation:
> mapred.mapoutput.key.class is deprecated. Instead, use
> mapreduce.map.output.key.class
> 14/02/24 00:07:00 INFO Configuration.deprecation: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
> 14/02/24 00:07:01 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1393229044702_0001
> 14/02/24 00:07:04 INFO impl.YarnClientImpl: Submitted application
> application_1393229044702_0001 to ResourceManager at master/
> 172.11.12.6:8993
> 14/02/24 00:07:04 INFO mapreduce.Job: The url to track the job:
> http://master:8088/proxy/application_1393229044702_0001/
> 14/02/24 00:07:04 INFO mapreduce.Job: Running job: job_1393229044702_0001
> 14/02/24 00:07:59 INFO mapreduce.Job: Job job_1393229044702_0001 running
> in uber mode : false
> 14/02/24 00:07:59 INFO mapreduce.Job:  map 0% reduce 0%
> 14/02/24 00:12:12 INFO mapreduce.Job:  map 50% reduce 0%
> 14/02/24 00:12:13 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000001_0, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:13 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000000_0, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:14 INFO mapreduce.Job:  map 0% reduce 0%
> 14/02/24 00:12:30 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000001_1, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:31 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000000_1, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:43 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000001_2, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:44 INFO mapreduce.Job: Task Id :
> attempt_1393229044702_0001_m_000000_2, Status : FAILED
> Error:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
> 14/02/24 00:12:56 INFO mapreduce.Job:  map 50% reduce 0%
> 14/02/24 00:12:57 INFO mapreduce.Job:  map 100% reduce 100%
> 14/02/24 00:12:57 INFO mapreduce.Job: Job job_1393229044702_0001 failed
> with state FAILED due to: Task failed task_1393229044702_0001_m_000001
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
> 14/02/24 00:12:57 INFO mapreduce.Job: Counters: 10
>         Job Counters
>                 Failed map tasks=7
>                 Killed map tasks=1
>                 Launched map tasks=8
>                 Other local map tasks=6
>                 Data-local map tasks=2
>                 Total time spent by all maps in occupied slots (ms)=571664
>                 Total time spent by all reduces in occupied slots (ms)=0
>         Map-Reduce Framework
>                 CPU time spent (ms)=0
>                 Physical memory (bytes) snapshot=0
>                 Virtual memory (bytes) snapshot=0
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>         at AvroMR.run(AvroMR.java:41)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at AvroMR.main(AvroMR.java:67)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> [hadoop@master ~]$
>
> I view the history of Track UI,I find following information:
> 2014-02-24 00:12:12,624 FATAL [main] org.apache.hadoop.mapred.YarnChild:
> Error running child : java.lang.NoSuchMethodError:
> org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
>  at
> org.apache.avro.mapred.AvroSerialization.getSerializer(AvroSerialization.java:107)
>
> I know avro-1.7.4.jar contains
> org.apache.avro.generic.GenericData.createDatumWriter,why it still raise
> the error "java.lang.NoSuchMethodError"? How to correct it? Thanks.
>
>
>
>
>
> ---------------------------------------------------------------------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
> storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---------------------------------------------------------------------------------------------------
>