You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Zhang, Pengchu" <pz...@sandia.gov> on 2014/02/20 22:25:27 UTC

RE: [EXTERNAL] Re: Mapreduce job failed

Thanks, it has been executed successfully.  Two more questions related to this:

1. This means that I have to execute Mahout for further analysis with the non-MR mode?

2. It is too bad that Hadoop2.2. does not support for newer versions of Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on MR? I do have a large dataset to be clustered.

Thanks.

Pengchu

-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com] 
Sent: Thursday, February 20, 2014 1:17 PM
To: user@mahout.apache.org
Subject: [EXTERNAL] Re: Mapreduce job failed

... and the reason for this failing is that 'TaskAttemptContext' which was a Class in Hadoop 1.x has now become an interface in Hadoop 2.2.

Suggest that u execute this job in non-MR mode with '-xm sequential'.  




On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <su...@yahoo.com> wrote:
 
Seems like u r running this on HAdoop 2.2 (officially not supported for Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm sequential".







On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:

Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my Linux box with Hadoop (2.2.0) but keeping failed consistently.

I tested Hadoop with the Hadoop example pi and wordcount, both worked well.

With a simple text file or directory with multiple text files, e.g., Shakespeare_text, I got the same message of failure. 

>mahout seqdirectory --input /Shakespeare_txet --output 
>/Shakespeare-seqdir --charset utf-8

$ mahout seqdirectory --input /shakespeare_text --output /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop
MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar
14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: {--charset=[utf-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]}
14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process : 43
14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 5284832
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1
14/02/20 11:29:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392919123773_0004
14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/
14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004
14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running in uber mode : false
14/02/20 11:29:53 INFO mapreduce.Job:  map 0% reduce 0%
14/02/20 11:29:58 INFO mapreduce.Job: Task Id : attempt_1392919123773_0004_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
    at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
    ... 10 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
    at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
    ... 15 more

Any suggestion is helpful.

Thanks.

Pengchu

Re: [EXTERNAL] Re: Mapreduce job failed

Posted by Suneel Marthi <su...@yahoo.com>.
You may not need to, if u build Mahout with Hadoop 2 profile u should be good. Please look at Gokhan's email and try applying his patch.





On Thursday, February 20, 2014 5:25 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:
 
Please provide the link for open a JIRA Hadoop.


-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com] 
Sent: Thursday, February 20, 2014 2:35 PM
To: user@mahout.apache.org
Subject: Re: [EXTERNAL] Re: Mapreduce job failed






On Thursday, February 20, 2014 4:26 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:

Thanks, it has been executed successfully.  Two more questions related to this:

1. This means that I have to execute Mahout for further analysis with the non-MR mode?

No, that's not the case. There have been API changes between Hadoop 2.x and Hadoop 1.x.
Mahout is not certified for Hadoop 2.x. 
But most of Mahout's jobs work on Hadoop 2.x, seqdirectory could be one job that's failing due to API incompatibility between Hadoop 1.x and Hadoop 2.x


2. It is too bad that Hadoop2.2. does not support for newer versions of Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on MR? I do have a large dataset to be clustered.

As mentioned earlier, Mahout 0.8/0.9 are certified for Hadoop 1.x so u shouldn't be seeing any issues with that.

Could u open a JIRA for this issue so that its trackable?  As we r  now working towards Mahout 1.0 and Hadoop 2.x compatibility its good that u have reported this issue. Thanks.



Thanks.

Pengchu


-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com]
Sent: Thursday, February 20, 2014 1:17 PM
To: user@mahout.apache.org
Subject: [EXTERNAL] Re: Mapreduce job failed

... and the reason for this failing is that 'TaskAttemptContext' which was a Class in Hadoop 1.x has now become an interface in Hadoop 2.2.

Suggest that u execute this job in non-MR mode with '-xm  sequential'.  




On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <su...@yahoo.com> wrote:

Seems like u r running this on HAdoop 2.2 (officially not supported for Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm sequential".







On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:

Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my Linux box with Hadoop (2.2.0) but keeping failed  consistently.

I tested Hadoop with the Hadoop example pi and wordcount, both worked well.

With a simple text file or directory with multiple text files, e.g., Shakespeare_text, I got the same message of failure. 

>mahout seqdirectory --input /Shakespeare_txet --output 
>/Shakespeare-seqdir --charset utf-8

$ mahout seqdirectory --input /shakespeare_text --output /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop
MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar
14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: {--charset=[utf-8], --chunkSize=[64],  --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]}
14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process : 43
14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 5284832
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1
14/02/20 11:29:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.value.class is  deprecated. Instead, use mapreduce.job.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392919123773_0004
14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/
14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004
14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running in uber mode : false
14/02/20 11:29:53 INFO mapreduce.Job:  map 0% reduce 0%
14/02/20 11:29:58 INFO mapreduce.Job: Task Id : attempt_1392919123773_0004_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
    at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at
org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
    ... 10 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was  expected
    at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
    ... 15 more

Any suggestion is helpful.

Thanks.

Pengchu

RE: [EXTERNAL] Re: Mapreduce job failed

Posted by "Zhang, Pengchu" <pz...@sandia.gov>.
Please provide the link for open a JIRA Hadoop.

-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com] 
Sent: Thursday, February 20, 2014 2:35 PM
To: user@mahout.apache.org
Subject: Re: [EXTERNAL] Re: Mapreduce job failed






On Thursday, February 20, 2014 4:26 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:
 
Thanks, it has been executed successfully.  Two more questions related to this:

1. This means that I have to execute Mahout for further analysis with the non-MR mode?

No, that's not the case. There have been API changes between Hadoop 2.x and Hadoop 1.x.
Mahout is not certified for Hadoop 2.x. 
But most of Mahout's jobs work on Hadoop 2.x, seqdirectory could be one job that's failing due to API incompatibility between Hadoop 1.x and Hadoop 2.x


2. It is too bad that Hadoop2.2. does not support for newer versions of Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on MR? I do have a large dataset to be clustered.

As mentioned earlier, Mahout 0.8/0.9 are certified for Hadoop 1.x so u shouldn't be seeing any issues with that.

Could u open a JIRA for this issue so that its trackable?  As we r  now working towards Mahout 1.0 and Hadoop 2.x compatibility its good that u have reported this issue. Thanks.



Thanks.

Pengchu


-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com]
Sent: Thursday, February 20, 2014 1:17 PM
To: user@mahout.apache.org
Subject: [EXTERNAL] Re: Mapreduce job failed

... and the reason for this failing is that 'TaskAttemptContext' which was a Class in Hadoop 1.x has now become an interface in Hadoop 2.2.

Suggest that u execute this job in non-MR mode with '-xm  sequential'.  




On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <su...@yahoo.com> wrote:

Seems like u r running this on HAdoop 2.2 (officially not supported for Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm sequential".







On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:

Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my Linux box with Hadoop (2.2.0) but keeping failed  consistently.

I tested Hadoop with the Hadoop example pi and wordcount, both worked well.

With a simple text file or directory with multiple text files, e.g., Shakespeare_text, I got the same message of failure. 

>mahout seqdirectory --input /Shakespeare_txet --output 
>/Shakespeare-seqdir --charset utf-8

$ mahout seqdirectory --input /shakespeare_text --output /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop
MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar
14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: {--charset=[utf-8], --chunkSize=[64],  --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]}
14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process : 43
14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 5284832
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1
14/02/20 11:29:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.value.class is  deprecated. Instead, use mapreduce.job.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392919123773_0004
14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/
14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004
14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running in uber mode : false
14/02/20 11:29:53 INFO mapreduce.Job:  map 0% reduce 0%
14/02/20 11:29:58 INFO mapreduce.Job: Task Id : attempt_1392919123773_0004_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
    at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at
 org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
    ... 10 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was  expected
    at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
    ... 15 more

Any suggestion is helpful.

Thanks.

Pengchu

RE: [EXTERNAL] Re: Mapreduce job failed

Posted by "Zhang, Pengchu" <pz...@sandia.gov>.
Built and run successfully on clustering (kmeans) with svn trunk and hadoop2.2.0. Thanks for you nice work. -Pengchu

From: Gokhan Capan [mailto:gkhncpn@gmail.com]
Sent: Friday, February 21, 2014 12:21 PM
To: Zhang, Pengchu
Subject: Re: [EXTERNAL] Re: Mapreduce job failed

Great then, I'm gonna commit it on Monday

Sent from my iPhone

On Feb 21, 2014, at 17:46, "Zhang, Pengchu" <pz...@sandia.gov>> wrote:
I saw that you have tested on clustering… successfully.

That is what I want to do.

From: Gokhan Capan [mailto:gkhncpn@gmail.com]
Sent: Friday, February 21, 2014 8:37 AM
To: Zhang, Pengchu
Subject: Re: [EXTERNAL] Re: Mapreduce job failed

By the way, you can see the progress in

https://issues.apache.org/jira/browse/MAHOUT-1329


Gokhan

On Fri, Feb 21, 2014 at 5:32 PM, Zhang, Pengchu <pz...@sandia.gov>> wrote:
Thanks, I am looking forward to trying it. Penghcu

From: Gokhan Capan [mailto:gkhncpn@gmail.com<ma...@gmail.com>]
Sent: Friday, February 21, 2014 8:30 AM

To: Zhang, Pengchu
Subject: Re: [EXTERNAL] Re: Mapreduce job failed

Hi, I tried it and it works.

I'll commit this in two days, then you can build mahout as usual and work with it without a problem.

Best

Gokhan

On Fri, Feb 21, 2014 at 4:56 PM, Zhang, Pengchu <pz...@sandia.gov>> wrote:
Eclipse on Linux Ubuntu. But I can build with command line. I have installed and used Maven.
Thanks.

Pengchu

From: Gokhan Capan [mailto:gkhncpn@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 20, 2014 11:28 PM
To: Zhang, Pengchu

Subject: Re: [EXTERNAL] Re: Mapreduce job failed

Sure, what IDE are you using for Java development, IntelliJ?

Sent from my iPhone

On Feb 21, 2014, at 0:47, "Zhang, Pengchu" <pz...@sandia.gov>> wrote:
Gokhan:

It is embraced (so not posted to the user group ☺)to ask the procedure to build the patch with Mahout and Hadoop, I have never done building some package with a patch. Thanks.


1.       Download the mahout 0.9 source codes

2.       Replace core/pom.xml with the patch 1329-3

3.       Build it with Maven?

Pengchu


From: Gokhan Capan [mailto:gkhncpn@gmail.com]
Sent: Thursday, February 20, 2014 2:40 PM
To: user@mahout.apache.org<ma...@mahout.apache.org>; Zhang, Pengchu
Subject: Re: [EXTERNAL] Re: Mapreduce job failed

If you have a chance to build mahout from source, could you try if that works when the patch in MAHOUT-1329 applied? mvn packaging mahout with "-DskipTests=true" is pretty fast

Gokhan

On Thu, Feb 20, 2014 at 11:34 PM, Suneel Marthi <su...@yahoo.com>> wrote:





On Thursday, February 20, 2014 4:26 PM, "Zhang, Pengchu" <pz...@sandia.gov>> wrote:

Thanks, it has been executed successfully.  Two more questions related to this:

1. This means that I have to execute Mahout for further analysis with the non-MR mode?
No, that's not the case. There have been API changes between Hadoop 2.x and Hadoop 1.x.
Mahout is not certified for Hadoop 2.x.
But most of Mahout's jobs work on Hadoop 2.x, seqdirectory could be one job that's failing due to API incompatibility between Hadoop 1.x and Hadoop 2.x


2. It is too bad that Hadoop2.2. does not support for newer versions of Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on MR? I do have a large dataset to be clustered.
As mentioned earlier, Mahout 0.8/0.9 are certified for Hadoop 1.x so u shouldn't be seeing any issues with that.

Could u open a JIRA for this issue so that its trackable?  As we r
 now working towards Mahout 1.0 and Hadoop 2.x compatibility its good that u have reported this issue. Thanks.



Thanks.

Pengchu


-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com<ma...@yahoo.com>]
Sent: Thursday, February 20, 2014 1:17 PM
To: user@mahout.apache.org<ma...@mahout.apache.org>
Subject: [EXTERNAL] Re: Mapreduce job failed

... and the reason for this failing is that 'TaskAttemptContext' which was a Class in Hadoop 1.x has now become an interface in Hadoop 2.2.

Suggest that u execute this job in non-MR mode with '-xm
 sequential'.




On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <su...@yahoo.com>> wrote:

Seems like u r running this on HAdoop 2.2 (officially not supported for Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm sequential".







On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <pz...@sandia.gov>> wrote:

Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my Linux box with Hadoop (2.2.0) but keeping failed
 consistently.

I tested Hadoop with the Hadoop example pi and wordcount, both worked well.

With a simple text file or directory with multiple text files, e.g., Shakespeare_text, I got the same message of failure.

>mahout seqdirectory --input /Shakespeare_txet --output
>/Shakespeare-seqdir --charset utf-8

$ mahout seqdirectory --input /shakespeare_text --output /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop
MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar
14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: {--charset=[utf-8], --chunkSize=[64],
 --endPhase=[2147483647<tel:%5B2147483647>], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]}
14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032<http://0.0.0.0:8032>
14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process : 43
14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 5284832
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1
14/02/20 11:29:44 INFO Configuration.deprecation: user.name<http://user.name> is deprecated. Instead, use mapreduce.job.user.name<http://mapreduce.job.user.name>
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.value.class is
 deprecated. Instead, use mapreduce.job.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name<http://mapred.job.name> is deprecated. Instead, use mapreduce.job.name<http://mapreduce.job.name>
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392919123773_0004
14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032<http://0.0.0.0:8032>
14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/
14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004
14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running in uber mode : false
14/02/20 11:29:53 INFO mapreduce.Job:  map 0% reduce 0%
14/02/20 11:29:58 INFO mapreduce.Job: Task Id : attempt_1392919123773_0004_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
    at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at
 org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
    ... 10 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was
 expected
    at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
    ... 15 more

Any suggestion is helpful.

Thanks.

Pengchu




Re: [EXTERNAL] Re: Mapreduce job failed

Posted by Gokhan Capan <gk...@gmail.com>.
If you have a chance to build mahout from source, could you try if that
works when the patch in MAHOUT-1329 applied? mvn packaging mahout with
"-DskipTests=true" is pretty fast

Gokhan


On Thu, Feb 20, 2014 at 11:34 PM, Suneel Marthi <su...@yahoo.com>wrote:

>
>
>
>
>
> On Thursday, February 20, 2014 4:26 PM, "Zhang, Pengchu" <
> pzhang@sandia.gov> wrote:
>
> Thanks, it has been executed successfully.  Two more questions related to
> this:
>
> 1. This means that I have to execute Mahout for further analysis with the
> non-MR mode?
>
> No, that's not the case. There have been API changes between Hadoop 2.x
> and Hadoop 1.x.
> Mahout is not certified for Hadoop 2.x.
> But most of Mahout's jobs work on Hadoop 2.x, seqdirectory could be one
> job that's failing due to API incompatibility between Hadoop 1.x and Hadoop
> 2.x
>
>
> 2. It is too bad that Hadoop2.2. does not support for newer versions of
> Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on
> MR? I do have a large dataset to be clustered.
>
> As mentioned earlier, Mahout 0.8/0.9 are certified for Hadoop 1.x so u
> shouldn't be seeing any issues with that.
>
> Could u open a JIRA for this issue so that its trackable?  As we r
>  now working towards Mahout 1.0 and Hadoop 2.x compatibility its good that
> u have reported this issue. Thanks.
>
>
>
> Thanks.
>
> Pengchu
>
>
> -----Original Message-----
> From: Suneel Marthi [mailto:suneel_marthi@yahoo.com]
> Sent: Thursday, February 20, 2014 1:17 PM
> To: user@mahout.apache.org
> Subject: [EXTERNAL] Re: Mapreduce job failed
>
> ... and the reason for this failing is that 'TaskAttemptContext' which was
> a Class in Hadoop 1.x has now become an interface in Hadoop 2.2.
>
> Suggest that u execute this job in non-MR mode with '-xm
>  sequential'.
>
>
>
>
> On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <
> suneel_marthi@yahoo.com> wrote:
>
> Seems like u r running this on HAdoop 2.2 (officially not supported for
> Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm
> sequential".
>
>
>
>
>
>
>
> On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <
> pzhang@sandia.gov> wrote:
>
> Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my
> Linux box with Hadoop (2.2.0) but keeping failed
>  consistently.
>
> I tested Hadoop with the Hadoop example pi and wordcount, both worked well.
>
> With a simple text file or directory with multiple text files, e.g.,
> Shakespeare_text, I got the same message of failure.
>
> >mahout seqdirectory --input /Shakespeare_txet --output
> >/Shakespeare-seqdir --charset utf-8
>
> $ mahout seqdirectory --input /shakespeare_text --output
> /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding
> HADOOP_CONF_DIR to classpath.
> Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and
> HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop
> MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar
> 14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments:
> {--charset=[utf-8], --chunkSize=[64],
>  --endPhase=[2147483647],
> --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter],
> --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce],
> --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]}
> 14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 14/02/20 11:29:42 INFO Configuration.deprecation:
> mapred.compress.map.output is deprecated. Instead, use
> mapreduce.map.output.compress
> 14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /
> 0.0.0.0:8032
> 14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process
> : 43
> 14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated
> node allocation with : CompletedNodes: 1, size left: 5284832
> 14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1
> 14/02/20 11:29:44 INFO Configuration.deprecation: user.name is
> deprecated. Instead, use mapreduce.job.user.name
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress
> is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is
> deprecated. Instead, use mapreduce.job.jar
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is
> deprecated. Instead, use mapreduce.job.reduces
> 14/02/20 11:29:44 INFO Configuration.deprecation:
> mapred.output.value.class is
>  deprecated. Instead, use mapreduce.job.output.value.class
> 14/02/20 11:29:44 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is
> deprecated. Instead, use mapreduce.job.map.class
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
> 14/02/20 11:29:44 INFO Configuration.deprecation:
> mapreduce.inputformat.class is deprecated. Instead, use
> mapreduce.job.inputformat.class
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is
> deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
> 14/02/20 11:29:44 INFO Configuration.deprecation:
> mapreduce.outputformat.class is deprecated. Instead, use
> mapreduce.job.outputformat.class
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class
> is deprecated. Instead, use mapreduce.job.output.key.class
> 14/02/20 11:29:44 INFO Configuration.deprecation:
> mapred.mapoutput.key.class is deprecated. Instead, use
> mapreduce.map.output.key.class
> 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
> 14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1392919123773_0004
> 14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application
> application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032
> 14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job:
> http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/
> 14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004
> 14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running
> in uber mode : false
> 14/02/20 11:29:53 INFO mapreduce.Job:  map 0% reduce 0%
> 14/02/20 11:29:58 INFO mapreduce.Job: Task Id :
> attempt_1392919123773_0004_m_000000_0, Status : FAILED
> Error: java.lang.RuntimeException:
> java.lang.reflect.InvocationTargetException
>     at
> org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
>     at
> org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
>     at
> org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
>     at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:416)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>     at
>  org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.lang.reflect.InvocationTargetException
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>     at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
>     at
> org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
>     ... 10 more
> Caused by: java.lang.IncompatibleClassChangeError: Found interface
> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was
>  expected
>     at
> org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
>     ... 15 more
>
> Any suggestion is helpful.
>
> Thanks.
>
> Pengchu
>

Re: [EXTERNAL] Re: Mapreduce job failed

Posted by Suneel Marthi <su...@yahoo.com>.




On Thursday, February 20, 2014 4:26 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:
 
Thanks, it has been executed successfully.  Two more questions related to this:

1. This means that I have to execute Mahout for further analysis with the non-MR mode?

No, that's not the case. There have been API changes between Hadoop 2.x and Hadoop 1.x.
Mahout is not certified for Hadoop 2.x. 
But most of Mahout's jobs work on Hadoop 2.x, seqdirectory could be one job that's failing due to API incompatibility between Hadoop 1.x and Hadoop 2.x


2. It is too bad that Hadoop2.2. does not support for newer versions of Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on MR? I do have a large dataset to be clustered.

As mentioned earlier, Mahout 0.8/0.9 are certified for Hadoop 1.x so u shouldn't be seeing any issues with that.

Could u open a JIRA for this issue so that its trackable?  As we r
 now working towards Mahout 1.0 and Hadoop 2.x compatibility its good that u have reported this issue. Thanks.



Thanks.

Pengchu


-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com] 
Sent: Thursday, February 20, 2014 1:17 PM
To: user@mahout.apache.org
Subject: [EXTERNAL] Re: Mapreduce job failed

... and the reason for this failing is that 'TaskAttemptContext' which was a Class in Hadoop 1.x has now become an interface in Hadoop 2.2.

Suggest that u execute this job in non-MR mode with '-xm
 sequential'.  




On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <su...@yahoo.com> wrote:

Seems like u r running this on HAdoop 2.2 (officially not supported for Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm sequential".







On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:

Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my Linux box with Hadoop (2.2.0) but keeping failed
 consistently.

I tested Hadoop with the Hadoop example pi and wordcount, both worked well.

With a simple text file or directory with multiple text files, e.g., Shakespeare_text, I got the same message of failure. 

>mahout seqdirectory --input /Shakespeare_txet --output 
>/Shakespeare-seqdir --charset utf-8

$ mahout seqdirectory --input /shakespeare_text --output /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop
MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar
14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: {--charset=[utf-8], --chunkSize=[64],
 --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]}
14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process : 43
14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 5284832
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1
14/02/20 11:29:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.value.class is
 deprecated. Instead, use mapreduce.job.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392919123773_0004
14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/
14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004
14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running in uber mode : false
14/02/20 11:29:53 INFO mapreduce.Job:  map 0% reduce 0%
14/02/20 11:29:58 INFO mapreduce.Job: Task Id : attempt_1392919123773_0004_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
    at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at
 org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
    ... 10 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was
 expected
    at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
    ... 15 more

Any suggestion is helpful.

Thanks.

Pengchu

Re: [EXTERNAL] Re: Mapreduce job failed

Posted by Suneel Marthi <su...@yahoo.com>.
Mahout 0.8/0.9 are certified for Hadoop 1.2.1. 





On Thursday, February 20, 2014 4:25 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:
 
Thanks, it has been executed successfully.  Two more questions related to this:

1. This means that I have to execute Mahout for further analysis with the non-MR mode?

2. It is too bad that Hadoop2.2. does not support for newer versions of Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on MR? I do have a large dataset to be clustered.

Thanks.

Pengchu


-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com] 
Sent: Thursday, February 20, 2014 1:17 PM
To: user@mahout.apache.org
Subject: [EXTERNAL] Re: Mapreduce job failed

... and the reason for this failing is that 'TaskAttemptContext' which was a Class in Hadoop 1.x has now become an interface in Hadoop 2.2.

Suggest that u execute this job in non-MR mode with '-xm sequential'.  




On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <su...@yahoo.com> wrote:

Seems like u r running this on HAdoop 2.2 (officially not supported for Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm sequential".







On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <pz...@sandia.gov> wrote:

Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my Linux box with Hadoop (2.2.0) but keeping failed consistently.

I tested Hadoop with the Hadoop example pi and wordcount, both worked well.

With a simple text file or directory with multiple text files, e.g., Shakespeare_text, I got the same message of failure. 

>mahout seqdirectory --input /Shakespeare_txet --output 
>/Shakespeare-seqdir --charset utf-8

$ mahout seqdirectory --input /shakespeare_text --output /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop
MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar
14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: {--charset=[utf-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]}
14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process : 43
14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 5284832
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1
14/02/20 11:29:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392919123773_0004
14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032
14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/
14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004
14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running in uber mode : false
14/02/20 11:29:53 INFO mapreduce.Job:  map 0% reduce 0%
14/02/20 11:29:58 INFO mapreduce.Job: Task Id : attempt_1392919123773_0004_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
    at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
    ... 10 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
    at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
    ... 15 more

Any suggestion is helpful.

Thanks.

Pengchu