You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Shengchao Ding <di...@gmail.com> on 2013/01/10 19:03:59 UTC

Class not found in mahout split -xm mapreduce

I'm running the 20 newsgroups examples on virtual machine of CDH4.1.2.
It ran smoothly but failed if I modify the split command to

mahout split \
    -i newsgroup/vectors \
    --trainingOutput newsgroup/train-vectors \
    --testOutput newsgroup/test-vectors  \
    --randomSelectionPct 40 --overwrite --sequenceFiles -xm mapreduce
-mro newsgroup/mro

The only different to original command is that the method is modified
to mapreduce while the original example is sequential.

I got the following exception.

Error: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.mahout.utils.SplitInputJob$SplitInputMapper not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1571)
        at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:685)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.ClassNotFoundException: Class
org.apache.mahout.utils.SplitInputJob$SplitInputMapper not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1477)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1569)
        ... 8 more


I checked the mahout package on the distribution as follows.

[cloudera@localhost ~]$ jar tf
/usr/lib/mahout/mahout-examples-0.7-cdh4.1.2-job.jar | grep SplitInput
org/apache/mahout/utils/SplitInputJob$SplitInputReducer.class
org/apache/mahout/utils/SplitInputJob$SplitInputMapper.class
org/apache/mahout/utils/SplitInputJob$SplitInputComparator.class
org/apache/mahout/utils/SplitInputJob.class
org/apache/mahout/utils/SplitInput.class
org/apache/mahout/utils/SplitInput$SplitCallback.class

Can anyone help me out? Thanks.

-- 
Shengchao

Re: Class not found in mahout split -xm mapreduce

Posted by Shengchao Ding <di...@gmail.com>.
It's an issue same to https://issues.apache.org/jira/browse/MAHOUT-1061
After modifying SplitInputJob.java, it works well on CDH4.

Thank you all.
-Shengchao
-----Original Message----- 
From: Ted Dunning 
Sent: Thursday, January 10, 2013 8:15 PM 
To: user@mahout.apache.org 
Subject: Re: Class not found in mahout split -xm mapreduce 

Using the 0.23 profile might allow you to compile a version that works with
CDH4.  Until Cloudera cares enough to test and commit a patch, however, we
can't be sure.

On Thu, Jan 10, 2013 at 5:52 PM, Marty Kube <
martykube@beavercreekconsulting.com> wrote:

> Hi Shengchao,
> My understanding is that CDH4 is not supported.  Try CDH3.
> Marty
> On 01/10/2013 01:03 PM, Shengchao Ding wrote:
>
>> I'm running the 20 newsgroups examples on virtual machine of CDH4.1.2.
>> It ran smoothly but failed if I modify the split command to
>>
>> mahout split \
>>      -i newsgroup/vectors \
>>      --trainingOutput newsgroup/train-vectors \
>>      --testOutput newsgroup/test-vectors  \
>>      --randomSelectionPct 40 --overwrite --sequenceFiles -xm mapreduce
>> -mro newsgroup/mro
>>
>> The only different to original command is that the method is modified
>> to mapreduce while the original example is sequential.
>>
>> I got the following exception.
>>
>> Error: java.lang.RuntimeException: java.lang.**ClassNotFoundException:
>> Class org.apache.mahout.utils.**SplitInputJob$SplitInputMapper not found
>>          at org.apache.hadoop.conf.**Configuration.getClass(**
>> Configuration.java:1571)
>>          at org.apache.hadoop.mapreduce.**task.JobContextImpl.**
>> getMapperClass(JobContextImpl.**java:186)
>>          at org.apache.hadoop.mapred.**MapTask.runNewMapper(MapTask.**
>> java:685)
>>          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:332)
>>          at org.apache.hadoop.mapred.**YarnChild$2.run(YarnChild.**
>> java:152)
>>          at java.security.**AccessController.doPrivileged(**Native
>> Method)
>>          at javax.security.auth.Subject.**doAs(Subject.java:396)
>>          at org.apache.hadoop.security.**UserGroupInformation.doAs(**
>> UserGroupInformation.java:**1332)
>>          at org.apache.hadoop.mapred.**YarnChild.main(YarnChild.java:**
>> 147)
>> Caused by: java.lang.**ClassNotFoundException: Class
>> org.apache.mahout.utils.**SplitInputJob$SplitInputMapper not found
>>          at org.apache.hadoop.conf.**Configuration.getClassByName(**
>> Configuration.java:1477)
>>          at org.apache.hadoop.conf.**Configuration.getClass(**
>> Configuration.java:1569)
>>          ... 8 more
>>
>>
>> I checked the mahout package on the distribution as follows.
>>
>> [cloudera@localhost ~]$ jar tf
>> /usr/lib/mahout/mahout-**examples-0.7-cdh4.1.2-job.jar | grep SplitInput
>> org/apache/mahout/utils/**SplitInputJob$**SplitInputReducer.class
>> org/apache/mahout/utils/**SplitInputJob$**SplitInputMapper.class
>> org/apache/mahout/utils/**SplitInputJob$**SplitInputComparator.class
>> org/apache/mahout/utils/**SplitInputJob.class
>> org/apache/mahout/utils/**SplitInput.class
>> org/apache/mahout/utils/**SplitInput$SplitCallback.class
>>
>> Can anyone help me out? Thanks.
>>
>>
>

Re: Class not found in mahout split -xm mapreduce

Posted by Ted Dunning <te...@gmail.com>.
Using the 0.23 profile might allow you to compile a version that works with
CDH4.  Until Cloudera cares enough to test and commit a patch, however, we
can't be sure.

On Thu, Jan 10, 2013 at 5:52 PM, Marty Kube <
martykube@beavercreekconsulting.com> wrote:

> Hi Shengchao,
> My understanding is that CDH4 is not supported.  Try CDH3.
> Marty
> On 01/10/2013 01:03 PM, Shengchao Ding wrote:
>
>> I'm running the 20 newsgroups examples on virtual machine of CDH4.1.2.
>> It ran smoothly but failed if I modify the split command to
>>
>> mahout split \
>>      -i newsgroup/vectors \
>>      --trainingOutput newsgroup/train-vectors \
>>      --testOutput newsgroup/test-vectors  \
>>      --randomSelectionPct 40 --overwrite --sequenceFiles -xm mapreduce
>> -mro newsgroup/mro
>>
>> The only different to original command is that the method is modified
>> to mapreduce while the original example is sequential.
>>
>> I got the following exception.
>>
>> Error: java.lang.RuntimeException: java.lang.**ClassNotFoundException:
>> Class org.apache.mahout.utils.**SplitInputJob$SplitInputMapper not found
>>          at org.apache.hadoop.conf.**Configuration.getClass(**
>> Configuration.java:1571)
>>          at org.apache.hadoop.mapreduce.**task.JobContextImpl.**
>> getMapperClass(JobContextImpl.**java:186)
>>          at org.apache.hadoop.mapred.**MapTask.runNewMapper(MapTask.**
>> java:685)
>>          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:332)
>>          at org.apache.hadoop.mapred.**YarnChild$2.run(YarnChild.**
>> java:152)
>>          at java.security.**AccessController.doPrivileged(**Native
>> Method)
>>          at javax.security.auth.Subject.**doAs(Subject.java:396)
>>          at org.apache.hadoop.security.**UserGroupInformation.doAs(**
>> UserGroupInformation.java:**1332)
>>          at org.apache.hadoop.mapred.**YarnChild.main(YarnChild.java:**
>> 147)
>> Caused by: java.lang.**ClassNotFoundException: Class
>> org.apache.mahout.utils.**SplitInputJob$SplitInputMapper not found
>>          at org.apache.hadoop.conf.**Configuration.getClassByName(**
>> Configuration.java:1477)
>>          at org.apache.hadoop.conf.**Configuration.getClass(**
>> Configuration.java:1569)
>>          ... 8 more
>>
>>
>> I checked the mahout package on the distribution as follows.
>>
>> [cloudera@localhost ~]$ jar tf
>> /usr/lib/mahout/mahout-**examples-0.7-cdh4.1.2-job.jar | grep SplitInput
>> org/apache/mahout/utils/**SplitInputJob$**SplitInputReducer.class
>> org/apache/mahout/utils/**SplitInputJob$**SplitInputMapper.class
>> org/apache/mahout/utils/**SplitInputJob$**SplitInputComparator.class
>> org/apache/mahout/utils/**SplitInputJob.class
>> org/apache/mahout/utils/**SplitInput.class
>> org/apache/mahout/utils/**SplitInput$SplitCallback.class
>>
>> Can anyone help me out? Thanks.
>>
>>
>

Re: Class not found in mahout split -xm mapreduce

Posted by Marty Kube <ma...@beavercreekconsulting.com>.
Hi Shengchao,
My understanding is that CDH4 is not supported.  Try CDH3.
Marty
On 01/10/2013 01:03 PM, Shengchao Ding wrote:
> I'm running the 20 newsgroups examples on virtual machine of CDH4.1.2.
> It ran smoothly but failed if I modify the split command to
>
> mahout split \
>      -i newsgroup/vectors \
>      --trainingOutput newsgroup/train-vectors \
>      --testOutput newsgroup/test-vectors  \
>      --randomSelectionPct 40 --overwrite --sequenceFiles -xm mapreduce
> -mro newsgroup/mro
>
> The only different to original command is that the method is modified
> to mapreduce while the original example is sequential.
>
> I got the following exception.
>
> Error: java.lang.RuntimeException: java.lang.ClassNotFoundException:
> Class org.apache.mahout.utils.SplitInputJob$SplitInputMapper not found
>          at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1571)
>          at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
>          at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:685)
>          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
>          at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>          at java.security.AccessController.doPrivileged(Native Method)
>          at javax.security.auth.Subject.doAs(Subject.java:396)
>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>          at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.ClassNotFoundException: Class
> org.apache.mahout.utils.SplitInputJob$SplitInputMapper not found
>          at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1477)
>          at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1569)
>          ... 8 more
>
>
> I checked the mahout package on the distribution as follows.
>
> [cloudera@localhost ~]$ jar tf
> /usr/lib/mahout/mahout-examples-0.7-cdh4.1.2-job.jar | grep SplitInput
> org/apache/mahout/utils/SplitInputJob$SplitInputReducer.class
> org/apache/mahout/utils/SplitInputJob$SplitInputMapper.class
> org/apache/mahout/utils/SplitInputJob$SplitInputComparator.class
> org/apache/mahout/utils/SplitInputJob.class
> org/apache/mahout/utils/SplitInput.class
> org/apache/mahout/utils/SplitInput$SplitCallback.class
>
> Can anyone help me out? Thanks.
>