You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Peter Goldstein (JIRA)" <ji...@apache.org> on 2010/06/23 02:20:50 UTC

[jira] Created: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

$MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
-----------------------------------------------------------------------------------------------------------------------------

                 Key: MAHOUT-427
                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
             Project: Mahout
          Issue Type: Bug
    Affects Versions: 0.4
         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
            Reporter: Peter Goldstein


Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  

The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.

Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885774#action_12885774 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

I wasn't able to reproduce this issue following the wiki instructions (up to step 9, running the hadoop cluster in single-node mode), leaving out the mvn install on the examples directory.

Is there any chance you could try a 'mvn clean install' (from the base directory) with your working copy to see if it still refuses to build an examples job? Otherwise, could you check out a fresh copy and give that a try?

Somewhat unrelated but worth noting: when I did run build-reuters,sh, k-means converged after 8 iterations instead of 10 and thus the clusterdump step did not complete successfully. When you finally managed to get the examples to run, what was the highest numbered clusters-* directory in examples/bin/work/reuters-kmeans


> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891774#action_12891774 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

bq. Based on some of the output it looks like this may be another missing job.setJarByClass(Class) issue

That appears to be exactly what the problem is.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891757#action_12891757 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

Based on some of the output it looks like this may be another missing job.setJarByClass(Class) issue.  I'm in the process of rebuilding with modified example source where I've added those lines.  I'll let you know if that resolves the issue.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891778#action_12891778 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

You don't happen to know which classes/jobs, do you?  I'm unable to track which jobs are actually getting run.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891690#action_12891690 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

It took longer than a day, but I finally did try this again.  And now I can't get the examples to run at all.  With or without the mvn install inside the examples directory (and even after yet another mvn clean install from the root) I get page after page of exceptions, all ClassNotFoundExceptions at their base:

10/07/23 17:31:35 INFO mapred.JobClient: Task Id : attempt_201007231729_0001_m_0
00000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.mahout.
utils.vectors.text.document.SequenceFileTokenizerMapper
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:960)
...
10/07/23 17:31:35 INFO mapred.JobClient: Task Id : attempt_201007231729_0001_m_0
00001_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.mahout.
utils.vectors.text.document.SequenceFileTokenizerMapper
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:960)
...
10/07/23 17:31:41 INFO mapred.JobClient: Task Id : attempt_201007231729_0001_m_0
00001_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.mahout.
utils.vectors.text.document.SequenceFileTokenizerMapper
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:960)
...
10/07/23 17:31:48 INFO mapred.JobClient: Task Id : attempt_201007231729_0001_m_0
00000_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.mahout.
utils.vectors.text.document.SequenceFileTokenizerMapper
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:960)
...
10/07/23 17:32:08 INFO mapred.JobClient: Task Id : attempt_201007231729_0002_r_0
00000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.mahout.
utils.vectors.text.term.TermCountReducer
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:960)
...
10/07/23 17:32:14 INFO mapred.JobClient: Task Id : attempt_201007231729_0002_r_0
00000_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.mahout.
utils.vectors.text.term.TermCountReducer
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:960)
...
10/07/23 17:32:20 INFO mapred.JobClient: Task Id : attempt_201007231729_0002_r_0
00000_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.mahout.
utils.vectors.text.term.TermCountReducer
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:960)
...
10/07/23 17:32:42 INFO mapred.JobClient: Task Id : attempt_201007231729_0003_r_0
00000_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundE
xception: org.apache.mahout.common.StringTuple
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:992)
...
10/07/23 17:32:48 INFO mapred.JobClient: Task Id : attempt_201007231729_0003_r_0
00000_1, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundE
xception: org.apache.mahout.common.StringTuple
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:992)

This is using hadoop 0.20.2+320 and the trunk Mahout (r967178), on a totally fresh AMI.

Any ideas?




> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891808#action_12891808 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

I've been following along in parallel and have been able to get past the ClassNotFoundExceptions.  I'm synced up to r967257 but I have a couple of classes in my src that are modified to include a setJarByClass call that are not in the tree:

core/src/main/java/org/apache/mahout/ga/watchmaker/MahoutEvaluator.java
examples/src/main/java/org/apache/mahout/classifier/bayes/WikipediaDatasetCreatorDriver.java
examples/src/main/java/org/apache/mahout/ga/watchmaker/cd/hadoop/CDMahoutEvaluator.java
examples/src/main/java/org/apache/mahout/ga/watchmaker/cd/tool/CDInfosTool.java

Not sure if the setJarByClass is required in these classes, but they seem to fit the pattern.

Now I'm getting an NPE in ClusterDumper, which is presumably what you are running into.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892043#action_12892043 ] 

Hudson commented on MAHOUT-427:
-------------------------------

Integrated in Mahout-Quality #155 (See [http://hudson.zones.apache.org/hudson/job/Mahout-Quality/155/])
    MAHOUT-167 MAHOUT-427: Fixed NPE in ClusterDumper due to missing call to init() in ClusterDumper.run(String[]);
MAHOUT-427 MAHOUT-167 more classes missing setJarByClass(Class)
Many cases where job.setJarByClass(Class) was not being called, possibly related to MAHOUT-167 MAHOUT-427
Many cases where job.setJarByClass(Class) was not being called, possibly related to MAHOUT-167 MAHOUT-427


> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-427.
------------------------------

    Fix Version/s: 0.4
       Resolution: Fixed

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>             Fix For: 0.4
>
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885533#action_12885533 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

The -core argument is there to allow code to be run from the build tree without requiring that .job files be built. Otherwise, the mahout script obtains its classes from the following locations:

1) the mahout-*.jar files one would expect to find at the root of a binary release
2) the mahout-*.job files one would find in the target directories of the various subprojects
3) the release dependencies found in the lib directory at the root of a binary release.

With -core, target/classes from each of the subprojects is added to the classpath and the dependencies are added from  examples/target/dependency/*.jar

./examples/bin/build-reuters.sh should run after a 'mvn clean install' -- if it doesn't that's an issue.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891742#action_12891742 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

I've managed to replicate this locally from the current trunk, with {{$HADOOP_CONF_DIR}} and {{$HADOOP_HOME}} set and pointing at my local hadoop install. 

both:

{quote}
export HADOOP_CONF_DIR=/opt/hadoop/conf/
export HADOOP_HOME=/opt/hadoop
./bin/mahout seq2sparse -i ./examples/bin/work/reuters-out-seqdir/ -o ./examples/bin/work/reuters-out-seqdir-sparse
{quote}

and: 

{quote}
HADOOP_CLASSPATH=/home/drew/mahout/trunk-2/conf /opt/hadoop-0.20.2/bin/hadoop jar /home/drew/mahout/trunk-2/examples/target/mahout-examples-0.4-SNAPSHOT.job org.apache.mahout.driver.MahoutDriver seq2sparse -i ./examples/bin/work/reuters-out-seqdir/ -o ./examples/bin/work/reuters-out-seqdir-sparse
{quote}

Both fail with the same ClassNotFoundExceptions described above. It seems that somehow hadoop is not finding the classes inside the job file. I've even tried rolling the jar by hand with no luck.


> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891895#action_12891895 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

Everything looks good now.  Thanks for the patches.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885781#action_12885781 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

I should be able to try this tomorrow.  I'll get back to you with the results.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885757#action_12885757 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

bq. To get the script to run correctly I had to run an additional 'mvn install' from the examples subdirectory. Not sure why that was necessary, but the examples job file wasn't created until I ran the second command. So there may be a build issue with the current scripts

Yes, examples should have been built alongside everything else, so strictly speaking the second mvn install should not be necessary. Did the first mvn install executed from the base directory complete successfully? 

I'll give it a go following the EC2 instructions in the wiki to the letter to see if I can reproduce.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891785#action_12891785 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

bq. You don't happen to know which classes/jobs, do you?

I'm working it out, just committed a bunch of changes but we're not there yet. there are a couple more left.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885577#action_12885577 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

Thanks for clarifying.

The ./examples/bin/build-reuters.sh script didn't run after a 'mvn install' from the root directory in a fresh install.  This was on an EC2 machine, following the EC2 deployment directions on the wiki (which I've since modified).

To get the script to run correctly I had to run an additional 'mvn install' from the examples subdirectory.  Not sure why that was necessary, but the examples job file wasn't created until I ran the second command.  So there may be a build issue with the current scripts.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Peter Goldstein (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885760#action_12885760 ] 

Peter Goldstein commented on MAHOUT-427:
----------------------------------------

I didn't see an errors when I ran the initial 'mvn install' from the base directory, but the examples job files still weren't built.

Also note that I've already altered the wiki instructions to include a 'mvn install' from the examples subdirectory, so you'll need to leave that step out.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891805#action_12891805 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

Peter, could you give r967257 a try and report back how it works for you? I was able to get as far as the cluster dumping stage, still looking at an issue there.

> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-427) $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891880#action_12891880 ] 

Drew Farris commented on MAHOUT-427:
------------------------------------

Thanks for pointing out the additional classes Peter. Try r967309, that should clean up the NPE.

 If you run into a case where the clusterdump portion of build-reuters.sh completes with no output, take a look at the {{examples/bin/work/reuters-kmeans}} directory. The script attempts to dump from {{clusters-10}} but in my case kmeans converged earlier and the last output was {{clusters-9}}.



> $MAHOUT_HOME/examples/bin/build-reuters.sh doesn't run successfully because of classpath issues related to the -core argument
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-427
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-427
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4
>         Environment: Hadoop installation on EC2 as described here - https://cwiki.apache.org/MAHOUT/mahoutec2.html
>            Reporter: Peter Goldstein
>
> Once I resolved issue MAHOUT-426 , I was still not able to run the ./examples/bin/build-reuters.sh script without errors  These were ClassNotFoundExceptions, indicating a problem with the classpath.  
> The issue appears to be related to the "-core" argument, which controls the classpath.  Placing '-core' as the first argument to the individual $MAHOUT_HOME/bin/mahout calls in the script solved the issue, but I'm not sure it was the correct solution.  It's unclear to me what the '-core' argument is supposed to signify.
> Can someone shed some light on this, and tell me whether this is the correct solution to the problem?  And if not, what is the correct solution to this classpath issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.