You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Darius Miliauskas <da...@gmail.com> on 2013/09/13 13:37:09 UTC

Reuters Example in Windows&Cygwin

Dear All,

I tried to run Reuters Example on my Windows machine (Windows 7), using
Cygwin, but got the following error:

DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
$ ./cluster-reuters.sh
Please select a number to choose the corresponding clustering algorithm
1. kmeans clustering
2. fuzzykmeans clustering
3. dirichlet clustering
4. lda clustering
5. minhash clustering
Enter your choice : 2
ok. You chose 2 and we'll use fuzzykmeans Clustering
creating work directory at /tmp/mahout-work-DARIUS
Downloading Reuters-21578
./cluster-reuters.sh: line 80: curl: command not found
Failed to download reuters

How can I solve this problem?


Best,

Darius

Re: Reuters Example in Windows&Cygwin

Posted by Kevin Blaisdell <bl...@gmail.com>.
Darius,

Have you considered trying to use the Hortonworks windows distribution?  I
don't know if it will help you, but if you need to work on windows it
removes the cygwin requirement and might be a better experience.

Kevin


On Thu, Sep 19, 2013 at 7:55 AM, Darius Miliauskas <
dariui.miliauskui@gmail.com> wrote:

> To add, I tried the described solution "
>
> http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver
> ".
> The version of mahout is 0.8. I tried it by adding (worth to check the
> personal case of the paths accordingly, $MAHOUT_HOME should be set as well,
> in my case it is "C:\cygwin64\usr\local\mahout"):
>
> CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar
>
> at the end of the section in the file "mahout" (), so, the part looks like
> this
>
> # add release dependencies to CLASSPATH
>   for f in $MAHOUT_HOME/lib/*.jar; do
>     CLASSPATH=${CLASSPATH}:$f;
>   done
> else
>   CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/math/target/classes
>   CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/target/classes
>   CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/integration/target/classes
>   CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/examples/target/classes
>   #CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/src/main/resources
>   CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar
> fi
>
> However, I still get the same error.
>
>
> Ciao,
>
> Darius
>
>
> 2013/9/18 Darius Miliauskas <da...@gmail.com>
>
> > Thanks, Michael. I looked more deeper at "cluster-reuters.sh", and tried
> > to play with paths in System variables. I set $HADOOP_HOME
> > as "C:\cygwin64\usr\local\hadoop", and I got:
> >
> > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> > $ ./build-reuters.sh
> > Please call cluster-reuters.sh directly next time.  This file is going
> > away.
> > Please select a number to choose the corresponding clustering algorithm
> > 1. kmeans clustering
> > 2. fuzzykmeans clustering
> > 3. dirichlet clustering
> > 4. lda clustering
> > 5. minhash clustering
> > Enter your choice : 1
> > ok. You chose 1 and we'll use kmeans Clustering
> > creating work directory at /tmp/mahout-work-DARIUS
> > cygwin warning:
> >   MS-DOS style path detected: C:\cygwin64\usr\local\hadoop/bin/hadoop
> >   Preferred POSIX equivalent is: /usr/local/hadoop/bin/hadoop
> >   CYGWIN environment variable option "nodosfilewarning" turns off this
> > warning.
> >   Consult the user's guide for more details about POSIX paths:
> >     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> > Extracting Reuters
> > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> > locally
> > SLF4J: Class path contains multiple SLF4J bindings.
> > SLF4J: Found binding in
> > [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-
> >
> >                  0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: Found binding in
> > [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.
> >
> >                  7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > explanation.
> > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/hadoop/util/ProgramDriver
> >         at
> > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.util.ProgramDriver
> >
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> >         ... 1 more
> > Copying Reuters data to Hadoop
> > Warning: $HADOOP_HOME is deprecated.
> >
> > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command
> not
> > found
> > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> > directory.
> > Warning: $HADOOP_HOME is deprecated.
> >
> > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command
> not
> > found
> > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> > directory.
> > Warning: $HADOOP_HOME is deprecated.
> >
> > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command
> not
> > found
> > put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
> >
> > There is the piece of the code in "cluster-reuters.sh" which use that
> > value:
> >
> > if [ "$HADOOP_HOME" != "" ] && [ "$MAHOUT_LOCAL" == "" ] ; then
> >   HADOOP="$HADOOP_HOME/bin/hadoop"
> >   if [ ! -e $HADOOP ]; then
> >     echo "Can't find hadoop in $HADOOP, exiting"
> >     exit 1
> >   fi
> > fi
> >
> > So, I reset my $HADOOP_HOME as "C:/cygwin64/usr/local/hadoop", and then
> > ran again, and I got:
> >
> > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> > $ ./build-reuters.sh
> > Please call cluster-reuters.sh directly next time.  This file is going
> > away.
> > Please select a number to choose the corresponding clustering algorithm
> > 1. kmeans clustering
> > 2. fuzzykmeans clustering
> > 3. dirichlet clustering
> > 4. lda clustering
> > 5. minhash clustering
> > Enter your choice : 1
> > ok. You chose 1 and we'll use kmeans Clustering
> > creating work directory at /tmp/mahout-work-DARIUS
> > Extracting Reuters
> > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> > locally
> > SLF4J: Class path contains multiple SLF4J bindings.
> > SLF4J: Found binding in
> > [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8
> > job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: Found binding in
> >
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > explanation.
> > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/hadoop/util/ProgramDriver
> >         at
> > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.util.ProgramDriver
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> >         ... 1 more
> > Copying Reuters data to Hadoop
> > Warning: $HADOOP_HOME is deprecated.
> >
> > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command
> not
> > found
> > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> > directory.
> > Warning: $HADOOP_HOME is deprecated.
> >
> > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command
> not
> > found
> > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> > directory.
> > Warning: $HADOOP_HOME is deprecated.
> >
> > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command
> not
> > found
> > put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
> >
> > Similar issue is described here (
> >
> http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver
> ).
> > So, it is odd that hadoop binary is not in the path while it should be
> > there. Missing the class "org/apache/hadoop/util/ProgramDriver" but it is
> > in "C:\cygwin64\usr\local\hadoop\src\core\org\apache\hadoop\util".
> >
> >
> > Darius
> >
> >
> > 2013/9/17 Darius Miliauskas <da...@gmail.com>
> >
> >> I guess there is some problems with the paths in Cygwin since I get that
> >> output:
> >>
> >> DARIUS@DARIUS-PC ~
> >> cd ..
> >>
> >> DARIUS@DARIUS-PC ~
> >> cd
> >>
> >> DARIUS@DARIUS-PC ~
> >> $ cd /usr/local/mahout/examples/bin
> >>
> >> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> >> $ ./build-reuters.sh
> >> Please call cluster-reuters.sh directly next time.  This file is going
> >> away.
> >> Please select a number to choose the corresponding clustering algorithm
> >> 1. kmeans clustering
> >> 2. fuzzykmeans clustering
> >> 3. dirichlet clustering
> >> 4. lda clustering
> >> 5. minhash clustering
> >> Enter your choice : 1
> >> ok. You chose 1 and we'll use kmeans Clustering
> >> creating work directory at /tmp/mahout-work-DARIUS
> >> Extracting Reuters
> >> Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> HADOOP_CONF_DIR=
> >> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> >> cygwin warning:
> >>   MS-DOS style path detected: /usr/local/bin/C:\Program
> >>   Preferred POSIX equivalent is: /usr/local/bin/C:/Program
> >>   CYGWIN environment variable option "nodosfilewarning" turns off this
> >> warning.
> >>   Consult the user's guide for more details about POSIX paths:
> >>     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> >> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> >> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
> >> Converting to Sequence Files from Directory
> >> Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> HADOOP_CONF_DIR=
> >> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> >> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> >> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
> >>  Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> >> HADOOP_CONF_DIR=
> >> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> >> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> >> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
> >>
> >> How should I run the clustering then?
> >>
> >>
> >> Thanks,
> >>
> >> Darius
> >>
> >>
> >> 2013/9/16 Michael Wechner <mi...@wyona.com>
> >>
> >>> Hi Darius
> >>>
> >>> I think you need to try to understand why in your case certain classes
> >>> are not being found.
> >>>
> >>> I would suggest that you have a look at the reuters script and try to
> >>> understand where exactly the problems
> >>> occur and then go deeper in order to find out the root of the problem.
> >>>
> >>> HTH
> >>>
> >>> Michael
> >>>
> >>> Am 16.09.13 17:10, schrieb Darius Miliauskas:
> >>>
> >>>  Caused by: java.lang.**ClassNotFoundException:
> >>>> > >org.apache.hadoop.util.**ProgramDriver
> >>>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
> >>>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
> >>>> > >        at java.security.**AccessController.doPrivileged(**Native
> >>>> Method)
> >>>> > >        at
> java.net.URLClassLoader.**findClass(URLClassLoader.java:*
> >>>> *354)
> >>>> > >        at
> java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
> >>>> > >        at sun.misc.Launcher$**AppClassLoader.loadClass(**
> >>>> Launcher.java:308)
> >>>> > >        at
> java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
> >>>>
> >>>
> >>>
> >>
> >
>

Re: Reuters Example in Windows&Cygwin

Posted by Pat Ferrel <pa...@occamsmachete.com>.
These look like hadoop errors, probably setup errors. Have you followed the Windows hadoop setup procedure and tested it separately from Mahout to verify it is working properly first? You may want to try the hadoop mailing list and look for a cygwin expert.

Trying to run this stack on Windows will make your life a little more difficult because cygwin is not quite unix. Can you create a Virtual machine and install a linux version in it? If so at least the standard installs should work out of the box. Sorry but Windows experts are getting harder to find on the mailing lists--I'm certainly not one.

 
On Sep 19, 2013, at 5:55 AM, Darius Miliauskas <da...@gmail.com> wrote:

To add, I tried the described solution "
http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver".
The version of mahout is 0.8. I tried it by adding (worth to check the
personal case of the paths accordingly, $MAHOUT_HOME should be set as well,
in my case it is "C:\cygwin64\usr\local\mahout"):

CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar

at the end of the section in the file "mahout" (), so, the part looks like
this

# add release dependencies to CLASSPATH
 for f in $MAHOUT_HOME/lib/*.jar; do
   CLASSPATH=${CLASSPATH}:$f;
 done
else
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/math/target/classes
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/target/classes
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/integration/target/classes
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/examples/target/classes
 #CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/src/main/resources
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar
fi

However, I still get the same error.


Ciao,

Darius


2013/9/18 Darius Miliauskas <da...@gmail.com>

> Thanks, Michael. I looked more deeper at "cluster-reuters.sh", and tried
> to play with paths in System variables. I set $HADOOP_HOME
> as "C:\cygwin64\usr\local\hadoop", and I got:
> 
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./build-reuters.sh
> Please call cluster-reuters.sh directly next time.  This file is going
> away.
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> cygwin warning:
>  MS-DOS style path detected: C:\cygwin64\usr\local\hadoop/bin/hadoop
>  Preferred POSIX equivalent is: /usr/local/hadoop/bin/hadoop
>  CYGWIN environment variable option "nodosfilewarning" turns off this
> warning.
>  Consult the user's guide for more details about POSIX paths:
>    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-
> 
>                 0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.
> 
>                 7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
> 
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> Copying Reuters data to Hadoop
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
> 
> There is the piece of the code in "cluster-reuters.sh" which use that
> value:
> 
> if [ "$HADOOP_HOME" != "" ] && [ "$MAHOUT_LOCAL" == "" ] ; then
>  HADOOP="$HADOOP_HOME/bin/hadoop"
>  if [ ! -e $HADOOP ]; then
>    echo "Can't find hadoop in $HADOOP, exiting"
>    exit 1
>  fi
> fi
> 
> So, I reset my $HADOOP_HOME as "C:/cygwin64/usr/local/hadoop", and then
> ran again, and I got:
> 
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./build-reuters.sh
> Please call cluster-reuters.sh directly next time.  This file is going
> away.
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8
> job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> Copying Reuters data to Hadoop
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
> 
> Similar issue is described here (
> http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver).
> So, it is odd that hadoop binary is not in the path while it should be
> there. Missing the class "org/apache/hadoop/util/ProgramDriver" but it is
> in "C:\cygwin64\usr\local\hadoop\src\core\org\apache\hadoop\util".
> 
> 
> Darius
> 
> 
> 2013/9/17 Darius Miliauskas <da...@gmail.com>
> 
>> I guess there is some problems with the paths in Cygwin since I get that
>> output:
>> 
>> DARIUS@DARIUS-PC ~
>> cd ..
>> 
>> DARIUS@DARIUS-PC ~
>> cd
>> 
>> DARIUS@DARIUS-PC ~
>> $ cd /usr/local/mahout/examples/bin
>> 
>> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
>> $ ./build-reuters.sh
>> Please call cluster-reuters.sh directly next time.  This file is going
>> away.
>> Please select a number to choose the corresponding clustering algorithm
>> 1. kmeans clustering
>> 2. fuzzykmeans clustering
>> 3. dirichlet clustering
>> 4. lda clustering
>> 5. minhash clustering
>> Enter your choice : 1
>> ok. You chose 1 and we'll use kmeans Clustering
>> creating work directory at /tmp/mahout-work-DARIUS
>> Extracting Reuters
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> cygwin warning:
>>  MS-DOS style path detected: /usr/local/bin/C:\Program
>>  Preferred POSIX equivalent is: /usr/local/bin/C:/Program
>>  CYGWIN environment variable option "nodosfilewarning" turns off this
>> warning.
>>  Consult the user's guide for more details about POSIX paths:
>>    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>> Converting to Sequence Files from Directory
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and
>> HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>> 
>> How should I run the clustering then?
>> 
>> 
>> Thanks,
>> 
>> Darius
>> 
>> 
>> 2013/9/16 Michael Wechner <mi...@wyona.com>
>> 
>>> Hi Darius
>>> 
>>> I think you need to try to understand why in your case certain classes
>>> are not being found.
>>> 
>>> I would suggest that you have a look at the reuters script and try to
>>> understand where exactly the problems
>>> occur and then go deeper in order to find out the root of the problem.
>>> 
>>> HTH
>>> 
>>> Michael
>>> 
>>> Am 16.09.13 17:10, schrieb Darius Miliauskas:
>>> 
>>> Caused by: java.lang.**ClassNotFoundException:
>>>>>> org.apache.hadoop.util.**ProgramDriver
>>>>>>       at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
>>>>>>       at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
>>>>>>       at java.security.**AccessController.doPrivileged(**Native
>>>> Method)
>>>>>>       at java.net.URLClassLoader.**findClass(URLClassLoader.java:*
>>>> *354)
>>>>>>       at java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
>>>>>>       at sun.misc.Launcher$**AppClassLoader.loadClass(**
>>>> Launcher.java:308)
>>>>>>       at java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
>>>> 
>>> 
>>> 
>> 
> 


Re: Reuters Example in Windows&Cygwin

Posted by Darius Miliauskas <da...@gmail.com>.
To add, I tried the described solution "
http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver".
The version of mahout is 0.8. I tried it by adding (worth to check the
personal case of the paths accordingly, $MAHOUT_HOME should be set as well,
in my case it is "C:\cygwin64\usr\local\mahout"):

CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar

at the end of the section in the file "mahout" (), so, the part looks like
this

# add release dependencies to CLASSPATH
  for f in $MAHOUT_HOME/lib/*.jar; do
    CLASSPATH=${CLASSPATH}:$f;
  done
else
  CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/math/target/classes
  CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/target/classes
  CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/integration/target/classes
  CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/examples/target/classes
  #CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/src/main/resources
  CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar
fi

However, I still get the same error.


Ciao,

Darius


2013/9/18 Darius Miliauskas <da...@gmail.com>

> Thanks, Michael. I looked more deeper at "cluster-reuters.sh", and tried
> to play with paths in System variables. I set $HADOOP_HOME
> as "C:\cygwin64\usr\local\hadoop", and I got:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./build-reuters.sh
> Please call cluster-reuters.sh directly next time.  This file is going
> away.
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> cygwin warning:
>   MS-DOS style path detected: C:\cygwin64\usr\local\hadoop/bin/hadoop
>   Preferred POSIX equivalent is: /usr/local/hadoop/bin/hadoop
>   CYGWIN environment variable option "nodosfilewarning" turns off this
> warning.
>   Consult the user's guide for more details about POSIX paths:
>     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-
>
>                  0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.
>
>                  7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>         at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>         ... 1 more
> Copying Reuters data to Hadoop
> Warning: $HADOOP_HOME is deprecated.
>
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
>
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
>
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
>
> There is the piece of the code in "cluster-reuters.sh" which use that
> value:
>
> if [ "$HADOOP_HOME" != "" ] && [ "$MAHOUT_LOCAL" == "" ] ; then
>   HADOOP="$HADOOP_HOME/bin/hadoop"
>   if [ ! -e $HADOOP ]; then
>     echo "Can't find hadoop in $HADOOP, exiting"
>     exit 1
>   fi
> fi
>
> So, I reset my $HADOOP_HOME as "C:/cygwin64/usr/local/hadoop", and then
> ran again, and I got:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./build-reuters.sh
> Please call cluster-reuters.sh directly next time.  This file is going
> away.
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8
> job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>         at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>         ... 1 more
> Copying Reuters data to Hadoop
> Warning: $HADOOP_HOME is deprecated.
>
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
>
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
>
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
>
> Similar issue is described here (
> http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver).
> So, it is odd that hadoop binary is not in the path while it should be
> there. Missing the class "org/apache/hadoop/util/ProgramDriver" but it is
> in "C:\cygwin64\usr\local\hadoop\src\core\org\apache\hadoop\util".
>
>
> Darius
>
>
> 2013/9/17 Darius Miliauskas <da...@gmail.com>
>
>> I guess there is some problems with the paths in Cygwin since I get that
>> output:
>>
>> DARIUS@DARIUS-PC ~
>> cd ..
>>
>> DARIUS@DARIUS-PC ~
>> cd
>>
>> DARIUS@DARIUS-PC ~
>> $ cd /usr/local/mahout/examples/bin
>>
>> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
>> $ ./build-reuters.sh
>> Please call cluster-reuters.sh directly next time.  This file is going
>> away.
>> Please select a number to choose the corresponding clustering algorithm
>> 1. kmeans clustering
>> 2. fuzzykmeans clustering
>> 3. dirichlet clustering
>> 4. lda clustering
>> 5. minhash clustering
>> Enter your choice : 1
>> ok. You chose 1 and we'll use kmeans Clustering
>> creating work directory at /tmp/mahout-work-DARIUS
>> Extracting Reuters
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> cygwin warning:
>>   MS-DOS style path detected: /usr/local/bin/C:\Program
>>   Preferred POSIX equivalent is: /usr/local/bin/C:/Program
>>   CYGWIN environment variable option "nodosfilewarning" turns off this
>> warning.
>>   Consult the user's guide for more details about POSIX paths:
>>     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>> Converting to Sequence Files from Directory
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>>  Running on hadoop, using /usr/local/hadoop/bin/hadoop and
>> HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>>
>> How should I run the clustering then?
>>
>>
>> Thanks,
>>
>> Darius
>>
>>
>> 2013/9/16 Michael Wechner <mi...@wyona.com>
>>
>>> Hi Darius
>>>
>>> I think you need to try to understand why in your case certain classes
>>> are not being found.
>>>
>>> I would suggest that you have a look at the reuters script and try to
>>> understand where exactly the problems
>>> occur and then go deeper in order to find out the root of the problem.
>>>
>>> HTH
>>>
>>> Michael
>>>
>>> Am 16.09.13 17:10, schrieb Darius Miliauskas:
>>>
>>>  Caused by: java.lang.**ClassNotFoundException:
>>>> > >org.apache.hadoop.util.**ProgramDriver
>>>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
>>>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
>>>> > >        at java.security.**AccessController.doPrivileged(**Native
>>>> Method)
>>>> > >        at java.net.URLClassLoader.**findClass(URLClassLoader.java:*
>>>> *354)
>>>> > >        at java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
>>>> > >        at sun.misc.Launcher$**AppClassLoader.loadClass(**
>>>> Launcher.java:308)
>>>> > >        at java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
>>>>
>>>
>>>
>>
>

Re: Reuters Example in Windows&Cygwin

Posted by Darius Miliauskas <da...@gmail.com>.
Thanks, Michael. I looked more deeper at "cluster-reuters.sh", and tried to
play with paths in System variables. I set $HADOOP_HOME
as "C:\cygwin64\usr\local\hadoop", and I got:

DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
$ ./build-reuters.sh
Please call cluster-reuters.sh directly next time.  This file is going away.
Please select a number to choose the corresponding clustering algorithm
1. kmeans clustering
2. fuzzykmeans clustering
3. dirichlet clustering
4. lda clustering
5. minhash clustering
Enter your choice : 1
ok. You chose 1 and we'll use kmeans Clustering
creating work directory at /tmp/mahout-work-DARIUS
cygwin warning:
  MS-DOS style path detected: C:\cygwin64\usr\local\hadoop/bin/hadoop
  Preferred POSIX equivalent is: /usr/local/hadoop/bin/hadoop
  CYGWIN environment variable option "nodosfilewarning" turns off this
warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
Extracting Reuters
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-

                 0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.

                 7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/util/ProgramDriver
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.util.ProgramDriver

        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        ... 1 more
Copying Reuters data to Hadoop
Warning: $HADOOP_HOME is deprecated.

C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
found
rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
directory.
Warning: $HADOOP_HOME is deprecated.

C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
found
rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
directory.
Warning: $HADOOP_HOME is deprecated.

C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
found
put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.

There is the piece of the code in "cluster-reuters.sh" which use that value:

if [ "$HADOOP_HOME" != "" ] && [ "$MAHOUT_LOCAL" == "" ] ; then
  HADOOP="$HADOOP_HOME/bin/hadoop"
  if [ ! -e $HADOOP ]; then
    echo "Can't find hadoop in $HADOOP, exiting"
    exit 1
  fi
fi

So, I reset my $HADOOP_HOME as "C:/cygwin64/usr/local/hadoop", and then ran
again, and I got:

DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
$ ./build-reuters.sh
Please call cluster-reuters.sh directly next time.  This file is going away.
Please select a number to choose the corresponding clustering algorithm
1. kmeans clustering
2. fuzzykmeans clustering
3. dirichlet clustering
4. lda clustering
5. minhash clustering
Enter your choice : 1
ok. You chose 1 and we'll use kmeans Clustering
creating work directory at /tmp/mahout-work-DARIUS
Extracting Reuters
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8
job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/util/ProgramDriver
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.util.ProgramDriver
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        ... 1 more
Copying Reuters data to Hadoop
Warning: $HADOOP_HOME is deprecated.

C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
found
rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
directory.
Warning: $HADOOP_HOME is deprecated.

C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
found
rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
directory.
Warning: $HADOOP_HOME is deprecated.

C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
found
put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.

Similar issue is described here (
http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver).
So, it is odd that hadoop binary is not in the path while it should be
there. Missing the class "org/apache/hadoop/util/ProgramDriver" but it is
in "C:\cygwin64\usr\local\hadoop\src\core\org\apache\hadoop\util".


Darius


2013/9/17 Darius Miliauskas <da...@gmail.com>

> I guess there is some problems with the paths in Cygwin since I get that
> output:
>
> DARIUS@DARIUS-PC ~
> cd ..
>
> DARIUS@DARIUS-PC ~
> cd
>
> DARIUS@DARIUS-PC ~
> $ cd /usr/local/mahout/examples/bin
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./build-reuters.sh
> Please call cluster-reuters.sh directly next time.  This file is going
> away.
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> cygwin warning:
>   MS-DOS style path detected: /usr/local/bin/C:\Program
>   Preferred POSIX equivalent is: /usr/local/bin/C:/Program
>   CYGWIN environment variable option "nodosfilewarning" turns off this
> warning.
>   Consult the user's guide for more details about POSIX paths:
>     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
> Converting to Sequence Files from Directory
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>
> How should I run the clustering then?
>
>
> Thanks,
>
> Darius
>
>
> 2013/9/16 Michael Wechner <mi...@wyona.com>
>
>> Hi Darius
>>
>> I think you need to try to understand why in your case certain classes
>> are not being found.
>>
>> I would suggest that you have a look at the reuters script and try to
>> understand where exactly the problems
>> occur and then go deeper in order to find out the root of the problem.
>>
>> HTH
>>
>> Michael
>>
>> Am 16.09.13 17:10, schrieb Darius Miliauskas:
>>
>>  Caused by: java.lang.**ClassNotFoundException:
>>> > >org.apache.hadoop.util.**ProgramDriver
>>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
>>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
>>> > >        at java.security.**AccessController.doPrivileged(**Native
>>> Method)
>>> > >        at java.net.URLClassLoader.**findClass(URLClassLoader.java:**
>>> 354)
>>> > >        at java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
>>> > >        at sun.misc.Launcher$**AppClassLoader.loadClass(**
>>> Launcher.java:308)
>>> > >        at java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
>>>
>>
>>
>

Re: Reuters Example in Windows&Cygwin

Posted by Darius Miliauskas <da...@gmail.com>.
I guess there is some problems with the paths in Cygwin since I get that
output:

DARIUS@DARIUS-PC ~
cd ..

DARIUS@DARIUS-PC ~
cd

DARIUS@DARIUS-PC ~
$ cd /usr/local/mahout/examples/bin

DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
$ ./build-reuters.sh
Please call cluster-reuters.sh directly next time.  This file is going away.
Please select a number to choose the corresponding clustering algorithm
1. kmeans clustering
2. fuzzykmeans clustering
3. dirichlet clustering
4. lda clustering
5. minhash clustering
Enter your choice : 1
ok. You chose 1 and we'll use kmeans Clustering
creating work directory at /tmp/mahout-work-DARIUS
Extracting Reuters
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
cygwin warning:
  MS-DOS style path detected: /usr/local/bin/C:\Program
  Preferred POSIX equivalent is: /usr/local/bin/C:/Program
  CYGWIN environment variable option "nodosfilewarning" turns off this
warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
/usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
Converting to Sequence Files from Directory
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
/usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
/usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar

How should I run the clustering then?


Thanks,

Darius


2013/9/16 Michael Wechner <mi...@wyona.com>

> Hi Darius
>
> I think you need to try to understand why in your case certain classes are
> not being found.
>
> I would suggest that you have a look at the reuters script and try to
> understand where exactly the problems
> occur and then go deeper in order to find out the root of the problem.
>
> HTH
>
> Michael
>
> Am 16.09.13 17:10, schrieb Darius Miliauskas:
>
>  Caused by: java.lang.**ClassNotFoundException:
>> > >org.apache.hadoop.util.**ProgramDriver
>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
>> > >        at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
>> > >        at java.security.**AccessController.doPrivileged(**Native
>> Method)
>> > >        at java.net.URLClassLoader.**findClass(URLClassLoader.java:**
>> 354)
>> > >        at java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
>> > >        at sun.misc.Launcher$**AppClassLoader.loadClass(**
>> Launcher.java:308)
>> > >        at java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
>>
>
>

Re: Reuters Example in Windows&Cygwin

Posted by Michael Wechner <mi...@wyona.com>.
Hi Darius

I think you need to try to understand why in your case certain classes 
are not being found.

I would suggest that you have a look at the reuters script and try to 
understand where exactly the problems
occur and then go deeper in order to find out the root of the problem.

HTH

Michael

Am 16.09.13 17:10, schrieb Darius Miliauskas:
> Caused by: java.lang.ClassNotFoundException:
> > >org.apache.hadoop.util.ProgramDriver
> > >        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> > >        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > >        at java.security.AccessController.doPrivileged(Native Method)
> > >        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > >        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> > >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > >        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)


Re: Reuters Example in Windows&Cygwin

Posted by Darius Miliauskas <da...@gmail.com>.
I installed it but the clustering of reuters data does not provide any
results as I described before.


2013/9/16 Gokhan Capan <gk...@gmail.com>

> I believe you can install it separately, without having reinstall Cygwin
>
> Sent from my iPhone
>
> On Sep 16, 2013, at 15:30, Darius Miliauskas
> <da...@gmail.com> wrote:
>
> > Thanks, Gokham, I needed to install "curl" additionally by running Cygwin
> > installer again (choosing not to skip "curl" which was skipped by
> default).
> >
> > 1.
> > I got:
> >
> > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> > $ ./cluster-reuters.sh
> > Please select a number to choose the corresponding clustering algorithm
> > 1. kmeans clustering
> > 2. fuzzykmeans clustering
> > 3. dirichlet clustering
> > 4. lda clustering
> > 5. minhash clustering
> > Enter your choice : 1
> > ok. You chose 1 and we'll use kmeans Clustering
> > creating work directory at /tmp/mahout-work-DARIUS
> > Extracting Reuters
> > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> > locally
> > SLF4J: Class path contains multiple SLF4J bindings.
> > SLF4J: Found binding in
> >
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: Found binding in
> >
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > explanation.
> > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/hadoop/util/ProgramDriver
> >        at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.util.ProgramDriver
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> >        ... 1 more
> > Converting to Sequence Files from Directory
> > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> > locally
> > SLF4J: Class path contains multiple SLF4J bindings.
> > SLF4J: Found binding in
> >
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: Found binding in
> >
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > explanation.
> > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/hadoop/util/ProgramDriver
> >        at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.util.ProgramDriver
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> >        ... 1 more
> > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> > locally
> > SLF4J: Class path contains multiple SLF4J bindings.
> > SLF4J: Found binding in
> >
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: Found binding in
> >
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > explanation.
> > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/hadoop/util/ProgramDriver
> >        at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.util.ProgramDriver
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> >        ... 1 more
> >
> > 2. then I set path of hadoop using GUI of Windows:
> > "C:\cygwin64\usr\local\hadoop\bin". And got the following output:
> >
> > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> > $ ./cluster-reuters.sh
> > Please select a number to choose the corresponding clustering algorithm
> > 1. kmeans clustering
> > 2. fuzzykmeans clustering
> > 3. dirichlet clustering
> > 4. lda clustering
> > 5. minhash clustering
> > Enter your choice : 1
> > ok. You chose 1 and we'll use kmeans Clustering
> > creating work directory at /tmp/mahout-work-DARIUS
> > Extracting Reuters
> > Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> HADOOP_CONF_DIR=
> > MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> > cygwin warning:
> >  MS-DOS style path detected: /usr/local/bin/C:\Program
> >  Preferred POSIX equivalent is: /usr/local/bin/C:/Program
> >  CYGWIN environment variable option "nodosfilewarning" turns off this
> > warning.
> >  Consult the user's guide for more details about POSIX paths:
> >    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> > /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> > /usr/local/hadoop/bin/hadoop: line 434: C:\Program
> Files\Java\jdk1.7.0_05;
> > C:\Pr
> >                                         ogram
> > Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> > /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> > Files\Java\jdk1.7.0_05;
> >                                                            C:\Program
> > Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
> >
> >                              directory
> > Converting to Sequence Files from Directory
> > Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> HADOOP_CONF_DIR=
> > MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> > /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> > /usr/local/hadoop/bin/hadoop: line 434: C:\Program
> Files\Java\jdk1.7.0_05;
> > C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> > /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> > Files\Java\jdk1.7.0_05;
> >                                                            C:\Program
> > Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
> >
> >                              directory
> > Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> HADOOP_CONF_DIR=
> > MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> > /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> > /usr/local/hadoop/bin/hadoop: line 434: C:\Program
> Files\Java\jdk1.7.0_05;
> > C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> > /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> > Files\Java\jdk1.7.0_05;
> >                                                            C:\Program
> > Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
> >
> >                              directory
> >
> > Actually, the files (from reuters) are downloaded as you can see:
> >
> > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> > $ pwd
> > /usr/local/mahout/examples/bin
> >
> > DARIUS@DARIUS-PC ~
> > $ cd /tmp
> >
> > DARIUS@DARIUS-PC /tmp
> > $ ls
> > hsperfdata_DARIUS  mahout-work-DARIUS
> >
> > DARIUS@DARIUS-PC /tmp
> > $ cd mahout-work-DARIUS/
> >
> > DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
> > $ ls
> > reuters21578.tar.gz  reuters-sgm
> >
> > DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
> > $ ls reuters-sgm/
> > all-exchanges-strings.lc.txt  all-topics-strings.lc.txt
> > README.txt     reut2-003.sgm  reut2-007.sgm  reut2-011.sgm  reut2-015.sgm
> > reut2-019.sgm
> > all-orgs-strings.lc.txt       cat-descriptions_120396.txt
> > reut2-000.sgm  reut2-004.sgm  reut2-008.sgm  reut2-012.sgm  reut2-016.sgm
> > reut2-020.sgm
> > all-people-strings.lc.txt     feldman-cia-worldfactbook-data.txt
> > reut2-001.sgm  reut2-005.sgm  reut2-009.sgm  reut2-013.sgm  reut2-017.sgm
> > reut2-021.sgm
> > all-places-strings.lc.txt     lewis.dtd
> > reut2-002.sgm  reut2-006.sgm  reut2-010.sgm  reut2-014.sgm  reut2-018.sgm
> >
> > Anyway, I do not get any clustering. So, where is the problem?
> >
> >
> > Best,
> >
> > Darius
> >
> >
> > 2013/9/13 Gokhan Capan <gk...@gmail.com>
> >
> >> You need to have 'curl' installed, as the error message tells.
> >>
> >> Gokhan
> >>
> >>
> >> On Fri, Sep 13, 2013 at 2:37 PM, Darius Miliauskas <
> >> dariui.miliauskui@gmail.com> wrote:
> >>
> >>> Dear All,
> >>>
> >>> I tried to run Reuters Example on my Windows machine (Windows 7), using
> >>> Cygwin, but got the following error:
> >>>
> >>> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> >>> $ ./cluster-reuters.sh
> >>> Please select a number to choose the corresponding clustering algorithm
> >>> 1. kmeans clustering
> >>> 2. fuzzykmeans clustering
> >>> 3. dirichlet clustering
> >>> 4. lda clustering
> >>> 5. minhash clustering
> >>> Enter your choice : 2
> >>> ok. You chose 2 and we'll use fuzzykmeans Clustering
> >>> creating work directory at /tmp/mahout-work-DARIUS
> >>> Downloading Reuters-21578
> >>> ./cluster-reuters.sh: line 80: curl: command not found
> >>> Failed to download reuters
> >>>
> >>> How can I solve this problem?
> >>>
> >>>
> >>> Best,
> >>>
> >>> Darius
> >>
>

Re: Reuters Example in Windows&Cygwin

Posted by Gokhan Capan <gk...@gmail.com>.
I believe you can install it separately, without having reinstall Cygwin

Sent from my iPhone

On Sep 16, 2013, at 15:30, Darius Miliauskas
<da...@gmail.com> wrote:

> Thanks, Gokham, I needed to install "curl" additionally by running Cygwin
> installer again (choosing not to skip "curl" which was skipped by default).
>
> 1.
> I got:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./cluster-reuters.sh
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> Converting to Sequence Files from Directory
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
>
> 2. then I set path of hadoop using GUI of Windows:
> "C:\cygwin64\usr\local\hadoop\bin". And got the following output:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./cluster-reuters.sh
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> cygwin warning:
>  MS-DOS style path detected: /usr/local/bin/C:\Program
>  Preferred POSIX equivalent is: /usr/local/bin/C:/Program
>  CYGWIN environment variable option "nodosfilewarning" turns off this
> warning.
>  Consult the user's guide for more details about POSIX paths:
>    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> /usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
> C:\Pr
>                                         ogram
> Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> Files\Java\jdk1.7.0_05;
>                                                            C:\Program
> Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
>
>                              directory
> Converting to Sequence Files from Directory
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> /usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
> C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> Files\Java\jdk1.7.0_05;
>                                                            C:\Program
> Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
>
>                              directory
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> /usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
> C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> Files\Java\jdk1.7.0_05;
>                                                            C:\Program
> Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
>
>                              directory
>
> Actually, the files (from reuters) are downloaded as you can see:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ pwd
> /usr/local/mahout/examples/bin
>
> DARIUS@DARIUS-PC ~
> $ cd /tmp
>
> DARIUS@DARIUS-PC /tmp
> $ ls
> hsperfdata_DARIUS  mahout-work-DARIUS
>
> DARIUS@DARIUS-PC /tmp
> $ cd mahout-work-DARIUS/
>
> DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
> $ ls
> reuters21578.tar.gz  reuters-sgm
>
> DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
> $ ls reuters-sgm/
> all-exchanges-strings.lc.txt  all-topics-strings.lc.txt
> README.txt     reut2-003.sgm  reut2-007.sgm  reut2-011.sgm  reut2-015.sgm
> reut2-019.sgm
> all-orgs-strings.lc.txt       cat-descriptions_120396.txt
> reut2-000.sgm  reut2-004.sgm  reut2-008.sgm  reut2-012.sgm  reut2-016.sgm
> reut2-020.sgm
> all-people-strings.lc.txt     feldman-cia-worldfactbook-data.txt
> reut2-001.sgm  reut2-005.sgm  reut2-009.sgm  reut2-013.sgm  reut2-017.sgm
> reut2-021.sgm
> all-places-strings.lc.txt     lewis.dtd
> reut2-002.sgm  reut2-006.sgm  reut2-010.sgm  reut2-014.sgm  reut2-018.sgm
>
> Anyway, I do not get any clustering. So, where is the problem?
>
>
> Best,
>
> Darius
>
>
> 2013/9/13 Gokhan Capan <gk...@gmail.com>
>
>> You need to have 'curl' installed, as the error message tells.
>>
>> Gokhan
>>
>>
>> On Fri, Sep 13, 2013 at 2:37 PM, Darius Miliauskas <
>> dariui.miliauskui@gmail.com> wrote:
>>
>>> Dear All,
>>>
>>> I tried to run Reuters Example on my Windows machine (Windows 7), using
>>> Cygwin, but got the following error:
>>>
>>> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
>>> $ ./cluster-reuters.sh
>>> Please select a number to choose the corresponding clustering algorithm
>>> 1. kmeans clustering
>>> 2. fuzzykmeans clustering
>>> 3. dirichlet clustering
>>> 4. lda clustering
>>> 5. minhash clustering
>>> Enter your choice : 2
>>> ok. You chose 2 and we'll use fuzzykmeans Clustering
>>> creating work directory at /tmp/mahout-work-DARIUS
>>> Downloading Reuters-21578
>>> ./cluster-reuters.sh: line 80: curl: command not found
>>> Failed to download reuters
>>>
>>> How can I solve this problem?
>>>
>>>
>>> Best,
>>>
>>> Darius
>>

Re: Reuters Example in Windows&Cygwin

Posted by Darius Miliauskas <da...@gmail.com>.
Thanks, Gokham, I needed to install "curl" additionally by running Cygwin
installer again (choosing not to skip "curl" which was skipped by default).

1.
I got:

DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
$ ./cluster-reuters.sh
Please select a number to choose the corresponding clustering algorithm
1. kmeans clustering
2. fuzzykmeans clustering
3. dirichlet clustering
4. lda clustering
5. minhash clustering
Enter your choice : 1
ok. You chose 1 and we'll use kmeans Clustering
creating work directory at /tmp/mahout-work-DARIUS
Extracting Reuters
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/util/ProgramDriver
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.util.ProgramDriver
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        ... 1 more
Converting to Sequence Files from Directory
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/util/ProgramDriver
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.util.ProgramDriver
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        ... 1 more
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/util/ProgramDriver
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.util.ProgramDriver
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        ... 1 more

2. then I set path of hadoop using GUI of Windows:
"C:\cygwin64\usr\local\hadoop\bin". And got the following output:

DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
$ ./cluster-reuters.sh
Please select a number to choose the corresponding clustering algorithm
1. kmeans clustering
2. fuzzykmeans clustering
3. dirichlet clustering
4. lda clustering
5. minhash clustering
Enter your choice : 1
ok. You chose 1 and we'll use kmeans Clustering
creating work directory at /tmp/mahout-work-DARIUS
Extracting Reuters
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
cygwin warning:
  MS-DOS style path detected: /usr/local/bin/C:\Program
  Preferred POSIX equivalent is: /usr/local/bin/C:/Program
  CYGWIN environment variable option "nodosfilewarning" turns off this
warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
/usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
/usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
C:\Pr
                                         ogram
Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
/usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
Files\Java\jdk1.7.0_05;
                                                            C:\Program
Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or

                              directory
Converting to Sequence Files from Directory
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
/usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
/usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
/usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
Files\Java\jdk1.7.0_05;
                                                            C:\Program
Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or

                              directory
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
/usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
/usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
/usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
Files\Java\jdk1.7.0_05;
                                                            C:\Program
Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or

                              directory

Actually, the files (from reuters) are downloaded as you can see:

DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
$ pwd
/usr/local/mahout/examples/bin

DARIUS@DARIUS-PC ~
$ cd /tmp

DARIUS@DARIUS-PC /tmp
$ ls
hsperfdata_DARIUS  mahout-work-DARIUS

DARIUS@DARIUS-PC /tmp
$ cd mahout-work-DARIUS/

DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
$ ls
reuters21578.tar.gz  reuters-sgm

DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
$ ls reuters-sgm/
all-exchanges-strings.lc.txt  all-topics-strings.lc.txt
README.txt     reut2-003.sgm  reut2-007.sgm  reut2-011.sgm  reut2-015.sgm
 reut2-019.sgm
all-orgs-strings.lc.txt       cat-descriptions_120396.txt
reut2-000.sgm  reut2-004.sgm  reut2-008.sgm  reut2-012.sgm  reut2-016.sgm
 reut2-020.sgm
all-people-strings.lc.txt     feldman-cia-worldfactbook-data.txt
 reut2-001.sgm  reut2-005.sgm  reut2-009.sgm  reut2-013.sgm  reut2-017.sgm
 reut2-021.sgm
all-places-strings.lc.txt     lewis.dtd
reut2-002.sgm  reut2-006.sgm  reut2-010.sgm  reut2-014.sgm  reut2-018.sgm

Anyway, I do not get any clustering. So, where is the problem?


Best,

Darius


2013/9/13 Gokhan Capan <gk...@gmail.com>

> You need to have 'curl' installed, as the error message tells.
>
> Gokhan
>
>
> On Fri, Sep 13, 2013 at 2:37 PM, Darius Miliauskas <
> dariui.miliauskui@gmail.com> wrote:
>
> > Dear All,
> >
> > I tried to run Reuters Example on my Windows machine (Windows 7), using
> > Cygwin, but got the following error:
> >
> > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> > $ ./cluster-reuters.sh
> > Please select a number to choose the corresponding clustering algorithm
> > 1. kmeans clustering
> > 2. fuzzykmeans clustering
> > 3. dirichlet clustering
> > 4. lda clustering
> > 5. minhash clustering
> > Enter your choice : 2
> > ok. You chose 2 and we'll use fuzzykmeans Clustering
> > creating work directory at /tmp/mahout-work-DARIUS
> > Downloading Reuters-21578
> > ./cluster-reuters.sh: line 80: curl: command not found
> > Failed to download reuters
> >
> > How can I solve this problem?
> >
> >
> > Best,
> >
> > Darius
> >
>

Re: Reuters Example in Windows&Cygwin

Posted by Gokhan Capan <gk...@gmail.com>.
You need to have 'curl' installed, as the error message tells.

Gokhan


On Fri, Sep 13, 2013 at 2:37 PM, Darius Miliauskas <
dariui.miliauskui@gmail.com> wrote:

> Dear All,
>
> I tried to run Reuters Example on my Windows machine (Windows 7), using
> Cygwin, but got the following error:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./cluster-reuters.sh
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 2
> ok. You chose 2 and we'll use fuzzykmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Downloading Reuters-21578
> ./cluster-reuters.sh: line 80: curl: command not found
> Failed to download reuters
>
> How can I solve this problem?
>
>
> Best,
>
> Darius
>