You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Tommy Chheng <to...@gmail.com> on 2010/05/31 18:28:30 UTC
mahout quickstart-kmeans script sequencefile parameter
Hi,
I'm using the quickstart-kmeans.sh script from
https://issues.apache.org/jira/browse/MAHOUT-390 to run the example
kmeans. I'm on mahout trunk.
It fails on the SequenceFile generation step:
$./bin/mahout seqdirectory -i ./work/reuters-out/ -o
./work/reuters-out-seqdir -c UTF-8
no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
Exception in thread "main" org.apache.commons.cli2.OptionException:
Unexpected -i while processing Options
at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
at
org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at
org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
Alternatively, I tried ./bin/mahout seqdirectory --input
./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the get
the same unexpected --input error.
--
@tommychheng
Programmer and UC Irvine Graduate Student
Find a great grad school based on research interests: http://gradschoolnow.com
Re: mahout quickstart-kmeans script sequencefile parameter
Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Good, and thank-you for posting your findings. I've updated the wiki to
reflect the revised arguments for k-Means and will update the other
clustering pages shortly.
Jeff
On 6/3/10 5:15 PM, Tommy Chheng wrote:
> Yes, it had the help. I was just making a comment in case anyone else
> ran into the error.
>
> @tommychheng
> Programmer and UC Irvine Graduate Student
> Find a great grad school based on research interests:
> http://gradschoolnow.com
>
>
> On 6/3/10 4:55 PM, Jeff Eastman wrote:
>> Yes, the options have changed a bit recently and that script
>> evidently did not get updated yet. We are working to make all the
>> algorithm command lines more uniform and still have a ways to go to
>> accomplish that goal.
>>
>> - w should now be -ow and causes the output directory to be overwritten
>> - x (--maxIter) is also required though perhaps it should not be? Do
>> you really want kmeans to run forever?
>>
>> If you run the driver with incorrect arguments, does it not print out
>> the help information for you?
>> Jeff
>>
>>
>> On 6/3/10 2:58 PM, Tommy Chheng wrote:
>>> Thanks Drew,
>>> I started a new EC2 instance with the mahout trunk and got it
>>> working. There is a problem with the last line though.
>>>
>>> The last line in the script gave an error:
>>> ../bin/mahout kmeans -i
>>> ./work/reuters-out-seqdir-sparse/tfidf/vectors/ -c ./work/clusters
>>> -o ./work/reuters-kmeans -k 20 -w
>>>
>>> org.apache.commons.cli2.OptionException: Unexpected -w while
>>> processing Options
>>>
>>> Removing the -w and adding the -maxIter fixes it.
>>> ../bin/mahout kmeans -i
>>> ./work/reuters-out-seqdir-sparse/tfidf-vectors/ -c ./work/clusters
>>> -o ./work/reuters-kmeans -k 20 --maxIter 20
>>>
>>> I added a comment to
>>> https://issues.apache.org/jira/browse/MAHOUT-390
>>>
>>> @tommychheng
>>> Programmer and UC Irvine Graduate Student
>>> Find a great grad school based on research interests:
>>> http://gradschoolnow.com
>>>
>>>
>>> On 6/2/10 8:27 PM, Drew Farris wrote:
>>>> Very strange:
>>>>
>>>> drew@skirnir:~/mahout/svn-trunk$ svn info
>>>> Path: .
>>>> URL: https://svn.apache.org/repos/asf/mahout/trunk
>>>> Repository Root: https://svn.apache.org/repos/asf
>>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>>> Revision: 950859
>>>> [...]
>>>> drew@skirnir:~/mahout/svn-trunk$ ./bin/mahout seqdirectory -i
>>>> ./work/reuters-out -o ./work/reuters-out-seqdir -c UTF-8
>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>> [..]
>>>> drew@skirnir:~/mahout/svn-trunk$ ls ./work/reuters-out-seqdir
>>>> chunk-0
>>>>
>>>> To be absolutely certain nothing old is lurking in your target
>>>> directories,
>>>> try 'mvn clean install' to rebuild and see if your results differ.
>>>> If you
>>>> prefer, you can skip test execution 'mvn clean install
>>>> -DskipTests=true'
>>>>
>>>> IF that doesn't work, run 'mvn -v' and post the results -- that might
>>>> provide some clues.
>>>>
>>>> - Drew
>>>>
>>>> On Tue, Jun 1, 2010 at 9:39 PM, Tommy
>>>> Chheng<to...@gmail.com> wrote:
>>>>
>>>>> I updated the svn and did a mvn install but still getting a parsing
>>>>> command line error on the seqdirectory command.
>>>>> $svn info
>>>>> Path: .
>>>>> URL: http://svn.apache.org/repos/asf/mahout/trunk
>>>>> Repository Root: http://svn.apache.org/repos/asf
>>>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>>>> Revision: 950329
>>>>> Node Kind: directory
>>>>> Schedule: normal
>>>>> Last Changed Author: srowen
>>>>> Last Changed Rev: 950049
>>>>> Last Changed Date: 2010-06-01 05:55:49 -0700 (Tue, 01 Jun 2010)
>>>>>
>>>>>
>>>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>>>> ./work/reuters-out-seqdir -c UTF-8
>>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>>>> Unexpected -i while processing Options
>>>>> at
>>>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>>> at
>>>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>>>>
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>> Method)
>>>>> at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>
>>>>> at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>
>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>> at
>>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>>> at
>>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>>>
>>>>> @tommychheng
>>>>> Programmer and UC Irvine Graduate Student
>>>>> Find a great grad school based on research interests:
>>>>> http://gradschoolnow.com
>>>>>
>>>>> On 6/1/10 12:43 PM, Grant Ingersoll wrote:
>>>>>
>>>>>> Can you try doing an SVN update and then "mvn install" and then
>>>>>> run again?
>>>>>>
>>>>>> On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
>>>>>>
>>>>>> Hi,
>>>>>>> I'm using the quickstart-kmeans.sh script from
>>>>>>> https://issues.apache.org/jira/browse/MAHOUT-390 to run the example
>>>>>>> kmeans. I'm on mahout trunk.
>>>>>>>
>>>>>>> It fails on the SequenceFile generation step:
>>>>>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>>>>>> ./work/reuters-out-seqdir -c UTF-8
>>>>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>>>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>>>>>> Unexpected -i while processing Options
>>>>>>> at
>>>>>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>>>>> at
>>>>>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>>>>>>
>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>> Method)
>>>>>>> at
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>
>>>>>>> at
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>
>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>> at
>>>>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>>>>> at
>>>>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>>>>>
>>>>>>> Alternatively, I tried ./bin/mahout seqdirectory --input
>>>>>>> ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but
>>>>>>> the get the
>>>>>>> same unexpected --input error.
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> @tommychheng
>>>>>>> Programmer and UC Irvine Graduate Student
>>>>>>> Find a great grad school based on research interests:
>>>>>>> http://gradschoolnow.com
>>>>>>>
>>>>>>>
>>>
>>
>
Re: mahout quickstart-kmeans script sequencefile parameter
Posted by Tommy Chheng <to...@gmail.com>.
Yes, it had the help. I was just making a comment in case anyone else
ran into the error.
@tommychheng
Programmer and UC Irvine Graduate Student
Find a great grad school based on research interests: http://gradschoolnow.com
On 6/3/10 4:55 PM, Jeff Eastman wrote:
> Yes, the options have changed a bit recently and that script evidently
> did not get updated yet. We are working to make all the algorithm
> command lines more uniform and still have a ways to go to accomplish
> that goal.
>
> - w should now be -ow and causes the output directory to be overwritten
> - x (--maxIter) is also required though perhaps it should not be? Do
> you really want kmeans to run forever?
>
> If you run the driver with incorrect arguments, does it not print out
> the help information for you?
> Jeff
>
>
> On 6/3/10 2:58 PM, Tommy Chheng wrote:
>> Thanks Drew,
>> I started a new EC2 instance with the mahout trunk and got it
>> working. There is a problem with the last line though.
>>
>> The last line in the script gave an error:
>> ../bin/mahout kmeans -i
>> ./work/reuters-out-seqdir-sparse/tfidf/vectors/ -c ./work/clusters -o
>> ./work/reuters-kmeans -k 20 -w
>>
>> org.apache.commons.cli2.OptionException: Unexpected -w while
>> processing Options
>>
>> Removing the -w and adding the -maxIter fixes it.
>> ../bin/mahout kmeans -i
>> ./work/reuters-out-seqdir-sparse/tfidf-vectors/ -c ./work/clusters -o
>> ./work/reuters-kmeans -k 20 --maxIter 20
>>
>> I added a comment to
>> https://issues.apache.org/jira/browse/MAHOUT-390
>>
>> @tommychheng
>> Programmer and UC Irvine Graduate Student
>> Find a great grad school based on research interests:
>> http://gradschoolnow.com
>>
>>
>> On 6/2/10 8:27 PM, Drew Farris wrote:
>>> Very strange:
>>>
>>> drew@skirnir:~/mahout/svn-trunk$ svn info
>>> Path: .
>>> URL: https://svn.apache.org/repos/asf/mahout/trunk
>>> Repository Root: https://svn.apache.org/repos/asf
>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>> Revision: 950859
>>> [...]
>>> drew@skirnir:~/mahout/svn-trunk$ ./bin/mahout seqdirectory -i
>>> ./work/reuters-out -o ./work/reuters-out-seqdir -c UTF-8
>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>> [..]
>>> drew@skirnir:~/mahout/svn-trunk$ ls ./work/reuters-out-seqdir
>>> chunk-0
>>>
>>> To be absolutely certain nothing old is lurking in your target
>>> directories,
>>> try 'mvn clean install' to rebuild and see if your results differ.
>>> If you
>>> prefer, you can skip test execution 'mvn clean install
>>> -DskipTests=true'
>>>
>>> IF that doesn't work, run 'mvn -v' and post the results -- that might
>>> provide some clues.
>>>
>>> - Drew
>>>
>>> On Tue, Jun 1, 2010 at 9:39 PM, Tommy
>>> Chheng<to...@gmail.com> wrote:
>>>
>>>> I updated the svn and did a mvn install but still getting a parsing
>>>> command line error on the seqdirectory command.
>>>> $svn info
>>>> Path: .
>>>> URL: http://svn.apache.org/repos/asf/mahout/trunk
>>>> Repository Root: http://svn.apache.org/repos/asf
>>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>>> Revision: 950329
>>>> Node Kind: directory
>>>> Schedule: normal
>>>> Last Changed Author: srowen
>>>> Last Changed Rev: 950049
>>>> Last Changed Date: 2010-06-01 05:55:49 -0700 (Tue, 01 Jun 2010)
>>>>
>>>>
>>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>>> ./work/reuters-out-seqdir -c UTF-8
>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>>> Unexpected -i while processing Options
>>>> at
>>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>> at
>>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>>>
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>
>>>> at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at
>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>>
>>>> at
>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>> at
>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>>
>>>> @tommychheng
>>>> Programmer and UC Irvine Graduate Student
>>>> Find a great grad school based on research interests:
>>>> http://gradschoolnow.com
>>>>
>>>> On 6/1/10 12:43 PM, Grant Ingersoll wrote:
>>>>
>>>>> Can you try doing an SVN update and then "mvn install" and then
>>>>> run again?
>>>>>
>>>>> On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
>>>>>
>>>>> Hi,
>>>>>> I'm using the quickstart-kmeans.sh script from
>>>>>> https://issues.apache.org/jira/browse/MAHOUT-390 to run the example
>>>>>> kmeans. I'm on mahout trunk.
>>>>>>
>>>>>> It fails on the SequenceFile generation step:
>>>>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>>>>> ./work/reuters-out-seqdir -c UTF-8
>>>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>>>>> Unexpected -i while processing Options
>>>>>> at
>>>>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>>>> at
>>>>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>>>>>
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>> Method)
>>>>>> at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>
>>>>>> at
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>
>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>> at
>>>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>>>>
>>>>>> at
>>>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>>>> at
>>>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>>>>
>>>>>> Alternatively, I tried ./bin/mahout seqdirectory --input
>>>>>> ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the
>>>>>> get the
>>>>>> same unexpected --input error.
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> @tommychheng
>>>>>> Programmer and UC Irvine Graduate Student
>>>>>> Find a great grad school based on research interests:
>>>>>> http://gradschoolnow.com
>>>>>>
>>>>>>
>>
>
Re: mahout quickstart-kmeans script sequencefile parameter
Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Yes, the options have changed a bit recently and that script evidently
did not get updated yet. We are working to make all the algorithm
command lines more uniform and still have a ways to go to accomplish
that goal.
- w should now be -ow and causes the output directory to be overwritten
- x (--maxIter) is also required though perhaps it should not be? Do you
really want kmeans to run forever?
If you run the driver with incorrect arguments, does it not print out
the help information for you?
Jeff
On 6/3/10 2:58 PM, Tommy Chheng wrote:
> Thanks Drew,
> I started a new EC2 instance with the mahout trunk and got it working.
> There is a problem with the last line though.
>
> The last line in the script gave an error:
> ../bin/mahout kmeans -i
> ./work/reuters-out-seqdir-sparse/tfidf/vectors/ -c ./work/clusters -o
> ./work/reuters-kmeans -k 20 -w
>
> org.apache.commons.cli2.OptionException: Unexpected -w while
> processing Options
>
> Removing the -w and adding the -maxIter fixes it.
> ../bin/mahout kmeans -i
> ./work/reuters-out-seqdir-sparse/tfidf-vectors/ -c ./work/clusters -o
> ./work/reuters-kmeans -k 20 --maxIter 20
>
> I added a comment to
> https://issues.apache.org/jira/browse/MAHOUT-390
>
> @tommychheng
> Programmer and UC Irvine Graduate Student
> Find a great grad school based on research interests:
> http://gradschoolnow.com
>
>
> On 6/2/10 8:27 PM, Drew Farris wrote:
>> Very strange:
>>
>> drew@skirnir:~/mahout/svn-trunk$ svn info
>> Path: .
>> URL: https://svn.apache.org/repos/asf/mahout/trunk
>> Repository Root: https://svn.apache.org/repos/asf
>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>> Revision: 950859
>> [...]
>> drew@skirnir:~/mahout/svn-trunk$ ./bin/mahout seqdirectory -i
>> ./work/reuters-out -o ./work/reuters-out-seqdir -c UTF-8
>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>> [..]
>> drew@skirnir:~/mahout/svn-trunk$ ls ./work/reuters-out-seqdir
>> chunk-0
>>
>> To be absolutely certain nothing old is lurking in your target
>> directories,
>> try 'mvn clean install' to rebuild and see if your results differ. If
>> you
>> prefer, you can skip test execution 'mvn clean install -DskipTests=true'
>>
>> IF that doesn't work, run 'mvn -v' and post the results -- that might
>> provide some clues.
>>
>> - Drew
>>
>> On Tue, Jun 1, 2010 at 9:39 PM, Tommy Chheng<to...@gmail.com>
>> wrote:
>>
>>> I updated the svn and did a mvn install but still getting a parsing
>>> command line error on the seqdirectory command.
>>> $svn info
>>> Path: .
>>> URL: http://svn.apache.org/repos/asf/mahout/trunk
>>> Repository Root: http://svn.apache.org/repos/asf
>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>> Revision: 950329
>>> Node Kind: directory
>>> Schedule: normal
>>> Last Changed Author: srowen
>>> Last Changed Rev: 950049
>>> Last Changed Date: 2010-06-01 05:55:49 -0700 (Tue, 01 Jun 2010)
>>>
>>>
>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>> ./work/reuters-out-seqdir -c UTF-8
>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>> Unexpected -i while processing Options
>>> at
>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>> at
>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>
>>> at
>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>> at
>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>
>>> @tommychheng
>>> Programmer and UC Irvine Graduate Student
>>> Find a great grad school based on research interests:
>>> http://gradschoolnow.com
>>>
>>> On 6/1/10 12:43 PM, Grant Ingersoll wrote:
>>>
>>>> Can you try doing an SVN update and then "mvn install" and then run
>>>> again?
>>>>
>>>> On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
>>>>
>>>> Hi,
>>>>> I'm using the quickstart-kmeans.sh script from
>>>>> https://issues.apache.org/jira/browse/MAHOUT-390 to run the example
>>>>> kmeans. I'm on mahout trunk.
>>>>>
>>>>> It fails on the SequenceFile generation step:
>>>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>>>> ./work/reuters-out-seqdir -c UTF-8
>>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>>>> Unexpected -i while processing Options
>>>>> at
>>>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>>> at
>>>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>>>>
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>> Method)
>>>>> at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>
>>>>> at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>
>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>> at
>>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>>> at
>>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>>>
>>>>> Alternatively, I tried ./bin/mahout seqdirectory --input
>>>>> ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the
>>>>> get the
>>>>> same unexpected --input error.
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> @tommychheng
>>>>> Programmer and UC Irvine Graduate Student
>>>>> Find a great grad school based on research interests:
>>>>> http://gradschoolnow.com
>>>>>
>>>>>
>
Re: mahout quickstart-kmeans script sequencefile parameter
Posted by Tommy Chheng <to...@gmail.com>.
Thanks Drew,
I started a new EC2 instance with the mahout trunk and got it working.
There is a problem with the last line though.
The last line in the script gave an error:
../bin/mahout kmeans -i ./work/reuters-out-seqdir-sparse/tfidf/vectors/
-c ./work/clusters -o ./work/reuters-kmeans -k 20 -w
org.apache.commons.cli2.OptionException: Unexpected -w while processing
Options
Removing the -w and adding the -maxIter fixes it.
../bin/mahout kmeans -i ./work/reuters-out-seqdir-sparse/tfidf-vectors/
-c ./work/clusters -o ./work/reuters-kmeans -k 20 --maxIter 20
I added a comment to
https://issues.apache.org/jira/browse/MAHOUT-390
@tommychheng
Programmer and UC Irvine Graduate Student
Find a great grad school based on research interests: http://gradschoolnow.com
On 6/2/10 8:27 PM, Drew Farris wrote:
> Very strange:
>
> drew@skirnir:~/mahout/svn-trunk$ svn info
> Path: .
> URL: https://svn.apache.org/repos/asf/mahout/trunk
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 950859
> [...]
> drew@skirnir:~/mahout/svn-trunk$ ./bin/mahout seqdirectory -i
> ./work/reuters-out -o ./work/reuters-out-seqdir -c UTF-8
> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
> [..]
> drew@skirnir:~/mahout/svn-trunk$ ls ./work/reuters-out-seqdir
> chunk-0
>
> To be absolutely certain nothing old is lurking in your target directories,
> try 'mvn clean install' to rebuild and see if your results differ. If you
> prefer, you can skip test execution 'mvn clean install -DskipTests=true'
>
> IF that doesn't work, run 'mvn -v' and post the results -- that might
> provide some clues.
>
> - Drew
>
> On Tue, Jun 1, 2010 at 9:39 PM, Tommy Chheng<to...@gmail.com> wrote:
>
>> I updated the svn and did a mvn install but still getting a parsing
>> command line error on the seqdirectory command.
>> $svn info
>> Path: .
>> URL: http://svn.apache.org/repos/asf/mahout/trunk
>> Repository Root: http://svn.apache.org/repos/asf
>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>> Revision: 950329
>> Node Kind: directory
>> Schedule: normal
>> Last Changed Author: srowen
>> Last Changed Rev: 950049
>> Last Changed Date: 2010-06-01 05:55:49 -0700 (Tue, 01 Jun 2010)
>>
>>
>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>> ./work/reuters-out-seqdir -c UTF-8
>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>> Unexpected -i while processing Options
>> at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>> at
>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>> at
>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>
>> @tommychheng
>> Programmer and UC Irvine Graduate Student
>> Find a great grad school based on research interests:
>> http://gradschoolnow.com
>>
>> On 6/1/10 12:43 PM, Grant Ingersoll wrote:
>>
>>> Can you try doing an SVN update and then "mvn install" and then run again?
>>>
>>> On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
>>>
>>> Hi,
>>>> I'm using the quickstart-kmeans.sh script from
>>>> https://issues.apache.org/jira/browse/MAHOUT-390 to run the example
>>>> kmeans. I'm on mahout trunk.
>>>>
>>>> It fails on the SequenceFile generation step:
>>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>>> ./work/reuters-out-seqdir -c UTF-8
>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>>> Unexpected -i while processing Options
>>>> at
>>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>> at
>>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>> at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at
>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>> at
>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>> at
>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>>
>>>> Alternatively, I tried ./bin/mahout seqdirectory --input
>>>> ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the get the
>>>> same unexpected --input error.
>>>>
>>>>
>>>> --
>>>>
>>>> @tommychheng
>>>> Programmer and UC Irvine Graduate Student
>>>> Find a great grad school based on research interests:
>>>> http://gradschoolnow.com
>>>>
>>>>
Re: mahout quickstart-kmeans script sequencefile parameter
Posted by Drew Farris <dr...@gmail.com>.
Very strange:
drew@skirnir:~/mahout/svn-trunk$ svn info
Path: .
URL: https://svn.apache.org/repos/asf/mahout/trunk
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 950859
[...]
drew@skirnir:~/mahout/svn-trunk$ ./bin/mahout seqdirectory -i
./work/reuters-out -o ./work/reuters-out-seqdir -c UTF-8
no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
[..]
drew@skirnir:~/mahout/svn-trunk$ ls ./work/reuters-out-seqdir
chunk-0
To be absolutely certain nothing old is lurking in your target directories,
try 'mvn clean install' to rebuild and see if your results differ. If you
prefer, you can skip test execution 'mvn clean install -DskipTests=true'
IF that doesn't work, run 'mvn -v' and post the results -- that might
provide some clues.
- Drew
On Tue, Jun 1, 2010 at 9:39 PM, Tommy Chheng <to...@gmail.com> wrote:
> I updated the svn and did a mvn install but still getting a parsing
> command line error on the seqdirectory command.
> $svn info
> Path: .
> URL: http://svn.apache.org/repos/asf/mahout/trunk
> Repository Root: http://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 950329
> Node Kind: directory
> Schedule: normal
> Last Changed Author: srowen
> Last Changed Rev: 950049
> Last Changed Date: 2010-06-01 05:55:49 -0700 (Tue, 01 Jun 2010)
>
>
> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
> ./work/reuters-out-seqdir -c UTF-8
> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
> Exception in thread "main" org.apache.commons.cli2.OptionException:
> Unexpected -i while processing Options
> at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
> at
> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>
> @tommychheng
> Programmer and UC Irvine Graduate Student
> Find a great grad school based on research interests:
> http://gradschoolnow.com
>
> On 6/1/10 12:43 PM, Grant Ingersoll wrote:
>
>> Can you try doing an SVN update and then "mvn install" and then run again?
>>
>> On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
>>
>> Hi,
>>> I'm using the quickstart-kmeans.sh script from
>>> https://issues.apache.org/jira/browse/MAHOUT-390 to run the example
>>> kmeans. I'm on mahout trunk.
>>>
>>> It fails on the SequenceFile generation step:
>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>> ./work/reuters-out-seqdir -c UTF-8
>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>> Unexpected -i while processing Options
>>> at
>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>> at
>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>> at
>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>> at
>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>
>>> Alternatively, I tried ./bin/mahout seqdirectory --input
>>> ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the get the
>>> same unexpected --input error.
>>>
>>>
>>> --
>>>
>>> @tommychheng
>>> Programmer and UC Irvine Graduate Student
>>> Find a great grad school based on research interests:
>>> http://gradschoolnow.com
>>>
>>>
>>
Re: mahout quickstart-kmeans script sequencefile parameter
Posted by Tommy Chheng <to...@gmail.com>.
I updated the svn and did a mvn install but still getting a parsing
command line error on the seqdirectory command.
$svn info
Path: .
URL: http://svn.apache.org/repos/asf/mahout/trunk
Repository Root: http://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 950329
Node Kind: directory
Schedule: normal
Last Changed Author: srowen
Last Changed Rev: 950049
Last Changed Date: 2010-06-01 05:55:49 -0700 (Tue, 01 Jun 2010)
$./bin/mahout seqdirectory -i ./work/reuters-out/ -o
./work/reuters-out-seqdir -c UTF-8
no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
Exception in thread "main" org.apache.commons.cli2.OptionException:
Unexpected -i while processing Options
at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
at
org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at
org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
@tommychheng
Programmer and UC Irvine Graduate Student
Find a great grad school based on research interests: http://gradschoolnow.com
On 6/1/10 12:43 PM, Grant Ingersoll wrote:
> Can you try doing an SVN update and then "mvn install" and then run again?
>
> On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
>
>> Hi,
>> I'm using the quickstart-kmeans.sh script from https://issues.apache.org/jira/browse/MAHOUT-390 to run the example kmeans. I'm on mahout trunk.
>>
>> It fails on the SequenceFile generation step:
>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8
>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>> Exception in thread "main" org.apache.commons.cli2.OptionException: Unexpected -i while processing Options
>> at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>> at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>
>> Alternatively, I tried ./bin/mahout seqdirectory --input ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the get the same unexpected --input error.
>>
>>
>> --
>>
>> @tommychheng
>> Programmer and UC Irvine Graduate Student
>> Find a great grad school based on research interests: http://gradschoolnow.com
>>
>
Re: mahout quickstart-kmeans script sequencefile parameter
Posted by Grant Ingersoll <gs...@apache.org>.
Can you try doing an SVN update and then "mvn install" and then run again?
On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
> Hi,
> I'm using the quickstart-kmeans.sh script from https://issues.apache.org/jira/browse/MAHOUT-390 to run the example kmeans. I'm on mahout trunk.
>
> It fails on the SequenceFile generation step:
> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8
> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
> Exception in thread "main" org.apache.commons.cli2.OptionException: Unexpected -i while processing Options
> at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
> at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>
> Alternatively, I tried ./bin/mahout seqdirectory --input ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the get the same unexpected --input error.
>
>
> --
>
> @tommychheng
> Programmer and UC Irvine Graduate Student
> Find a great grad school based on research interests: http://gradschoolnow.com
>