You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Mishari Almishari <ma...@gmail.com> on 2010/07/11 02:22:39 UTC

question about twenty newsgroup example

Hi all,
I am a new user to Mahout. I tried the quickstart first with 20 newsgroup
example to familiarize myself with Mahout. I checked this link:
https://cwiki.apache.org/confluence/display/MAHOUT/Twenty+Newsgroups and
followed the steps and everything was fine until i reached the testing step
over hadoop with the command stated as follows:

$HADOOP_HOME/bin/hadoop \
    jar \
    $MAHOUT_HOME/examples/target/mahout-examples-0.1.job \
    org.apache.mahout.classifier.bayes.TestClassifier \
    -p newsmodel \
    -t work/20news-input \
    -ng 3 \
    -type bayes


First, I found mistakes in the command i guess "p" should be "m" and "t"
should be "d". I corrected that but i kept receiving the following message
every time i execute the command:

Usage:

 [--defaultCat <defaultCat> --testDir <testDir> --encoding
<encoding>
--gramSize <gramSize> --model <model> --classifierType
<classifierType>
--dataSource <dataSource> --help --method <method> --verbose --alpha
<a>]
Options

  --defaultCat (-default) defaultCat         The default category
Default
                                             Value:
unknown
  --testDir (-d) testDir                     The directory where test
documents
                                             resides
in
  --encoding (-e) encoding                   The file encoding.  Defaults
to

UTF-8
  --gramSize (-ng) gramSize                  Size of the n-gram. Default
Value:

1
  --model (-m) model                         The path on HDFS / Name of
Hbase
                                             Table as defined by the
-source

parameter
  --classifierType (-type) classifierType    Type of classifier:
bayes|cbayes.
                                             Default Value:
bayes
  --dataSource (-source) dataSource          Location of model:
hdfs|hbase
                                             Default Value:
hdfs
  --help (-h)                                Print out
help
  --method (-method) method                  Method of
Classification:
                                             sequential|mapreduce.
Default
                                             Value:
sequential
  --verbose (-v)                             Output which values were
correctly
                                             and incorrectly
classified
  --alpha (-a) a                             Smoothing parameter Default
Value:
                                             1.0


eventhough my command structure is correct and i have executed the training
command before and it ran perfectly creating the newsmodel directory. Any
clue why i am receiving this message and being able to run the test command?


Thanks in advance!

-mish