You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Dipti Mathur <di...@gmail.com> on 2011/05/11 15:42:18 UTC
Used my own data for the 20NewsGroup example. TestClassifier giving
incorrect output
Hi All,
I used the 20NewsGroup model to train my data. However, while trying to test
the classifier (test data is same as train data just for simplicity sake
now), I get the following error. Any ideas?
dipti@dipti-laptop:~$ mahout/trunk/bin/mahout testclassifier -m
ruralsearch/bayes-model/ -d ruralsearch/test-input/ -type bayes -ng 1
-source hdfs -method sequential
Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/
HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf
11/05/11 19:02:35 INFO bayes.TestClassifier: Loading model from:
{basePath=ruralsearch/bayes-model/, classifierType=bayes, alpha_i=1.0,
dataSource=hdfs, gramSize=1, verbose=false, encoding=UTF-8,
defaultCat=unknown, testDirPath=ruralsearch/test-input/}
11/05/11 19:02:35 INFO bayes.TestClassifier: Testing Bayes Classifier
11/05/11 19:02:36 INFO io.SequenceFileModelReader: 135467.11329474236
11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: realestate
-103464.88819958708 168594.15797711344 -0.6136920130627087
11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: automobiles
-168594.15797711344 168594.15797711344 -1.0
11/05/11 19:02:37 INFO bayes.TestClassifier:
=======================================================
Summary
-------------------------------------------------------
Correctly Classified Instances : 0 �%
Incorrectly Classified Instances : 0 �%
Total Classified Instances : 0
=======================================================
Confusion Matrix
-------------------------------------------------------
a b c <--Classified as
0 0 0 | 0 a = realestate
0 0 0 | 0 b = automobiles
0 0 0 | 0 c = unknown
Default Category: unknown: 2
11/05/11 19:02:37 INFO driver.MahoutDriver: Program took 2309 ms
Regards,
Dipti Mathur
Re: Used my own data for the 20NewsGroup example. TestClassifier giving incorrect output
Posted by Grant Ingersoll <gs...@apache.org>.
What steps did you do before this?
(For future reference, this is a good question to ask on user@mahout.apache.org)
On May 11, 2011, at 9:42 AM, Dipti Mathur wrote:
> Hi All,
>
> I used the 20NewsGroup model to train my data. However, while trying to test
> the classifier (test data is same as train data just for simplicity sake
> now), I get the following error. Any ideas?
>
> dipti@dipti-laptop:~$ mahout/trunk/bin/mahout testclassifier -m
> ruralsearch/bayes-model/ -d ruralsearch/test-input/ -type bayes -ng 1
> -source hdfs -method sequential
> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/
> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf
> 11/05/11 19:02:35 INFO bayes.TestClassifier: Loading model from:
> {basePath=ruralsearch/bayes-model/, classifierType=bayes, alpha_i=1.0,
> dataSource=hdfs, gramSize=1, verbose=false, encoding=UTF-8,
> defaultCat=unknown, testDirPath=ruralsearch/test-input/}
> 11/05/11 19:02:35 INFO bayes.TestClassifier: Testing Bayes Classifier
> 11/05/11 19:02:36 INFO io.SequenceFileModelReader: 135467.11329474236
> 11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: realestate
> -103464.88819958708 168594.15797711344 -0.6136920130627087
> 11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: automobiles
> -168594.15797711344 168594.15797711344 -1.0
> 11/05/11 19:02:37 INFO bayes.TestClassifier:
> =======================================================
> Summary
> -------------------------------------------------------
> Correctly Classified Instances : 0 �%
> Incorrectly Classified Instances : 0 �%
> Total Classified Instances : 0
>
> =======================================================
> Confusion Matrix
> -------------------------------------------------------
> a b c <--Classified as
> 0 0 0 | 0 a = realestate
> 0 0 0 | 0 b = automobiles
> 0 0 0 | 0 c = unknown
> Default Category: unknown: 2
>
>
> 11/05/11 19:02:37 INFO driver.MahoutDriver: Program took 2309 ms
>
> Regards,
> Dipti Mathur
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search
Re: Used my own data for the 20NewsGroup example. TestClassifier
giving incorrect output
Posted by Daniel McEnnis <dm...@gmail.com>.
Dipti,
Double check that your classify data is in category\ttokenized text
format (i.e. the testclassifier data builder rather than the
classifier data builder).
Daniel.
On Wed, May 11, 2011 at 9:42 AM, Dipti Mathur <di...@gmail.com> wrote:
> Hi All,
>
> I used the 20NewsGroup model to train my data. However, while trying to test
> the classifier (test data is same as train data just for simplicity sake
> now), I get the following error. Any ideas?
>
> dipti@dipti-laptop:~$ mahout/trunk/bin/mahout testclassifier -m
> ruralsearch/bayes-model/ -d ruralsearch/test-input/ -type bayes -ng 1
> -source hdfs -method sequential
> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/
> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf
> 11/05/11 19:02:35 INFO bayes.TestClassifier: Loading model from:
> {basePath=ruralsearch/bayes-model/, classifierType=bayes, alpha_i=1.0,
> dataSource=hdfs, gramSize=1, verbose=false, encoding=UTF-8,
> defaultCat=unknown, testDirPath=ruralsearch/test-input/}
> 11/05/11 19:02:35 INFO bayes.TestClassifier: Testing Bayes Classifier
> 11/05/11 19:02:36 INFO io.SequenceFileModelReader: 135467.11329474236
> 11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: realestate
> -103464.88819958708 168594.15797711344 -0.6136920130627087
> 11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: automobiles
> -168594.15797711344 168594.15797711344 -1.0
> 11/05/11 19:02:37 INFO bayes.TestClassifier:
> =======================================================
> Summary
> -------------------------------------------------------
> Correctly Classified Instances : 0 �%
> Incorrectly Classified Instances : 0 �%
> Total Classified Instances : 0
>
> =======================================================
> Confusion Matrix
> -------------------------------------------------------
> a b c <--Classified as
> 0 0 0 | 0 a = realestate
> 0 0 0 | 0 b = automobiles
> 0 0 0 | 0 c = unknown
> Default Category: unknown: 2
>
>
> 11/05/11 19:02:37 INFO driver.MahoutDriver: Program took 2309 ms
>
> Regards,
> Dipti Mathur
>