You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Ravi Sharma <rs...@manuhindia.com> on 2012/11/29 11:46:19 UTC

Navie Bayes- Help

Hii Guys,
I am newbie for mahout.please help me out how do  process my data.

i have one csv file for train data as name of
Trngdata.csv(first column attribute is target value) and that hold the
value like:
2 /tab-delimited 3 5 6 7 8
4 /tab-delimited 1 3 5 2 5
2 /tab-delimited 5 1 5 7 6
4 /tab-delimited 10 2 5 7 6
4 /tab-delimited 8 5 7 5 6
2 /tab-delimited 7 5 6 8 6
..
..
i build the model with help of this command-
$MAHOUT_HOME/bin/mahout trainclassifier --input BC/Trngdata.csv --output
model --classifierType cbayes

Successfully my model has build.

Now i want to classify my test data and test data is in test.csv and that
holds the value like:

2 /tab-delimited 5 1 5 7 6
4 /tab-delimited 10 2 5 7 6
2 /tab-delimited 3 5 6 7 8
..
..
for classification i applied this above command

$MAHOUT_HOME/bin/mahout testclassifier --model model -d BCtst/test.csv
--classifierType cbayes --method mapreduce


Now i am getting the error Label Not found.

But when i use same training data for classification instead of test
data,then one confusion matrix generated:-
Confusion Matrix
-------------------------------------------------------
a       b       c       <--Classified as
1       0       0        |  1           a     =
3       0       0        |  3           b     = 2
2       0       0        |  2           c     = 4

I cant figure out what is the actually happening.
Plz Guys help me..!!
one more thing what input should i provide as a test data as per theory we
should not provide target value coz its only going to be predict.so please
light on this too.


Thanks,
Ravi Sharma