You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Gouri Sankar Majumder <ma...@gmail.com> on 2013/12/30 08:22:49 UTC

Issue with running Complementary Naive Bayes classifier and changes to make it working

Dear All,

I would like to share an issue that I faced with Complementary Naive Bayes
classifier while developing a text classification system using Mahout. I
was trying to compare result of Standard Naive Bayes classifier with
Complementary Naive Bayes classifier. But strangely I was getting same
accuracy for both classifier. I tried with several datasets but no success.


So I looked into the source code of two driver classes
*org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.java*and
*org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.java*. I
found following two lines for which always Standard Naive Bayes classifier
was getting called. "-c" option to run Complementary Naive Bayes classifier
was not making any change.

*TrainNaiveBayesJob.java : line 96*
    boolean trainComplementary = Boolean.parseBoolean(
getOption(TRAIN_COMPLEMENTARY));
    // results to false as getOption(TRAIN_COMPLEMENTARY) always returns
null.

*TestNaiveBayesDriver.java : line 139*
    boolean complementary = parsedArgs.containsKey("testComplementary");
    // always results to false as key in Map parsedArgs is
"--testComplementary" not "testComplementary".

Due to this Complementary Naive Bayes classifier was never getting called.

So I made following changes and that worked !!!

*TrainNaiveBayesJob.java :*
    boolean trainComplementary = hasOption(TRAIN_COMPLEMENTARY);

*TestNaiveBayesDriver.java :*
    boolean complementary = hasOption("testComplementary"); //or
complementary = parsedArgs.containsKey("--testComplementary");

Please find attached patch.

With Regards,
Gouri Sankar Majumder

Re: Issue with running Complementary Naive Bayes classifier and changes to make it working

Posted by Gouri Sankar Majumder <ma...@gmail.com>.
Created JIRA issue MAHOUT-1389.

URL: https://issues.apache.org/jira/browse/MAHOUT-1389

Patch has been committed to trunk by Suneel Marthi.

Thanks and Regards,
Gouri Sankar Majumder


On Mon, Dec 30, 2013 at 4:15 PM, Isabel Drost-Fromm <is...@apache.org>wrote:

> On Mon, 30 Dec 2013 09:00:58 +0100
> Sebastian Schelter <ss...@googlemail.com> wrote:
>
> > Very good! Can you file a hira and attach the patch there?
>
> Jira URL for reference:
>
> https://issues.apache.org/jira/browse/MAHOUT
>
>
> Isabel
>

Re: Issue with running Complementary Naive Bayes classifier and changes to make it working

Posted by Isabel Drost-Fromm <is...@apache.org>.
On Mon, 30 Dec 2013 09:00:58 +0100
Sebastian Schelter <ss...@googlemail.com> wrote:

> Very good! Can you file a hira and attach the patch there?

Jira URL for reference:

https://issues.apache.org/jira/browse/MAHOUT


Isabel

Re: Issue with running Complementary Naive Bayes classifier and changes to make it working

Posted by Sebastian Schelter <ss...@googlemail.com>.
Very good! Can you file a hira and attach the patch there?
Am 30.12.2013 08:23 schrieb "Gouri Sankar Majumder" <majumder.gs19@gmail.com
>:

> Dear All,
>
> I would like to share an issue that I faced with Complementary Naive Bayes
> classifier while developing a text classification system using Mahout. I
> was trying to compare result of Standard Naive Bayes classifier with
> Complementary Naive Bayes classifier. But strangely I was getting same
> accuracy for both classifier. I tried with several datasets but no success.
>
>
> So I looked into the source code of two driver classes
> *org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.java*and
> *org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.java*.
> I found following two lines for which always Standard Naive Bayes
> classifier was getting called. "-c" option to run Complementary Naive Bayes
> classifier was not making any change.
>
> *TrainNaiveBayesJob.java : line 96*
>     boolean trainComplementary = Boolean.parseBoolean(
> getOption(TRAIN_COMPLEMENTARY));
>     // results to false as getOption(TRAIN_COMPLEMENTARY) always returns
> null.
>
> *TestNaiveBayesDriver.java : line 139*
>     boolean complementary = parsedArgs.containsKey("testComplementary");
>     // always results to false as key in Map parsedArgs is
> "--testComplementary" not "testComplementary".
>
> Due to this Complementary Naive Bayes classifier was never getting called.
>
> So I made following changes and that worked !!!
>
> *TrainNaiveBayesJob.java :*
>     boolean trainComplementary = hasOption(TRAIN_COMPLEMENTARY);
>
> *TestNaiveBayesDriver.java :*
>     boolean complementary = hasOption("testComplementary"); //or
> complementary = parsedArgs.containsKey("--testComplementary");
>
> Please find attached patch.
>
> With Regards,
> Gouri Sankar Majumder
>