You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Andrew Palumbo (JIRA)" <ji...@apache.org> on 2014/04/22 04:56:18 UTC
[jira] [Updated] (MAHOUT-1519) Remove StandardThetaTrainer
[ https://issues.apache.org/jira/browse/MAHOUT-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Palumbo updated MAHOUT-1519:
-----------------------------------
Attachment: MAHOUT-1519.patch
I thought it was going to be a simpler to remove all references to the thetaNormalizer vector for Standard NB models without making too many changes, but its in there pretty deep. To completely remove any thetaNormalizer references i added a field to NaiveBayesModel to explicitly define it as complementary or standard (probably not a bad thing) in order to deal with all of the Serialization, validation differences, etc. This way the constructor can take a null value for the thetaNormalizer vector.
I made alot of changes, so I went back and rewrote a simpler/hackish patch which doesn't touch NaiveBayesModel, but that one is much less stable and can not accept a null thetaNormalizer.
Let me know if there's too many changes here, and I'll submit that one.
> Remove StandardThetaTrainer
> ---------------------------
>
> Key: MAHOUT-1519
> URL: https://issues.apache.org/jira/browse/MAHOUT-1519
> Project: Mahout
> Issue Type: Improvement
> Components: Classification
> Reporter: Sebastian Schelter
> Fix For: 1.0
>
> Attachments: MAHOUT-1519.patch
>
>
> [~Andrew_Palumbo] if I understand your work in MAHOUT-1504 correctly, the theta training is only necessary for complementary naive bayes, right?
> Then, we should remove the StandardthetaTrainer and make the TrainNaiveBayesJob only do the theta training in the complementary case.
> Correct me if I miss something here.
--
This message was sent by Atlassian JIRA
(v6.2#6252)