You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Vijay Santhanam <vi...@gmail.com> on 2011/07/06 07:02:56 UTC

Weighted "features" using naive bayes classifier

Hi,

I've used Lucene a fair bit and one useful feature it has is the ability to
boost fields to make them more relevant. E.g. matching Titles are more
important than matching descriptions, so you can "boost" title fields to
ensure they weigh in more in the final relevance calculation.

I expected there to be a similar concept of boosting or weightings with the
bayes classifier too, but I can't work how to make that work.

>From the examples, there's the target variable and the predictor variable
and that's it in mahouts implementation.

I guess I could fake boosting by duplicating phrases inside the predictor
variable. E.g. When Classifying "Electronic Games", there is a "platform"
feature, this feature is super important and weighs heavier than the "title"
feature.

Can anyone suggest what mahouts what approach is to weighing features in
naive bayes classification?

Thanks,
V