You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Lancaster, Robert (Orbitz)" <RO...@orbitz.com> on 2011/06/02 16:40:25 UTC

NaiveBayes and Classification of non-documents

I'm looking at the Mahout implementation NaiveBayes for a classification task, but the language around the Mahout implementation appears to be document-centric.  Is it possible to use the Mahout implementation of NB for a classification task that doesn't involve documents?

I have about 80 million records with a small number of features.  The arff header looks like (the numeric features could easily be nominalized if need be):

@RELATION        relation
@ATTRIBUTE      featurea    NUMERIC
@ATTRIBUTE      featureb    {1,2,3,4,5,6,7}
@ATTRIBUTE      featurec     {1,2,3,4,5,6,7}
@ATTRIBUTE      featured     NUMERIC
@ATTRIBUTE      featuref        NUMERIC
@ATTRIBUTE      featuref {0,1}
@ATTRIBUTE      target  {0,1}

Re: NaiveBayes and Classification of non-documents

Posted by Hector Yee <he...@gmail.com>.
You can try adaboost in patch 716 :)

Sent from my iPad

On Jun 2, 2011, at 7:40 AM, "Lancaster, Robert (Orbitz)" <RO...@orbitz.com> wrote:

> I'm looking at the Mahout implementation NaiveBayes for a classification task, but the language around the Mahout implementation appears to be document-centric.  Is it possible to use the Mahout implementation of NB for a classification task that doesn't involve documents?
> 
> I have about 80 million records with a small number of features.  The arff header looks like (the numeric features could easily be nominalized if need be):
> 
> @RELATION        relation
> @ATTRIBUTE      featurea    NUMERIC
> @ATTRIBUTE      featureb    {1,2,3,4,5,6,7}
> @ATTRIBUTE      featurec     {1,2,3,4,5,6,7}
> @ATTRIBUTE      featured     NUMERIC
> @ATTRIBUTE      featuref        NUMERIC
> @ATTRIBUTE      featuref {0,1}
> @ATTRIBUTE      target  {0,1}