You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Lithium Guava <li...@gmail.com> on 2012/04/12 14:28:55 UTC

CBayes Input

Hi,

I've played with the bayes 20newsgroups example, but I'd like to try
running the cbayes algorithm on it also. The example script doesn't seem to
offer this, so I dug into the code a bit and it looks like the input to
cbayes is sequence files rather than key/value text files.

Can anyone tell me how those sequence files should be formatted? I couldn't
find it documented anywhere. Also I don't suppose there's a handy prepare
data program to get it running on the 20newsgroups data easily?

Thanks,

Tom

Re: CBayes Input

Posted by Lithium Guava <li...@gmail.com>.
Thanks, I just realised I was getting the wrong end of the stick - looking
at the theta normalizer driver code thinking it was the classifier driver...

Cheers!


On 12 April 2012 13:53, Robin Anil <ro...@gmail.com> wrote:

> In the command line example replace "bayes" with  "cbayes". That's all you
> need to do.
> On Apr 12, 2012 7:29 AM, "Lithium Guava" <li...@gmail.com> wrote:
>
> > Hi,
> >
> > I've played with the bayes 20newsgroups example, but I'd like to try
> > running the cbayes algorithm on it also. The example script doesn't seem
> to
> > offer this, so I dug into the code a bit and it looks like the input to
> > cbayes is sequence files rather than key/value text files.
> >
> > Can anyone tell me how those sequence files should be formatted? I
> couldn't
> > find it documented anywhere. Also I don't suppose there's a handy prepare
> > data program to get it running on the 20newsgroups data easily?
> >
> > Thanks,
> >
> > Tom
> >
>

Re: CBayes Input

Posted by Robin Anil <ro...@gmail.com>.
In the command line example replace "bayes" with  "cbayes". That's all you
need to do.
On Apr 12, 2012 7:29 AM, "Lithium Guava" <li...@gmail.com> wrote:

> Hi,
>
> I've played with the bayes 20newsgroups example, but I'd like to try
> running the cbayes algorithm on it also. The example script doesn't seem to
> offer this, so I dug into the code a bit and it looks like the input to
> cbayes is sequence files rather than key/value text files.
>
> Can anyone tell me how those sequence files should be formatted? I couldn't
> find it documented anywhere. Also I don't suppose there's a handy prepare
> data program to get it running on the 20newsgroups data easily?
>
> Thanks,
>
> Tom
>