You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Xiaomeng Wan <sh...@gmail.com> on 2010/07/22 23:07:06 UTC

20news example

Hi,
I just start to use mahout. I got the 20news example working well in
local mode, but when run over hadoop cluster, the confusion matrix
shows all zeros. It looks like the test treats each file as one
document, instead of one line, and so, the results show 20 docs all
fall into the unknown category. Does anyone have an idea? Thanks in
advance!

Regards,
Xiaomeng Wan

Re: 20news example

Posted by Drew Farris <dr...@gmail.com>.
Hi,

I'm seeing this too with the lastest from trunk. Haven't had a chance
to investigate why the input files aren't being read properly.

Drew

On Thu, Jul 22, 2010 at 5:07 PM, Xiaomeng Wan <sh...@gmail.com> wrote:
> Hi,
> I just start to use mahout. I got the 20news example working well in
> local mode, but when run over hadoop cluster, the confusion matrix
> shows all zeros. It looks like the test treats each file as one
> document, instead of one line, and so, the results show 20 docs all
> fall into the unknown category. Does anyone have an idea? Thanks in
> advance!
>
> Regards,
> Xiaomeng Wan
>