You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Sandra Clover <sc...@consultant.com> on 2009/09/29 14:47:09 UTC
Classify() method results anomoly - help!
Hi, I'm using Mahout 0.1 for document classification (using the
distributed Bayesian Network) and I'm getting some answers back. I
have noticed 1 thing that is really bugging me. I'm wondering can you
help please:-
Problem: Concernign the Classify() method there are 2 constructors in
the API. The first one returns just one answer (according to the API it
returns: "the single best category"). The second constructor says that
it: "return the top numResults, ranked by score" My problem is that I
have compared and contrasted the results in both techniques. I have
noticed that the single best category does not appear at *all* in the
range of categories given by the second contructor! Strange no? I would
of expected that it should come top of the list. I have gone to a value
of 20 deep in the numResults level and have not even see in the best
category. Has anyone encountered this before? I would appreciate any
comments/suggestions/user-experience that you may like to share. Thanks,
Sandra.
--
An Excellent Credit Score is 750
See Yours in Just 2 Easy Steps!
Re: Classify() method results anomoly - help!
Posted by Grant Ingersoll <gs...@apache.org>.
On Sep 29, 2009, at 8:47 AM, Sandra Clover wrote:
> Hi, I'm using Mahout 0.1 for document classification (using the
> distributed Bayesian Network) and I'm getting some answers
> back. I
> have noticed 1 thing that is really bugging me. I'm wondering can you
> help please:-
> Problem: Concernign the Classify() method there are 2 constructors in
> the API. The first one returns just one answer (according to the API
> it
> returns: "the single best category"). The second constructor says that
> it: "return the top numResults, ranked by score" My problem is that I
> have compared and contrasted the results in both techniques. I have
> noticed that the single best category does not appear at *all* in the
> range of categories given by the second contructor! Strange no? I
> would
> of expected that it should come top of the list. I have gone to a
> value
> of 20 deep in the numResults level and have not even see in the best
> category. Has anyone encountered this before? I would appreciate
> any
> comments/suggestions/user-experience that you may like to share.
> Thanks,
> Sandra.
>
That sounds like a bug. Can you try out the trunk version of Mahout
and see if it is still there? A lot of the classification stuff has
been reworked recently (I'm not even sure at the moment that those two
classify methods are even still in the code!)