You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Mat Kelcey <ma...@gmail.com> on 2011/05/22 23:53:41 UTC

AUC=1.00 when confusion matrix shows prediction not perfect

Hi,

I'm working through some examples from mahout in action and have got a
strange result.

mat@matpc:~/dev/mahout$ bin/mahout trainlogistic --input donut.csv
--output ./model --target color --categories 2 --predictors x y a b c
--types numeric --features 20 --passes 100 --rate 50
...
mat@matpc:~/dev/mahout$ bin/mahout runlogistic --input donut.csv
--model ./model --auc --confusion
...
AUC = 1.00
confusion: [[27.0, 1.0], [0.0, 12.0]]
entropy: [[-0.1, -1.5], [-4.0, -0.2]]
...

how can i have AUC=1.00 when there was a mis prediction?

cheers,
mat

Re: AUC=1.00 when confusion matrix shows prediction not perfect

Posted by Ted Dunning <te...@gmail.com>.

AUC is independent of threshold.  The confusion matrix is not.

If all scores for the positive class are greater than all scores for the
negative class you will have AUC = 1.00.  On the other hand, that doesn't
say that all of the positive scores are > 0.5 and all the negative ones <
0.5.  It just says that there is *some* threshold that would give perfect
performance on the data set you used.  Note that this value is on the
training set.

On Sun, May 22, 2011 at 2:53 PM, Mat Kelcey <ma...@gmail.com>wrote:

> Hi,
>
> I'm working through some examples from mahout in action and have got a
> strange result.
>
> mat@matpc:~/dev/mahout$ bin/mahout trainlogistic --input donut.csv
> --output ./model --target color --categories 2 --predictors x y a b c
> --types numeric --features 20 --passes 100 --rate 50
> ...
> mat@matpc:~/dev/mahout$ bin/mahout runlogistic --input donut.csv
> --model ./model --auc --confusion
> ...
> AUC = 1.00
> confusion: [[27.0, 1.0], [0.0, 12.0]]
> entropy: [[-0.1, -1.5], [-4.0, -0.2]]
> ...
>
> how can i have AUC=1.00 when there was a mis prediction?
>
> cheers,
> mat
>