You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Xiaobo Gu <gu...@gmail.com> on 2011/05/31 16:45:06 UTC

What does percentCorrect of CrossFloderLearner mean?

Does it mean the percent of records that the model has correctlly
predicted the target on the validate protion of the data set, then it
should be between 0 and 1, and the bigger the better performance of
the model ?

Regards,

Xiaobo Gu

Re: What does percentCorrect of CrossFloderLearner mean?

Posted by Ted Dunning <te...@gmail.com>.
Yes.

0 is perfect prediction.  It can only be achieved by a score of 1 for the
correct answer every time.

Note that average log-likelihood only works for probability scores.

On Tue, May 31, 2011 at 6:38 PM, Xiaobo Gu <gu...@gmail.com> wrote:

> On Tue, May 31, 2011 at 11:54 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > Argh....
> >
> > log-likelihood should approach the percentage of INcorrect answers
> > (negated).
>
> Then we just only to see if the average log likeliyhood is closer to 0
> to determine the perfmonce of the model, regardless the relationship
> between it and percentage of INcorrect or correct answers?
>
>
> > On Tue, May 31, 2011 at 7:49 AM, Xiaobo Gu <gu...@gmail.com>
> wrote:
> >
> >> Page 228 of version 7 of Mahout in Action says :
> >>
> >> Log-likelihood has a maximum value of zero and no bound on how far
> >> negative it can go. For highly accurate classifiers, the value of
> >> average log-likelihood should be close to the average percent correct
> >> for the classifier times the number of target categories.
> >>
> >> Average percent correct times the number of target categories is more
> >> than 0, while Log-likelihood is always less than 0, then is the above
> >> statement correct ?
> >>
> >>
> >>
> >> On Tue, May 31, 2011 at 10:45 PM, Xiaobo Gu <gu...@gmail.com>
> >> wrote:
> >> > Does it mean the percent of records that the model has correctlly
> >> > predicted the target on the validate protion of the data set, then it
> >> > should be between 0 and 1, and the bigger the better performance of
> >> > the model ?
> >> >
> >> > Regards,
> >> >
> >> > Xiaobo Gu
> >> >
> >>
> >
>

Re: What does percentCorrect of CrossFloderLearner mean?

Posted by Xiaobo Gu <gu...@gmail.com>.
On Tue, May 31, 2011 at 11:54 PM, Ted Dunning <te...@gmail.com> wrote:
> Argh....
>
> log-likelihood should approach the percentage of INcorrect answers
> (negated).

Then we just only to see if the average log likeliyhood is closer to 0
to determine the perfmonce of the model, regardless the relationship
between it and percentage of INcorrect or correct answers?


> On Tue, May 31, 2011 at 7:49 AM, Xiaobo Gu <gu...@gmail.com> wrote:
>
>> Page 228 of version 7 of Mahout in Action says :
>>
>> Log-likelihood has a maximum value of zero and no bound on how far
>> negative it can go. For highly accurate classifiers, the value of
>> average log-likelihood should be close to the average percent correct
>> for the classifier times the number of target categories.
>>
>> Average percent correct times the number of target categories is more
>> than 0, while Log-likelihood is always less than 0, then is the above
>> statement correct ?
>>
>>
>>
>> On Tue, May 31, 2011 at 10:45 PM, Xiaobo Gu <gu...@gmail.com>
>> wrote:
>> > Does it mean the percent of records that the model has correctlly
>> > predicted the target on the validate protion of the data set, then it
>> > should be between 0 and 1, and the bigger the better performance of
>> > the model ?
>> >
>> > Regards,
>> >
>> > Xiaobo Gu
>> >
>>
>

Re: What does percentCorrect of CrossFloderLearner mean?

Posted by Ted Dunning <te...@gmail.com>.
Argh....

log-likelihood should approach the percentage of INcorrect answers
(negated).

On Tue, May 31, 2011 at 7:49 AM, Xiaobo Gu <gu...@gmail.com> wrote:

> Page 228 of version 7 of Mahout in Action says :
>
> Log-likelihood has a maximum value of zero and no bound on how far
> negative it can go. For highly accurate classifiers, the value of
> average log-likelihood should be close to the average percent correct
> for the classifier times the number of target categories.
>
> Average percent correct times the number of target categories is more
> than 0, while Log-likelihood is always less than 0, then is the above
> statement correct ?
>
>
>
> On Tue, May 31, 2011 at 10:45 PM, Xiaobo Gu <gu...@gmail.com>
> wrote:
> > Does it mean the percent of records that the model has correctlly
> > predicted the target on the validate protion of the data set, then it
> > should be between 0 and 1, and the bigger the better performance of
> > the model ?
> >
> > Regards,
> >
> > Xiaobo Gu
> >
>

Re: What does percentCorrect of CrossFloderLearner mean?

Posted by Xiaobo Gu <gu...@gmail.com>.
Page 228 of version 7 of Mahout in Action says :

Log-likelihood has a maximum value of zero and no bound on how far
negative it can go. For highly accurate classifiers, the value of
average log-likelihood should be close to the average percent correct
for the classifier times the number of target categories.

Average percent correct times the number of target categories is more
than 0, while Log-likelihood is always less than 0, then is the above
statement correct ?



On Tue, May 31, 2011 at 10:45 PM, Xiaobo Gu <gu...@gmail.com> wrote:
> Does it mean the percent of records that the model has correctlly
> predicted the target on the validate protion of the data set, then it
> should be between 0 and 1, and the bigger the better performance of
> the model ?
>
> Regards,
>
> Xiaobo Gu
>

Re: What does percentCorrect of CrossFloderLearner mean?

Posted by Ted Dunning <te...@gmail.com>.
Yes.

On Tue, May 31, 2011 at 7:50 AM, Xiaobo Gu <gu...@gmail.com> wrote:

> Does percentCorrect of CrossFolderLearner support multi target value
> modeling?
>

Re: What does percentCorrect of CrossFloderLearner mean?

Posted by Xiaobo Gu <gu...@gmail.com>.
Does percentCorrect of CrossFolderLearner support multi target value modeling?



On Tue, May 31, 2011 at 10:45 PM, Xiaobo Gu <gu...@gmail.com> wrote:
> Does it mean the percent of records that the model has correctlly
> predicted the target on the validate protion of the data set, then it
> should be between 0 and 1, and the bigger the better performance of
> the model ?
>
> Regards,
>
> Xiaobo Gu
>