You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Gabor Bernat <be...@primeranks.net> on 2012/12/04 22:27:20 UTC

Very high average absolute difference score

Hello,

I've perhaps a newbie question. I've tried the average absolute difference
evaluator on a data set, and the result was 43, with a log-likelihood (and
Euclidian), item-based recommender. Which is strange given, that they are 3
types of events (preference values defined in the input file data) with
ratings 5,8 and 10. How's this possible?
Recall and precision in the same use case is somehow high 0.4 and 0.8 to
increase my confusion recall and precision in the same use case is somehow
high 0.4 and 0.8 to increase my confusion. Any solid explanations for this?

Thanks,

Bernát GÁBOR

Re: Very high average absolute difference score

Posted by Gabor Bernat <be...@primeranks.net>.

Hello,

Nope, that's not the case. Managed to find the issue, and damn, I was using
the boolean recommender instead of the one with the ratings
(GenericItemBasedRecommender vs GenericBooleanPrefItemBasedRecommender),
hence the precision-recall was sort of good.

Anyways, thanks for the quick help,

Bernát GÁBOR


On Tue, Dec 4, 2012 at 11:49 PM, Ted Dunning <te...@gmail.com> wrote:

> Bernát
>
> I am guessing from the fact that you have accents in your name that you may
> be in Europe.
>
> If so, it is possible that there is a confusion about the decimal point
> that Mahout uses and the one that you use.  Is it possible that you have
> decimal numbers like 3,1 instead of 3.1?
>
> On Tue, Dec 4, 2012 at 11:30 PM, Gabor Bernat <be...@primeranks.net>
> wrote:
>
> > Hmmm, I had the feeling that it was so. The input format is right, I
> think,
> > csv file with *userId,itemId,rating *rows, I'll try to create a minimal
> > working code tomorrow and get back onto this.
> >
> > Thanks,
> >
> > Bernát GÁBOR
> >
> >
> > On Tue, Dec 4, 2012 at 11:21 PM, Sean Owen <sr...@gmail.com> wrote:
> >
> > > That doesn't sound right. I can't immediately think of how that would
> > > be possible because the result is a weighted average. Have you
> > > double-checked all this? the data is in the right format, the values
> > > are what you think, etc. Maybe you can show your code. Your best bet
> > > is to look at the estimation function and try to isolate a case where
> > > the return value is large, then see more about why.
> > >
> > > On Tue, Dec 4, 2012 at 9:27 PM, Gabor Bernat <be...@primeranks.net>
> > > wrote:
> > > > Hello,
> > > >
> > > > I've perhaps a newbie question. I've tried the average absolute
> > > difference
> > > > evaluator on a data set, and the result was 43, with a log-likelihood
> > > (and
> > > > Euclidian), item-based recommender. Which is strange given, that they
> > > are 3
> > > > types of events (preference values defined in the input file data)
> with
> > > > ratings 5,8 and 10. How's this possible?
> > > > Recall and precision in the same use case is somehow high 0.4 and 0.8
> > to
> > > > increase my confusion recall and precision in the same use case is
> > > somehow
> > > > high 0.4 and 0.8 to increase my confusion. Any solid explanations for
> > > this?
> > > >
> > > > Thanks,
> > > >
> > > > Bernát GÁBOR
> > >
> >
>

Re: Very high average absolute difference score

Posted by Ted Dunning <te...@gmail.com>.

Bernát

I am guessing from the fact that you have accents in your name that you may
be in Europe.

If so, it is possible that there is a confusion about the decimal point
that Mahout uses and the one that you use.  Is it possible that you have
decimal numbers like 3,1 instead of 3.1?

On Tue, Dec 4, 2012 at 11:30 PM, Gabor Bernat <be...@primeranks.net> wrote:

> Hmmm, I had the feeling that it was so. The input format is right, I think,
> csv file with *userId,itemId,rating *rows, I'll try to create a minimal
> working code tomorrow and get back onto this.
>
> Thanks,
>
> Bernát GÁBOR
>
>
> On Tue, Dec 4, 2012 at 11:21 PM, Sean Owen <sr...@gmail.com> wrote:
>
> > That doesn't sound right. I can't immediately think of how that would
> > be possible because the result is a weighted average. Have you
> > double-checked all this? the data is in the right format, the values
> > are what you think, etc. Maybe you can show your code. Your best bet
> > is to look at the estimation function and try to isolate a case where
> > the return value is large, then see more about why.
> >
> > On Tue, Dec 4, 2012 at 9:27 PM, Gabor Bernat <be...@primeranks.net>
> > wrote:
> > > Hello,
> > >
> > > I've perhaps a newbie question. I've tried the average absolute
> > difference
> > > evaluator on a data set, and the result was 43, with a log-likelihood
> > (and
> > > Euclidian), item-based recommender. Which is strange given, that they
> > are 3
> > > types of events (preference values defined in the input file data) with
> > > ratings 5,8 and 10. How's this possible?
> > > Recall and precision in the same use case is somehow high 0.4 and 0.8
> to
> > > increase my confusion recall and precision in the same use case is
> > somehow
> > > high 0.4 and 0.8 to increase my confusion. Any solid explanations for
> > this?
> > >
> > > Thanks,
> > >
> > > Bernát GÁBOR
> >
>

Re: Very high average absolute difference score

Posted by Gabor Bernat <be...@primeranks.net>.

Hmmm, I had the feeling that it was so. The input format is right, I think,
csv file with *userId,itemId,rating *rows, I'll try to create a minimal
working code tomorrow and get back onto this.

Thanks,

Bernát GÁBOR


On Tue, Dec 4, 2012 at 11:21 PM, Sean Owen <sr...@gmail.com> wrote:

> That doesn't sound right. I can't immediately think of how that would
> be possible because the result is a weighted average. Have you
> double-checked all this? the data is in the right format, the values
> are what you think, etc. Maybe you can show your code. Your best bet
> is to look at the estimation function and try to isolate a case where
> the return value is large, then see more about why.
>
> On Tue, Dec 4, 2012 at 9:27 PM, Gabor Bernat <be...@primeranks.net>
> wrote:
> > Hello,
> >
> > I've perhaps a newbie question. I've tried the average absolute
> difference
> > evaluator on a data set, and the result was 43, with a log-likelihood
> (and
> > Euclidian), item-based recommender. Which is strange given, that they
> are 3
> > types of events (preference values defined in the input file data) with
> > ratings 5,8 and 10. How's this possible?
> > Recall and precision in the same use case is somehow high 0.4 and 0.8 to
> > increase my confusion recall and precision in the same use case is
> somehow
> > high 0.4 and 0.8 to increase my confusion. Any solid explanations for
> this?
> >
> > Thanks,
> >
> > Bernát GÁBOR
>

Re: Very high average absolute difference score

Posted by Sean Owen <sr...@gmail.com>.

That doesn't sound right. I can't immediately think of how that would
be possible because the result is a weighted average. Have you
double-checked all this? the data is in the right format, the values
are what you think, etc. Maybe you can show your code. Your best bet
is to look at the estimation function and try to isolate a case where
the return value is large, then see more about why.

On Tue, Dec 4, 2012 at 9:27 PM, Gabor Bernat <be...@primeranks.net> wrote:
> Hello,
>
> I've perhaps a newbie question. I've tried the average absolute difference
> evaluator on a data set, and the result was 43, with a log-likelihood (and
> Euclidian), item-based recommender. Which is strange given, that they are 3
> types of events (preference values defined in the input file data) with
> ratings 5,8 and 10. How's this possible?
> Recall and precision in the same use case is somehow high 0.4 and 0.8 to
> increase my confusion recall and precision in the same use case is somehow
> high 0.4 and 0.8 to increase my confusion. Any solid explanations for this?
>
> Thanks,
>
> Bernát GÁBOR