You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Mugoma Joseph Okomba <mu...@yengas.com> on 2012/05/09 03:00:51 UTC

Exclusing certain ratings when running recommender

Hello,

I have database with ratings: 1,2,3,4

However, when running the recommender I would, in some cases, want to
exclude items with rating 4.

I considered IDRescorer but reckon that it only filters items after the
recommender has has already recommended. I would like items filtered
before recommendations i.e. they should not be included when calculating
recommendations.

What's the best way of handling this in mahout?

Thanks.

Mugoma.


Re: Exclusing certain ratings when running recommender

Posted by Jens Grivolla <j+...@grivolla.net>.
As I understand it, what Mugoma is asking about has nothing to do with 
filtering or rescoring candidates (and apparently he is already using 
IDRescorer in other settings to do that).

He seems to want to exclude ratings when calculating the user- or 
item-similarity (or whatever approach he is using), which almost by 
definition does not include the candidates (at least in the case of 
user-similarity).

HTH,
Jens

On 05/09/2012 03:18 PM, Sean Owen wrote:
> In that case -- the rescoring never operates on "original items". It is
> rescoring only estimated ratings.
> If you supply no rescorer, no filtering or rescoring happens.
> You would never delete data to make an item unrecommendable for one query,
> because you would be totally deleting the data!
>
> I hope this finally clarifies -- something like this could happen:
>
> There are 9 items in the world: 1 2 3 4 5 6 7 8 9
> User A expresses a rating for 1 2 3, so only 4 5 6 7 8 9 are recommendable
> The framework further selects as possible candidates 6 7 8 9
> The filter removes 6 7, leaving 8 9 as possibilities<-- THIS IS FILTERING
> The recommender algorithm predicts a rating of 3.5 for 8, and 3.2 for 9
> The rescorer changes the prediction for 9 from 3.2 to 4.1<-- THIS IS
> RESCORING
> The final recommendations are 9 (score of 4.1), then 8 (score of 3.5)
>
> To say it a third time: IDRescorer already does *exactly what you are
> describing*!!!
>
> On Wed, May 9, 2012 at 11:33 AM, Mugoma Joseph Okomba<mu...@yengas.com>wrote:
>
>>
>> By 'original items' I mean the items in the database (just the raw table
>> rows). In the example I gave original items are 10. Actually should be
>> 'original ratings', not 'original items'
>>
>> If you look at CachingRecommender it has 2 recommend () methods: one with
>> IDRescorer and other without. My understanding of this is that the one
>> with IDRescorer has *recommended* items altered by IDRescorer. The one
>> without doesn't. So, the IDRescorer works on *recommended* items and not
>> *original* ratings.
>>
>> The confusion is not about filtering parse but what's being filtered.
>>
>> For the example I gave I could do an SQL to delete the 'unwanted' ratings
>> then have a clean set of ratings to feed into the recommender so that the
>> recommender sees 7 ratings instead of 10. But this doesn't look intuitive
>> so I thought there's a better way of handling this within mahout.
>>
>> Probably what I need is a new data model that overwrites
>> getPreferencesFromUser(long id) and getPreferencesForItem(long itemID)
>>
>>
>> On Wed, May 9, 2012 12:25 pm, Sean Owen wrote:
>>> What do you mean "original items"? The user's preferred items are already
>>> not candidates for recommendation, but that is nothing to do with the
>>> rescorer. It operates on all *candidate* items, *before* scoring.
>>>
>>> What is your distinction between filtering *recommended* items and
>>> *original* items? Either way it is filtered. I don't understand what you
>>> are getting at.
>>>
>>> On Wed, May 9, 2012 at 10:09 AM, Mugoma Joseph Okomba
>>> <mu...@yengas.com>wrote:
>>>
>>>>
>>>> If it's true that IDRescorer works on original items then that's both
>>>> bad
>>>> and good news for me.
>>>>
>>>> The bad news is is that all the code I had previous written involving
>>>> IDRescorer is all bugy since I had assumed that IDRescorer filters
>>>> recommendations and not original list
>>>>
>>>> The bit of good news is that I don't have to anything for the new task.
>>>>
>>>> But, if IDRescorer changes original list, what can be used to change
>>>> recommendations?
>>>
>>
>>
>>
>



Re: Exclusing certain ratings when running recommender

Posted by Sean Owen <sr...@gmail.com>.
In that case -- the rescoring never operates on "original items". It is
rescoring only estimated ratings.
If you supply no rescorer, no filtering or rescoring happens.
You would never delete data to make an item unrecommendable for one query,
because you would be totally deleting the data!

I hope this finally clarifies -- something like this could happen:

There are 9 items in the world: 1 2 3 4 5 6 7 8 9
User A expresses a rating for 1 2 3, so only 4 5 6 7 8 9 are recommendable
The framework further selects as possible candidates 6 7 8 9
The filter removes 6 7, leaving 8 9 as possibilities   <-- THIS IS FILTERING
The recommender algorithm predicts a rating of 3.5 for 8, and 3.2 for 9
The rescorer changes the prediction for 9 from 3.2 to 4.1  <-- THIS IS
RESCORING
The final recommendations are 9 (score of 4.1), then 8 (score of 3.5)

To say it a third time: IDRescorer already does *exactly what you are
describing*!!!

On Wed, May 9, 2012 at 11:33 AM, Mugoma Joseph Okomba <mu...@yengas.com>wrote:

>
> By 'original items' I mean the items in the database (just the raw table
> rows). In the example I gave original items are 10. Actually should be
> 'original ratings', not 'original items'
>
> If you look at CachingRecommender it has 2 recommend () methods: one with
> IDRescorer and other without. My understanding of this is that the one
> with IDRescorer has *recommended* items altered by IDRescorer. The one
> without doesn't. So, the IDRescorer works on *recommended* items and not
> *original* ratings.
>
> The confusion is not about filtering parse but what's being filtered.
>
> For the example I gave I could do an SQL to delete the 'unwanted' ratings
> then have a clean set of ratings to feed into the recommender so that the
> recommender sees 7 ratings instead of 10. But this doesn't look intuitive
> so I thought there's a better way of handling this within mahout.
>
> Probably what I need is a new data model that overwrites
> getPreferencesFromUser(long id) and getPreferencesForItem(long itemID)
>
>
> On Wed, May 9, 2012 12:25 pm, Sean Owen wrote:
> > What do you mean "original items"? The user's preferred items are already
> > not candidates for recommendation, but that is nothing to do with the
> > rescorer. It operates on all *candidate* items, *before* scoring.
> >
> > What is your distinction between filtering *recommended* items and
> > *original* items? Either way it is filtered. I don't understand what you
> > are getting at.
> >
> > On Wed, May 9, 2012 at 10:09 AM, Mugoma Joseph Okomba
> > <mu...@yengas.com>wrote:
> >
> >>
> >> If it's true that IDRescorer works on original items then that's both
> >> bad
> >> and good news for me.
> >>
> >> The bad news is is that all the code I had previous written involving
> >> IDRescorer is all bugy since I had assumed that IDRescorer filters
> >> recommendations and not original list
> >>
> >> The bit of good news is that I don't have to anything for the new task.
> >>
> >> But, if IDRescorer changes original list, what can be used to change
> >> recommendations?
> >
>
>
>

Re: Exclusing certain ratings when running recommender

Posted by Mugoma Joseph Okomba <mu...@yengas.com>.
By 'original items' I mean the items in the database (just the raw table
rows). In the example I gave original items are 10. Actually should be
'original ratings', not 'original items'

If you look at CachingRecommender it has 2 recommend () methods: one with
IDRescorer and other without. My understanding of this is that the one
with IDRescorer has *recommended* items altered by IDRescorer. The one
without doesn't. So, the IDRescorer works on *recommended* items and not 
*original* ratings.

The confusion is not about filtering parse but what's being filtered.

For the example I gave I could do an SQL to delete the 'unwanted' ratings
then have a clean set of ratings to feed into the recommender so that the
recommender sees 7 ratings instead of 10. But this doesn't look intuitive
so I thought there's a better way of handling this within mahout.

Probably what I need is a new data model that overwrites
getPreferencesFromUser(long id) and getPreferencesForItem(long itemID)


On Wed, May 9, 2012 12:25 pm, Sean Owen wrote:
> What do you mean "original items"? The user's preferred items are already
> not candidates for recommendation, but that is nothing to do with the
> rescorer. It operates on all *candidate* items, *before* scoring.
>
> What is your distinction between filtering *recommended* items and
> *original* items? Either way it is filtered. I don't understand what you
> are getting at.
>
> On Wed, May 9, 2012 at 10:09 AM, Mugoma Joseph Okomba
> <mu...@yengas.com>wrote:
>
>>
>> If it's true that IDRescorer works on original items then that's both
>> bad
>> and good news for me.
>>
>> The bad news is is that all the code I had previous written involving
>> IDRescorer is all bugy since I had assumed that IDRescorer filters
>> recommendations and not original list
>>
>> The bit of good news is that I don't have to anything for the new task.
>>
>> But, if IDRescorer changes original list, what can be used to change
>> recommendations?
>



Re: Exclusing certain ratings when running recommender

Posted by Sean Owen <sr...@gmail.com>.
What do you mean "original items"? The user's preferred items are already
not candidates for recommendation, but that is nothing to do with the
rescorer. It operates on all *candidate* items, *before* scoring.

What is your distinction between filtering *recommended* items and
*original* items? Either way it is filtered. I don't understand what you
are getting at.

On Wed, May 9, 2012 at 10:09 AM, Mugoma Joseph Okomba <mu...@yengas.com>wrote:

>
> If it's true that IDRescorer works on original items then that's both bad
> and good news for me.
>
> The bad news is is that all the code I had previous written involving
> IDRescorer is all bugy since I had assumed that IDRescorer filters
> recommendations and not original list
>
> The bit of good news is that I don't have to anything for the new task.
>
> But, if IDRescorer changes original list, what can be used to change
> recommendations?

Re: Exclusing certain ratings when running recommender

Posted by Mugoma Joseph Okomba <mu...@yengas.com>.
If it's true that IDRescorer works on original items then that's both bad
and good news for me.

The bad news is is that all the code I had previous written involving
IDRescorer is all bugy since I had assumed that IDRescorer filters
recommendations and not original list

The bit of good news is that I don't have to anything for the new task.

But, if IDRescorer changes original list, what can be used to change
recommendations?

On Wed, May 9, 2012 11:14 am, Sean Owen wrote:
> Trust me, I'm telling you how it works since I wrote it. What is unclear
> about the steps I listed below?
>
> Filtering happens before even *scoring*. It works exactly how you want it
> to. If you're still confused please just read the source code.
>
> On Wed, May 9, 2012 at 9:08 AM, Mugoma Joseph Okomba
> <mu...@yengas.com>wrote:
>
>>
>> From javadoc
>> (
>> https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/impl/recommender/GenericUserBasedRecommender.html#recommend(long,%20int,%20org.apache.mahout.cf.taste.recommender.IDRescorer)
>>
>> rescorer - rescoring function to apply before final list of
>> recommendations is determined
>>
>> My take is that IDRescorer is used to determine the final list of
>> recommendations (via filtering or re-scoring) but doesn't apply on the
>> original items.
>>
>> To be clear let me give small demonstration:
>> 1. Item 1 => rating 3
>> 2. Item 2 => rating 2
>> 3. Item 3 => rating 1
>> 4. Item 4 => rating 3
>> 5. Item 5 => rating 2
>> 6. Item 6 => rating 4
>> 7. Item 7 => rating 2
>> 8. Item 8 => rating 3
>> 9. Item 9 => rating 4
>> 10. Item 10 => rating 4
>>
>> Original items => 10
>>
>> What IDRescorer does is compute recommendations then exclude items 6,9 &
>> 10 from final list via filtering.
>>
>> What I need is that, before recommendations are computed, only items
>> 1,2,3,4,5,7,8 are considered i.e. original items should be reduced from
>> 10
>> to 7 and only the 7 items should be considered by the recommender.
>>
>>
>



Re: Exclusing certain ratings when running recommender

Posted by Sean Owen <sr...@gmail.com>.
Trust me, I'm telling you how it works since I wrote it. What is unclear
about the steps I listed below?

Filtering happens before even *scoring*. It works exactly how you want it
to. If you're still confused please just read the source code.

On Wed, May 9, 2012 at 9:08 AM, Mugoma Joseph Okomba <mu...@yengas.com>wrote:

>
> From javadoc
> (
> https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/impl/recommender/GenericUserBasedRecommender.html#recommend(long,%20int,%20org.apache.mahout.cf.taste.recommender.IDRescorer)
>
> rescorer - rescoring function to apply before final list of
> recommendations is determined
>
> My take is that IDRescorer is used to determine the final list of
> recommendations (via filtering or re-scoring) but doesn't apply on the
> original items.
>
> To be clear let me give small demonstration:
> 1. Item 1 => rating 3
> 2. Item 2 => rating 2
> 3. Item 3 => rating 1
> 4. Item 4 => rating 3
> 5. Item 5 => rating 2
> 6. Item 6 => rating 4
> 7. Item 7 => rating 2
> 8. Item 8 => rating 3
> 9. Item 9 => rating 4
> 10. Item 10 => rating 4
>
> Original items => 10
>
> What IDRescorer does is compute recommendations then exclude items 6,9 &
> 10 from final list via filtering.
>
> What I need is that, before recommendations are computed, only items
> 1,2,3,4,5,7,8 are considered i.e. original items should be reduced from 10
> to 7 and only the 7 items should be considered by the recommender.
>
>

Re: Exclusing certain ratings when running recommender

Posted by Mugoma Joseph Okomba <mu...@yengas.com>.
>From javadoc
(https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/impl/recommender/GenericUserBasedRecommender.html#recommend(long,%20int,%20org.apache.mahout.cf.taste.recommender.IDRescorer)

rescorer - rescoring function to apply before final list of
recommendations is determined

My take is that IDRescorer is used to determine the final list of
recommendations (via filtering or re-scoring) but doesn't apply on the
original items.

To be clear let me give small demonstration:
1. Item 1 => rating 3
2. Item 2 => rating 2
3. Item 3 => rating 1
4. Item 4 => rating 3
5. Item 5 => rating 2
6. Item 6 => rating 4
7. Item 7 => rating 2
8. Item 8 => rating 3
9. Item 9 => rating 4
10. Item 10 => rating 4

Original items => 10

What IDRescorer does is compute recommendations then exclude items 6,9 &
10 from final list via filtering.

What I need is that, before recommendations are computed, only items
1,2,3,4,5,7,8 are considered i.e. original items should be reduced from 10
to 7 and only the 7 items should be considered by the recommender.


On Wed, May 9, 2012 10:18 am, Sean Owen wrote:
> Filtering happens on "original items" if I understand you correctly.
> Conceptually, it goes like...
>
>
>    1. Candidate items are chosen
>    2. (Optionally) Items are filtered
>    3. Items are scored
>    4. (Optionally) Items are rescored
>    5. The top K are returned
>
>
> On Wed, May 9, 2012 at 7:21 AM, Mugoma Joseph Okomba
> <mu...@yengas.com>wrote:
>
>> On Wed, May 9, 2012 7:37 am, Sean Owen wrote:
>> > Actually that's how IDRescorer already works. It will filter before
>> > scoring.
>> >
>>
>> Does 'before scoring' mean before the recommender extracts
>> recommendations?
>>
>> The way I have used IDRescorer before is as a way of filtering out
>> recommendations I didn't want to be seen. I assumed that the filtering
>> happens on the recommended items, not on the original items. Am I wrong
>> on
>> this?
>>
>> Thanks.
>>
>> Mugoma.
>>
>>
>



Re: Exclusing certain ratings when running recommender

Posted by Sean Owen <sr...@gmail.com>.
Filtering happens on "original items" if I understand you correctly.
Conceptually, it goes like...


   1. Candidate items are chosen
   2. (Optionally) Items are filtered
   3. Items are scored
   4. (Optionally) Items are rescored
   5. The top K are returned


On Wed, May 9, 2012 at 7:21 AM, Mugoma Joseph Okomba <mu...@yengas.com>wrote:

> On Wed, May 9, 2012 7:37 am, Sean Owen wrote:
> > Actually that's how IDRescorer already works. It will filter before
> > scoring.
> >
>
> Does 'before scoring' mean before the recommender extracts recommendations?
>
> The way I have used IDRescorer before is as a way of filtering out
> recommendations I didn't want to be seen. I assumed that the filtering
> happens on the recommended items, not on the original items. Am I wrong on
> this?
>
> Thanks.
>
> Mugoma.
>
>

Re: Exclusing certain ratings when running recommender

Posted by Mugoma Joseph Okomba <mu...@yengas.com>.
On Wed, May 9, 2012 7:37 am, Sean Owen wrote:
> Actually that's how IDRescorer already works. It will filter before
> scoring.
>

Does 'before scoring' mean before the recommender extracts recommendations?

The way I have used IDRescorer before is as a way of filtering out
recommendations I didn't want to be seen. I assumed that the filtering
happens on the recommended items, not on the original items. Am I wrong on
this?

Thanks.

Mugoma.


Re: Exclusing certain ratings when running recommender

Posted by Sean Owen <sr...@gmail.com>.
Actually that's how IDRescorer already works. It will filter before scoring.

On Wed, May 9, 2012 at 2:00 AM, Mugoma Joseph Okomba <mu...@yengas.com>wrote:

> Hello,
>
> I have database with ratings: 1,2,3,4
>
> However, when running the recommender I would, in some cases, want to
> exclude items with rating 4.
>
> I considered IDRescorer but reckon that it only filters items after the
> recommender has has already recommended. I would like items filtered
> before recommendations i.e. they should not be included when calculating
> recommendations.
>
> What's the best way of handling this in mahout?
>
> Thanks.
>
> Mugoma.
>
>