You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Cassio Melo <me...@gmail.com> on 2013/11/06 17:32:35 UTC

Decaying score for old preferences when using the .refresh()

Assuming that most recent ratings or implicit preference data is more
important than the older ones, I wonder if there is a way to decrease the
importance (score) of old preference entries without having to update all
previous preferences.

Currently I'm fetching new preferences from time to time and using the
.refresh() method to update the data model with the new values.

Thanks

Re: Decaying score for old preferences when using the .refresh()

Posted by Gokhan Capan <gk...@gmail.com>.

That would be great:

Specifically if that is some kind of real usage data, and the results are
evaluated against a -without decay- baseline, via A/B tests measuring the
increase in conversion.

Best

Gokhan


On Wed, Nov 20, 2013 at 2:28 PM, Cassio Melo <me...@gmail.com> wrote:

> Hi guys, thanks for sharing your experiences on this subject, really
> appreciated. To summarize the discussion:
>
> - The decay of old preference values might loose important historical data
> in cases where the user has no recent activity (Gokhan)
> - When using decay (or truncate preferences), the precision of rating
> prediction may be lower (Pat, Gokhan, Ted) but it might increase conversion
> rates (Gokhan, Pat) since it reflects recent user intent.
> - Tweaking the score estimation may be a better approach (Gokhan)
>
> I'm doing some experiments with e-commerce data, I'll post the results
> later.
>
> Best regards,
> Cassio
>
>
> On Fri, Nov 8, 2013 at 5:08 PM, Pat Ferrel <pa...@gmail.com> wrote:
>
> > > I think the intuition here is, when making an item neighborhood base
> > > recommendation, to penalize the contribution of the items that the user
> > has
> > > rated a long time ago. I didn't test this in a production recommender
> > > system, but I believe this might result in recommendation lists with
> > better
> > > conversion rates in certain use cases.
> >
> > It’s only one data point but it was a real ecom recommender with real
> user
> > data. We did not come to the conclusion above, though there is some truth
> > in it.
> >
> > There are two phenomena at play, similarity of users and items, and
> recent
> > user intent. Similarity of users decays very slowly if at all. The fact
> > that you and I bought an iPhone 1 makes us similar even though the
> iPhone 1
> > is no longer for sale. However you don’t really want to rely on user
> > activity that old to judge recent shopping intent. Mahout conflates these
> > unfortunately.
> >
> > Back to the canonical R = [B’B]H; [B’B] is actually calculated using some
> > similarity metric like log-likihood and RowSimilarityJob.
> > B = preference matrix; user = row, item = column, value = strength
> perhaps
> > 1 for a purchase.
> > H = user history of preferences in columns, rows = items
> >
> > If you did nothing to decay preferences B’=H
> >
> > If you truncate to use only recent preferences in H then B’ != H
> >
> > Out of the box Mahout requires B’=H, and we got significantly lower
> > precision scores by decaying BOTH B and H. Our conclusion was that this
> was
> > not really a good idea given our data.
> >
> > If you truncate user preferences to some number of the most recent in H
> > you probably get a lower precision score (as Ted mentions) but our
> > intuition was that the recommendations reflect the most recent user
> intent.
> > Unfortunately we haven’t A/B tested this conclusion but the candidate for
> > best recommender was using most recent prefs in H and all prefs in B.
> >
> > > On Nov 7, 2013, at 11:36 PM, Gokhan Capan <gk...@gmail.com> wrote:
> >
> > On Fri, Nov 8, 2013 at 6:24 AM, Ted Dunning <te...@gmail.com>
> wrote:
> >
> > > On Thu, Nov 7, 2013 at 12:50 AM, Gokhan Capan <gk...@gmail.com>
> wrote:
> > >
> > >> This particular approach is discussed, and proven to increase the
> > > accuracy
> > >> in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren.
> The
> > >> decay function is parameterized per user, keeping track of how
> > consistent
> > >> the user behavior is.
> > >>
> > >
> > > Note that user-level temporal dynamics does not actually improve the
> > > accuracy of ranking. It improves the accuracy of ratings.
> >
> >
> > Yes, the accuracy of rating prediction.
> >
> > Since
> > > recommendation quality is primarily a precision@20 sort of activity,
> > > improving ratings does no good at all.
> >
> >
> > > Item-level temporal dynamics is a different beast.
> > >
> >
> > I think the intuition here is, when making an item neighborhood base
> > recommendation, to penalize the contribution of the items that the user
> has
> > rated a long time ago. I didn't test this in a production recommender
> > system, but I believe this might result in recommendation lists with
> better
> > conversion rates in certain use cases.
> >
> > Best
> >
> >
>

Re: Decaying score for old preferences when using the .refresh()

Posted by Cassio Melo <me...@gmail.com>.

Hi guys, thanks for sharing your experiences on this subject, really
appreciated. To summarize the discussion:

- The decay of old preference values might loose important historical data
in cases where the user has no recent activity (Gokhan)
- When using decay (or truncate preferences), the precision of rating
prediction may be lower (Pat, Gokhan, Ted) but it might increase conversion
rates (Gokhan, Pat) since it reflects recent user intent.
- Tweaking the score estimation may be a better approach (Gokhan)

I'm doing some experiments with e-commerce data, I'll post the results
later.

Best regards,
Cassio


On Fri, Nov 8, 2013 at 5:08 PM, Pat Ferrel <pa...@gmail.com> wrote:

> > I think the intuition here is, when making an item neighborhood base
> > recommendation, to penalize the contribution of the items that the user
> has
> > rated a long time ago. I didn't test this in a production recommender
> > system, but I believe this might result in recommendation lists with
> better
> > conversion rates in certain use cases.
>
> It’s only one data point but it was a real ecom recommender with real user
> data. We did not come to the conclusion above, though there is some truth
> in it.
>
> There are two phenomena at play, similarity of users and items, and recent
> user intent. Similarity of users decays very slowly if at all. The fact
> that you and I bought an iPhone 1 makes us similar even though the iPhone 1
> is no longer for sale. However you don’t really want to rely on user
> activity that old to judge recent shopping intent. Mahout conflates these
> unfortunately.
>
> Back to the canonical R = [B’B]H; [B’B] is actually calculated using some
> similarity metric like log-likihood and RowSimilarityJob.
> B = preference matrix; user = row, item = column, value = strength perhaps
> 1 for a purchase.
> H = user history of preferences in columns, rows = items
>
> If you did nothing to decay preferences B’=H
>
> If you truncate to use only recent preferences in H then B’ != H
>
> Out of the box Mahout requires B’=H, and we got significantly lower
> precision scores by decaying BOTH B and H. Our conclusion was that this was
> not really a good idea given our data.
>
> If you truncate user preferences to some number of the most recent in H
> you probably get a lower precision score (as Ted mentions) but our
> intuition was that the recommendations reflect the most recent user intent.
> Unfortunately we haven’t A/B tested this conclusion but the candidate for
> best recommender was using most recent prefs in H and all prefs in B.
>
> > On Nov 7, 2013, at 11:36 PM, Gokhan Capan <gk...@gmail.com> wrote:
>
> On Fri, Nov 8, 2013 at 6:24 AM, Ted Dunning <te...@gmail.com> wrote:
>
> > On Thu, Nov 7, 2013 at 12:50 AM, Gokhan Capan <gk...@gmail.com> wrote:
> >
> >> This particular approach is discussed, and proven to increase the
> > accuracy
> >> in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren. The
> >> decay function is parameterized per user, keeping track of how
> consistent
> >> the user behavior is.
> >>
> >
> > Note that user-level temporal dynamics does not actually improve the
> > accuracy of ranking. It improves the accuracy of ratings.
>
>
> Yes, the accuracy of rating prediction.
>
> Since
> > recommendation quality is primarily a precision@20 sort of activity,
> > improving ratings does no good at all.
>
>
> > Item-level temporal dynamics is a different beast.
> >
>
> I think the intuition here is, when making an item neighborhood base
> recommendation, to penalize the contribution of the items that the user has
> rated a long time ago. I didn't test this in a production recommender
> system, but I believe this might result in recommendation lists with better
> conversion rates in certain use cases.
>
> Best
>
>

Re: Decaying score for old preferences when using the .refresh()

Posted by Pat Ferrel <pa...@gmail.com>.

> I think the intuition here is, when making an item neighborhood base
> recommendation, to penalize the contribution of the items that the user has
> rated a long time ago. I didn't test this in a production recommender
> system, but I believe this might result in recommendation lists with better
> conversion rates in certain use cases.

It’s only one data point but it was a real ecom recommender with real user data. We did not come to the conclusion above, though there is some truth in it.

There are two phenomena at play, similarity of users and items, and recent user intent. Similarity of users decays very slowly if at all. The fact that you and I bought an iPhone 1 makes us similar even though the iPhone 1 is no longer for sale. However you don’t really want to rely on user activity that old to judge recent shopping intent. Mahout conflates these unfortunately. 

Back to the canonical R = [B’B]H; [B’B] is actually calculated using some similarity metric like log-likihood and RowSimilarityJob.
B = preference matrix; user = row, item = column, value = strength perhaps 1 for a purchase.
H = user history of preferences in columns, rows = items

If you did nothing to decay preferences B’=H

If you truncate to use only recent preferences in H then B’ != H

Out of the box Mahout requires B’=H, and we got significantly lower precision scores by decaying BOTH B and H. Our conclusion was that this was not really a good idea given our data.

If you truncate user preferences to some number of the most recent in H you probably get a lower precision score (as Ted mentions) but our intuition was that the recommendations reflect the most recent user intent. Unfortunately we haven’t A/B tested this conclusion but the candidate for best recommender was using most recent prefs in H and all prefs in B.

> On Nov 7, 2013, at 11:36 PM, Gokhan Capan <gk...@gmail.com> wrote:

On Fri, Nov 8, 2013 at 6:24 AM, Ted Dunning <te...@gmail.com> wrote:

> On Thu, Nov 7, 2013 at 12:50 AM, Gokhan Capan <gk...@gmail.com> wrote:
> 
>> This particular approach is discussed, and proven to increase the
> accuracy
>> in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren. The
>> decay function is parameterized per user, keeping track of how consistent
>> the user behavior is.
>> 
> 
> Note that user-level temporal dynamics does not actually improve the
> accuracy of ranking. It improves the accuracy of ratings.

Yes, the accuracy of rating prediction.

Since
> recommendation quality is primarily a precision@20 sort of activity,
> improving ratings does no good at all.

> Item-level temporal dynamics is a different beast.
> 

I think the intuition here is, when making an item neighborhood base
recommendation, to penalize the contribution of the items that the user has
rated a long time ago. I didn't test this in a production recommender
system, but I believe this might result in recommendation lists with better
conversion rates in certain use cases.

Best

Re: Decaying score for old preferences when using the .refresh()

Posted by Gokhan Capan <gk...@gmail.com>.

On Fri, Nov 8, 2013 at 6:24 AM, Ted Dunning <te...@gmail.com> wrote:

> On Thu, Nov 7, 2013 at 12:50 AM, Gokhan Capan <gk...@gmail.com> wrote:
>
> > This particular approach is discussed, and proven to increase the
> accuracy
> > in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren. The
> > decay function is parameterized per user, keeping track of how consistent
> > the user behavior is.
> >
>
> Note that user-level temporal dynamics does not actually improve the
> accuracy of ranking. It improves the accuracy of ratings.

Yes, the accuracy of rating prediction.

 Since
> recommendation quality is primarily a precision@20 sort of activity,
> improving ratings does no good at all.

> Item-level temporal dynamics is a different beast.
>

I think the intuition here is, when making an item neighborhood base
recommendation, to penalize the contribution of the items that the user has
rated a long time ago. I didn't test this in a production recommender
system, but I believe this might result in recommendation lists with better
conversion rates in certain use cases.

Best

Re: Decaying score for old preferences when using the .refresh()

Posted by Ted Dunning <te...@gmail.com>.

On Thu, Nov 7, 2013 at 12:50 AM, Gokhan Capan <gk...@gmail.com> wrote:

> This particular approach is discussed, and proven to increase the accuracy
> in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren. The
> decay function is parameterized per user, keeping track of how consistent
> the user behavior is.
>

Note that user-level temporal dynamics does not actually improve the
accuracy of ranking. It improves the accuracy of ratings.  Since
recommendation quality is primarily a precision@20 sort of activity,
improving ratings does no good at all.

Item-level temporal dynamics is a different beast.

Re: Decaying score for old preferences when using the .refresh()

Posted by Pat Ferrel <pa...@gmail.com>.

Not sure how you are going to decay in Mahout. Once ingested into Mahout there are no timestamps. So you’ll have to do that before ingesting.

Last year we set up an ecom-department store type recommender with data from online user purchase, add-to-cart, and view. The data was actual user behavior in the online store before any recommender was implemented. So it was very clean of external effects. We took varying time slices and measured change in precision of Mahout’s item-base recommendations. We found that the precision always increased with more data up to our max of 1 year. Put another way we took 3 months, 6 months, 9 months, and 12 months produced the best results. All we did was filter items no longer in stock. We did nothing to decay preferences.

That said you can still make a good case to limit or decay user preferences used in the queries. The problem is you may not want to have the same limit on data used to build the model. The model data represents user’s taste similarities, which change very slowly. I don’t know of a way to have a short time span user preference query against a long time span model in Mahout, as Gokhan says.

If you care to hack Mahout you can use different data in the recommendation pipeline. Mahout uses the user preference matrix to calculate item-item similarities and puts them in a DRM (distributed row matrix) then it uses the user’s preference data taken from the preference matrix as a sort of query agains the item-item DRM. If you use your own truncated user preference vectors (or decayed) as the queries instead of the ones that were used to train the item-item DRM you would do get the result you are trying for without throwing out potentially important training data.

By decaying the user preferences you may get a lower precision score, but that is only a crude measure of goodness. The recs for recent user activity will probably result in more sales since they indicate recent user intent. You can measure this later with A/B testing if you want.

On Nov 7, 2013, at 12:50 AM, Gokhan Capan <gk...@gmail.com> wrote:

Cassio,

I am not sure if there are direct/indirect ways to to this with existing
code.

Recall that an item neighborhood based score prediction, in simplest terms,
is a weighted average of the active user's ratings on other items, where
the weights are item-to-item similarities. Applying a decay function to
these item-to-item weights, where the the decay is based on the rating time
of the active user on the "other item"s can help to achieve this.

One consideration might be for users who do not change their rating
behavior much, this decay can mask valuable historical information.

This particular approach is discussed, and proven to increase the accuracy
in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren. The
decay function is parameterized per user, keeping track of how consistent
the user behavior is.

If you think it is not necessary to estimate those per user parameters, in
Mahout's current neighborhood based recommenders, you might apply that
decay to item-to-item similarities at "recommendation time". Note that
DataModel#getPreferenceTime is the method you require. If you're using a
GenericItemBasedRecommender directly,
GenericItemBasedRecommender#doEstimatePreference is where your edits would
go. The benefit here is not having to update item-to-item similarities, so
you can still cache them.

Gokhan

On Wed, Nov 6, 2013 at 6:32 PM, Cassio Melo <me...@gmail.com> wrote:

> Assuming that most recent ratings or implicit preference data is more
> important than the older ones, I wonder if there is a way to decrease the
> importance (score) of old preference entries without having to update all
> previous preferences.
> 
> Currently I'm fetching new preferences from time to time and using the
> .refresh() method to update the data model with the new values.
> 
> Thanks
>

Re: Decaying score for old preferences when using the .refresh()

Posted by Gokhan Capan <gk...@gmail.com>.

Cassio,

I am not sure if there are direct/indirect ways to to this with existing
code.

Recall that an item neighborhood based score prediction, in simplest terms,
is a weighted average of the active user's ratings on other items, where
the weights are item-to-item similarities. Applying a decay function to
these item-to-item weights, where the the decay is based on the rating time
of the active user on the "other item"s can help to achieve this.

One consideration might be for users who do not change their rating
behavior much, this decay can mask valuable historical information.

This particular approach is discussed, and proven to increase the accuracy
in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren. The
decay function is parameterized per user, keeping track of how consistent
the user behavior is.

If you think it is not necessary to estimate those per user parameters, in
Mahout's current neighborhood based recommenders, you might apply that
decay to item-to-item similarities at "recommendation time". Note that
DataModel#getPreferenceTime is the method you require. If you're using a
GenericItemBasedRecommender directly,
GenericItemBasedRecommender#doEstimatePreference is where your edits would
go. The benefit here is not having to update item-to-item similarities, so
you can still cache them.

Gokhan

On Wed, Nov 6, 2013 at 6:32 PM, Cassio Melo <me...@gmail.com> wrote:

> Assuming that most recent ratings or implicit preference data is more
> important than the older ones, I wonder if there is a way to decrease the
> importance (score) of old preference entries without having to update all
> previous preferences.
>
> Currently I'm fetching new preferences from time to time and using the
> .refresh() method to update the data model with the new values.
>
> Thanks
>