You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Frederik Kraus <fr...@gmail.com> on 2013/01/31 20:02:13 UTC

(near) real time recommender/predictor

Hi Guys,  

I'm rather new to the whole Mahout ecosystem, so please excuse if the questions I have are rather dumb ;)

Our "problem" basically boils down to this: we want to match users with either the content they interested in and/or the content they could contribute to. To do this "matching" we have several dimensions both of users and content items (things like: contribution history, tags, browsing history, diggs, likes, ….).

As interest of users can change over time some kind of CF algorithm including temporal effects would obviously be best, but for the time being those effects could probably be neglected.

Now my questions:

- what algorithm from the mahout "toolkit" would best fit our case?
- How can we get this near realtime, i.e. not having to recalculate the entire model when user dimensions change and/or new content is being added to the system (or updated)
- how would we model the user and item vectors (especially things like "tags")?
- any hints on where to start? ;)

Thanks a lot!

Fred.

Re: Using IDF in CF recommender

Posted by Paulo Villegas <pa...@gmail.com>.

 > The affect of downweighting the popular items is very similar to
 > removing them from recommendations so I still suspect precision will
 > go down using IDF. Obviously this can pretty easily be tested, I just
 > wondered if anyone had already done it.
 >
 > This brings up a problem with holdout based precision. It measures
 > the value of a model trained on a training set in predicting
 > something that is in the holdout set. This may or may not correlate
 > with affecting user behavior.

Indeed. The problem with holdout sets is that they only indicate what 
users did with certain items. There's no way to know what they would 
have done with items they were not exposed to.

 >
 > To use purchases as preference indicators, a precision metric would
 > measure how well purchases in the trianing set predicted purchases in
 > the test set. If IDF lowers precision, it may also affect user
 > behavior strongly by recommending non-obvious (non-inevitable)
 > items.

It's also a strategic decision: whether you want to use recommendations 
to reinforce the "long tail" of your catalog or go with the sure thing.


>
> This affect on user behavior AFAIK can't be measured from holdout
> tests. I worry that precision related measures may point us in the
> wrong direction. Are A/B tests our only reliable metric for questions
> like this?


I'm afraid I agree, A/B testing is the only true valid proof that one
recommender config is better than another.

And even A/B testing may point us in the wrong direction. Say we achieve 
one configuration with which we can measure better sales with enough 
significance level, then that configuration is the best one from an 
experimental A/B test, i.e. the Holy Grail of measures. But what if our 
ultimate goal is customer retention? Maybe those short-term 
recommendation of, say, very popular items (because we're not using the 
IDF weights) are achieving sales we would have had anyway but are not 
helping client loyalty because there's no added value perceived. So in 
the long term we'll increase churn because our recommendations do not 
differentiate ourselves.

Life and business are complicated :-)

As for offline metrics, I consider them as a hint that can help in 
pruning the space of possible recommender configurations. But discarding 
one system in favour of another based only on precision is risky, the 
difference would need to be more than significant.

Re: Using IDF in CF recommender

Posted by Pat Ferrel <pa...@gmail.com>.

The affect of downweighting the popular items is very similar to removing them from recommendations so I still suspect precision will go down using IDF. Obviously this can pretty easily be tested, I just wondered if anyone had already done it.

This brings up a problem with holdout based precision. It measures the value of a model trained on a training set in predicting something that is in the holdout set. This may or may not correlate with affecting user behavior. 

To use purchases as preference indicators, a precision metric would measure how well purchases in the trianing set predicted purchases in the test set. If IDF lowers precision, it may also affect user behavior strongly by recommending non-obvious (non-inevitable) items. 

This affect on user behavior AFAIK can't be measured from holdout tests. I worry that precision related measures may point us in the wrong direction. Are A/B tests our only reliable metric for questions like this?

On Feb 6, 2013, at 9:04 AM, Paulo Villegas <pa...@gmail.com> wrote:

> 

> This results in no information for universally preferred items, which
> is indeed what I was looking for. It looks like this should also work
> for other values or explicit preferences--item prices, ratings,
> etc..
> 
> Intuition says this will result in a lower precision related cross
> validation measure since you are discounting the obvious
> recommendations. I have no experience with measuring something like
> this, any you have would be appreciated.
> 

(this is just guesswork, so I could be terribly wrong)

In a non-IDF-weighted recommender, if you take out the top N% of items
(items with more occurrences in the user-item matrix) precision will
suffer badly, since the recommender will miss opportunities to recommend "easy targets" (items with high probability of occurrence in the test set).

In an IDF-weighted recommender, it could improve precision instead, since you take items highly likely to be in the testset that were not going to be recommended in top positions due to their strong IDF down-weight. This would be a hint that the IDF weight is working to suppress the "obvious" recommendations.

In this last case, precision would tend to go up as you keep removing a bigger share of top items, until you reach a diminishing returns point, in which the growing reduction of data relating user & items provoked by removing top item interactions spoils any advantage of taking them out of the picture. This might be the point in which you decide pruning top items is best. So you could use that % of top items pruned as the place for your "canonical" precision value.

Highly application- and domain-dependent, anyway

Paulo

Re: Using IDF in CF recommender

Posted by Paulo Villegas <pa...@gmail.com>.

>

> This results in no information for universally preferred items, which
> is indeed what I was looking for. It looks like this should also work
> for other values or explicit preferences--item prices, ratings,
> etc..
>
> Intuition says this will result in a lower precision related cross
> validation measure since you are discounting the obvious
> recommendations. I have no experience with measuring something like
> this, any you have would be appreciated.
>

(this is just guesswork, so I could be terribly wrong)

In a non-IDF-weighted recommender, if you take out the top N% of items
(items with more occurrences in the user-item matrix) precision will
suffer badly, since the recommender will miss opportunities to recommend 
"easy targets" (items with high probability of occurrence in the test set).

In an IDF-weighted recommender, it could improve precision instead, 
since you take items highly likely to be in the testset that were not 
going to be recommended in top positions due to their strong IDF 
down-weight. This would be a hint that the IDF weight is working to 
suppress the "obvious" recommendations.

In this last case, precision would tend to go up as you keep removing a 
bigger share of top items, until you reach a diminishing returns point, 
in which the growing reduction of data relating user & items provoked by 
removing top item interactions spoils any advantage of taking them out 
of the picture. This might be the point in which you decide pruning top 
items is best. So you could use that % of top items pruned as the place 
for your "canonical" precision value.

Highly application- and domain-dependent, anyway

Paulo

Re: Using IDF in CF recommender

Posted by Pat Ferrel <pa...@gmail.com>.

oops, forgot the log 

So...
idf weighted preference value = item preference value * log (number of all items/number of users with specific item pref)

                         items
                        1   0   0
users              1   0   0
                        1   1   0

freq                 3   1   0
#users/freq 3/3 3/1 0

So the idf weighted values
         1*log(1)       0                 0
         1*log(1)       0                 0
         1*log(1)   1*log(3)        0
sum    0              log(3)           0

so the IDF weighted matrix is 

                         items
                        0    0    0
users              0    0    0
                        0 0.48  0

This results in no information for universally preferred items, which is indeed what I was looking for. It looks like this should also work for other values or explicit preferences--item prices, ratings, etc..

Intuition says this will result in a lower precision related cross validation measure since you are discounting the obvious recommendations. I have no experience with measuring something like this, any you have would be appreciated. 
  
On Feb 5, 2013, at 12:33 PM, Ted Dunning <te...@gmail.com> wrote:

On Tue, Feb 5, 2013 at 11:29 AM, Pat Ferrel <pa...@gmail.com> wrote:

> I think you meant: "Human relatedness decays much slower than item
> popularity."
> 

Yes.  Oops.


> So to make sure I understand the implications of using IDF…  For
> boolean/implicit preferences the sum of all prefs (after weighting) for a
> single item over all users will always be 1 or 0. This no matter whether
> the frequency is 1M or 1.
> 

I don't see this.

For things that occur once for N users, the sum is log N.  For items that
occur for every user, the sum will be 0.

Another approach would be to do some kind of outlier detection and remove
> those users.


Down-sampling and proper thresholding handles this.   Crazy users and
crawlers are relatively rare and each get only a single vote.  This makes
them immaterial.

Looking at some types of web data you will see crawlers as outliers mucking
> up impression or click-thru data.
> 

You will see them, but they shouldn't matter.


> 
> On Feb 2, 2013, at 1:25 PM, Ted Dunning <te...@gmail.com> wrote:
> 
> On Sat, Feb 2, 2013 at 1:03 PM, Pat Ferrel <pa...@gmail.com> wrote:
> 
>> Indeed, please elaborate. Not sure what you mean by "this is an important
>> effect"
>> 
>> Do you disagree with what I said re temporal decay?
>> 
> 
> No.  I agree with it.  Human relatedness decays much more quickly than item
> popularity.
> 
> I was extending this.  Down-sampling should make use of this observation to
> try to preserve time coincidence in the resulting dataset.
> 
> 
>> As to downsampling or rather reweighting outliers in popular items and/or
>> active users--It's another interesting question. Does the fact that we
> both
>> like puppies and motherhood make us in any real way similar? I'm quite
>> interested in ways to account for this. I've seen what is done to
> normalize
>> ratings from different users based on whether they tend to rate high or
>> low. I'm interested in any papers talking about the super active user or
>> super popular items.
>> 
> 
> I view downsampling as a necessary evil when using cooccurrence based
> algorithms.  This only applies to prolific users.
> 
> For items, I tend to use simple IDF weightings.  This gives very low
> weights to ubiquitous preferences.
> 
> 
> 
>> 
>> Another subject of interest is the question; is it possible to create a
>> blend of recommenders based on their performance on long tail items.
> 
> 
> Absolutely this is possible and it is a great thing to do.  Ensembles are
> all the fashion rage and for good reason.  See all the top players in the
> Netflix challenge.
> 
> 
>> For instance if the precision of a recommender (just considering the
>> item-item similarity for the present) as a function of item popularity
>> decreases towards the long tail, is it possible that one type of
>> recommender does better than another--do the distributions cross? This
>> would suggest a blending strategy based on how far out the long tail you
>> are when calculating similar items.
> 
> 
> Yeah... but you can't tell very well due to the low counts.
> 
>

Re: Using IDF in CF recommender

Posted by Ted Dunning <te...@gmail.com>.

On Tue, Feb 5, 2013 at 11:29 AM, Pat Ferrel <pa...@gmail.com> wrote:

> I think you meant: "Human relatedness decays much slower than item
> popularity."
>

Yes.  Oops.


> So to make sure I understand the implications of using IDF…  For
> boolean/implicit preferences the sum of all prefs (after weighting) for a
> single item over all users will always be 1 or 0. This no matter whether
> the frequency is 1M or 1.
>

I don't see this.

For things that occur once for N users, the sum is log N.  For items that
occur for every user, the sum will be 0.

Another approach would be to do some kind of outlier detection and remove
> those users.


Down-sampling and proper thresholding handles this.   Crazy users and
crawlers are relatively rare and each get only a single vote.  This makes
them immaterial.

Looking at some types of web data you will see crawlers as outliers mucking
> up impression or click-thru data.
>

You will see them, but they shouldn't matter.


>
> On Feb 2, 2013, at 1:25 PM, Ted Dunning <te...@gmail.com> wrote:
>
> On Sat, Feb 2, 2013 at 1:03 PM, Pat Ferrel <pa...@gmail.com> wrote:
>
> > Indeed, please elaborate. Not sure what you mean by "this is an important
> > effect"
> >
> > Do you disagree with what I said re temporal decay?
> >
>
> No.  I agree with it.  Human relatedness decays much more quickly than item
> popularity.
>
> I was extending this.  Down-sampling should make use of this observation to
> try to preserve time coincidence in the resulting dataset.
>
>
> > As to downsampling or rather reweighting outliers in popular items and/or
> > active users--It's another interesting question. Does the fact that we
> both
> > like puppies and motherhood make us in any real way similar? I'm quite
> > interested in ways to account for this. I've seen what is done to
> normalize
> > ratings from different users based on whether they tend to rate high or
> > low. I'm interested in any papers talking about the super active user or
> > super popular items.
> >
>
> I view downsampling as a necessary evil when using cooccurrence based
> algorithms.  This only applies to prolific users.
>
> For items, I tend to use simple IDF weightings.  This gives very low
> weights to ubiquitous preferences.
>
>
>
> >
> > Another subject of interest is the question; is it possible to create a
> > blend of recommenders based on their performance on long tail items.
>
>
> Absolutely this is possible and it is a great thing to do.  Ensembles are
> all the fashion rage and for good reason.  See all the top players in the
> Netflix challenge.
>
>
> > For instance if the precision of a recommender (just considering the
> > item-item similarity for the present) as a function of item popularity
> > decreases towards the long tail, is it possible that one type of
> > recommender does better than another--do the distributions cross? This
> > would suggest a blending strategy based on how far out the long tail you
> > are when calculating similar items.
>
>
> Yeah... but you can't tell very well due to the low counts.
>
>

Using IDF in CF recommender

Posted by Pat Ferrel <pa...@gmail.com>.

I think you meant: "Human relatedness decays much slower than item
popularity."

So to make sure I understand the implications of using IDF…  For boolean/implicit preferences the sum of all prefs (after weighting) for a single item over all users will always be 1 or 0. This no matter whether the frequency is 1M or 1.

Another approach would be to do some kind of outlier detection and remove those users. Looking at some types of web data you will see crawlers as outliers mucking up impression or click-thru data.

On Feb 2, 2013, at 1:25 PM, Ted Dunning <te...@gmail.com> wrote:

On Sat, Feb 2, 2013 at 1:03 PM, Pat Ferrel <pa...@gmail.com> wrote:

> Indeed, please elaborate. Not sure what you mean by "this is an important
> effect"
> 
> Do you disagree with what I said re temporal decay?
> 

No.  I agree with it.  Human relatedness decays much more quickly than item
popularity.

I was extending this.  Down-sampling should make use of this observation to
try to preserve time coincidence in the resulting dataset.

> As to downsampling or rather reweighting outliers in popular items and/or
> active users--It's another interesting question. Does the fact that we both
> like puppies and motherhood make us in any real way similar? I'm quite
> interested in ways to account for this. I've seen what is done to normalize
> ratings from different users based on whether they tend to rate high or
> low. I'm interested in any papers talking about the super active user or
> super popular items.
> 

I view downsampling as a necessary evil when using cooccurrence based
algorithms.  This only applies to prolific users.

For items, I tend to use simple IDF weightings.  This gives very low
weights to ubiquitous preferences.

> 
> Another subject of interest is the question; is it possible to create a
> blend of recommenders based on their performance on long tail items.

Absolutely this is possible and it is a great thing to do.  Ensembles are
all the fashion rage and for good reason.  See all the top players in the
Netflix challenge.

> For instance if the precision of a recommender (just considering the
> item-item similarity for the present) as a function of item popularity
> decreases towards the long tail, is it possible that one type of
> recommender does better than another--do the distributions cross? This
> would suggest a blending strategy based on how far out the long tail you
> are when calculating similar items.

Yeah... but you can't tell very well due to the low counts.

Re: (near) real time recommender/predictor

Posted by Ted Dunning <te...@gmail.com>.

On Sat, Feb 2, 2013 at 1:03 PM, Pat Ferrel <pa...@gmail.com> wrote:

> Indeed, please elaborate. Not sure what you mean by "this is an important
> effect"
>
> Do you disagree with what I said re temporal decay?
>

No.  I agree with it.  Human relatedness decays much more quickly than item
popularity.

I was extending this.  Down-sampling should make use of this observation to
try to preserve time coincidence in the resulting dataset.

> As to downsampling or rather reweighting outliers in popular items and/or
> active users--It's another interesting question. Does the fact that we both
> like puppies and motherhood make us in any real way similar? I'm quite
> interested in ways to account for this. I've seen what is done to normalize
> ratings from different users based on whether they tend to rate high or
> low. I'm interested in any papers talking about the super active user or
> super popular items.
>

I view downsampling as a necessary evil when using cooccurrence based
algorithms.  This only applies to prolific users.

For items, I tend to use simple IDF weightings.  This gives very low
weights to ubiquitous preferences.

>
> Another subject of interest is the question; is it possible to create a
> blend of recommenders based on their performance on long tail items.

Absolutely this is possible and it is a great thing to do.  Ensembles are
all the fashion rage and for good reason.  See all the top players in the
Netflix challenge.

> For instance if the precision of a recommender (just considering the
> item-item similarity for the present) as a function of item popularity
> decreases towards the long tail, is it possible that one type of
> recommender does better than another--do the distributions cross? This
> would suggest a blending strategy based on how far out the long tail you
> are when calculating similar items.

Yeah... but you can't tell very well due to the low counts.

Re: (near) real time recommender/predictor

Posted by Pat Ferrel <pa...@gmail.com>.

Indeed, please elaborate. Not sure what you mean by "this is an important effect"  

Do you disagree with what I said re temporal decay?

As to downsampling or rather reweighting outliers in popular items and/or active users--It's another interesting question. Does the fact that we both like puppies and motherhood make us in any real way similar? I'm quite interested in ways to account for this. I've seen what is done to normalize ratings from different users based on whether they tend to rate high or low. I'm interested in any papers talking about the super active user or super popular items.

Another subject of interest is the question; is it possible to create a blend of recommenders based on their performance on long tail items. For instance if the precision of a recommender (just considering the item-item similarity for the present) as a function of item popularity decreases towards the long tail, is it possible that one type of recommender does better than another--do the distributions cross? This would suggest a blending strategy based on how far out the long tail you are when calculating similar items. I haven't found any papers on this but perhaps there are… I think I'll experiment with it if I don't find existing research.

On Feb 2, 2013, at 11:44 AM, Ted Dunning <te...@gmail.com> wrote:

Pat,

This is an important effect and it strongly informs how you should
down-sample heavy users as well as how you should handle temporal dynamics.

On Sat, Feb 2, 2013 at 9:54 AM, Pat Ferrel <pa...@gmail.com> wrote:

> RE: Temporal effects. In CF you are interested in similarities. For
> instance in a User-based CF recommender you want to detect users similar to
> a given user. The time decay of the similarities is likely to be very slow.
> In other word if I bought an iPad 1 and you bought an iPad 1, the
> similarity in our taste may live on past the introduction of the iPad 3.
> 
> However the "recommendability" of the iPad 1 decays much quicker. I'd
> suggest looking at using the recommendability decay for rescoring the
> recommendations somehow. So if you get back an iPad 1 as a recommendation
> you might rescore it based on the mean time that you saw it in the
> preferences and decay the strength based on that.
> 
> Maybe someone can come up with a better way to rescore but the point is
> that you need to think of the two decays differently and decay preferences
> in the model building step at a different rate, if al all.
> 
> On Jan 31, 2013, at 11:57 AM, Sean Owen <sr...@gmail.com> wrote:
> 
> It's a good question. I think you can achieve a partial solution in Mahout.
> 
> "Real-time" suggests that you won't be able to make use of
> Hadoop-based implementations, since they are by nature big batch
> processes.
> 
> All of the implementations accept the same input -- user,item,value.
> That's OK; you can probably just reduce all of your user-thing
> interactions to tuples like this. Any reasonable mapping should be OK.
> Tags can be items too.
> 
> I don't think any of the implementations take advantage of time.
> 
> The non-Hadoop implementations are not-quite-realtime. The model is
> loading data into memory from backing store, computing and maybe
> caching partial results, and serving results as quickly as possible.
> New input can't be immediately used, no. It comes into play when the
> model is reloaded only.
> 
> I think you have very sparse input -- a high number of users and
> "items" (tags, likes), but relatively few interactions. Matrix
> factorization / latent factor models work well here. The ones in
> Mahout that are not Hadoop-based may work for you, like
> SVDRecommender. It's worth a try.
> 
> (Advertisement: the new recommender product I am commercializing,
> Myrrix, does the real-time and matrix factorization thing just fine.
> It's easy enough to start with that I would encourage you to
> experiment with the open source system also:
> http://myrrix.com/download/)
> 
> 
> 
> On Thu, Jan 31, 2013 at 7:02 PM, Frederik Kraus
> <fr...@gmail.com> wrote:
>> Hi Guys,
>> 
>> I'm rather new to the whole Mahout ecosystem, so please excuse if the
> questions I have are rather dumb ;)
>> 
>> Our "problem" basically boils down to this: we want to match users with
> either the content they interested in and/or the content they could
> contribute to. To do this "matching" we have several dimensions both of
> users and content items (things like: contribution history, tags, browsing
> history, diggs, likes, ….).
>> 
>> As interest of users can change over time some kind of CF algorithm
> including temporal effects would obviously be best, but for the time being
> those effects could probably be neglected.
>> 
>> Now my questions:
>> 
>> - what algorithm from the mahout "toolkit" would best fit our case?
>> - How can we get this near realtime, i.e. not having to recalculate the
> entire model when user dimensions change and/or new content is being added
> to the system (or updated)
>> - how would we model the user and item vectors (especially things like
> "tags")?
>> - any hints on where to start? ;)
>> 
>> Thanks a lot!
>> 
>> Fred.
>> 
> 
>

Re: (near) real time recommender/predictor

Posted by Ted Dunning <te...@gmail.com>.

Pat,

This is an important effect and it strongly informs how you should
down-sample heavy users as well as how you should handle temporal dynamics.

On Sat, Feb 2, 2013 at 9:54 AM, Pat Ferrel <pa...@gmail.com> wrote:

> RE: Temporal effects. In CF you are interested in similarities. For
> instance in a User-based CF recommender you want to detect users similar to
> a given user. The time decay of the similarities is likely to be very slow.
> In other word if I bought an iPad 1 and you bought an iPad 1, the
> similarity in our taste may live on past the introduction of the iPad 3.
>
> However the "recommendability" of the iPad 1 decays much quicker. I'd
> suggest looking at using the recommendability decay for rescoring the
> recommendations somehow. So if you get back an iPad 1 as a recommendation
> you might rescore it based on the mean time that you saw it in the
> preferences and decay the strength based on that.
>
> Maybe someone can come up with a better way to rescore but the point is
> that you need to think of the two decays differently and decay preferences
> in the model building step at a different rate, if al all.
>
> On Jan 31, 2013, at 11:57 AM, Sean Owen <sr...@gmail.com> wrote:
>
> It's a good question. I think you can achieve a partial solution in Mahout.
>
> "Real-time" suggests that you won't be able to make use of
> Hadoop-based implementations, since they are by nature big batch
> processes.
>
> All of the implementations accept the same input -- user,item,value.
> That's OK; you can probably just reduce all of your user-thing
> interactions to tuples like this. Any reasonable mapping should be OK.
> Tags can be items too.
>
> I don't think any of the implementations take advantage of time.
>
> The non-Hadoop implementations are not-quite-realtime. The model is
> loading data into memory from backing store, computing and maybe
> caching partial results, and serving results as quickly as possible.
> New input can't be immediately used, no. It comes into play when the
> model is reloaded only.
>
> I think you have very sparse input -- a high number of users and
> "items" (tags, likes), but relatively few interactions. Matrix
> factorization / latent factor models work well here. The ones in
> Mahout that are not Hadoop-based may work for you, like
> SVDRecommender. It's worth a try.
>
> (Advertisement: the new recommender product I am commercializing,
> Myrrix, does the real-time and matrix factorization thing just fine.
> It's easy enough to start with that I would encourage you to
> experiment with the open source system also:
> http://myrrix.com/download/)
>
>
>
> On Thu, Jan 31, 2013 at 7:02 PM, Frederik Kraus
> <fr...@gmail.com> wrote:
> > Hi Guys,
> >
> > I'm rather new to the whole Mahout ecosystem, so please excuse if the
> questions I have are rather dumb ;)
> >
> > Our "problem" basically boils down to this: we want to match users with
> either the content they interested in and/or the content they could
> contribute to. To do this "matching" we have several dimensions both of
> users and content items (things like: contribution history, tags, browsing
> history, diggs, likes, ….).
> >
> > As interest of users can change over time some kind of CF algorithm
> including temporal effects would obviously be best, but for the time being
> those effects could probably be neglected.
> >
> > Now my questions:
> >
> > - what algorithm from the mahout "toolkit" would best fit our case?
> > - How can we get this near realtime, i.e. not having to recalculate the
> entire model when user dimensions change and/or new content is being added
> to the system (or updated)
> > - how would we model the user and item vectors (especially things like
> "tags")?
> > - any hints on where to start? ;)
> >
> > Thanks a lot!
> >
> > Fred.
> >
>
>

Re: (near) real time recommender/predictor

Posted by Pat Ferrel <pa...@gmail.com>.

RE: Temporal effects. In CF you are interested in similarities. For instance in a User-based CF recommender you want to detect users similar to a given user. The time decay of the similarities is likely to be very slow. In other word if I bought an iPad 1 and you bought an iPad 1, the similarity in our taste may live on past the introduction of the iPad 3. 

However the "recommendability" of the iPad 1 decays much quicker. I'd suggest looking at using the recommendability decay for rescoring the recommendations somehow. So if you get back an iPad 1 as a recommendation you might rescore it based on the mean time that you saw it in the preferences and decay the strength based on that. 

Maybe someone can come up with a better way to rescore but the point is that you need to think of the two decays differently and decay preferences in the model building step at a different rate, if al all.

On Jan 31, 2013, at 11:57 AM, Sean Owen <sr...@gmail.com> wrote:

It's a good question. I think you can achieve a partial solution in Mahout.

"Real-time" suggests that you won't be able to make use of
Hadoop-based implementations, since they are by nature big batch
processes.

All of the implementations accept the same input -- user,item,value.
That's OK; you can probably just reduce all of your user-thing
interactions to tuples like this. Any reasonable mapping should be OK.
Tags can be items too.

I don't think any of the implementations take advantage of time.

The non-Hadoop implementations are not-quite-realtime. The model is
loading data into memory from backing store, computing and maybe
caching partial results, and serving results as quickly as possible.
New input can't be immediately used, no. It comes into play when the
model is reloaded only.

I think you have very sparse input -- a high number of users and
"items" (tags, likes), but relatively few interactions. Matrix
factorization / latent factor models work well here. The ones in
Mahout that are not Hadoop-based may work for you, like
SVDRecommender. It's worth a try.

(Advertisement: the new recommender product I am commercializing,
Myrrix, does the real-time and matrix factorization thing just fine.
It's easy enough to start with that I would encourage you to
experiment with the open source system also:
http://myrrix.com/download/)

On Thu, Jan 31, 2013 at 7:02 PM, Frederik Kraus
<fr...@gmail.com> wrote:
> Hi Guys,
> 
> I'm rather new to the whole Mahout ecosystem, so please excuse if the questions I have are rather dumb ;)
> 
> Our "problem" basically boils down to this: we want to match users with either the content they interested in and/or the content they could contribute to. To do this "matching" we have several dimensions both of users and content items (things like: contribution history, tags, browsing history, diggs, likes, ….).
> 
> As interest of users can change over time some kind of CF algorithm including temporal effects would obviously be best, but for the time being those effects could probably be neglected.
> 
> Now my questions:
> 
> - what algorithm from the mahout "toolkit" would best fit our case?
> - How can we get this near realtime, i.e. not having to recalculate the entire model when user dimensions change and/or new content is being added to the system (or updated)
> - how would we model the user and item vectors (especially things like "tags")?
> - any hints on where to start? ;)
> 
> Thanks a lot!
> 
> Fred.
>

Re: (near) real time recommender/predictor

Posted by Sean Owen <sr...@gmail.com>.

It's a good question. I think you can achieve a partial solution in Mahout.

"Real-time" suggests that you won't be able to make use of
Hadoop-based implementations, since they are by nature big batch
processes.

All of the implementations accept the same input -- user,item,value.
That's OK; you can probably just reduce all of your user-thing
interactions to tuples like this. Any reasonable mapping should be OK.
Tags can be items too.

I don't think any of the implementations take advantage of time.

The non-Hadoop implementations are not-quite-realtime. The model is
loading data into memory from backing store, computing and maybe
caching partial results, and serving results as quickly as possible.
New input can't be immediately used, no. It comes into play when the
model is reloaded only.

I think you have very sparse input -- a high number of users and
"items" (tags, likes), but relatively few interactions. Matrix
factorization / latent factor models work well here. The ones in
Mahout that are not Hadoop-based may work for you, like
SVDRecommender. It's worth a try.

(Advertisement: the new recommender product I am commercializing,
Myrrix, does the real-time and matrix factorization thing just fine.
It's easy enough to start with that I would encourage you to
experiment with the open source system also:
http://myrrix.com/download/)

On Thu, Jan 31, 2013 at 7:02 PM, Frederik Kraus
<fr...@gmail.com> wrote:
> Hi Guys,
>
> I'm rather new to the whole Mahout ecosystem, so please excuse if the questions I have are rather dumb ;)
>
> Our "problem" basically boils down to this: we want to match users with either the content they interested in and/or the content they could contribute to. To do this "matching" we have several dimensions both of users and content items (things like: contribution history, tags, browsing history, diggs, likes, ….).
>
> As interest of users can change over time some kind of CF algorithm including temporal effects would obviously be best, but for the time being those effects could probably be neglected.
>
> Now my questions:
>
> - what algorithm from the mahout "toolkit" would best fit our case?
> - How can we get this near realtime, i.e. not having to recalculate the entire model when user dimensions change and/or new content is being added to the system (or updated)
> - how would we model the user and item vectors (especially things like "tags")?
> - any hints on where to start? ;)
>
> Thanks a lot!
>
> Fred.
>