You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Pat Ferrel <pa...@occamsmachete.com> on 2014/02/05 02:27:53 UTC

Popularity of recommender items

Trying to come up with a relative measure of popularity for items in a recommender. Something that could be used to rank items.

The user - item preference matrix would be the obvious thought. Just add the number of preferences per item. Maybe transpose the preference matrix (the temp DRM created by the recommender), then for each row vector (now that a row = item) grab the number of non zero preferences. This corresponds to the number of preferences, and would give one measure of popularity. In the case where the items are not boolean you’d sum the weights.

However it might be a better idea to look at the item-item similarity matrix. It doesn’t need to be transposed and contains the “important” similarities--as calculated by LLR for example. Here similarity means similarity in which users preferred an item. So summing the non-zero weights would give perhaps an even better relative “popularity” measure. For the same reason clustering the similarity matrix would yield “important” clusters.

Anyone have intuition about this?

I started to think about this because transposing the user-item matrix seems to yield a fromat that cannot be sent directly into clustering.

Re: Dithering and Thompson Sampling

Posted by Ted Dunning <te...@gmail.com>.
The adPredictor paper has a good example of how to do this.

See http://research.microsoft.com/apps/pubs/default.aspx?id=122779




On Sat, Feb 15, 2014 at 6:38 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> I’m still unclear about how to apply TS to dithering. TS is usually talked
> about in conjunction with a feedback loop, which make the examples seem
> more like Multi-armed Bandit examples.
>
> Are you suggesting some feedback in recommender ranking or just using the
> same distribution assumptions used in TS?
>
> On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com> wrote:
>
> Thompson sampling doesn't require time other than a sense of what do we now
> know.  It really is just a correct form for dithering that uses our current
> knowledge.
>
> For a worked out version of Thompson sampling with ranking, see this blog:
>
> http://tdunning.blogspot.com/2013/04/learning-to-rank-in-very-bayesian-way.html
>
> The reason that we aren't adding this like cross-rec and other things is
> that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
> the rest are not.  You seem more active than most.
>
>
>
>
>
> On Sat, Feb 8, 2014 at 8:50 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>
> > Didn’t mean to imply I had historical view data—yet.
> >
> > The Thompson sampling ‘trick’ looks useful for auto converging to the
> best
> > of A/B versions and a replacement for dithering. Below you are proposing
> > another case to replace dithering—this time on a list of popular items?
> > Dithering works on anything you can rank but Thompson Sampling usually
> > implies a time dimension. The initial guess, first Thompson sample, could
> > be thought of as a form of dithering I suppose? Haven’t looked at the
> math
> > but it wouldn’t surprise me to find they are very similar things.
> >
> > While we are talking about it, why aren’t we adding things like
> > cross-reccomendations, dithering, popularity, and other generally useful
> > techniques into the Mahout recommenders? All the data is there to do
> these
> > things, and they could be packaged in the same Mahout Jobs. They seem to
> be
> > languishing a bit while technology and the art of recommendations moves
> on.
> >
> > If we add temporal data to preference data a bunch of new features come
> to
> > mind, like hot lists or asymmetric train/query preference history.
> >
> > On Feb 6, 2014, at 9:43 PM, Ted Dunning <te...@gmail.com> wrote:
> >
> > One way to deal with that is to build a model that predicts the ultimate
> > number of views/plays/purchases for the item based on history so far.
> >
> > If this model can be made Bayesian enough to sample from the posterior
> > distribution of total popularity, then you can use the Thomson sampling
> > trick and sort by sampled total views rather than estimated total views.
> > That will give uncertain items (typically new ones) a chance to be shown
> > in the ratings without flooding the list with newcomers.
> >
> > Sent from my iPhone
> >
> >> On Feb 7, 2014, at 3:38, Pat Ferrel <pa...@occamsmachete.com> wrote:
> >>
> >> The particular thing I’m looking at now is how to rank a list of items
> > by some measure of popularity when you don’t have a velocity. There is an
> > introduction date though so another way to look at popularity might be to
> > decay it with something like e^-t where t is it’s age. You can see the
> > decay in the views histogram
> >
> >
>
>

Dithering and Thompson Sampling

Posted by Pat Ferrel <pa...@occamsmachete.com>.
I’m still unclear about how to apply TS to dithering. TS is usually talked about in conjunction with a feedback loop, which make the examples seem more like Multi-armed Bandit examples.

Are you suggesting some feedback in recommender ranking or just using the same distribution assumptions used in TS?

On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com> wrote:

Thompson sampling doesn't require time other than a sense of what do we now
know.  It really is just a correct form for dithering that uses our current
knowledge.

For a worked out version of Thompson sampling with ranking, see this blog:
http://tdunning.blogspot.com/2013/04/learning-to-rank-in-very-bayesian-way.html

The reason that we aren't adding this like cross-rec and other things is
that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
the rest are not.  You seem more active than most.





On Sat, Feb 8, 2014 at 8:50 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Didn’t mean to imply I had historical view data—yet.
> 
> The Thompson sampling ‘trick’ looks useful for auto converging to the best
> of A/B versions and a replacement for dithering. Below you are proposing
> another case to replace dithering—this time on a list of popular items?
> Dithering works on anything you can rank but Thompson Sampling usually
> implies a time dimension. The initial guess, first Thompson sample, could
> be thought of as a form of dithering I suppose? Haven’t looked at the math
> but it wouldn’t surprise me to find they are very similar things.
> 
> While we are talking about it, why aren’t we adding things like
> cross-reccomendations, dithering, popularity, and other generally useful
> techniques into the Mahout recommenders? All the data is there to do these
> things, and they could be packaged in the same Mahout Jobs. They seem to be
> languishing a bit while technology and the art of recommendations moves on.
> 
> If we add temporal data to preference data a bunch of new features come to
> mind, like hot lists or asymmetric train/query preference history.
> 
> On Feb 6, 2014, at 9:43 PM, Ted Dunning <te...@gmail.com> wrote:
> 
> One way to deal with that is to build a model that predicts the ultimate
> number of views/plays/purchases for the item based on history so far.
> 
> If this model can be made Bayesian enough to sample from the posterior
> distribution of total popularity, then you can use the Thomson sampling
> trick and sort by sampled total views rather than estimated total views.
> That will give uncertain items (typically new ones) a chance to be shown
> in the ratings without flooding the list with newcomers.
> 
> Sent from my iPhone
> 
>> On Feb 7, 2014, at 3:38, Pat Ferrel <pa...@occamsmachete.com> wrote:
>> 
>> The particular thing I’m looking at now is how to rank a list of items
> by some measure of popularity when you don’t have a velocity. There is an
> introduction date though so another way to look at popularity might be to
> decay it with something like e^-t where t is it’s age. You can see the
> decay in the views histogram
> 
> 


Re: Popularity of recommender items

Posted by Andrew Musselman <an...@gmail.com>.
Precisely; that's the one, thanks!


On Fri, Feb 14, 2014 at 12:45 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Note sure if this is what you are looking for. I assume you are talking
> about Ted's paper describing a Solr based recommender pipeline?
>
> Much of the paper was implemented in the solr-recommender referenced
> below, which has a fairly flexible parallel version of a logfile reader
> that uses Cascading for mapreduce. It picks out columns in delimited text
> files. You can choose a constant string for your action id, like "purchase"
> or "thumbs-up". Then specify the field index for user, item, and action. It
> assumes strings for all these inputs and creates
> string-id->Mahout-Integer-id->string-id bidriectional hashmaps as
> dictionary and reverse dictionary. Everything is scalable except the
> BiHashmaps, which are in-memory. They aren't usually too big for that.
> There is also a pattern for the input log file names and they are searched
> for recursively from some root directory.
>
> Caveat emptor: not all the options are implemented or tested. One person
> has already implemented a scaffolded option and their pull request was
> merged so feel free to contribute.
>
> It is an example of how to digest logfiles, build Mahout data, and run the
> recommender. It creates Solr indexing data too but the output of the
> recommender is up to you to implement. It is a Solr query or a lookup in
> the Mahout recommender DRM output.
>
> https://github.com/pferrel/solr-recommender
>
>
> On Feb 14, 2014, at 12:39 PM, Ted Dunning <te...@gmail.com> wrote:
>
> Yes!
>
> But it is very hard to find the time.
>
>
>
> On Fri, Feb 14, 2014 at 11:51 AM, Andrew Musselman <
> andrew.musselman@gmail.com> wrote:
>
> > I'd like to see cross-recommendations added too.
> >
> > But I also want some automation of the steps required to build a simple
> > recommender like the solr/mahout example Ted and Ellen have in their
> > pamphlet.
> >
> > Lowering the barrier to entry by providing a sample pipeline would help a
> > lot of folks get started and hopefully would keep them interested.
>  Perhaps
> > in examples/bin?
> >
> >
> > On Fri, Feb 14, 2014 at 10:56 AM, Pat Ferrel <pa...@occamsmachete.com>
> > wrote:
> >
> >> There's been work done on the cross-recommender. There is a Mahout-style
> >> XRecommenderJob that has two preference models for two actions or
> >> preference types. It uses matrix multiply to get a cooccurrence type
> >> similarity matrix. If we had a cross-row-similarity-job, it could pretty
> >> easily be integrated and I'd volunteer to integrate it. The XRSJ is
> >> probably beyond me right now so if we can scare up someone to do that
> > we'd
> >> be a long way down the road.
> >>
> >> I'll put a feature request into Jira and take this to the dev list
> >>
> >> BTW this is already integrated with the solr-recommender.
> >>
> >> On Feb 8, 2014, at 7:19 PM, Ted Dunning <te...@gmail.com> wrote:
> >>
> >> I have different opinions about each piece.
> >>
> >> I think that cross recommendation is as core as RowSimilarityJob and
> > should
> >> be a parallel implementation or integrated.  Parallel is probably
> easier.
> >> It is even plausible to have a version of RowSimilarityJob that doesn't
> >> support all the different distance measures but does support multiple
> > cross
> >> and direct processing using LLR or related cooccurrence based measures.
> > It
> >> would be very cool if a single pass over the data could do many kinds of
> > co
> >> or cross occurrence operations.
> >>
> >> For dithering, it really is post processing.  That said, it is also the
> >> single largest improvement that anybody typically gets when testing
> >> different options so it is a bit goofy to not have good support for some
> >> kinds of dithering.
> >>
> >> For Thompson sampled recommenders, I am not sure where to start hacking
> > on
> >> our current code.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <pa...@occamsmachete.com>
> > wrote:
> >>
> >>> That was by no means to criticize effort level, which has been
> > impressive
> >>> especially during the release.
> >>>
> >>> It was more a question about the best place to add these things and
> >>> whether they are important. Whether people see these things as custom
> >> post
> >>> processing or core.
> >>>
> >>> On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com>
> > wrote:
> >>>
> >>> ...
> >>>
> >>> The reason that we aren't adding this like cross-rec and other things
> > is
> >>> that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout,
> >> but
> >>> the rest are not.  You seem more active than most.
> >>>
> >>>
> >>>
> >>
> >>
> >
>
>

Re: Popularity of recommender items

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Note sure if this is what you are looking for. I assume you are talking about Ted’s paper describing a Solr based recommender pipeline?

Much of the paper was implemented in the solr-recommender referenced below, which has a fairly flexible parallel version of a logfile reader that uses Cascading for mapreduce. It picks out columns in delimited text files. You can choose a constant string for your action id, like “purchase” or “thumbs-up”. Then specify the field index for user, item, and action. It assumes strings for all these inputs and creates string-id->Mahout-Integer-id->string-id bidriectional hashmaps as dictionary and reverse dictionary. Everything is scalable except the BiHashmaps, which are in-memory. They aren’t usually too big for that. There is also a pattern for the input log file names and they are searched for recursively from some root directory. 

Caveat emptor: not all the options are implemented or tested. One person has already implemented a scaffolded option and their pull request was merged so feel free to contribute.

It is an example of how to digest logfiles, build Mahout data, and run the recommender. It creates Solr indexing data too but the output of the recommender is up to you to implement. It is a Solr query or a lookup in the Mahout recommender DRM output.

https://github.com/pferrel/solr-recommender


On Feb 14, 2014, at 12:39 PM, Ted Dunning <te...@gmail.com> wrote:

Yes!

But it is very hard to find the time.



On Fri, Feb 14, 2014 at 11:51 AM, Andrew Musselman <
andrew.musselman@gmail.com> wrote:

> I'd like to see cross-recommendations added too.
> 
> But I also want some automation of the steps required to build a simple
> recommender like the solr/mahout example Ted and Ellen have in their
> pamphlet.
> 
> Lowering the barrier to entry by providing a sample pipeline would help a
> lot of folks get started and hopefully would keep them interested.  Perhaps
> in examples/bin?
> 
> 
> On Fri, Feb 14, 2014 at 10:56 AM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
> 
>> There's been work done on the cross-recommender. There is a Mahout-style
>> XRecommenderJob that has two preference models for two actions or
>> preference types. It uses matrix multiply to get a cooccurrence type
>> similarity matrix. If we had a cross-row-similarity-job, it could pretty
>> easily be integrated and I'd volunteer to integrate it. The XRSJ is
>> probably beyond me right now so if we can scare up someone to do that
> we'd
>> be a long way down the road.
>> 
>> I'll put a feature request into Jira and take this to the dev list
>> 
>> BTW this is already integrated with the solr-recommender.
>> 
>> On Feb 8, 2014, at 7:19 PM, Ted Dunning <te...@gmail.com> wrote:
>> 
>> I have different opinions about each piece.
>> 
>> I think that cross recommendation is as core as RowSimilarityJob and
> should
>> be a parallel implementation or integrated.  Parallel is probably easier.
>> It is even plausible to have a version of RowSimilarityJob that doesn't
>> support all the different distance measures but does support multiple
> cross
>> and direct processing using LLR or related cooccurrence based measures.
> It
>> would be very cool if a single pass over the data could do many kinds of
> co
>> or cross occurrence operations.
>> 
>> For dithering, it really is post processing.  That said, it is also the
>> single largest improvement that anybody typically gets when testing
>> different options so it is a bit goofy to not have good support for some
>> kinds of dithering.
>> 
>> For Thompson sampled recommenders, I am not sure where to start hacking
> on
>> our current code.
>> 
>> 
>> 
>> 
>> 
>> 
>> On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
>> 
>>> That was by no means to criticize effort level, which has been
> impressive
>>> especially during the release.
>>> 
>>> It was more a question about the best place to add these things and
>>> whether they are important. Whether people see these things as custom
>> post
>>> processing or core.
>>> 
>>> On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com>
> wrote:
>>> 
>>> ...
>>> 
>>> The reason that we aren't adding this like cross-rec and other things
> is
>>> that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout,
>> but
>>> the rest are not.  You seem more active than most.
>>> 
>>> 
>>> 
>> 
>> 
> 


Re: Popularity of recommender items

Posted by Andrew Musselman <an...@gmail.com>.
Oh yes.  I do have a small team I could enlist to do things like this; is
there a starting point somewhere on Github, Ted?


On Fri, Feb 14, 2014 at 12:39 PM, Ted Dunning <te...@gmail.com> wrote:

> Yes!
>
> But it is very hard to find the time.
>
>
>
> On Fri, Feb 14, 2014 at 11:51 AM, Andrew Musselman <
> andrew.musselman@gmail.com> wrote:
>
> > I'd like to see cross-recommendations added too.
> >
> > But I also want some automation of the steps required to build a simple
> > recommender like the solr/mahout example Ted and Ellen have in their
> > pamphlet.
> >
> > Lowering the barrier to entry by providing a sample pipeline would help a
> > lot of folks get started and hopefully would keep them interested.
>  Perhaps
> > in examples/bin?
> >
> >
> > On Fri, Feb 14, 2014 at 10:56 AM, Pat Ferrel <pa...@occamsmachete.com>
> > wrote:
> >
> > > There's been work done on the cross-recommender. There is a
> Mahout-style
> > > XRecommenderJob that has two preference models for two actions or
> > > preference types. It uses matrix multiply to get a cooccurrence type
> > > similarity matrix. If we had a cross-row-similarity-job, it could
> pretty
> > > easily be integrated and I'd volunteer to integrate it. The XRSJ is
> > > probably beyond me right now so if we can scare up someone to do that
> > we'd
> > > be a long way down the road.
> > >
> > > I'll put a feature request into Jira and take this to the dev list
> > >
> > > BTW this is already integrated with the solr-recommender.
> > >
> > > On Feb 8, 2014, at 7:19 PM, Ted Dunning <te...@gmail.com> wrote:
> > >
> > > I have different opinions about each piece.
> > >
> > > I think that cross recommendation is as core as RowSimilarityJob and
> > should
> > > be a parallel implementation or integrated.  Parallel is probably
> easier.
> > > It is even plausible to have a version of RowSimilarityJob that doesn't
> > > support all the different distance measures but does support multiple
> > cross
> > > and direct processing using LLR or related cooccurrence based measures.
> >  It
> > > would be very cool if a single pass over the data could do many kinds
> of
> > co
> > > or cross occurrence operations.
> > >
> > > For dithering, it really is post processing.  That said, it is also the
> > > single largest improvement that anybody typically gets when testing
> > > different options so it is a bit goofy to not have good support for
> some
> > > kinds of dithering.
> > >
> > > For Thompson sampled recommenders, I am not sure where to start hacking
> > on
> > > our current code.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <pa...@occamsmachete.com>
> > wrote:
> > >
> > > > That was by no means to criticize effort level, which has been
> > impressive
> > > > especially during the release.
> > > >
> > > > It was more a question about the best place to add these things and
> > > > whether they are important. Whether people see these things as custom
> > > post
> > > > processing or core.
> > > >
> > > > On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com>
> > wrote:
> > > >
> > > > ...
> > > >
> > > > The reason that we aren't adding this like cross-rec and other things
> > is
> > > > that "we" have full-time jobs, mostly.  Suneel is full-time on
> Mahout,
> > > but
> > > > the rest are not.  You seem more active than most.
> > > >
> > > >
> > > >
> > >
> > >
> >
>

Re: Popularity of recommender items

Posted by Ted Dunning <te...@gmail.com>.
Yes!

But it is very hard to find the time.



On Fri, Feb 14, 2014 at 11:51 AM, Andrew Musselman <
andrew.musselman@gmail.com> wrote:

> I'd like to see cross-recommendations added too.
>
> But I also want some automation of the steps required to build a simple
> recommender like the solr/mahout example Ted and Ellen have in their
> pamphlet.
>
> Lowering the barrier to entry by providing a sample pipeline would help a
> lot of folks get started and hopefully would keep them interested.  Perhaps
> in examples/bin?
>
>
> On Fri, Feb 14, 2014 at 10:56 AM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
>
> > There's been work done on the cross-recommender. There is a Mahout-style
> > XRecommenderJob that has two preference models for two actions or
> > preference types. It uses matrix multiply to get a cooccurrence type
> > similarity matrix. If we had a cross-row-similarity-job, it could pretty
> > easily be integrated and I'd volunteer to integrate it. The XRSJ is
> > probably beyond me right now so if we can scare up someone to do that
> we'd
> > be a long way down the road.
> >
> > I'll put a feature request into Jira and take this to the dev list
> >
> > BTW this is already integrated with the solr-recommender.
> >
> > On Feb 8, 2014, at 7:19 PM, Ted Dunning <te...@gmail.com> wrote:
> >
> > I have different opinions about each piece.
> >
> > I think that cross recommendation is as core as RowSimilarityJob and
> should
> > be a parallel implementation or integrated.  Parallel is probably easier.
> > It is even plausible to have a version of RowSimilarityJob that doesn't
> > support all the different distance measures but does support multiple
> cross
> > and direct processing using LLR or related cooccurrence based measures.
>  It
> > would be very cool if a single pass over the data could do many kinds of
> co
> > or cross occurrence operations.
> >
> > For dithering, it really is post processing.  That said, it is also the
> > single largest improvement that anybody typically gets when testing
> > different options so it is a bit goofy to not have good support for some
> > kinds of dithering.
> >
> > For Thompson sampled recommenders, I am not sure where to start hacking
> on
> > our current code.
> >
> >
> >
> >
> >
> >
> > On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
> >
> > > That was by no means to criticize effort level, which has been
> impressive
> > > especially during the release.
> > >
> > > It was more a question about the best place to add these things and
> > > whether they are important. Whether people see these things as custom
> > post
> > > processing or core.
> > >
> > > On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > >
> > > ...
> > >
> > > The reason that we aren't adding this like cross-rec and other things
> is
> > > that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout,
> > but
> > > the rest are not.  You seem more active than most.
> > >
> > >
> > >
> >
> >
>

Re: Popularity of recommender items

Posted by Andrew Musselman <an...@gmail.com>.
I'd like to see cross-recommendations added too.

But I also want some automation of the steps required to build a simple
recommender like the solr/mahout example Ted and Ellen have in their
pamphlet.

Lowering the barrier to entry by providing a sample pipeline would help a
lot of folks get started and hopefully would keep them interested.  Perhaps
in examples/bin?


On Fri, Feb 14, 2014 at 10:56 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> There's been work done on the cross-recommender. There is a Mahout-style
> XRecommenderJob that has two preference models for two actions or
> preference types. It uses matrix multiply to get a cooccurrence type
> similarity matrix. If we had a cross-row-similarity-job, it could pretty
> easily be integrated and I'd volunteer to integrate it. The XRSJ is
> probably beyond me right now so if we can scare up someone to do that we'd
> be a long way down the road.
>
> I'll put a feature request into Jira and take this to the dev list
>
> BTW this is already integrated with the solr-recommender.
>
> On Feb 8, 2014, at 7:19 PM, Ted Dunning <te...@gmail.com> wrote:
>
> I have different opinions about each piece.
>
> I think that cross recommendation is as core as RowSimilarityJob and should
> be a parallel implementation or integrated.  Parallel is probably easier.
> It is even plausible to have a version of RowSimilarityJob that doesn't
> support all the different distance measures but does support multiple cross
> and direct processing using LLR or related cooccurrence based measures.  It
> would be very cool if a single pass over the data could do many kinds of co
> or cross occurrence operations.
>
> For dithering, it really is post processing.  That said, it is also the
> single largest improvement that anybody typically gets when testing
> different options so it is a bit goofy to not have good support for some
> kinds of dithering.
>
> For Thompson sampled recommenders, I am not sure where to start hacking on
> our current code.
>
>
>
>
>
>
> On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>
> > That was by no means to criticize effort level, which has been impressive
> > especially during the release.
> >
> > It was more a question about the best place to add these things and
> > whether they are important. Whether people see these things as custom
> post
> > processing or core.
> >
> > On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com> wrote:
> >
> > ...
> >
> > The reason that we aren't adding this like cross-rec and other things is
> > that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout,
> but
> > the rest are not.  You seem more active than most.
> >
> >
> >
>
>

Re: Popularity of recommender items

Posted by Pat Ferrel <pa...@occamsmachete.com>.
There’s been work done on the cross-recommender. There is a Mahout-style XRecommenderJob that has two preference models for two actions or preference types. It uses matrix multiply to get a cooccurrence type similarity matrix. If we had a cross-row-similarity-job, it could pretty easily be integrated and I’d volunteer to integrate it. The XRSJ is probably beyond me right now so if we can scare up someone to do that we’d be a long way down the road.

I’ll put a feature request into Jira and take this to the dev list

BTW this is already integrated with the solr-recommender.

On Feb 8, 2014, at 7:19 PM, Ted Dunning <te...@gmail.com> wrote:

I have different opinions about each piece.

I think that cross recommendation is as core as RowSimilarityJob and should
be a parallel implementation or integrated.  Parallel is probably easier.
It is even plausible to have a version of RowSimilarityJob that doesn't
support all the different distance measures but does support multiple cross
and direct processing using LLR or related cooccurrence based measures.  It
would be very cool if a single pass over the data could do many kinds of co
or cross occurrence operations.

For dithering, it really is post processing.  That said, it is also the
single largest improvement that anybody typically gets when testing
different options so it is a bit goofy to not have good support for some
kinds of dithering.

For Thompson sampled recommenders, I am not sure where to start hacking on
our current code.






On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> That was by no means to criticize effort level, which has been impressive
> especially during the release.
> 
> It was more a question about the best place to add these things and
> whether they are important. Whether people see these things as custom post
> processing or core.
> 
> On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com> wrote:
> 
> …
> 
> The reason that we aren't adding this like cross-rec and other things is
> that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
> the rest are not.  You seem more active than most.
> 
> 
> 


Re: Popularity of recommender items

Posted by Ted Dunning <te...@gmail.com>.
I have different opinions about each piece.

I think that cross recommendation is as core as RowSimilarityJob and should
be a parallel implementation or integrated.  Parallel is probably easier.
 It is even plausible to have a version of RowSimilarityJob that doesn't
support all the different distance measures but does support multiple cross
and direct processing using LLR or related cooccurrence based measures.  It
would be very cool if a single pass over the data could do many kinds of co
or cross occurrence operations.

For dithering, it really is post processing.  That said, it is also the
single largest improvement that anybody typically gets when testing
different options so it is a bit goofy to not have good support for some
kinds of dithering.

For Thompson sampled recommenders, I am not sure where to start hacking on
our current code.






On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> That was by no means to criticize effort level, which has been impressive
> especially during the release.
>
> It was more a question about the best place to add these things and
> whether they are important. Whether people see these things as custom post
> processing or core.
>
> On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com> wrote:
>
> …
>
> The reason that we aren't adding this like cross-rec and other things is
> that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
> the rest are not.  You seem more active than most.
>
>
>

Re: Popularity of recommender items

Posted by Pat Ferrel <pa...@occamsmachete.com>.
That was by no means to criticize effort level, which has been impressive especially during the release.

It was more a question about the best place to add these things and whether they are important. Whether people see these things as custom post processing or core.

On Feb 8, 2014, at 12:13 PM, Ted Dunning <te...@gmail.com> wrote:

…

The reason that we aren't adding this like cross-rec and other things is
that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
the rest are not.  You seem more active than most.



Re: Popularity of recommender items

Posted by Suneel Marthi <su...@yahoo.com>.
I am not fulltime on Mahout either and have a fulltime job which is unrelated to Mahout. 
Its just that I have been sacrificing personal time to keep things moving on Mahout. 





On Saturday, February 8, 2014 3:13 PM, Ted Dunning <te...@gmail.com> wrote:
 
Thompson sampling doesn't require time other than a sense of what do we now
know.  It really is just a correct form for dithering that uses our current
knowledge.

For a worked out version of Thompson sampling with ranking, see this blog:
http://tdunning.blogspot.com/2013/04/learning-to-rank-in-very-bayesian-way.html

The reason that we aren't adding this like cross-rec and other things is
that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
the rest are not.  You seem more active than most.






On Sat, Feb 8, 2014 at 8:50 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Didn’t mean to imply I had historical view data—yet.
>
> The Thompson sampling ‘trick’ looks useful for auto converging to the best
> of A/B versions and a replacement for dithering. Below you are proposing
> another case to replace dithering—this time on a list of popular items?
> Dithering works on anything you can rank but Thompson Sampling usually
> implies a time dimension. The initial guess, first Thompson sample, could
> be thought of as a form of dithering I suppose? Haven’t looked at the math
> but it wouldn’t surprise me to find they are very similar things.
>
> While we are talking about it, why
 aren’t we adding things like
> cross-reccomendations, dithering, popularity, and other generally useful
> techniques into the Mahout recommenders? All the data is there to do these
> things, and they could be packaged in the same Mahout Jobs. They seem to be
> languishing a bit while technology and the art of recommendations moves on.
>
> If we add temporal data to preference data a bunch of new features come to
> mind, like hot lists or asymmetric train/query preference history.
>
> On Feb 6, 2014, at 9:43 PM, Ted Dunning <te...@gmail.com> wrote:
>
> One way to deal with that is to build a model that predicts the ultimate
> number of views/plays/purchases for the item based on history so far.
>
> If this model can be made Bayesian enough to sample from the posterior
> distribution of total popularity, then you can use the Thomson sampling
> trick and sort by sampled total views rather than estimated total views.
>  That will give uncertain items (typically new ones) a chance to be shown
> in the ratings without flooding the list with newcomers.
>
> Sent from my iPhone
>
> > On Feb 7, 2014, at 3:38, Pat Ferrel <pa...@occamsmachete.com> wrote:
> >
> > The particular thing I’m looking at now is how to rank a list of
 items
> by some measure of popularity when you don’t have a velocity. There is an
> introduction date though so another way to look at popularity might be to
> decay it with something like e^-t where t is it’s age. You can see the
> decay in the views histogram
>
>

Re: Popularity of recommender items

Posted by Ted Dunning <te...@gmail.com>.
Thompson sampling doesn't require time other than a sense of what do we now
know.  It really is just a correct form for dithering that uses our current
knowledge.

For a worked out version of Thompson sampling with ranking, see this blog:
http://tdunning.blogspot.com/2013/04/learning-to-rank-in-very-bayesian-way.html

The reason that we aren't adding this like cross-rec and other things is
that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
the rest are not.  You seem more active than most.





On Sat, Feb 8, 2014 at 8:50 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Didn’t mean to imply I had historical view data—yet.
>
> The Thompson sampling ‘trick’ looks useful for auto converging to the best
> of A/B versions and a replacement for dithering. Below you are proposing
> another case to replace dithering—this time on a list of popular items?
> Dithering works on anything you can rank but Thompson Sampling usually
> implies a time dimension. The initial guess, first Thompson sample, could
> be thought of as a form of dithering I suppose? Haven’t looked at the math
> but it wouldn’t surprise me to find they are very similar things.
>
> While we are talking about it, why aren’t we adding things like
> cross-reccomendations, dithering, popularity, and other generally useful
> techniques into the Mahout recommenders? All the data is there to do these
> things, and they could be packaged in the same Mahout Jobs. They seem to be
> languishing a bit while technology and the art of recommendations moves on.
>
> If we add temporal data to preference data a bunch of new features come to
> mind, like hot lists or asymmetric train/query preference history.
>
> On Feb 6, 2014, at 9:43 PM, Ted Dunning <te...@gmail.com> wrote:
>
> One way to deal with that is to build a model that predicts the ultimate
> number of views/plays/purchases for the item based on history so far.
>
> If this model can be made Bayesian enough to sample from the posterior
> distribution of total popularity, then you can use the Thomson sampling
> trick and sort by sampled total views rather than estimated total views.
>  That will give uncertain items (typically new ones) a chance to be shown
> in the ratings without flooding the list with newcomers.
>
> Sent from my iPhone
>
> > On Feb 7, 2014, at 3:38, Pat Ferrel <pa...@occamsmachete.com> wrote:
> >
> > The particular thing I’m looking at now is how to rank a list of items
> by some measure of popularity when you don’t have a velocity. There is an
> introduction date though so another way to look at popularity might be to
> decay it with something like e^-t where t is it’s age. You can see the
> decay in the views histogram
>
>

Re: Popularity of recommender items

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Didn’t mean to imply I had historical view data—yet.

The Thompson sampling ‘trick’ looks useful for auto converging to the best of A/B versions and a replacement for dithering. Below you are proposing another case to replace dithering—this time on a list of popular items? Dithering works on anything you can rank but Thompson Sampling usually implies a time dimension. The initial guess, first Thompson sample, could be thought of as a form of dithering I suppose? Haven’t looked at the math but it wouldn’t surprise me to find they are very similar things.

While we are talking about it, why aren’t we adding things like cross-reccomendations, dithering, popularity, and other generally useful techniques into the Mahout recommenders? All the data is there to do these things, and they could be packaged in the same Mahout Jobs. They seem to be languishing a bit while technology and the art of recommendations moves on.

If we add temporal data to preference data a bunch of new features come to mind, like hot lists or asymmetric train/query preference history.

On Feb 6, 2014, at 9:43 PM, Ted Dunning <te...@gmail.com> wrote:

One way to deal with that is to build a model that predicts the ultimate number of views/plays/purchases for the item based on history so far.  

If this model can be made Bayesian enough to sample from the posterior distribution of total popularity, then you can use the Thomson sampling trick and sort by sampled total views rather than estimated total views.  That will give uncertain items (typically new ones) a chance to be shown in the ratings without flooding the list with newcomers.  

Sent from my iPhone

> On Feb 7, 2014, at 3:38, Pat Ferrel <pa...@occamsmachete.com> wrote:
> 
> The particular thing I’m looking at now is how to rank a list of items by some measure of popularity when you don’t have a velocity. There is an introduction date though so another way to look at popularity might be to decay it with something like e^-t where t is it’s age. You can see the decay in the views histogram


Re: Popularity of recommender items

Posted by Ted Dunning <te...@gmail.com>.
One way to deal with that is to build a model that predicts the ultimate number of views/plays/purchases for the item based on history so far.  

If this model can be made Bayesian enough to sample from the posterior distribution of total popularity, then you can use the Thomson sampling trick and sort by sampled total views rather than estimated total views.  That will give uncertain items (typically new ones) a chance to be shown in the ratings without flooding the list with newcomers.  

Sent from my iPhone

> On Feb 7, 2014, at 3:38, Pat Ferrel <pa...@occamsmachete.com> wrote:
> 
> The particular thing I’m looking at now is how to rank a list of items by some measure of popularity when you don’t have a velocity. There is an introduction date though so another way to look at popularity might be to decay it with something like e^-t where t is it’s age. You can see the decay in the views histogram

Re: Popularity of recommender items

Posted by Pat Ferrel <pa...@occamsmachete.com>.
A velocity measure of sorts, makes a lot of sense for a “what’s hot” list.

The particular thing I’m looking at now is how to rank a list of items by some measure of popularity when you don’t have a velocity. There is an introduction date though so another way to look at popularity might be to decay it with something like e^-t where t is it’s age. You can see the decay in the views histogram.

On Feb 6, 2014, at 4:35 PM, Ted Dunning <te...@gmail.com> wrote:

Rising popularity is often a better match to what people want to see on a
"most popular" page.

The best measure for that in my experience is log (new_count + offset) /
(old_count + offset) where new and old counts are the number of views
during the periods in question and offset is used partly to avoid log(0) or
x/0 problems, but also to give a Bayesian grounding to the measure.




On Thu, Feb 6, 2014 at 5:33 PM, Sean Owen <sr...@gmail.com> wrote:

> Agree - I thought by asking for most popular you meant to look for apple
> pie.
> 
> Agree with you and Ted that the sum of similarity says something
> interesting even if it is not popularity exactly.
> On Feb 6, 2014 11:16 AM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:
> 
>> The problem with the usual preference count is that big hit items can be
>> overwhelmingly popular. If you want to know which ones the most people
> saw
>> and are likely to have an opinion about then this seems a good measure.
> But
>> these hugely popular items may not differentiate taste.
>> 
>> So we calculate the “important” taste indicators with LLR. The benefit of
>> the similarity matrix is that it attempts to model the “important”
>> cooccurrences.
>> 
>> There is an affect of hugely popular items where they really say nothing
>> about similarity of taste. Everyone likes motherhood and Apple pie so it
>> doesn’t say much about us if we both do to. This is usually accounted for
>> with something like TFIDF so I suppose another weighted popularity
> measure
>> would be to run the preference matrix through TFIDF to de-weight
>> non-differentiating preferences.
>> 
>> On Feb 6, 2014, at 7:14 AM, Ted Dunning <te...@gmail.com> wrote:
>> 
>> If you look at the indicator matrix (cooccurrence reduced by LLR), you
> will
>> usually have asymmetry due to limitations on the number of indicators per
>> row.
>> 
>> This will give you some interesting results when you look at the column
>> sums.  I wouldn't call it popularity, but it is an interesting measure.
>> 
>> 
>> 
>> On Thu, Feb 6, 2014 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
>> 
>>> I have always defined popularity as just the number of ratings/prefs,
>>> yes. You could rank on some kind of 'net promoter score' -- good
>>> ratings minus bad ratings -- though that becomes more like 'most
>>> liked'.
>>> 
>>> How do you get popularity from similarity -- similarity to what?
>>> Ranking by sum of similarities seems more like a measure of how much
>>> the item is the 'centroid' of all items. Not necessarily most popular
>>> but 'least eccentric'.
>>> 
>>> 
>>> On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <
> tevfik.aytekin@gmail.com
>>> 
>>> wrote:
>>>> Well, I think what you are suggesting is to define popularity as being
>>>> similar to other items. So in this way most popular items will be
>>>> those which are most similar to all other items, like the centroids in
>>>> K-means.
>>>> 
>>>> I would first check the correlation between this definition and the
>>>> standard one (that is, the definition of popularity as having the
>>>> highest number of ratings). But my intuition is that they are
>>>> different things. For example. an item might lie at the center in the
>>>> similarity space but it might not be a popular item. However, there
>>>> might still be some correlation, it would be interesting to check it.
>>>> 
>>>> hope it helps
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <pa...@occamsmachete.com>
>>> wrote:
>>>>> Trying to come up with a relative measure of popularity for items in
> a
>>> recommender. Something that could be used to rank items.
>>>>> 
>>>>> The user - item preference matrix would be the obvious thought. Just
>>> add the number of preferences per item. Maybe transpose the preference
>>> matrix (the temp DRM created by the recommender), then for each row
>> vector
>>> (now that a row = item) grab the number of non zero preferences. This
>>> corresponds to the number of preferences, and would give one measure of
>>> popularity. In the case where the items are not boolean you'd sum the
>>> weights.
>>>>> 
>>>>> However it might be a better idea to look at the item-item similarity
>>> matrix. It doesn't need to be transposed and contains the "important"
>>> similarities--as calculated by LLR for example. Here similarity means
>>> similarity in which users preferred an item. So summing the non-zero
>>> weights would give perhaps an even better relative "popularity"
> measure.
>>> For the same reason clustering the similarity matrix would yield
>>> "important" clusters.
>>>>> 
>>>>> Anyone have intuition about this?
>>>>> 
>>>>> I started to think about this because transposing the user-item
> matrix
>>> seems to yield a fromat that cannot be sent directly into clustering.
>>> 
>> 
>> 
> 


Re: Popularity of recommender items

Posted by Ted Dunning <te...@gmail.com>.
Rising popularity is often a better match to what people want to see on a
"most popular" page.

The best measure for that in my experience is log (new_count + offset) /
(old_count + offset) where new and old counts are the number of views
during the periods in question and offset is used partly to avoid log(0) or
x/0 problems, but also to give a Bayesian grounding to the measure.




On Thu, Feb 6, 2014 at 5:33 PM, Sean Owen <sr...@gmail.com> wrote:

> Agree - I thought by asking for most popular you meant to look for apple
> pie.
>
> Agree with you and Ted that the sum of similarity says something
> interesting even if it is not popularity exactly.
> On Feb 6, 2014 11:16 AM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:
>
> > The problem with the usual preference count is that big hit items can be
> > overwhelmingly popular. If you want to know which ones the most people
> saw
> > and are likely to have an opinion about then this seems a good measure.
> But
> > these hugely popular items may not differentiate taste.
> >
> > So we calculate the “important” taste indicators with LLR. The benefit of
> > the similarity matrix is that it attempts to model the “important”
> > cooccurrences.
> >
> > There is an affect of hugely popular items where they really say nothing
> > about similarity of taste. Everyone likes motherhood and Apple pie so it
> > doesn’t say much about us if we both do to. This is usually accounted for
> > with something like TFIDF so I suppose another weighted popularity
> measure
> > would be to run the preference matrix through TFIDF to de-weight
> > non-differentiating preferences.
> >
> > On Feb 6, 2014, at 7:14 AM, Ted Dunning <te...@gmail.com> wrote:
> >
> > If you look at the indicator matrix (cooccurrence reduced by LLR), you
> will
> > usually have asymmetry due to limitations on the number of indicators per
> > row.
> >
> > This will give you some interesting results when you look at the column
> > sums.  I wouldn't call it popularity, but it is an interesting measure.
> >
> >
> >
> > On Thu, Feb 6, 2014 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
> >
> > > I have always defined popularity as just the number of ratings/prefs,
> > > yes. You could rank on some kind of 'net promoter score' -- good
> > > ratings minus bad ratings -- though that becomes more like 'most
> > > liked'.
> > >
> > > How do you get popularity from similarity -- similarity to what?
> > > Ranking by sum of similarities seems more like a measure of how much
> > > the item is the 'centroid' of all items. Not necessarily most popular
> > > but 'least eccentric'.
> > >
> > >
> > > On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <
> tevfik.aytekin@gmail.com
> > >
> > > wrote:
> > >> Well, I think what you are suggesting is to define popularity as being
> > >> similar to other items. So in this way most popular items will be
> > >> those which are most similar to all other items, like the centroids in
> > >> K-means.
> > >>
> > >> I would first check the correlation between this definition and the
> > >> standard one (that is, the definition of popularity as having the
> > >> highest number of ratings). But my intuition is that they are
> > >> different things. For example. an item might lie at the center in the
> > >> similarity space but it might not be a popular item. However, there
> > >> might still be some correlation, it would be interesting to check it.
> > >>
> > >> hope it helps
> > >>
> > >>
> > >>
> > >>
> > >> On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <pa...@occamsmachete.com>
> > > wrote:
> > >>> Trying to come up with a relative measure of popularity for items in
> a
> > > recommender. Something that could be used to rank items.
> > >>>
> > >>> The user - item preference matrix would be the obvious thought. Just
> > > add the number of preferences per item. Maybe transpose the preference
> > > matrix (the temp DRM created by the recommender), then for each row
> > vector
> > > (now that a row = item) grab the number of non zero preferences. This
> > > corresponds to the number of preferences, and would give one measure of
> > > popularity. In the case where the items are not boolean you'd sum the
> > > weights.
> > >>>
> > >>> However it might be a better idea to look at the item-item similarity
> > > matrix. It doesn't need to be transposed and contains the "important"
> > > similarities--as calculated by LLR for example. Here similarity means
> > > similarity in which users preferred an item. So summing the non-zero
> > > weights would give perhaps an even better relative "popularity"
> measure.
> > > For the same reason clustering the similarity matrix would yield
> > > "important" clusters.
> > >>>
> > >>> Anyone have intuition about this?
> > >>>
> > >>> I started to think about this because transposing the user-item
> matrix
> > > seems to yield a fromat that cannot be sent directly into clustering.
> > >
> >
> >
>

Re: Popularity of recommender items

Posted by Sean Owen <sr...@gmail.com>.
Agree - I thought by asking for most popular you meant to look for apple
pie.

Agree with you and Ted that the sum of similarity says something
interesting even if it is not popularity exactly.
On Feb 6, 2014 11:16 AM, "Pat Ferrel" <pa...@occamsmachete.com> wrote:

> The problem with the usual preference count is that big hit items can be
> overwhelmingly popular. If you want to know which ones the most people saw
> and are likely to have an opinion about then this seems a good measure. But
> these hugely popular items may not differentiate taste.
>
> So we calculate the “important” taste indicators with LLR. The benefit of
> the similarity matrix is that it attempts to model the “important”
> cooccurrences.
>
> There is an affect of hugely popular items where they really say nothing
> about similarity of taste. Everyone likes motherhood and Apple pie so it
> doesn’t say much about us if we both do to. This is usually accounted for
> with something like TFIDF so I suppose another weighted popularity measure
> would be to run the preference matrix through TFIDF to de-weight
> non-differentiating preferences.
>
> On Feb 6, 2014, at 7:14 AM, Ted Dunning <te...@gmail.com> wrote:
>
> If you look at the indicator matrix (cooccurrence reduced by LLR), you will
> usually have asymmetry due to limitations on the number of indicators per
> row.
>
> This will give you some interesting results when you look at the column
> sums.  I wouldn't call it popularity, but it is an interesting measure.
>
>
>
> On Thu, Feb 6, 2014 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
>
> > I have always defined popularity as just the number of ratings/prefs,
> > yes. You could rank on some kind of 'net promoter score' -- good
> > ratings minus bad ratings -- though that becomes more like 'most
> > liked'.
> >
> > How do you get popularity from similarity -- similarity to what?
> > Ranking by sum of similarities seems more like a measure of how much
> > the item is the 'centroid' of all items. Not necessarily most popular
> > but 'least eccentric'.
> >
> >
> > On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <tevfik.aytekin@gmail.com
> >
> > wrote:
> >> Well, I think what you are suggesting is to define popularity as being
> >> similar to other items. So in this way most popular items will be
> >> those which are most similar to all other items, like the centroids in
> >> K-means.
> >>
> >> I would first check the correlation between this definition and the
> >> standard one (that is, the definition of popularity as having the
> >> highest number of ratings). But my intuition is that they are
> >> different things. For example. an item might lie at the center in the
> >> similarity space but it might not be a popular item. However, there
> >> might still be some correlation, it would be interesting to check it.
> >>
> >> hope it helps
> >>
> >>
> >>
> >>
> >> On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <pa...@occamsmachete.com>
> > wrote:
> >>> Trying to come up with a relative measure of popularity for items in a
> > recommender. Something that could be used to rank items.
> >>>
> >>> The user - item preference matrix would be the obvious thought. Just
> > add the number of preferences per item. Maybe transpose the preference
> > matrix (the temp DRM created by the recommender), then for each row
> vector
> > (now that a row = item) grab the number of non zero preferences. This
> > corresponds to the number of preferences, and would give one measure of
> > popularity. In the case where the items are not boolean you'd sum the
> > weights.
> >>>
> >>> However it might be a better idea to look at the item-item similarity
> > matrix. It doesn't need to be transposed and contains the "important"
> > similarities--as calculated by LLR for example. Here similarity means
> > similarity in which users preferred an item. So summing the non-zero
> > weights would give perhaps an even better relative "popularity" measure.
> > For the same reason clustering the similarity matrix would yield
> > "important" clusters.
> >>>
> >>> Anyone have intuition about this?
> >>>
> >>> I started to think about this because transposing the user-item matrix
> > seems to yield a fromat that cannot be sent directly into clustering.
> >
>
>

Re: Popularity of recommender items

Posted by Pat Ferrel <pa...@occamsmachete.com>.
The problem with the usual preference count is that big hit items can be overwhelmingly popular. If you want to know which ones the most people saw and are likely to have an opinion about then this seems a good measure. But these hugely popular items may not differentiate taste.

So we calculate the “important” taste indicators with LLR. The benefit of the similarity matrix is that it attempts to model the “important” cooccurrences. 

There is an affect of hugely popular items where they really say nothing about similarity of taste. Everyone likes motherhood and Apple pie so it doesn’t say much about us if we both do to. This is usually accounted for with something like TFIDF so I suppose another weighted popularity measure would be to run the preference matrix through TFIDF to de-weight non-differentiating preferences. 

On Feb 6, 2014, at 7:14 AM, Ted Dunning <te...@gmail.com> wrote:

If you look at the indicator matrix (cooccurrence reduced by LLR), you will
usually have asymmetry due to limitations on the number of indicators per
row.

This will give you some interesting results when you look at the column
sums.  I wouldn't call it popularity, but it is an interesting measure.



On Thu, Feb 6, 2014 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:

> I have always defined popularity as just the number of ratings/prefs,
> yes. You could rank on some kind of 'net promoter score' -- good
> ratings minus bad ratings -- though that becomes more like 'most
> liked'.
> 
> How do you get popularity from similarity -- similarity to what?
> Ranking by sum of similarities seems more like a measure of how much
> the item is the 'centroid' of all items. Not necessarily most popular
> but 'least eccentric'.
> 
> 
> On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <te...@gmail.com>
> wrote:
>> Well, I think what you are suggesting is to define popularity as being
>> similar to other items. So in this way most popular items will be
>> those which are most similar to all other items, like the centroids in
>> K-means.
>> 
>> I would first check the correlation between this definition and the
>> standard one (that is, the definition of popularity as having the
>> highest number of ratings). But my intuition is that they are
>> different things. For example. an item might lie at the center in the
>> similarity space but it might not be a popular item. However, there
>> might still be some correlation, it would be interesting to check it.
>> 
>> hope it helps
>> 
>> 
>> 
>> 
>> On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
>>> Trying to come up with a relative measure of popularity for items in a
> recommender. Something that could be used to rank items.
>>> 
>>> The user - item preference matrix would be the obvious thought. Just
> add the number of preferences per item. Maybe transpose the preference
> matrix (the temp DRM created by the recommender), then for each row vector
> (now that a row = item) grab the number of non zero preferences. This
> corresponds to the number of preferences, and would give one measure of
> popularity. In the case where the items are not boolean you'd sum the
> weights.
>>> 
>>> However it might be a better idea to look at the item-item similarity
> matrix. It doesn't need to be transposed and contains the "important"
> similarities--as calculated by LLR for example. Here similarity means
> similarity in which users preferred an item. So summing the non-zero
> weights would give perhaps an even better relative "popularity" measure.
> For the same reason clustering the similarity matrix would yield
> "important" clusters.
>>> 
>>> Anyone have intuition about this?
>>> 
>>> I started to think about this because transposing the user-item matrix
> seems to yield a fromat that cannot be sent directly into clustering.
> 


Re: Popularity of recommender items

Posted by Ted Dunning <te...@gmail.com>.
If you look at the indicator matrix (cooccurrence reduced by LLR), you will
usually have asymmetry due to limitations on the number of indicators per
row.

This will give you some interesting results when you look at the column
sums.  I wouldn't call it popularity, but it is an interesting measure.



On Thu, Feb 6, 2014 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:

> I have always defined popularity as just the number of ratings/prefs,
> yes. You could rank on some kind of 'net promoter score' -- good
> ratings minus bad ratings -- though that becomes more like 'most
> liked'.
>
> How do you get popularity from similarity -- similarity to what?
> Ranking by sum of similarities seems more like a measure of how much
> the item is the 'centroid' of all items. Not necessarily most popular
> but 'least eccentric'.
>
>
> On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <te...@gmail.com>
> wrote:
> > Well, I think what you are suggesting is to define popularity as being
> > similar to other items. So in this way most popular items will be
> > those which are most similar to all other items, like the centroids in
> > K-means.
> >
> > I would first check the correlation between this definition and the
> > standard one (that is, the definition of popularity as having the
> > highest number of ratings). But my intuition is that they are
> > different things. For example. an item might lie at the center in the
> > similarity space but it might not be a popular item. However, there
> > might still be some correlation, it would be interesting to check it.
> >
> > hope it helps
> >
> >
> >
> >
> > On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
> >> Trying to come up with a relative measure of popularity for items in a
> recommender. Something that could be used to rank items.
> >>
> >> The user - item preference matrix would be the obvious thought. Just
> add the number of preferences per item. Maybe transpose the preference
> matrix (the temp DRM created by the recommender), then for each row vector
> (now that a row = item) grab the number of non zero preferences. This
> corresponds to the number of preferences, and would give one measure of
> popularity. In the case where the items are not boolean you'd sum the
> weights.
> >>
> >> However it might be a better idea to look at the item-item similarity
> matrix. It doesn't need to be transposed and contains the "important"
> similarities--as calculated by LLR for example. Here similarity means
> similarity in which users preferred an item. So summing the non-zero
> weights would give perhaps an even better relative "popularity" measure.
> For the same reason clustering the similarity matrix would yield
> "important" clusters.
> >>
> >> Anyone have intuition about this?
> >>
> >> I started to think about this because transposing the user-item matrix
> seems to yield a fromat that cannot be sent directly into clustering.
>

Re: Popularity of recommender items

Posted by Sean Owen <sr...@gmail.com>.
I have always defined popularity as just the number of ratings/prefs,
yes. You could rank on some kind of 'net promoter score' -- good
ratings minus bad ratings -- though that becomes more like 'most
liked'.

How do you get popularity from similarity -- similarity to what?
Ranking by sum of similarities seems more like a measure of how much
the item is the 'centroid' of all items. Not necessarily most popular
but 'least eccentric'.


On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <te...@gmail.com> wrote:
> Well, I think what you are suggesting is to define popularity as being
> similar to other items. So in this way most popular items will be
> those which are most similar to all other items, like the centroids in
> K-means.
>
> I would first check the correlation between this definition and the
> standard one (that is, the definition of popularity as having the
> highest number of ratings). But my intuition is that they are
> different things. For example. an item might lie at the center in the
> similarity space but it might not be a popular item. However, there
> might still be some correlation, it would be interesting to check it.
>
> hope it helps
>
>
>
>
> On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>> Trying to come up with a relative measure of popularity for items in a recommender. Something that could be used to rank items.
>>
>> The user - item preference matrix would be the obvious thought. Just add the number of preferences per item. Maybe transpose the preference matrix (the temp DRM created by the recommender), then for each row vector (now that a row = item) grab the number of non zero preferences. This corresponds to the number of preferences, and would give one measure of popularity. In the case where the items are not boolean you'd sum the weights.
>>
>> However it might be a better idea to look at the item-item similarity matrix. It doesn't need to be transposed and contains the "important" similarities--as calculated by LLR for example. Here similarity means similarity in which users preferred an item. So summing the non-zero weights would give perhaps an even better relative "popularity" measure. For the same reason clustering the similarity matrix would yield "important" clusters.
>>
>> Anyone have intuition about this?
>>
>> I started to think about this because transposing the user-item matrix seems to yield a fromat that cannot be sent directly into clustering.

Re: Popularity of recommender items

Posted by Tevfik Aytekin <te...@gmail.com>.
Well, I think what you are suggesting is to define popularity as being
similar to other items. So in this way most popular items will be
those which are most similar to all other items, like the centroids in
K-means.

I would first check the correlation between this definition and the
standard one (that is, the definition of popularity as having the
highest number of ratings). But my intuition is that they are
different things. For example. an item might lie at the center in the
similarity space but it might not be a popular item. However, there
might still be some correlation, it would be interesting to check it.

hope it helps




On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
> Trying to come up with a relative measure of popularity for items in a recommender. Something that could be used to rank items.
>
> The user - item preference matrix would be the obvious thought. Just add the number of preferences per item. Maybe transpose the preference matrix (the temp DRM created by the recommender), then for each row vector (now that a row = item) grab the number of non zero preferences. This corresponds to the number of preferences, and would give one measure of popularity. In the case where the items are not boolean you'd sum the weights.
>
> However it might be a better idea to look at the item-item similarity matrix. It doesn't need to be transposed and contains the "important" similarities--as calculated by LLR for example. Here similarity means similarity in which users preferred an item. So summing the non-zero weights would give perhaps an even better relative "popularity" measure. For the same reason clustering the similarity matrix would yield "important" clusters.
>
> Anyone have intuition about this?
>
> I started to think about this because transposing the user-item matrix seems to yield a fromat that cannot be sent directly into clustering.