You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by B Lyon <br...@gmail.com> on 2013/08/17 05:21:46 UTC

Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

As part of trying to get a better grip on recommenders, I have started a
simple interactive visualization that begins with the raw data of user-item
interactions and goes all the way to being able to twiddle the interactions
in a test user vector to see the impact on recommended items.  This is for
simple "user interacted with an item" case rather than numerical
preferences for items.  The goal is to show the intermediate pieces and how
they fit together via popup text on mouseovers and dynamic highlighting of
the related pieces.  I am of course interested in feedback as I keep
tweaking on it - not sure I got all the terminology quite right yet, for
example, and might have missed some other things I need to know about.
 Note that this material is covered in Chapter 6.2 in MIA in the discussion
on distributed recommenders.

It's on googledrive here (very much a work-in-progress):

https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/

(apologies to small resolution screens)

This is based only on the co-occurrence matrix, rather than including the
other similarity measures, although in working through this, it seems that
the other ones can just be interpreted as having alternative definitions of
what "*" means in matrix multiplication of A^T*A, where A is the user-item
matrix... and as an aside to me begs the interesting question of [purely
hypotheticall?] situations where LLR and co-occurrence are at odds with
each other in making recommendations, as co-occurrence seems to be just
using the "k11" term that is part of the LLR calculation.

My goal (at the moment at least) is to eventually continue this for the
solr-recommender project that started as few weeks ago, where we have the
additional cross-matrix, as well as a kind of regrouping of pieces for solr.


-- 
BF Lyon
http://www.nowherenearithaca.com

Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by Rafal Lukawiecki <ra...@projectbotticelli.com>.
Dear Lyon,

This is a lovely visualisation.

Rafal

On 17 Aug 2013, at 06:44, Anuj Kumar <an...@gmail.com> wrote:

Hi Lyon,

This is cool; specifically, the lineage part where you can see how the
values were calculated.
Thanks for sharing.

- Anuj


On Sat, Aug 17, 2013 at 8:51 AM, B Lyon <br...@gmail.com> wrote:

> As part of trying to get a better grip on recommenders, I have started a
> simple interactive visualization that begins with the raw data of user-item
> interactions and goes all the way to being able to twiddle the interactions
> in a test user vector to see the impact on recommended items.  This is for
> simple "user interacted with an item" case rather than numerical
> preferences for items.  The goal is to show the intermediate pieces and how
> they fit together via popup text on mouseovers and dynamic highlighting of
> the related pieces.  I am of course interested in feedback as I keep
> tweaking on it - not sure I got all the terminology quite right yet, for
> example, and might have missed some other things I need to know about.
> Note that this material is covered in Chapter 6.2 in MIA in the discussion
> on distributed recommenders.
> 
> It's on googledrive here (very much a work-in-progress):
> 
> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
> 
> (apologies to small resolution screens)
> 
> This is based only on the co-occurrence matrix, rather than including the
> other similarity measures, although in working through this, it seems that
> the other ones can just be interpreted as having alternative definitions of
> what "*" means in matrix multiplication of A^T*A, where A is the user-item
> matrix... and as an aside to me begs the interesting question of [purely
> hypotheticall?] situations where LLR and co-occurrence are at odds with
> each other in making recommendations, as co-occurrence seems to be just
> using the "k11" term that is part of the LLR calculation.
> 
> My goal (at the moment at least) is to eventually continue this for the
> solr-recommender project that started as few weeks ago, where we have the
> additional cross-matrix, as well as a kind of regrouping of pieces for
> solr.
> 
> 
> --
> BF Lyon
> http://www.nowherenearithaca.com
> 



Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by Anuj Kumar <an...@gmail.com>.
Hi Lyon,

This is cool; specifically, the lineage part where you can see how the
values were calculated.
Thanks for sharing.

- Anuj


On Sat, Aug 17, 2013 at 8:51 AM, B Lyon <br...@gmail.com> wrote:

> As part of trying to get a better grip on recommenders, I have started a
> simple interactive visualization that begins with the raw data of user-item
> interactions and goes all the way to being able to twiddle the interactions
> in a test user vector to see the impact on recommended items.  This is for
> simple "user interacted with an item" case rather than numerical
> preferences for items.  The goal is to show the intermediate pieces and how
> they fit together via popup text on mouseovers and dynamic highlighting of
> the related pieces.  I am of course interested in feedback as I keep
> tweaking on it - not sure I got all the terminology quite right yet, for
> example, and might have missed some other things I need to know about.
>  Note that this material is covered in Chapter 6.2 in MIA in the discussion
> on distributed recommenders.
>
> It's on googledrive here (very much a work-in-progress):
>
> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
>
> (apologies to small resolution screens)
>
> This is based only on the co-occurrence matrix, rather than including the
> other similarity measures, although in working through this, it seems that
> the other ones can just be interpreted as having alternative definitions of
> what "*" means in matrix multiplication of A^T*A, where A is the user-item
> matrix... and as an aside to me begs the interesting question of [purely
> hypotheticall?] situations where LLR and co-occurrence are at odds with
> each other in making recommendations, as co-occurrence seems to be just
> using the "k11" term that is part of the LLR calculation.
>
> My goal (at the moment at least) is to eventually continue this for the
> solr-recommender project that started as few weeks ago, where we have the
> additional cross-matrix, as well as a kind of regrouping of pieces for
> solr.
>
>
> --
> BF Lyon
> http://www.nowherenearithaca.com
>

Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by Ted Dunning <te...@gmail.com>.
Yes.

Correlation is a problem because tables like

1 0
0 10^6

and

10 0
0 10^6

produce the same correlation.  LLR correctly distinguishes these cases.



On Mon, Aug 19, 2013 at 7:16 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Which is why LLR would be really nice in two action cross-similairty case.
> The cross-corelation sparsification via cooccurrence is probably pretty
> weak, no?
>
>
> On Aug 18, 2013, at 11:53 AM, Ted Dunning <te...@gmail.com> wrote:
>
> Outside of the context of your demo, suppose that you have events a, b, c
> and d.  Event a is the one we are centered on and is relatively rare.
> Event b is not so rare, but has weak correlation with a.  Event c is as
> rare as a, but correlates strongly with it.  Even d is quite common, but
> has no correlation with a.
>
> The 2x2 matrices that you would get would look something like this.  In
> each of these, a and NOT a are in rows while other and NOT other are in
> columns.
>
> versus b, llrRoot = 8.03
>       b NOT b  a *10* *10*  NOT a *1000* *99000*
>
>
>
> versus c, llrRoot = 11.5
>      c NOT c  a *10* *10*  NOT a *30* *99970*
>
>
>
> versus d, llrRoot = 0
>  d NOT d  a *10* *10*  NOT a *50000* *50000*
>
> Note that what we are holding constant here is the prevalence of a (20
> times) and the distribution of a under the conditions of the other symbol.
> What is being varied is the distribution of the other symbol in the "NOT
> a" case.
>
>
>
>
> On Sun, Aug 18, 2013 at 10:50 AM, B Lyon <br...@gmail.com> wrote:
>
> > Thanks folks for taking a look.
> >
> > I haven't sat down to try it yet, but wondering how hard it is to
> construct
> > (realizable and realistic) k11, k12, k21, k22 values for three binary
> > sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you
> > can tweak k12 and k21 so that the LLR values are extremely different in
> > both directions.  I assume that k22 doesn't matter much in practice since
> > things are sparse and k22 is huge.  Well, obviously, I guess you could
> > simply switch the k12/k21 values between the two sequence pairs to flip
> the
> > order at will... which is information that co-occurrence of course does
> not
> > "know about".
> >
> >
> > On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <te...@gmail.com>
> > wrote:
> >
> >> This is nice.  As you say, k11 is the only part that is used in
> >> cooccurrence and it doesn't weight by prevalence, either.
> >>
> >> This size analysis is hard to demonstrate much difference because it is
> >> hard to show interesting values of LLR without absurdly string
> > coordination
> >> between items.
> >>
> >>
> >> On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <br...@gmail.com> wrote:
> >>
> >>> As part of trying to get a better grip on recommenders, I have started
> > a
> >>> simple interactive visualization that begins with the raw data of
> >> user-item
> >>> interactions and goes all the way to being able to twiddle the
> >> interactions
> >>> in a test user vector to see the impact on recommended items.  This is
> >> for
> >>> simple "user interacted with an item" case rather than numerical
> >>> preferences for items.  The goal is to show the intermediate pieces and
> >> how
> >>> they fit together via popup text on mouseovers and dynamic highlighting
> >> of
> >>> the related pieces.  I am of course interested in feedback as I keep
> >>> tweaking on it - not sure I got all the terminology quite right yet,
> > for
> >>> example, and might have missed some other things I need to know about.
> >>> Note that this material is covered in Chapter 6.2 in MIA in the
> >> discussion
> >>> on distributed recommenders.
> >>>
> >>> It's on googledrive here (very much a work-in-progress):
> >>>
> >>> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
> >>>
> >>> (apologies to small resolution screens)
> >>>
> >>> This is based only on the co-occurrence matrix, rather than including
> > the
> >>> other similarity measures, although in working through this, it seems
> >> that
> >>> the other ones can just be interpreted as having alternative
> > definitions
> >> of
> >>> what "*" means in matrix multiplication of A^T*A, where A is the
> >> user-item
> >>> matrix... and as an aside to me begs the interesting question of
> > [purely
> >>> hypotheticall?] situations where LLR and co-occurrence are at odds with
> >>> each other in making recommendations, as co-occurrence seems to be just
> >>> using the "k11" term that is part of the LLR calculation.
> >>>
> >>> My goal (at the moment at least) is to eventually continue this for the
> >>> solr-recommender project that started as few weeks ago, where we have
> > the
> >>> additional cross-matrix, as well as a kind of regrouping of pieces for
> >>> solr.
> >>>
> >>>
> >>> --
> >>> BF Lyon
> >>> http://www.nowherenearithaca.com
> >>>
> >>
> >
> >
> >
> > --
> > BF Lyon
> > http://www.nowherenearithaca.com
> >
>
>

Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Which is why LLR would be really nice in two action cross-similairty case. The cross-corelation sparsification via cooccurrence is probably pretty weak, no?


On Aug 18, 2013, at 11:53 AM, Ted Dunning <te...@gmail.com> wrote:

Outside of the context of your demo, suppose that you have events a, b, c
and d.  Event a is the one we are centered on and is relatively rare.
Event b is not so rare, but has weak correlation with a.  Event c is as
rare as a, but correlates strongly with it.  Even d is quite common, but
has no correlation with a.

The 2x2 matrices that you would get would look something like this.  In
each of these, a and NOT a are in rows while other and NOT other are in
columns.

versus b, llrRoot = 8.03
      b NOT b  a *10* *10*  NOT a *1000* *99000*



versus c, llrRoot = 11.5
     c NOT c  a *10* *10*  NOT a *30* *99970*



versus d, llrRoot = 0
 d NOT d  a *10* *10*  NOT a *50000* *50000*

Note that what we are holding constant here is the prevalence of a (20
times) and the distribution of a under the conditions of the other symbol.
What is being varied is the distribution of the other symbol in the "NOT
a" case.




On Sun, Aug 18, 2013 at 10:50 AM, B Lyon <br...@gmail.com> wrote:

> Thanks folks for taking a look.
> 
> I haven't sat down to try it yet, but wondering how hard it is to construct
> (realizable and realistic) k11, k12, k21, k22 values for three binary
> sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you
> can tweak k12 and k21 so that the LLR values are extremely different in
> both directions.  I assume that k22 doesn't matter much in practice since
> things are sparse and k22 is huge.  Well, obviously, I guess you could
> simply switch the k12/k21 values between the two sequence pairs to flip the
> order at will... which is information that co-occurrence of course does not
> "know about".
> 
> 
> On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <te...@gmail.com>
> wrote:
> 
>> This is nice.  As you say, k11 is the only part that is used in
>> cooccurrence and it doesn't weight by prevalence, either.
>> 
>> This size analysis is hard to demonstrate much difference because it is
>> hard to show interesting values of LLR without absurdly string
> coordination
>> between items.
>> 
>> 
>> On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <br...@gmail.com> wrote:
>> 
>>> As part of trying to get a better grip on recommenders, I have started
> a
>>> simple interactive visualization that begins with the raw data of
>> user-item
>>> interactions and goes all the way to being able to twiddle the
>> interactions
>>> in a test user vector to see the impact on recommended items.  This is
>> for
>>> simple "user interacted with an item" case rather than numerical
>>> preferences for items.  The goal is to show the intermediate pieces and
>> how
>>> they fit together via popup text on mouseovers and dynamic highlighting
>> of
>>> the related pieces.  I am of course interested in feedback as I keep
>>> tweaking on it - not sure I got all the terminology quite right yet,
> for
>>> example, and might have missed some other things I need to know about.
>>> Note that this material is covered in Chapter 6.2 in MIA in the
>> discussion
>>> on distributed recommenders.
>>> 
>>> It's on googledrive here (very much a work-in-progress):
>>> 
>>> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
>>> 
>>> (apologies to small resolution screens)
>>> 
>>> This is based only on the co-occurrence matrix, rather than including
> the
>>> other similarity measures, although in working through this, it seems
>> that
>>> the other ones can just be interpreted as having alternative
> definitions
>> of
>>> what "*" means in matrix multiplication of A^T*A, where A is the
>> user-item
>>> matrix... and as an aside to me begs the interesting question of
> [purely
>>> hypotheticall?] situations where LLR and co-occurrence are at odds with
>>> each other in making recommendations, as co-occurrence seems to be just
>>> using the "k11" term that is part of the LLR calculation.
>>> 
>>> My goal (at the moment at least) is to eventually continue this for the
>>> solr-recommender project that started as few weeks ago, where we have
> the
>>> additional cross-matrix, as well as a kind of regrouping of pieces for
>>> solr.
>>> 
>>> 
>>> --
>>> BF Lyon
>>> http://www.nowherenearithaca.com
>>> 
>> 
> 
> 
> 
> --
> BF Lyon
> http://www.nowherenearithaca.com
> 


Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by Ted Dunning <te...@gmail.com>.
Outside of the context of your demo, suppose that you have events a, b, c
and d.  Event a is the one we are centered on and is relatively rare.
 Event b is not so rare, but has weak correlation with a.  Event c is as
rare as a, but correlates strongly with it.  Even d is quite common, but
has no correlation with a.

The 2x2 matrices that you would get would look something like this.  In
each of these, a and NOT a are in rows while other and NOT other are in
columns.

versus b, llrRoot = 8.03
       b NOT b  a *10* *10*  NOT a *1000* *99000*



versus c, llrRoot = 11.5
      c NOT c  a *10* *10*  NOT a *30* *99970*



versus d, llrRoot = 0
  d NOT d  a *10* *10*  NOT a *50000* *50000*

Note that what we are holding constant here is the prevalence of a (20
times) and the distribution of a under the conditions of the other symbol.
 What is being varied is the distribution of the other symbol in the "NOT
a" case.




On Sun, Aug 18, 2013 at 10:50 AM, B Lyon <br...@gmail.com> wrote:

> Thanks folks for taking a look.
>
> I haven't sat down to try it yet, but wondering how hard it is to construct
> (realizable and realistic) k11, k12, k21, k22 values for three binary
> sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you
> can tweak k12 and k21 so that the LLR values are extremely different in
> both directions.  I assume that k22 doesn't matter much in practice since
> things are sparse and k22 is huge.  Well, obviously, I guess you could
> simply switch the k12/k21 values between the two sequence pairs to flip the
> order at will... which is information that co-occurrence of course does not
> "know about".
>
>
> On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > This is nice.  As you say, k11 is the only part that is used in
> > cooccurrence and it doesn't weight by prevalence, either.
> >
> > This size analysis is hard to demonstrate much difference because it is
> > hard to show interesting values of LLR without absurdly string
> coordination
> > between items.
> >
> >
> > On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <br...@gmail.com> wrote:
> >
> > > As part of trying to get a better grip on recommenders, I have started
> a
> > > simple interactive visualization that begins with the raw data of
> > user-item
> > > interactions and goes all the way to being able to twiddle the
> > interactions
> > > in a test user vector to see the impact on recommended items.  This is
> > for
> > > simple "user interacted with an item" case rather than numerical
> > > preferences for items.  The goal is to show the intermediate pieces and
> > how
> > > they fit together via popup text on mouseovers and dynamic highlighting
> > of
> > > the related pieces.  I am of course interested in feedback as I keep
> > > tweaking on it - not sure I got all the terminology quite right yet,
> for
> > > example, and might have missed some other things I need to know about.
> > >  Note that this material is covered in Chapter 6.2 in MIA in the
> > discussion
> > > on distributed recommenders.
> > >
> > > It's on googledrive here (very much a work-in-progress):
> > >
> > > https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
> > >
> > > (apologies to small resolution screens)
> > >
> > > This is based only on the co-occurrence matrix, rather than including
> the
> > > other similarity measures, although in working through this, it seems
> > that
> > > the other ones can just be interpreted as having alternative
> definitions
> > of
> > > what "*" means in matrix multiplication of A^T*A, where A is the
> > user-item
> > > matrix... and as an aside to me begs the interesting question of
> [purely
> > > hypotheticall?] situations where LLR and co-occurrence are at odds with
> > > each other in making recommendations, as co-occurrence seems to be just
> > > using the "k11" term that is part of the LLR calculation.
> > >
> > > My goal (at the moment at least) is to eventually continue this for the
> > > solr-recommender project that started as few weeks ago, where we have
> the
> > > additional cross-matrix, as well as a kind of regrouping of pieces for
> > > solr.
> > >
> > >
> > > --
> > > BF Lyon
> > > http://www.nowherenearithaca.com
> > >
> >
>
>
>
> --
> BF Lyon
> http://www.nowherenearithaca.com
>

Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Very cool. 

There is an excel spreadsheet in the solr-recommender project at src/test/resources/Recommender\ Math.xlsx that shows an example of the cross-recommender with data that trivially simulates purchases (B) and views (A), where everything purchased has been viewed but not the other way around. 

The data shows cases where no recommendation could be made from [B'B] but can from [B'A]. Thus illustrating why the cross-recommender might be of use. It's not as flashy but you can change values and get the results recalculated.


On Aug 18, 2013, at 10:50 AM, B Lyon <br...@gmail.com> wrote:

Thanks folks for taking a look.

I haven't sat down to try it yet, but wondering how hard it is to construct
(realizable and realistic) k11, k12, k21, k22 values for three binary
sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you
can tweak k12 and k21 so that the LLR values are extremely different in
both directions.  I assume that k22 doesn't matter much in practice since
things are sparse and k22 is huge.  Well, obviously, I guess you could
simply switch the k12/k21 values between the two sequence pairs to flip the
order at will... which is information that co-occurrence of course does not
"know about".


On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <te...@gmail.com> wrote:

> This is nice.  As you say, k11 is the only part that is used in
> cooccurrence and it doesn't weight by prevalence, either.
> 
> This size analysis is hard to demonstrate much difference because it is
> hard to show interesting values of LLR without absurdly string coordination
> between items.
> 
> 
> On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <br...@gmail.com> wrote:
> 
>> As part of trying to get a better grip on recommenders, I have started a
>> simple interactive visualization that begins with the raw data of
> user-item
>> interactions and goes all the way to being able to twiddle the
> interactions
>> in a test user vector to see the impact on recommended items.  This is
> for
>> simple "user interacted with an item" case rather than numerical
>> preferences for items.  The goal is to show the intermediate pieces and
> how
>> they fit together via popup text on mouseovers and dynamic highlighting
> of
>> the related pieces.  I am of course interested in feedback as I keep
>> tweaking on it - not sure I got all the terminology quite right yet, for
>> example, and might have missed some other things I need to know about.
>> Note that this material is covered in Chapter 6.2 in MIA in the
> discussion
>> on distributed recommenders.
>> 
>> It's on googledrive here (very much a work-in-progress):
>> 
>> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
>> 
>> (apologies to small resolution screens)
>> 
>> This is based only on the co-occurrence matrix, rather than including the
>> other similarity measures, although in working through this, it seems
> that
>> the other ones can just be interpreted as having alternative definitions
> of
>> what "*" means in matrix multiplication of A^T*A, where A is the
> user-item
>> matrix... and as an aside to me begs the interesting question of [purely
>> hypotheticall?] situations where LLR and co-occurrence are at odds with
>> each other in making recommendations, as co-occurrence seems to be just
>> using the "k11" term that is part of the LLR calculation.
>> 
>> My goal (at the moment at least) is to eventually continue this for the
>> solr-recommender project that started as few weeks ago, where we have the
>> additional cross-matrix, as well as a kind of regrouping of pieces for
>> solr.
>> 
>> 
>> --
>> BF Lyon
>> http://www.nowherenearithaca.com
>> 
> 



-- 
BF Lyon
http://www.nowherenearithaca.com


Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by B Lyon <br...@gmail.com>.
Thanks folks for taking a look.

I haven't sat down to try it yet, but wondering how hard it is to construct
(realizable and realistic) k11, k12, k21, k22 values for three binary
sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you
can tweak k12 and k21 so that the LLR values are extremely different in
both directions.  I assume that k22 doesn't matter much in practice since
things are sparse and k22 is huge.  Well, obviously, I guess you could
simply switch the k12/k21 values between the two sequence pairs to flip the
order at will... which is information that co-occurrence of course does not
"know about".


On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <te...@gmail.com> wrote:

> This is nice.  As you say, k11 is the only part that is used in
> cooccurrence and it doesn't weight by prevalence, either.
>
> This size analysis is hard to demonstrate much difference because it is
> hard to show interesting values of LLR without absurdly string coordination
> between items.
>
>
> On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <br...@gmail.com> wrote:
>
> > As part of trying to get a better grip on recommenders, I have started a
> > simple interactive visualization that begins with the raw data of
> user-item
> > interactions and goes all the way to being able to twiddle the
> interactions
> > in a test user vector to see the impact on recommended items.  This is
> for
> > simple "user interacted with an item" case rather than numerical
> > preferences for items.  The goal is to show the intermediate pieces and
> how
> > they fit together via popup text on mouseovers and dynamic highlighting
> of
> > the related pieces.  I am of course interested in feedback as I keep
> > tweaking on it - not sure I got all the terminology quite right yet, for
> > example, and might have missed some other things I need to know about.
> >  Note that this material is covered in Chapter 6.2 in MIA in the
> discussion
> > on distributed recommenders.
> >
> > It's on googledrive here (very much a work-in-progress):
> >
> > https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
> >
> > (apologies to small resolution screens)
> >
> > This is based only on the co-occurrence matrix, rather than including the
> > other similarity measures, although in working through this, it seems
> that
> > the other ones can just be interpreted as having alternative definitions
> of
> > what "*" means in matrix multiplication of A^T*A, where A is the
> user-item
> > matrix... and as an aside to me begs the interesting question of [purely
> > hypotheticall?] situations where LLR and co-occurrence are at odds with
> > each other in making recommendations, as co-occurrence seems to be just
> > using the "k11" term that is part of the LLR calculation.
> >
> > My goal (at the moment at least) is to eventually continue this for the
> > solr-recommender project that started as few weeks ago, where we have the
> > additional cross-matrix, as well as a kind of regrouping of pieces for
> > solr.
> >
> >
> > --
> > BF Lyon
> > http://www.nowherenearithaca.com
> >
>



-- 
BF Lyon
http://www.nowherenearithaca.com

Re: Draft Interactive Viz for Exploring Co-occurrence, Recommender calculations

Posted by Ted Dunning <te...@gmail.com>.
This is nice.  As you say, k11 is the only part that is used in
cooccurrence and it doesn't weight by prevalence, either.

This size analysis is hard to demonstrate much difference because it is
hard to show interesting values of LLR without absurdly string coordination
between items.


On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <br...@gmail.com> wrote:

> As part of trying to get a better grip on recommenders, I have started a
> simple interactive visualization that begins with the raw data of user-item
> interactions and goes all the way to being able to twiddle the interactions
> in a test user vector to see the impact on recommended items.  This is for
> simple "user interacted with an item" case rather than numerical
> preferences for items.  The goal is to show the intermediate pieces and how
> they fit together via popup text on mouseovers and dynamic highlighting of
> the related pieces.  I am of course interested in feedback as I keep
> tweaking on it - not sure I got all the terminology quite right yet, for
> example, and might have missed some other things I need to know about.
>  Note that this material is covered in Chapter 6.2 in MIA in the discussion
> on distributed recommenders.
>
> It's on googledrive here (very much a work-in-progress):
>
> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
>
> (apologies to small resolution screens)
>
> This is based only on the co-occurrence matrix, rather than including the
> other similarity measures, although in working through this, it seems that
> the other ones can just be interpreted as having alternative definitions of
> what "*" means in matrix multiplication of A^T*A, where A is the user-item
> matrix... and as an aside to me begs the interesting question of [purely
> hypotheticall?] situations where LLR and co-occurrence are at odds with
> each other in making recommendations, as co-occurrence seems to be just
> using the "k11" term that is part of the LLR calculation.
>
> My goal (at the moment at least) is to eventually continue this for the
> solr-recommender project that started as few weeks ago, where we have the
> additional cross-matrix, as well as a kind of regrouping of pieces for
> solr.
>
>
> --
> BF Lyon
> http://www.nowherenearithaca.com
>