You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Ahmed Abdeen Hamed <ah...@gmail.com> on 2012/03/06 23:09:02 UTC

Injecting content into item-item CF

Hello friends,

Is there an example on how you can inject intem attributes into a item-item
similarity algorithm?

Thanks very much,

-Ahmed

Re: Injecting content into item-item CF

Posted by Sean Owen <sr...@gmail.com>.
If you have a "query" then this really sounds like a search problem,
not a recommender problem.

How are you wanting to use similarity? you already have some IR score
for each of these items based on the query. Are you trying to augment
that by multiplying by the user's predicted preference for the item?

If so, then that bit, the predicted preference, is not dependent on
the query and can be precomputed. But it is not a similarity
computation -- it's an estimate of a preference, which may use an
algorithm that uses a similarity.

On Tue, Mar 13, 2012 at 2:28 PM, Ahmed Abdeen Hamed
<ah...@gmail.com> wrote:
> Hi Sean,
>
> I did some reading before writing so I can ask more specific questions. The
> MiA book has a couple of sections that cover content-based. The move
> attributes examples make sense. However, it appears to me that the
> similarity can not be computed offline. This is because the similarity is
> depended on a user query that will be entered in real time. For instance,
> assume that we have two different movies in our database we would like to
> recommend, among other movies along with the genre:
>
> The Matrix, Action Adventure
> The Matrix of Power, Documentary
> Matrix Method, Sports and Fitness
> The Matrix Reloaded, Action Adventure
>
> Now if the user query was "matrix sports" the similarity will be higher for
> Matrix Method movie than the Matrix Reloaded. But these similarities will
> only be available after the user enters the query.
>
> My question now is: is there a way to compute these similarities offline?
>
> Thanks very much,
>
> -Ahmed
>
>
>
>
>
> On Tue, Mar 6, 2012 at 5:14 PM, Sean Owen <sr...@gmail.com> wrote:
>>
>> Sure, you just write your own ItemSimilarity implementation based on
>> the content, whatever that may be. what you do there is mostly up to
>> you; there's not a framework for this.
>>
>

Re: Injecting content into item-item CF

Posted by Sean Owen <sr...@gmail.com>.
Yes. You could define item similarity based on your movie category in
your data: 1 if they're in the same category, 0 if not. That's very
simplistic. And you'd have to write it yourself. But that's what it is
referring to.

However, you need more than an item similarity metric to make a
recommender. You need user-item preferences. Without that you have
nothing to make recommendations from. item-item similarity doesn't
somehow magically tell you what users like what item.

On Tue, Mar 13, 2012 at 7:57 PM, Ahmed Abdeen Hamed
<ah...@gmail.com> wrote:
> Sorry if my questions are hard to understand.
>
> Let's start all over...
>
> Do we have an example that explains the following paragraph the in MiA
> book?
>
> "Or recall that item-based recommenders require some notion of similarity
>
> between two given items. This similarity is encapsulated by an
> ItemSimilarity  implementation.
>
> So far, implementations have derived similarity from user preferences
>
> only—this is classic collaborative filtering. But there’s no reason the
> implementation
>
> couldn’t be based on item attributes. For example, a movie recommender might
>
> define item (movie) similarity as a function of movie attributes like genre,
> director,
>
> actor, and year of release. Using such an implementation within a
> traditional item"
>
>
> This is the part that I am trying to understand and have a solution for.
>
> Thanks,
>
> -Ahmed
>
>
>
> On Tue, Mar 13, 2012 at 2:08 PM, Sean Owen <sr...@gmail.com> wrote:
>>
>> OK, you have some users. You have some items, and those items have
>> attributes.
>>
>> Nothing here connects users to items though, so how can any process
>> estimate any additional user-item connections?
>>
>> You could compute item-item similarities, but that doesn't resolve this.
>>
>> Sorry I am really confused -- you have been talking about queries but
>> saying you are not using any search. It's hard to help.
>>
>

Re: Injecting content into item-item CF

Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Sorry if my questions are hard to understand.

Let's start all over...

Do we have an example that explains the following paragraph the in MiA
book?

"Or recall that item-based recommenders require some notion of similarity

between two given items. This similarity is encapsulated by an
ItemSimilarity  implementation.

So far, implementations have derived similarity from user preferences

only—this is classic collaborative filtering. But there’s no reason the
implementation

couldn’t be based on item attributes. For example, a movie recommender might

define item (movie) similarity as a function of movie attributes like
genre, director,
actor, and year of release. Using such an implementation within a
traditional item"


This is the part that I am trying to understand and have a solution for.

Thanks,

-Ahmed



On Tue, Mar 13, 2012 at 2:08 PM, Sean Owen <sr...@gmail.com> wrote:

> OK, you have some users. You have some items, and those items have
> attributes.
>
> Nothing here connects users to items though, so how can any process
> estimate any additional user-item connections?
>
> You could compute item-item similarities, but that doesn't resolve this.
>
> Sorry I am really confused -- you have been talking about queries but
> saying you are not using any search. It's hard to help.
>
>

Re: Injecting content into item-item CF

Posted by Sean Owen <sr...@gmail.com>.
OK, you have some users. You have some items, and those items have attributes.

Nothing here connects users to items though, so how can any process
estimate any additional user-item connections?

You could compute item-item similarities, but that doesn't resolve this.

Sorry I am really confused -- you have been talking about queries but
saying you are not using any search. It's hard to help.


On Tue, Mar 13, 2012 at 5:57 PM, Ahmed Abdeen Hamed
<ah...@gmail.com> wrote:
> Thanks again for the response!
>
> Perhaps this is a search problem I will not disagree. However, I am not
> using any search of any sort. I have a bunch of items that I need to derive
> "the implicit preferences" among them using their attributes (genre,
> director, actor, etc). And, right now, I don't have any IR scores which is
> what I try to compute. Just item-id, and user-id. And my final goal is to
> have the following fields in my file to compute similarities and make
> recommendations:
>
> item-id, user-id, score
> 100, 700, 0.787
> 100, 767, 0.653
> .
> .
> .
>
> That's all I have and I don't intend to use any search components.
>
> Sorry if I am not making this hard for you to understand.
>
> -Ahmed
>

Re: Injecting content into item-item CF

Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Thanks again for the response!

Perhaps this is a search problem I will not disagree. However, I am not
using any search of any sort. I have a bunch of items that I need to derive
"the implicit preferences" among them using their attributes (genre,
director, actor, etc). And, right now, I don't have any IR scores which is
what I try to compute. Just item-id, and user-id. And my final goal is to
have the following fields in my file to compute similarities and make
recommendations:

item-id, user-id, score
100, 700, 0.787
100, 767, 0.653
.
.
.

That's all I have and I don't intend to use any search components.

Sorry if I am not making this hard for you to understand.

-Ahmed




On Tue, Mar 13, 2012 at 1:43 PM, Sean Owen <sr...@gmail.com> wrote:

> Before I answer, I want to make sure we're on the same page. You are
> definitely describing a search problem. Was my guess at how you are
> also adding in something recommender-related accurate?
>
> Otherwise we may be talking past each other again.
>
> On Tue, Mar 13, 2012 at 5:35 PM, Ahmed Abdeen Hamed
> <ah...@gmail.com> wrote:
> > Thanks Sean and Ted!
> >
> > Let me explain how I got here in the first place. I have an interest in
> > content-based similarities. When I read the two sections in the MiA book
> > about that, I got some hints. I don't have user preferences and was
> trying
> > to use the content-based similarities in its place as the book explains.
> > Therefore, my question is really about computing this similarity from
> item
> > attributes. How can I do that without the use of search queries?
> >
> > Thanks,
> >
> > -Ahmed
>

Re: Injecting content into item-item CF

Posted by Sean Owen <sr...@gmail.com>.
Before I answer, I want to make sure we're on the same page. You are
definitely describing a search problem. Was my guess at how you are
also adding in something recommender-related accurate?

Otherwise we may be talking past each other again.

On Tue, Mar 13, 2012 at 5:35 PM, Ahmed Abdeen Hamed
<ah...@gmail.com> wrote:
> Thanks Sean and Ted!
>
> Let me explain how I got here in the first place. I have an interest in
> content-based similarities. When I read the two sections in the MiA book
> about that, I got some hints. I don't have user preferences and was trying
> to use the content-based similarities in its place as the book explains.
> Therefore, my question is really about computing this similarity from item
> attributes. How can I do that without the use of search queries?
>
> Thanks,
>
> -Ahmed

Re: Injecting content into item-item CF

Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Thanks Sean and Ted!

Let me explain how I got here in the first place. I have an interest in
content-based similarities. When I read the two sections in the MiA book
about that, I got some hints. I don't have user preferences and was trying
to use the content-based similarities in its place as the book explains.
Therefore, my question is really about computing this similarity from item
attributes. How can I do that without the use of search queries?

Thanks,

-Ahmed



On Tue, Mar 13, 2012 at 10:36 AM, Ted Dunning <te...@gmail.com> wrote:

> This is search, not recommendation.
>
> For search, you need to build and index (which can be built off-line).  In
> the process of building that index, you can propagate content terms across
> highly similar (behaviorally) items and you can include references to and
> from similar items.
>
> Content-based recommendation uses content attributes on items to refine
> the item-item similarities and uses content attributes on users to help
> access those similarities.  Often one uses a search engine such as solr to
> augment the real-time side of the implementation.
>
>
> On Tue, Mar 13, 2012 at 9:28 AM, Ahmed Abdeen Hamed <
> ahmed.elmasri@gmail.com> wrote:
>
>> Hi Sean,
>>
>> I did some reading before writing so I can ask more specific questions.
>> The
>> MiA book has a couple of sections that cover content-based. The move
>> attributes examples make sense. However, it appears to me that the
>> similarity can not be computed offline. This is because the similarity is
>> depended on a user query that will be entered in real time. For instance,
>> assume that we have two different movies in our database we would like to
>> recommend, among other movies along with the genre:
>>
>> The Matrix, Action Adventure
>> The Matrix of Power, Documentary
>> Matrix Method, Sports and Fitness
>> The Matrix Reloaded, Action Adventure
>>
>> Now if the user query was "matrix sports" the similarity will be higher
>> for
>> Matrix Method movie than the Matrix Reloaded. But these similarities will
>> only be available after the user enters the query.
>>
>> My question now is: is there a way to compute these similarities offline?
>>
>> Thanks very much,
>>
>> -Ahmed
>>
>>
>>
>>
>>
>> On Tue, Mar 6, 2012 at 5:14 PM, Sean Owen <sr...@gmail.com> wrote:
>>
>> > Sure, you just write your own ItemSimilarity implementation based on
>> > the content, whatever that may be. what you do there is mostly up to
>> > you; there's not a framework for this.
>> >
>> >
>>
>
>

Re: Injecting content into item-item CF

Posted by Ted Dunning <te...@gmail.com>.
This is search, not recommendation.

For search, you need to build and index (which can be built off-line).  In
the process of building that index, you can propagate content terms across
highly similar (behaviorally) items and you can include references to and
from similar items.

Content-based recommendation uses content attributes on items to refine the
item-item similarities and uses content attributes on users to help access
those similarities.  Often one uses a search engine such as solr to augment
the real-time side of the implementation.

On Tue, Mar 13, 2012 at 9:28 AM, Ahmed Abdeen Hamed <ahmed.elmasri@gmail.com
> wrote:

> Hi Sean,
>
> I did some reading before writing so I can ask more specific questions. The
> MiA book has a couple of sections that cover content-based. The move
> attributes examples make sense. However, it appears to me that the
> similarity can not be computed offline. This is because the similarity is
> depended on a user query that will be entered in real time. For instance,
> assume that we have two different movies in our database we would like to
> recommend, among other movies along with the genre:
>
> The Matrix, Action Adventure
> The Matrix of Power, Documentary
> Matrix Method, Sports and Fitness
> The Matrix Reloaded, Action Adventure
>
> Now if the user query was "matrix sports" the similarity will be higher for
> Matrix Method movie than the Matrix Reloaded. But these similarities will
> only be available after the user enters the query.
>
> My question now is: is there a way to compute these similarities offline?
>
> Thanks very much,
>
> -Ahmed
>
>
>
>
>
> On Tue, Mar 6, 2012 at 5:14 PM, Sean Owen <sr...@gmail.com> wrote:
>
> > Sure, you just write your own ItemSimilarity implementation based on
> > the content, whatever that may be. what you do there is mostly up to
> > you; there's not a framework for this.
> >
> >
>

Re: Injecting content into item-item CF

Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Hi Sean,

I did some reading before writing so I can ask more specific questions. The
MiA book has a couple of sections that cover content-based. The move
attributes examples make sense. However, it appears to me that the
similarity can not be computed offline. This is because the similarity is
depended on a user query that will be entered in real time. For instance,
assume that we have two different movies in our database we would like to
recommend, among other movies along with the genre:

The Matrix, Action Adventure
The Matrix of Power, Documentary
Matrix Method, Sports and Fitness
The Matrix Reloaded, Action Adventure

Now if the user query was "matrix sports" the similarity will be higher for
Matrix Method movie than the Matrix Reloaded. But these similarities will
only be available after the user enters the query.

My question now is: is there a way to compute these similarities offline?

Thanks very much,

-Ahmed





On Tue, Mar 6, 2012 at 5:14 PM, Sean Owen <sr...@gmail.com> wrote:

> Sure, you just write your own ItemSimilarity implementation based on
> the content, whatever that may be. what you do there is mostly up to
> you; there's not a framework for this.
>
>

Re: Injecting content into item-item CF

Posted by Sean Owen <sr...@gmail.com>.
Sure, you just write your own ItemSimilarity implementation based on
the content, whatever that may be. what you do there is mostly up to
you; there's not a framework for this.

On Tue, Mar 6, 2012 at 10:09 PM, Ahmed Abdeen Hamed
<ah...@gmail.com> wrote:
> Hello friends,
>
> Is there an example on how you can inject intem attributes into a item-item
> similarity algorithm?
>
> Thanks very much,
>
> -Ahmed