You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by charlysf <ch...@gmail.com> on 2009/06/22 22:15:32 UTC

Would like some recommendation, need advice

Hello,

I would like some advice, now I have these tables in MYSQL :

User_subject
user_id, subject_id, relevance

Item_subject
item_id, subject_id

I would like some advice to have some recommendations.

Now, to compute the user similarity, I made a JDBCDataModel for the table
User_subject.
To compute the item similarity, I made the same, for the table item_subject.

Now, I have my similarity between users, and between items.
Do I need to make a table like that :

user_item
user_id, item_id, relevance

I will have millions of rows, and I think it could be very slow no ?

Thank you very much,
-- 
View this message in context: http://www.nabble.com/Would-like-some-recommendation%2C-need-advice-tp24154572p24154572.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Would like some recommendation, need advice

Posted by Sean Owen <sr...@gmail.com>.

On Mon, Jun 22, 2009 at 4:50 PM, charlysf<ch...@gmail.com> wrote:
>
> Thank you very much, in this case, I will give the best relevant articles,
> but not really some new recommendations, in this case, I maybe should make
> some recommendations about new subjects for an user, with an User Based
> recommendation, and then make the same to retrieve linked articles.

Yes you could use the framework to recommend subjects, rather than
items, to users. Or, to recommend items to subjects. Anything could be
construed as a recommendation problem. I think it's interesting to
experiment with these to see what kind of results you get.

> That's my first part, implicit recommendation, and for explicit
> recommendation, i have a user_feedback table, with : user_id, article_id,
> rating, so for that this is very common and not a problem.
>
> I'm wondering if I should choose a user based or an item basis engine for
> that, what is your advice ?

If you have more users than items, I would try a user-based
recommender. And vice versa.

You should also try the SlopeOneRecommender.

I think it's worthwhile to try many variations and see which performs
best. You can use the "Evaluator" classes to decide how well the
recommenders are predicting ratings.

Re: Would like some recommendation, need advice

Posted by charlysf <ch...@gmail.com>.

Thank you very much, in this case, I will give the best relevant articles,
but not really some new recommendations, in this case, I maybe should make
some recommendations about new subjects for an user, with an User Based
recommendation, and then make the same to retrieve linked articles.

That's it ?

That's my first part, implicit recommendation, and for explicit
recommendation, i have a user_feedback table, with : user_id, article_id,
rating, so for that this is very common and not a problem.

I'm wondering if I should choose a user based or an item basis engine for
that, what is your advice ?

Thanks! 

Ted Dunning wrote:
> 
> Indeed not.  But it *is* a case of the product architecture for
> recommendations I was nattering about.
> 
> The problem here is how to compute the (user x topic)  x  (item x topic)'
> product efficiently.  This can be done pretty well with either hadoop or
> SQL.  In Pig or native map-reduce, the trick is to group by the topic and
> then group by (user, item), summing the results as you go.  If either user
> x
> topic or item x topic is small then a map-side join is good for the first
> group-by operation.  If not, then doing two full-scale map-reduce
> operations
> is no big deal.
> 
> You should consider how to weight different relevances, probably according
> to overall frequency in the corpus.
> 
> On Mon, Jun 22, 2009 at 1:32 PM, Sean Owen <sr...@gmail.com> wrote:
> 
>> I see. This almost is not a 'classic' recommendation problem. If you
>> have user-subject similarity, and subject-item similarity already,
>> then user-item similarity is probably just the product of the two? so
>> you can recommend items by ordering by similarity.
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Would-like-some-recommendation%2C-need-advice-tp24154572p24155217.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Would like some recommendation, need advice

Posted by Ted Dunning <te...@gmail.com>.

Indeed not.  But it *is* a case of the product architecture for
recommendations I was nattering about.

The problem here is how to compute the (user x topic)  x  (item x topic)'
product efficiently.  This can be done pretty well with either hadoop or
SQL.  In Pig or native map-reduce, the trick is to group by the topic and
then group by (user, item), summing the results as you go.  If either user x
topic or item x topic is small then a map-side join is good for the first
group-by operation.  If not, then doing two full-scale map-reduce operations
is no big deal.

You should consider how to weight different relevances, probably according
to overall frequency in the corpus.

On Mon, Jun 22, 2009 at 1:32 PM, Sean Owen <sr...@gmail.com> wrote:

> I see. This almost is not a 'classic' recommendation problem. If you
> have user-subject similarity, and subject-item similarity already,
> then user-item similarity is probably just the product of the two? so
> you can recommend items by ordering by similarity.
>

Re: Would like some recommendation, need advice

Posted by Sean Owen <sr...@gmail.com>.

I see. This almost is not a 'classic' recommendation problem. If you
have user-subject similarity, and subject-item similarity already,
then user-item similarity is probably just the product of the two? so
you can recommend items by ordering by similarity.

That is, in particular, it sounds like you don't really have a notion
of 'ratings' from users to items, which is what this library really is
all about.

Am I right, or do you have user ratings?

On Mon, Jun 22, 2009 at 4:24 PM, charlysf<ch...@gmail.com> wrote:
>
> Thank you,
>
> In fact, I need the similarity in database to be able to give the
> neighborhood and similar users and similar items.
>
> Right now, for my users, I have this table : user_a_id, user_b_id,
> similarity
> And I have the same for items.
>
> I would like to do some implicit recommendation, and I know that an user is
> linked to some topics, and an item too. A topic is what I call my subject.
> That's why I have a link between users and subjects and items and subjects.

Re: Would like some recommendation, need advice

Posted by charlysf <ch...@gmail.com>.

Thank you,

In fact, I need the similarity in database to be able to give the
neighborhood and similar users and similar items.

Right now, for my users, I have this table : user_a_id, user_b_id,
similarity
And I have the same for items.

I would like to do some implicit recommendation, and I know that an user is
linked to some topics, and an item too. A topic is what I call my subject.
That's why I have a link between users and subjects and items and subjects.


srowen wrote:
> 
> It sounds like you want to pre-compute, and then save, the similarity
> between each pair of items, and each pair of users? Yes you can do
> that. You don't have to do that if you don't want to. Already you are
> using things like TanimotoCoefficientSimilarity, which compute
> similarity dynamically based on the data tables.
> 
> If you did want to make your own table to store these things, you
> would also have to write a custom UserSimilarity or ItemSimilarity
> class to read from that table. That is fairly easy.
> 
> But I think your table would be more like this:
> 
> user_a_id, user_b_id, similarity
> 
> right?
> 
> 
> I may be misunderstanding what you are trying to do, since it seems
> like you are doing something a little non-standard. Normally you have
> one data table, like:
> 
> user_id, item_id, preference
> 
> You have this extra notion of 'subject'. If you explain how this fits
> in, maybe I can provide some better advice.
> 
> 
> Sean
> 
> 
> On Mon, Jun 22, 2009 at 4:15 PM, charlysf<ch...@gmail.com> wrote:
>>
>> Hello,
>>
>> I would like some advice, now I have these tables in MYSQL :
>>
>> User_subject
>> user_id, subject_id, relevance
>>
>> Item_subject
>> item_id, subject_id
>>
>> I would like some advice to have some recommendations.
>>
>> Now, to compute the user similarity, I made a JDBCDataModel for the table
>> User_subject.
>> To compute the item similarity, I made the same, for the table
>> item_subject.
>>
>> Now, I have my similarity between users, and between items.
>> Do I need to make a table like that :
>>
>> user_item
>> user_id, item_id, relevance
>>
>> I will have millions of rows, and I think it could be very slow no ?
>>
>> Thank you very much,
>> --
>> View this message in context:
>> http://www.nabble.com/Would-like-some-recommendation%2C-need-advice-tp24154572p24154572.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Would-like-some-recommendation%2C-need-advice-tp24154572p24154737.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Would like some recommendation, need advice

Posted by Sean Owen <sr...@gmail.com>.

It sounds like you want to pre-compute, and then save, the similarity
between each pair of items, and each pair of users? Yes you can do
that. You don't have to do that if you don't want to. Already you are
using things like TanimotoCoefficientSimilarity, which compute
similarity dynamically based on the data tables.

If you did want to make your own table to store these things, you
would also have to write a custom UserSimilarity or ItemSimilarity
class to read from that table. That is fairly easy.

But I think your table would be more like this:

user_a_id, user_b_id, similarity

right?

I may be misunderstanding what you are trying to do, since it seems
like you are doing something a little non-standard. Normally you have
one data table, like:

user_id, item_id, preference

You have this extra notion of 'subject'. If you explain how this fits
in, maybe I can provide some better advice.

Sean

On Mon, Jun 22, 2009 at 4:15 PM, charlysf<ch...@gmail.com> wrote:
>
> Hello,
>
> I would like some advice, now I have these tables in MYSQL :
>
> User_subject
> user_id, subject_id, relevance
>
> Item_subject
> item_id, subject_id
>
> I would like some advice to have some recommendations.
>
> Now, to compute the user similarity, I made a JDBCDataModel for the table
> User_subject.
> To compute the item similarity, I made the same, for the table item_subject.
>
> Now, I have my similarity between users, and between items.
> Do I need to make a table like that :
>
> user_item
> user_id, item_id, relevance
>
> I will have millions of rows, and I think it could be very slow no ?
>
> Thank you very much,
> --
> View this message in context: http://www.nabble.com/Would-like-some-recommendation%2C-need-advice-tp24154572p24154572.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>
>