You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by mrkahvi <mr...@yahoo.co.id> on 2011/10/26 07:38:45 UTC

cold-start and attribute based ItemSimilarity implementation

Dear Mahout Team,
I'm new to Mahout...
Most of explanations about using Mahout i've found are discussing how to
make recommendation using CF. 

Here I wish to create a recommender system using Mahout that makes use of an
item ID to decide which user IDs would be relevant to the item. The item
would be recommended as soon as it is available in the database. But using
CF becomes a problem since in this case, a new item has no sufficien info,
like ratings, buys, and so on. 
Sean Owen hinted me to construct Item-Similarity based on attribute, not
ratings. I see.. But i 'm still confused how to do so in Mahout, since
ItemSimilarity is usually constructed by passing DataModel object that is
based on item ratings (user_id, item_id, rating, and timestamp). 
He  also suggested me to ask here, so i hope anybody of you can help me to
solve this problem. Thanks before..

--
View this message in context: http://lucene.472066.n3.nabble.com/cold-start-and-attribute-based-ItemSimilarity-implementation-tp3453699p3453699.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: cold-start and attribute based ItemSimilarity implementation

Posted by Daniel Xiaodan Zhou <da...@gmail.com>.
I'm thinking to use Amazon Mechanical Turk (mturk.com) for cold-start problem, that is, to hire some cheap mturk workers to generate some initial ratings. Does anyone have experience on this?

Daniel


On Oct 26, 2011, at 5:15 AM, Sean Owen wrote:

> I suggested that you write your own ItemSimilarity implementation,
> that can be based on anything you want. That is the part that is
> mostly up to you.
> 
> You'd have to say what your items are, and what their attributes are,
> to get ideas about how to define a similarity metric based on
> attributes. Are there tags or categories for the items, for example?
> if so you could write a similarity metric that uses overlap in
> category or tag.
> 
> On Wed, Oct 26, 2011 at 6:38 AM, mrkahvi <mr...@yahoo.co.id> wrote:
>> Dear Mahout Team,
>> I'm new to Mahout...
>> Most of explanations about using Mahout i've found are discussing how to
>> make recommendation using CF.
>> 
>> Here I wish to create a recommender system using Mahout that makes use of an
>> item ID to decide which user IDs would be relevant to the item. The item
>> would be recommended as soon as it is available in the database. But using
>> CF becomes a problem since in this case, a new item has no sufficien info,
>> like ratings, buys, and so on.
>> Sean Owen hinted me to construct Item-Similarity based on attribute, not
>> ratings. I see.. But i 'm still confused how to do so in Mahout, since
>> ItemSimilarity is usually constructed by passing DataModel object that is
>> based on item ratings (user_id, item_id, rating, and timestamp).
>> He  also suggested me to ask here, so i hope anybody of you can help me to
>> solve this problem. Thanks before..
>> 
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/cold-start-and-attribute-based-ItemSimilarity-implementation-tp3453699p3453699.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.
>> 


Re: cold-start and attribute based ItemSimilarity implementation

Posted by Sean Owen <sr...@gmail.com>.
The simplest sort of metric might be to return 1 if two items are in
the same category, or 0 if not. Or if the items have tags, and share t
tags, return 1 - 1/(1+t) or something. There are much better ideas,
but these are the sorts of things you might start with to experiment.

On Thu, Oct 27, 2011 at 3:02 AM, kahfi Muhammad <mr...@yahoo.co.id> wrote:
> Hi lee and sean, can you provide me a simple example. Say, i want to recommend an event, based on its
> category, location, etc...
> Ok, will try to implement it myself  if i can understand how to use it from the example.

Bls: cold-start and attribute based ItemSimilarity implementation

Posted by kahfi Muhammad <mr...@yahoo.co.id>.
Hi lee and sean, can you provide me a simple example. Say, i want to recommend an event, based on its 
category, location, etc...
Ok, will try to implement it myself  if i can understand how to use it from the example.


________________________________


If you are going to use product attributes maybe take a look at solr's
more like this (mlt) request handler

I know its a complete new set of infra-structure :-) but it plays very nicely

http://wiki.apache.org/solr/MoreLikeThis

As for creating a custom item similarity - I've literally just been
following an example of doing this in
the mahout in action book (takes an item property and returns
similarities of -1 , 1 or zero if the two items share the same
attribute value or not - 0 is if the attribute is missing. It works
for this made up trivial example but the amount of domain logic you
will need to encode into the custom class will not be maintainable(??)
against a set of real world set of products. who comes up
with the similarity heuristic ? not IT for sure, comparing lightbulbs
with table lamps might need a different set of rules comparing tables
with curtains.
some times colour is key some times size is etc etc.

By using solr with mlt and edismax etc you may stand a better chance
of making a more effective, more maintainable solution.

get the book though as the custom item similarity is great stuff.

cheers lee c

On 26 October 2011 10:15, Sean Owen <sr...@gmail.com> wrote:
> I suggested that you write your own ItemSimilarity implementation,
> that can be based on anything you want. That is the part that is
> mostly up to you.
>
> You'd have to say what your items are, and what their attributes are,
> to get ideas about how to define a similarity metric based on
> attributes. Are there tags or categories for the items, for example?
> if so you could write a similarity metric that uses overlap in
> category or tag.
>
> On Wed, Oct 26, 2011 at 6:38 AM, mrkahvi <mr...@yahoo.co.id> wrote:
>> Dear Mahout Team,
>> I'm new to Mahout...
>> Most of explanations about using Mahout i've found are discussing how to
>> make recommendation using CF.
>>
>> Here I wish to create a recommender system using Mahout that makes use of an
>> item ID to decide which user IDs would be relevant to the item. The item
>> would be recommended as soon as it is available in the database. But using
>> CF becomes a problem since in this case, a new item has no sufficien info,
>> like ratings, buys, and so on.
>> Sean Owen hinted me to construct Item-Similarity based on attribute, not
>> ratings. I see.. But i 'm still confused how to do so in Mahout, since
>> ItemSimilarity is usually constructed by passing DataModel object that is
>> based on item ratings (user_id, item_id, rating, and timestamp).
>> He  also suggested me to ask here, so i hope anybody of you can help me to
>> solve this problem. Thanks before..
>>
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/cold-start-and-attribute-based-ItemSimilarity-implementation-tp3453699p3453699.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.
>>
>

Re: cold-start and attribute based ItemSimilarity implementation

Posted by lee carroll <le...@googlemail.com>.
If you are going to use product attributes maybe take a look at solr's
more like this (mlt) request handler

I know its a complete new set of infra-structure :-) but it plays very nicely

http://wiki.apache.org/solr/MoreLikeThis

As for creating a custom item similarity - I've literally just been
following an example of doing this in
the mahout in action book (takes an item property and returns
similarities of -1 , 1 or zero if the two items share the same
attribute value or not - 0 is if the attribute is missing. It works
for this made up trivial example but the amount of domain logic you
will need to encode into the custom class will not be maintainable(??)
against a set of real world set of products. who comes up
with the similarity heuristic ? not IT for sure, comparing lightbulbs
with table lamps might need a different set of rules comparing tables
with curtains.
some times colour is key some times size is etc etc.

By using solr with mlt and edismax etc you may stand a better chance
of making a more effective, more maintainable solution.

get the book though as the custom item similarity is great stuff.

cheers lee c

On 26 October 2011 10:15, Sean Owen <sr...@gmail.com> wrote:
> I suggested that you write your own ItemSimilarity implementation,
> that can be based on anything you want. That is the part that is
> mostly up to you.
>
> You'd have to say what your items are, and what their attributes are,
> to get ideas about how to define a similarity metric based on
> attributes. Are there tags or categories for the items, for example?
> if so you could write a similarity metric that uses overlap in
> category or tag.
>
> On Wed, Oct 26, 2011 at 6:38 AM, mrkahvi <mr...@yahoo.co.id> wrote:
>> Dear Mahout Team,
>> I'm new to Mahout...
>> Most of explanations about using Mahout i've found are discussing how to
>> make recommendation using CF.
>>
>> Here I wish to create a recommender system using Mahout that makes use of an
>> item ID to decide which user IDs would be relevant to the item. The item
>> would be recommended as soon as it is available in the database. But using
>> CF becomes a problem since in this case, a new item has no sufficien info,
>> like ratings, buys, and so on.
>> Sean Owen hinted me to construct Item-Similarity based on attribute, not
>> ratings. I see.. But i 'm still confused how to do so in Mahout, since
>> ItemSimilarity is usually constructed by passing DataModel object that is
>> based on item ratings (user_id, item_id, rating, and timestamp).
>> He  also suggested me to ask here, so i hope anybody of you can help me to
>> solve this problem. Thanks before..
>>
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/cold-start-and-attribute-based-ItemSimilarity-implementation-tp3453699p3453699.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.
>>
>

Re: cold-start and attribute based ItemSimilarity implementation

Posted by Sean Owen <sr...@gmail.com>.
I suggested that you write your own ItemSimilarity implementation,
that can be based on anything you want. That is the part that is
mostly up to you.

You'd have to say what your items are, and what their attributes are,
to get ideas about how to define a similarity metric based on
attributes. Are there tags or categories for the items, for example?
if so you could write a similarity metric that uses overlap in
category or tag.

On Wed, Oct 26, 2011 at 6:38 AM, mrkahvi <mr...@yahoo.co.id> wrote:
> Dear Mahout Team,
> I'm new to Mahout...
> Most of explanations about using Mahout i've found are discussing how to
> make recommendation using CF.
>
> Here I wish to create a recommender system using Mahout that makes use of an
> item ID to decide which user IDs would be relevant to the item. The item
> would be recommended as soon as it is available in the database. But using
> CF becomes a problem since in this case, a new item has no sufficien info,
> like ratings, buys, and so on.
> Sean Owen hinted me to construct Item-Similarity based on attribute, not
> ratings. I see.. But i 'm still confused how to do so in Mahout, since
> ItemSimilarity is usually constructed by passing DataModel object that is
> based on item ratings (user_id, item_id, rating, and timestamp).
> He  also suggested me to ask here, so i hope anybody of you can help me to
> solve this problem. Thanks before..
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/cold-start-and-attribute-based-ItemSimilarity-implementation-tp3453699p3453699.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>