You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Agata Filiana <a....@gmail.com> on 2013/03/15 17:37:47 UTC

Boosting User-Based with the user's attributes

Hi,

I'm fairly new to Mahout. Right now I am experimenting Mahout by trying to
build a simple recommendation system. What I have is just a boolean data
set, with only the userID and itemID. I understand that for this case I
have to use GenericBooleanPrefUserBasedRecommender - which I have and works
fine.

Apart from the userID and itemID data, I also have the user's attributes
(their age, gender, list of interests). I would like to combine this into
the recommendation system to increase the performance of the recommender.
Is this possible to do or am I trying something that does not make sense?

It would be great if you can give me any inputs or ideas for this. (Or any
good read based on this matter)

Thank you!

Regards,

*Agata Filiana*
Erasmus Mundus Student

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

I see it makes more sense with geometric mean. And with weight, if I want
to apply say 70% for Sim1 and 30% for Sim2, would it also make sense to
have it like this? The result should be around 0.194.

*

Agata Filiana
Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
*


On 17 April 2013 17:00, Sean Owen <sr...@gmail.com> wrote:

> If all of your similarities are a product like this, then they're all
> "low". In a relative sense this is fine.
> But this is also why I proposed a geometric mean instead. For example
> the geometric mean of these is about 0.424 and this notion can be
> extended to include weights as well, which is what may make it
> particularly interesting to you since you mentioned weighting.
>
> On Wed, Apr 17, 2013 at 3:56 PM, Agata Filiana <a....@gmail.com>
> wrote:
> > Just a thought, when you say to combine the metrics by multiplying their,
> > for example Sim1 = 0.9 and Sim2 = 0.2
> > When they are multiplied it would give a result of 0.18 which is very
> low,
> > remembering that they are pretty "similar" based on Sim1 - how can this
> > problem be tackled?
> >
> > *
> >
> > Agata Filiana
> > Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
> > *
> >
> >
> > On 16 April 2013 16:41, Agata Filiana <a....@gmail.com> wrote:
> >
> >> Thanks a lot for the insight,very useful!
> >>
> >>
> >> *
> >>
> >> Agata Filiana
> >> Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
> >> *
> >>
> >>
> >> On 16 April 2013 16:40, Sean Owen <sr...@gmail.com> wrote:
> >>
> >>> Of course it's not meaningless. They provide a basis for ranking
> >>> items, so you can return top-K recommendations.
> >>> If it's normally based on similarity and ratings -- and you have no
> >>> ratings -- similarity is of course the only thing you can base the
> >>> result on.
> >>>
> >>> On Tue, Apr 16, 2013 at 3:36 PM, Agata Filiana <a....@gmail.com>
> >>> wrote:
> >>> > Well right now, I am only using one boolean file -just from from this
> >>> > history of reading.
> >>> > So you are saying the values generated in
> >>> > the GenericBooleanPrefUserBasedRecommender is actually useless in
> this
> >>> case
> >>> > of no ratings and that it is merely based on the similarity only?
> >>>
> >>
> >>
>

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

If all of your similarities are a product like this, then they're all
"low". In a relative sense this is fine.
But this is also why I proposed a geometric mean instead. For example
the geometric mean of these is about 0.424 and this notion can be
extended to include weights as well, which is what may make it
particularly interesting to you since you mentioned weighting.

On Wed, Apr 17, 2013 at 3:56 PM, Agata Filiana <a....@gmail.com> wrote:
> Just a thought, when you say to combine the metrics by multiplying their,
> for example Sim1 = 0.9 and Sim2 = 0.2
> When they are multiplied it would give a result of 0.18 which is very low,
> remembering that they are pretty "similar" based on Sim1 - how can this
> problem be tackled?
>
> *
>
> Agata Filiana
> Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
> *
>
>
> On 16 April 2013 16:41, Agata Filiana <a....@gmail.com> wrote:
>
>> Thanks a lot for the insight,very useful!
>>
>>
>> *
>>
>> Agata Filiana
>> Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
>> *
>>
>>
>> On 16 April 2013 16:40, Sean Owen <sr...@gmail.com> wrote:
>>
>>> Of course it's not meaningless. They provide a basis for ranking
>>> items, so you can return top-K recommendations.
>>> If it's normally based on similarity and ratings -- and you have no
>>> ratings -- similarity is of course the only thing you can base the
>>> result on.
>>>
>>> On Tue, Apr 16, 2013 at 3:36 PM, Agata Filiana <a....@gmail.com>
>>> wrote:
>>> > Well right now, I am only using one boolean file -just from from this
>>> > history of reading.
>>> > So you are saying the values generated in
>>> > the GenericBooleanPrefUserBasedRecommender is actually useless in this
>>> case
>>> > of no ratings and that it is merely based on the similarity only?
>>>
>>
>>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Just a thought, when you say to combine the metrics by multiplying their,
for example Sim1 = 0.9 and Sim2 = 0.2
When they are multiplied it would give a result of 0.18 which is very low,
remembering that they are pretty "similar" based on Sim1 - how can this
problem be tackled?

*

Agata Filiana
Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
*


On 16 April 2013 16:41, Agata Filiana <a....@gmail.com> wrote:

> Thanks a lot for the insight,very useful!
>
>
> *
>
> Agata Filiana
> Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
> *
>
>
> On 16 April 2013 16:40, Sean Owen <sr...@gmail.com> wrote:
>
>> Of course it's not meaningless. They provide a basis for ranking
>> items, so you can return top-K recommendations.
>> If it's normally based on similarity and ratings -- and you have no
>> ratings -- similarity is of course the only thing you can base the
>> result on.
>>
>> On Tue, Apr 16, 2013 at 3:36 PM, Agata Filiana <a....@gmail.com>
>> wrote:
>> > Well right now, I am only using one boolean file -just from from this
>> > history of reading.
>> > So you are saying the values generated in
>> > the GenericBooleanPrefUserBasedRecommender is actually useless in this
>> case
>> > of no ratings and that it is merely based on the similarity only?
>>
>
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Thanks a lot for the insight,very useful!


*

Agata Filiana
Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
*


On 16 April 2013 16:40, Sean Owen <sr...@gmail.com> wrote:

> Of course it's not meaningless. They provide a basis for ranking
> items, so you can return top-K recommendations.
> If it's normally based on similarity and ratings -- and you have no
> ratings -- similarity is of course the only thing you can base the
> result on.
>
> On Tue, Apr 16, 2013 at 3:36 PM, Agata Filiana <a....@gmail.com>
> wrote:
> > Well right now, I am only using one boolean file -just from from this
> > history of reading.
> > So you are saying the values generated in
> > the GenericBooleanPrefUserBasedRecommender is actually useless in this
> case
> > of no ratings and that it is merely based on the similarity only?
>

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

Of course it's not meaningless. They provide a basis for ranking
items, so you can return top-K recommendations.
If it's normally based on similarity and ratings -- and you have no
ratings -- similarity is of course the only thing you can base the
result on.

On Tue, Apr 16, 2013 at 3:36 PM, Agata Filiana <a....@gmail.com> wrote:
> Well right now, I am only using one boolean file -just from from this
> history of reading.
> So you are saying the values generated in
> the GenericBooleanPrefUserBasedRecommender is actually useless in this case
> of no ratings and that it is merely based on the similarity only?

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Well right now, I am only using one boolean file -just from from this
history of reading.
So you are saying the values generated in
the GenericBooleanPrefUserBasedRecommender is actually useless in this case
of no ratings and that it is merely based on the similarity only?

*

Agata Filiana
Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
*


On 16 April 2013 16:09, Sean Owen <sr...@gmail.com> wrote:

> In the usual recommender, the output is a weighted average of ratings.
> In a model where there are no ratings, this has no meaning --
> everything is "1" implicitly. So the output is something else, and
> here it's a sum of similarities actually.
>
> On Tue, Apr 16, 2013 at 3:05 PM, Agata Filiana <a....@gmail.com>
> wrote:
> > Sorry my mistake!
> > LogLikelihoodSimilarity is giving me [0,1], however when I
> > call GenericBooleanPrefUserBasedRecommender for the recommendation it is
> > not giving me values [0,1]. That's what I meant.
> >
>

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

In the usual recommender, the output is a weighted average of ratings.
In a model where there are no ratings, this has no meaning --
everything is "1" implicitly. So the output is something else, and
here it's a sum of similarities actually.

On Tue, Apr 16, 2013 at 3:05 PM, Agata Filiana <a....@gmail.com> wrote:
> Sorry my mistake!
> LogLikelihoodSimilarity is giving me [0,1], however when I
> call GenericBooleanPrefUserBasedRecommender for the recommendation it is
> not giving me values [0,1]. That's what I meant.
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Sorry my mistake!
LogLikelihoodSimilarity is giving me [0,1], however when I
call GenericBooleanPrefUserBasedRecommender for the recommendation it is
not giving me values [0,1]. That's what I meant.


*

Agata Filiana
Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
*


On 16 April 2013 15:58, Sean Owen <sr...@gmail.com> wrote:

> That shouldn't be possible, are you sure? it's 1 - 1/(1+LLR) where LLR
> is nonnegative.
> Similarities are in [-1,1] and some are in [0,1].
>
> On Tue, Apr 16, 2013 at 2:51 PM, Agata Filiana <a....@gmail.com>
> wrote:
> > Hi Sean,
> >
> > I see your point.
> > I think I better experiment with those different options.
> >
> > I'd also like to ask if the result of LogLikelihoodSimilarity is between
> > [0,1] ? It seems that I'm getting results higher than 1. So if like you
> > said combining the different attributes can be done by multiplying them
> and
> > normalizing them to [0,1] - what is the best method for normalization?
> >
>

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

That shouldn't be possible, are you sure? it's 1 - 1/(1+LLR) where LLR
is nonnegative.
Similarities are in [-1,1] and some are in [0,1].

On Tue, Apr 16, 2013 at 2:51 PM, Agata Filiana <a....@gmail.com> wrote:
> Hi Sean,
>
> I see your point.
> I think I better experiment with those different options.
>
> I'd also like to ask if the result of LogLikelihoodSimilarity is between
> [0,1] ? It seems that I'm getting results higher than 1. So if like you
> said combining the different attributes can be done by multiplying them and
> normalizing them to [0,1] - what is the best method for normalization?
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Hi Sean,

I see your point.
I think I better experiment with those different options.

I'd also like to ask if the result of LogLikelihoodSimilarity is between
[0,1] ? It seems that I'm getting results higher than 1. So if like you
said combining the different attributes can be done by multiplying them and
normalizing them to [0,1] - what is the best method for normalization?



*

Agata Filiana
Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
*


On 16 April 2013 12:30, Sean Owen <sr...@gmail.com> wrote:

> Broadly the idea makes sense, but I think this is getting into hacking
> heuristics together without a lot of principle. The result will
> probably work, and you can just proceed as you say -- make up some
> weights and use them to weight the various similarities. If you are
> using the product of similarity values, you can compute something like
> a weighted geometric mean.
> https://en.wikipedia.org/wiki/Geometric_mean
>
> A step in a more principled direction is to consider these various
> things as "items" -- things you read, hobbies you engage in, interests
> you have. Then create a recommender on top of all of these things,
> weighting the input differently. The often-mentioned ALS-WR is one of
> several processes that fits since it has an explicit notion of input
> weight.
>
>
> On Tue, Apr 16, 2013 at 11:24 AM, Agata Filiana <a....@gmail.com>
> wrote:
> > Hi,
> >
> > Continuing this discussion - I have the implementation, but I'd like to
> > know your opinion.
> > As I said before, I am creating a new implementation of UserSimilarity as
> > Sean pointed out.
> > Does it make sense to put weights into these metrics? Say I combined 3
> > similarity metrics: reading history, hobbies and interests.
> > I would like my recommender to be "based" on history but boosted with
> > weighted hobbies and interests with different weight, for example
> interests
> > is more important than hobbies.
> >
> > Does that make sense? And how would you go about to implement it if it
> does
> > make sense?
> >
> > Thank you again!
> >
> >
> > *
> >
> > Agata Filiana
> > Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
> > *
> >
> >
> > On 19 March 2013 12:03, Agata Filiana <a....@gmail.com> wrote:
> >
> >> Ok, I will try that.
> >>
> >> Thanks for the help Sean!
> >>
> >>
> >> On 19 March 2013 12:02, Sean Owen <sr...@gmail.com> wrote:
> >>
> >>> Write a new implementation of UserSimilarity that internally calls 2
> other
> >>> similarity metrics with the same arguments when asked for a similarity.
> >>> Return their product.
> >>>
> >>>
> >>> On Tue, Mar 19, 2013 at 6:59 AM, Agata Filiana <a.filiana87@gmail.com
> >>> >wrote:
> >>>
> >>> > I understand that, I guess what I am confused is the implementation
> of
> >>> > merging the two similarity metrics in code. For example I apply
> >>> > LogLikelihoodSimilarity for both item and hobby, and I have 2
> >>> > UserSimilarity metrics. Then from there I am unsure of how to combine
> >>> the
> >>> > two.
> >>> >
> >>> >
> >>>
> >>
> >>
> >>
> >> --
> >> *Agata Filiana
> >> *
> >>
>

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

Broadly the idea makes sense, but I think this is getting into hacking
heuristics together without a lot of principle. The result will
probably work, and you can just proceed as you say -- make up some
weights and use them to weight the various similarities. If you are
using the product of similarity values, you can compute something like
a weighted geometric mean.
https://en.wikipedia.org/wiki/Geometric_mean

A step in a more principled direction is to consider these various
things as "items" -- things you read, hobbies you engage in, interests
you have. Then create a recommender on top of all of these things,
weighting the input differently. The often-mentioned ALS-WR is one of
several processes that fits since it has an explicit notion of input
weight.

On Tue, Apr 16, 2013 at 11:24 AM, Agata Filiana <a....@gmail.com> wrote:
> Hi,
>
> Continuing this discussion - I have the implementation, but I'd like to
> know your opinion.
> As I said before, I am creating a new implementation of UserSimilarity as
> Sean pointed out.
> Does it make sense to put weights into these metrics? Say I combined 3
> similarity metrics: reading history, hobbies and interests.
> I would like my recommender to be "based" on history but boosted with
> weighted hobbies and interests with different weight, for example interests
> is more important than hobbies.
>
> Does that make sense? And how would you go about to implement it if it does
> make sense?
>
> Thank you again!
>
>
> *
>
> Agata Filiana
> Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
> *
>
>
> On 19 March 2013 12:03, Agata Filiana <a....@gmail.com> wrote:
>
>> Ok, I will try that.
>>
>> Thanks for the help Sean!
>>
>>
>> On 19 March 2013 12:02, Sean Owen <sr...@gmail.com> wrote:
>>
>>> Write a new implementation of UserSimilarity that internally calls 2 other
>>> similarity metrics with the same arguments when asked for a similarity.
>>> Return their product.
>>>
>>>
>>> On Tue, Mar 19, 2013 at 6:59 AM, Agata Filiana <a.filiana87@gmail.com
>>> >wrote:
>>>
>>> > I understand that, I guess what I am confused is the implementation of
>>> > merging the two similarity metrics in code. For example I apply
>>> > LogLikelihoodSimilarity for both item and hobby, and I have 2
>>> > UserSimilarity metrics. Then from there I am unsure of how to combine
>>> the
>>> > two.
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> *Agata Filiana
>> *
>>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Hi,

Continuing this discussion - I have the implementation, but I'd like to
know your opinion.
As I said before, I am creating a new implementation of UserSimilarity as
Sean pointed out.
Does it make sense to put weights into these metrics? Say I combined 3
similarity metrics: reading history, hobbies and interests.
I would like my recommender to be "based" on history but boosted with
weighted hobbies and interests with different weight, for example interests
is more important than hobbies.

Does that make sense? And how would you go about to implement it if it does
make sense?

Thank you again!

*

Agata Filiana
Erasmus Mundus DMKM Student 2011-2013 <http://www.em-dmkm.eu/>
*

On 19 March 2013 12:03, Agata Filiana <a....@gmail.com> wrote:

> Ok, I will try that.
>
> Thanks for the help Sean!
>
>
> On 19 March 2013 12:02, Sean Owen <sr...@gmail.com> wrote:
>
>> Write a new implementation of UserSimilarity that internally calls 2 other
>> similarity metrics with the same arguments when asked for a similarity.
>> Return their product.
>>
>>
>> On Tue, Mar 19, 2013 at 6:59 AM, Agata Filiana <a.filiana87@gmail.com
>> >wrote:
>>
>> > I understand that, I guess what I am confused is the implementation of
>> > merging the two similarity metrics in code. For example I apply
>> > LogLikelihoodSimilarity for both item and hobby, and I have 2
>> > UserSimilarity metrics. Then from there I am unsure of how to combine
>> the
>> > two.
>> >
>> >
>>
>
>
>
> --
> *Agata Filiana
> *
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Ok, I will try that.

Thanks for the help Sean!

On 19 March 2013 12:02, Sean Owen <sr...@gmail.com> wrote:

> Write a new implementation of UserSimilarity that internally calls 2 other
> similarity metrics with the same arguments when asked for a similarity.
> Return their product.
>
>
> On Tue, Mar 19, 2013 at 6:59 AM, Agata Filiana <a.filiana87@gmail.com
> >wrote:
>
> > I understand that, I guess what I am confused is the implementation of
> > merging the two similarity metrics in code. For example I apply
> > LogLikelihoodSimilarity for both item and hobby, and I have 2
> > UserSimilarity metrics. Then from there I am unsure of how to combine the
> > two.
> >
> >
>



-- 
*Agata Filiana
*

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

Write a new implementation of UserSimilarity that internally calls 2 other
similarity metrics with the same arguments when asked for a similarity.
Return their product.

On Tue, Mar 19, 2013 at 6:59 AM, Agata Filiana <a....@gmail.com>wrote:

> I understand that, I guess what I am confused is the implementation of
> merging the two similarity metrics in code. For example I apply
> LogLikelihoodSimilarity for both item and hobby, and I have 2
> UserSimilarity metrics. Then from there I am unsure of how to combine the
> two.
>
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

I understand that, I guess what I am confused is the implementation of
merging the two similarity metrics in code. For example I apply
LogLikelihoodSimilarity for both item and hobby, and I have 2
UserSimilarity metrics. Then from there I am unsure of how to combine the
two.

On 18 March 2013 21:49, Sean Owen <sr...@gmail.com> wrote:

> I'm not sure what you mean. The only thing I am suggesting to combine are
> two similarity metrics, not data or recommendations.
> You combine metrics by multiplying their values.
>
>
> On Mon, Mar 18, 2013 at 12:54 PM, Agata Filiana <a.filiana87@gmail.com
> >wrote:
>
> > In this case, would be correct if I somehow "loop" through the item data
> > and the hobby data and then combine the score for a pair of users?
> >
> > I am having trouble in how to combine both similarity into one metric,
> > could you possibly point me out a clue?
> >
> > Thank you
> >
> > On 18 March 2013 14:54, Sean Owen <sr...@gmail.com> wrote:
> >
> > > There is a difference between the recommender and the similarity metric
> > it
> > > uses. My suggestion was to either use your item data with the
> recommender
> > > and hobby data with the similarity metric, or, use both in the
> similarity
> > > metric by making a combined metric.
> > >
> > >
> > > On Mon, Mar 18, 2013 at 9:44 AM, Agata Filiana <a.filiana87@gmail.com
> > > >wrote:
> > >
> > > > I understand how it works logically. However I am having problem
> > > > understanding about the implementation of it and how to get the final
> > > > outcome.
> > > > Say the user's attribute is Hobbies: hobby1,hobby2,hobby3
> > > > So I would make the similarity metric of the users and hobbies.
> > > >
> > > > Then for the CF, using Mahout's
> GenericBooleanPrefUserBasedRecommender
> > > with
> > > > the boolean data set (userID and itemID).
> > > >
> > > > Then somehow combine the two?
> > > >
> > > > But at the end, my goal is to recommend the items in the second data
> > set
> > > > (the itemID, not recommend the hobbies) - does this make sense? Or
> am I
> > > > confusing myself?
> > > >
> > > > Agata
> > > >
> > > >
> > > > On 18 March 2013 14:23, Sean Owen <sr...@gmail.com> wrote:
> > > >
> > > > > You would have to make up the similarity metric separately since it
> > > > depends
> > > > > entirely on how you want to define it.
> > > > > The part of the book you are talking about concerns rescoring,
> which
> > is
> > > > not
> > > > > the same thing.
> > > > > Combine the similarity metrics, I mean, not make two recommenders.
> > > Make a
> > > > > metric that is the product of two other metrics. Normalize both of
> > > those
> > > > > metrics to the range [0,1].
> > > > >
> > > > > Sean
> > > > >
> > > > >
> > > > > On Mon, Mar 18, 2013 at 6:51 AM, Agata Filiana <
> > a.filiana87@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Thank Sean for the response. I like the idea of multiplying the
> > > > > similarity
> > > > > > metric based on
> > > > > > user properties with the one based on CF data.
> > > > > > I understand that I have to create a seperate similarity metric -
> > > can I
> > > > > do
> > > > > > this with the help of Mahout or does this have to be done
> > seperately,
> > > > as
> > > > > in
> > > > > > I have to implement my own similarity measure? It would be great
> if
> > > > there
> > > > > > is some clue on how I get this started.
> > > > > > Is this somehow similar to the subject of *Injecting
> > domain-specific
> > > > > > information* in the book Mahout in Action (with the example of
> the
> > > > > > gender-based item similarity metric)?
> > > > > >
> > > > > > And also how can I multiply the two results - will this affect
> the
> > > > result
> > > > > > of the evaluation of the recommender system? Or it should be
> > > normalized
> > > > > in
> > > > > > a way?
> > > > > >
> > > > > > Thank you and sorry for the basic questions.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Agata Filiana
> > > > > >
> > > > > >
> > > > > > On 16 March 2013 13:41, Sean Owen <sr...@gmail.com> wrote:
> > > > > >
> > > > > > > There are many ways to think about combining these two types of
> > > data.
> > > > > > >
> > > > > > > If you can make some similarity metric based on age, gender and
> > > > > > interests,
> > > > > > > then you can use it as the similarity metric in
> > > > > > > GenericBooleanPrefUserBasedRecommender. You would be using both
> > > data
> > > > > sets
> > > > > > > in some way. Of course this means learning a whole different
> > > > similarity
> > > > > > > metric somehow. A variant on this is to make a similarity
> metric
> > > > based
> > > > > on
> > > > > > > user properties, and also use one based on CF data, and
> multiply
> > > them
> > > > > > > together to make a new combined similarity metric for this
> > > approach.
> > > > > This
> > > > > > > might work OK.
> > > > > > >
> > > > > > > It can also work to treat age and gender and other features as
> > > > > > categorical
> > > > > > > features, and then model them as 'items' that the user
> interacts
> > > > with.
> > > > > > They
> > > > > > > would not have much of an effect here given how many items
> there
> > > are.
> > > > > In
> > > > > > > other models like ALS-WR you can weight these pseudo-items much
> > > more
> > > > > > highly
> > > > > > > and get the desired effect to a degree.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <
> > > > a.filiana87@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I'm fairly new to Mahout. Right now I am experimenting Mahout
> > by
> > > > > trying
> > > > > > > to
> > > > > > > > build a simple recommendation system. What I have is just a
> > > boolean
> > > > > > data
> > > > > > > > set, with only the userID and itemID. I understand that for
> > this
> > > > > case I
> > > > > > > > have to use GenericBooleanPrefUserBasedRecommender - which I
> > have
> > > > and
> > > > > > > works
> > > > > > > > fine.
> > > > > > > >
> > > > > > > > Apart from the userID and itemID data, I also have the user's
> > > > > > attributes
> > > > > > > > (their age, gender, list of interests). I would like to
> combine
> > > > this
> > > > > > into
> > > > > > > > the recommendation system to increase the performance of the
> > > > > > recommender.
> > > > > > > > Is this possible to do or am I trying something that does not
> > > make
> > > > > > sense?
> > > > > > > >
> > > > > > > > It would be great if you can give me any inputs or ideas for
> > > this.
> > > > > (Or
> > > > > > > any
> > > > > > > > good read based on this matter)
> > > > > > > >
> > > > > > > > Thank you!
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > *Agata Filiana*
> > > > > > > > Erasmus Mundus Student
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Agata Filiana
> > > > > > *
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Agata Filiana
> > > > *
> > > >
> > >
> >
> >
> >
> > --
> > *Agata Filiana
> > *
> >
>



-- 
*Agata Filiana
*

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

I'm not sure what you mean. The only thing I am suggesting to combine are
two similarity metrics, not data or recommendations.
You combine metrics by multiplying their values.


On Mon, Mar 18, 2013 at 12:54 PM, Agata Filiana <a....@gmail.com>wrote:

> In this case, would be correct if I somehow "loop" through the item data
> and the hobby data and then combine the score for a pair of users?
>
> I am having trouble in how to combine both similarity into one metric,
> could you possibly point me out a clue?
>
> Thank you
>
> On 18 March 2013 14:54, Sean Owen <sr...@gmail.com> wrote:
>
> > There is a difference between the recommender and the similarity metric
> it
> > uses. My suggestion was to either use your item data with the recommender
> > and hobby data with the similarity metric, or, use both in the similarity
> > metric by making a combined metric.
> >
> >
> > On Mon, Mar 18, 2013 at 9:44 AM, Agata Filiana <a.filiana87@gmail.com
> > >wrote:
> >
> > > I understand how it works logically. However I am having problem
> > > understanding about the implementation of it and how to get the final
> > > outcome.
> > > Say the user's attribute is Hobbies: hobby1,hobby2,hobby3
> > > So I would make the similarity metric of the users and hobbies.
> > >
> > > Then for the CF, using Mahout's GenericBooleanPrefUserBasedRecommender
> > with
> > > the boolean data set (userID and itemID).
> > >
> > > Then somehow combine the two?
> > >
> > > But at the end, my goal is to recommend the items in the second data
> set
> > > (the itemID, not recommend the hobbies) - does this make sense? Or am I
> > > confusing myself?
> > >
> > > Agata
> > >
> > >
> > > On 18 March 2013 14:23, Sean Owen <sr...@gmail.com> wrote:
> > >
> > > > You would have to make up the similarity metric separately since it
> > > depends
> > > > entirely on how you want to define it.
> > > > The part of the book you are talking about concerns rescoring, which
> is
> > > not
> > > > the same thing.
> > > > Combine the similarity metrics, I mean, not make two recommenders.
> > Make a
> > > > metric that is the product of two other metrics. Normalize both of
> > those
> > > > metrics to the range [0,1].
> > > >
> > > > Sean
> > > >
> > > >
> > > > On Mon, Mar 18, 2013 at 6:51 AM, Agata Filiana <
> a.filiana87@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Thank Sean for the response. I like the idea of multiplying the
> > > > similarity
> > > > > metric based on
> > > > > user properties with the one based on CF data.
> > > > > I understand that I have to create a seperate similarity metric -
> > can I
> > > > do
> > > > > this with the help of Mahout or does this have to be done
> seperately,
> > > as
> > > > in
> > > > > I have to implement my own similarity measure? It would be great if
> > > there
> > > > > is some clue on how I get this started.
> > > > > Is this somehow similar to the subject of *Injecting
> domain-specific
> > > > > information* in the book Mahout in Action (with the example of the
> > > > > gender-based item similarity metric)?
> > > > >
> > > > > And also how can I multiply the two results - will this affect the
> > > result
> > > > > of the evaluation of the recommender system? Or it should be
> > normalized
> > > > in
> > > > > a way?
> > > > >
> > > > > Thank you and sorry for the basic questions.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Agata Filiana
> > > > >
> > > > >
> > > > > On 16 March 2013 13:41, Sean Owen <sr...@gmail.com> wrote:
> > > > >
> > > > > > There are many ways to think about combining these two types of
> > data.
> > > > > >
> > > > > > If you can make some similarity metric based on age, gender and
> > > > > interests,
> > > > > > then you can use it as the similarity metric in
> > > > > > GenericBooleanPrefUserBasedRecommender. You would be using both
> > data
> > > > sets
> > > > > > in some way. Of course this means learning a whole different
> > > similarity
> > > > > > metric somehow. A variant on this is to make a similarity metric
> > > based
> > > > on
> > > > > > user properties, and also use one based on CF data, and multiply
> > them
> > > > > > together to make a new combined similarity metric for this
> > approach.
> > > > This
> > > > > > might work OK.
> > > > > >
> > > > > > It can also work to treat age and gender and other features as
> > > > > categorical
> > > > > > features, and then model them as 'items' that the user interacts
> > > with.
> > > > > They
> > > > > > would not have much of an effect here given how many items there
> > are.
> > > > In
> > > > > > other models like ALS-WR you can weight these pseudo-items much
> > more
> > > > > highly
> > > > > > and get the desired effect to a degree.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <
> > > a.filiana87@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm fairly new to Mahout. Right now I am experimenting Mahout
> by
> > > > trying
> > > > > > to
> > > > > > > build a simple recommendation system. What I have is just a
> > boolean
> > > > > data
> > > > > > > set, with only the userID and itemID. I understand that for
> this
> > > > case I
> > > > > > > have to use GenericBooleanPrefUserBasedRecommender - which I
> have
> > > and
> > > > > > works
> > > > > > > fine.
> > > > > > >
> > > > > > > Apart from the userID and itemID data, I also have the user's
> > > > > attributes
> > > > > > > (their age, gender, list of interests). I would like to combine
> > > this
> > > > > into
> > > > > > > the recommendation system to increase the performance of the
> > > > > recommender.
> > > > > > > Is this possible to do or am I trying something that does not
> > make
> > > > > sense?
> > > > > > >
> > > > > > > It would be great if you can give me any inputs or ideas for
> > this.
> > > > (Or
> > > > > > any
> > > > > > > good read based on this matter)
> > > > > > >
> > > > > > > Thank you!
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > *Agata Filiana*
> > > > > > > Erasmus Mundus Student
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Agata Filiana
> > > > > *
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Agata Filiana
> > > *
> > >
> >
>
>
>
> --
> *Agata Filiana
> *
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

In this case, would be correct if I somehow "loop" through the item data
and the hobby data and then combine the score for a pair of users?

I am having trouble in how to combine both similarity into one metric,
could you possibly point me out a clue?

Thank you

On 18 March 2013 14:54, Sean Owen <sr...@gmail.com> wrote:

> There is a difference between the recommender and the similarity metric it
> uses. My suggestion was to either use your item data with the recommender
> and hobby data with the similarity metric, or, use both in the similarity
> metric by making a combined metric.
>
>
> On Mon, Mar 18, 2013 at 9:44 AM, Agata Filiana <a.filiana87@gmail.com
> >wrote:
>
> > I understand how it works logically. However I am having problem
> > understanding about the implementation of it and how to get the final
> > outcome.
> > Say the user's attribute is Hobbies: hobby1,hobby2,hobby3
> > So I would make the similarity metric of the users and hobbies.
> >
> > Then for the CF, using Mahout's GenericBooleanPrefUserBasedRecommender
> with
> > the boolean data set (userID and itemID).
> >
> > Then somehow combine the two?
> >
> > But at the end, my goal is to recommend the items in the second data set
> > (the itemID, not recommend the hobbies) - does this make sense? Or am I
> > confusing myself?
> >
> > Agata
> >
> >
> > On 18 March 2013 14:23, Sean Owen <sr...@gmail.com> wrote:
> >
> > > You would have to make up the similarity metric separately since it
> > depends
> > > entirely on how you want to define it.
> > > The part of the book you are talking about concerns rescoring, which is
> > not
> > > the same thing.
> > > Combine the similarity metrics, I mean, not make two recommenders.
> Make a
> > > metric that is the product of two other metrics. Normalize both of
> those
> > > metrics to the range [0,1].
> > >
> > > Sean
> > >
> > >
> > > On Mon, Mar 18, 2013 at 6:51 AM, Agata Filiana <a.filiana87@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > Thank Sean for the response. I like the idea of multiplying the
> > > similarity
> > > > metric based on
> > > > user properties with the one based on CF data.
> > > > I understand that I have to create a seperate similarity metric -
> can I
> > > do
> > > > this with the help of Mahout or does this have to be done seperately,
> > as
> > > in
> > > > I have to implement my own similarity measure? It would be great if
> > there
> > > > is some clue on how I get this started.
> > > > Is this somehow similar to the subject of *Injecting domain-specific
> > > > information* in the book Mahout in Action (with the example of the
> > > > gender-based item similarity metric)?
> > > >
> > > > And also how can I multiply the two results - will this affect the
> > result
> > > > of the evaluation of the recommender system? Or it should be
> normalized
> > > in
> > > > a way?
> > > >
> > > > Thank you and sorry for the basic questions.
> > > >
> > > > Regards,
> > > >
> > > > Agata Filiana
> > > >
> > > >
> > > > On 16 March 2013 13:41, Sean Owen <sr...@gmail.com> wrote:
> > > >
> > > > > There are many ways to think about combining these two types of
> data.
> > > > >
> > > > > If you can make some similarity metric based on age, gender and
> > > > interests,
> > > > > then you can use it as the similarity metric in
> > > > > GenericBooleanPrefUserBasedRecommender. You would be using both
> data
> > > sets
> > > > > in some way. Of course this means learning a whole different
> > similarity
> > > > > metric somehow. A variant on this is to make a similarity metric
> > based
> > > on
> > > > > user properties, and also use one based on CF data, and multiply
> them
> > > > > together to make a new combined similarity metric for this
> approach.
> > > This
> > > > > might work OK.
> > > > >
> > > > > It can also work to treat age and gender and other features as
> > > > categorical
> > > > > features, and then model them as 'items' that the user interacts
> > with.
> > > > They
> > > > > would not have much of an effect here given how many items there
> are.
> > > In
> > > > > other models like ALS-WR you can weight these pseudo-items much
> more
> > > > highly
> > > > > and get the desired effect to a degree.
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <
> > a.filiana87@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm fairly new to Mahout. Right now I am experimenting Mahout by
> > > trying
> > > > > to
> > > > > > build a simple recommendation system. What I have is just a
> boolean
> > > > data
> > > > > > set, with only the userID and itemID. I understand that for this
> > > case I
> > > > > > have to use GenericBooleanPrefUserBasedRecommender - which I have
> > and
> > > > > works
> > > > > > fine.
> > > > > >
> > > > > > Apart from the userID and itemID data, I also have the user's
> > > > attributes
> > > > > > (their age, gender, list of interests). I would like to combine
> > this
> > > > into
> > > > > > the recommendation system to increase the performance of the
> > > > recommender.
> > > > > > Is this possible to do or am I trying something that does not
> make
> > > > sense?
> > > > > >
> > > > > > It would be great if you can give me any inputs or ideas for
> this.
> > > (Or
> > > > > any
> > > > > > good read based on this matter)
> > > > > >
> > > > > > Thank you!
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > *Agata Filiana*
> > > > > > Erasmus Mundus Student
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Agata Filiana
> > > > *
> > > >
> > >
> >
> >
> >
> > --
> > *Agata Filiana
> > *
> >
>



-- 
*Agata Filiana
*

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

There is a difference between the recommender and the similarity metric it
uses. My suggestion was to either use your item data with the recommender
and hobby data with the similarity metric, or, use both in the similarity
metric by making a combined metric.


On Mon, Mar 18, 2013 at 9:44 AM, Agata Filiana <a....@gmail.com>wrote:

> I understand how it works logically. However I am having problem
> understanding about the implementation of it and how to get the final
> outcome.
> Say the user's attribute is Hobbies: hobby1,hobby2,hobby3
> So I would make the similarity metric of the users and hobbies.
>
> Then for the CF, using Mahout's GenericBooleanPrefUserBasedRecommender with
> the boolean data set (userID and itemID).
>
> Then somehow combine the two?
>
> But at the end, my goal is to recommend the items in the second data set
> (the itemID, not recommend the hobbies) - does this make sense? Or am I
> confusing myself?
>
> Agata
>
>
> On 18 March 2013 14:23, Sean Owen <sr...@gmail.com> wrote:
>
> > You would have to make up the similarity metric separately since it
> depends
> > entirely on how you want to define it.
> > The part of the book you are talking about concerns rescoring, which is
> not
> > the same thing.
> > Combine the similarity metrics, I mean, not make two recommenders. Make a
> > metric that is the product of two other metrics. Normalize both of those
> > metrics to the range [0,1].
> >
> > Sean
> >
> >
> > On Mon, Mar 18, 2013 at 6:51 AM, Agata Filiana <a.filiana87@gmail.com
> > >wrote:
> >
> > > Hi,
> > >
> > > Thank Sean for the response. I like the idea of multiplying the
> > similarity
> > > metric based on
> > > user properties with the one based on CF data.
> > > I understand that I have to create a seperate similarity metric - can I
> > do
> > > this with the help of Mahout or does this have to be done seperately,
> as
> > in
> > > I have to implement my own similarity measure? It would be great if
> there
> > > is some clue on how I get this started.
> > > Is this somehow similar to the subject of *Injecting domain-specific
> > > information* in the book Mahout in Action (with the example of the
> > > gender-based item similarity metric)?
> > >
> > > And also how can I multiply the two results - will this affect the
> result
> > > of the evaluation of the recommender system? Or it should be normalized
> > in
> > > a way?
> > >
> > > Thank you and sorry for the basic questions.
> > >
> > > Regards,
> > >
> > > Agata Filiana
> > >
> > >
> > > On 16 March 2013 13:41, Sean Owen <sr...@gmail.com> wrote:
> > >
> > > > There are many ways to think about combining these two types of data.
> > > >
> > > > If you can make some similarity metric based on age, gender and
> > > interests,
> > > > then you can use it as the similarity metric in
> > > > GenericBooleanPrefUserBasedRecommender. You would be using both data
> > sets
> > > > in some way. Of course this means learning a whole different
> similarity
> > > > metric somehow. A variant on this is to make a similarity metric
> based
> > on
> > > > user properties, and also use one based on CF data, and multiply them
> > > > together to make a new combined similarity metric for this approach.
> > This
> > > > might work OK.
> > > >
> > > > It can also work to treat age and gender and other features as
> > > categorical
> > > > features, and then model them as 'items' that the user interacts
> with.
> > > They
> > > > would not have much of an effect here given how many items there are.
> > In
> > > > other models like ALS-WR you can weight these pseudo-items much more
> > > highly
> > > > and get the desired effect to a degree.
> > > >
> > > >
> > > >
> > > > On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <
> a.filiana87@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm fairly new to Mahout. Right now I am experimenting Mahout by
> > trying
> > > > to
> > > > > build a simple recommendation system. What I have is just a boolean
> > > data
> > > > > set, with only the userID and itemID. I understand that for this
> > case I
> > > > > have to use GenericBooleanPrefUserBasedRecommender - which I have
> and
> > > > works
> > > > > fine.
> > > > >
> > > > > Apart from the userID and itemID data, I also have the user's
> > > attributes
> > > > > (their age, gender, list of interests). I would like to combine
> this
> > > into
> > > > > the recommendation system to increase the performance of the
> > > recommender.
> > > > > Is this possible to do or am I trying something that does not make
> > > sense?
> > > > >
> > > > > It would be great if you can give me any inputs or ideas for this.
> > (Or
> > > > any
> > > > > good read based on this matter)
> > > > >
> > > > > Thank you!
> > > > >
> > > > > Regards,
> > > > >
> > > > > *Agata Filiana*
> > > > > Erasmus Mundus Student
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Agata Filiana
> > > *
> > >
> >
>
>
>
> --
> *Agata Filiana
> *
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

I understand how it works logically. However I am having problem
understanding about the implementation of it and how to get the final
outcome.
Say the user's attribute is Hobbies: hobby1,hobby2,hobby3
So I would make the similarity metric of the users and hobbies.

Then for the CF, using Mahout's GenericBooleanPrefUserBasedRecommender with
the boolean data set (userID and itemID).

Then somehow combine the two?

But at the end, my goal is to recommend the items in the second data set
(the itemID, not recommend the hobbies) - does this make sense? Or am I
confusing myself?

Agata


On 18 March 2013 14:23, Sean Owen <sr...@gmail.com> wrote:

> You would have to make up the similarity metric separately since it depends
> entirely on how you want to define it.
> The part of the book you are talking about concerns rescoring, which is not
> the same thing.
> Combine the similarity metrics, I mean, not make two recommenders. Make a
> metric that is the product of two other metrics. Normalize both of those
> metrics to the range [0,1].
>
> Sean
>
>
> On Mon, Mar 18, 2013 at 6:51 AM, Agata Filiana <a.filiana87@gmail.com
> >wrote:
>
> > Hi,
> >
> > Thank Sean for the response. I like the idea of multiplying the
> similarity
> > metric based on
> > user properties with the one based on CF data.
> > I understand that I have to create a seperate similarity metric - can I
> do
> > this with the help of Mahout or does this have to be done seperately, as
> in
> > I have to implement my own similarity measure? It would be great if there
> > is some clue on how I get this started.
> > Is this somehow similar to the subject of *Injecting domain-specific
> > information* in the book Mahout in Action (with the example of the
> > gender-based item similarity metric)?
> >
> > And also how can I multiply the two results - will this affect the result
> > of the evaluation of the recommender system? Or it should be normalized
> in
> > a way?
> >
> > Thank you and sorry for the basic questions.
> >
> > Regards,
> >
> > Agata Filiana
> >
> >
> > On 16 March 2013 13:41, Sean Owen <sr...@gmail.com> wrote:
> >
> > > There are many ways to think about combining these two types of data.
> > >
> > > If you can make some similarity metric based on age, gender and
> > interests,
> > > then you can use it as the similarity metric in
> > > GenericBooleanPrefUserBasedRecommender. You would be using both data
> sets
> > > in some way. Of course this means learning a whole different similarity
> > > metric somehow. A variant on this is to make a similarity metric based
> on
> > > user properties, and also use one based on CF data, and multiply them
> > > together to make a new combined similarity metric for this approach.
> This
> > > might work OK.
> > >
> > > It can also work to treat age and gender and other features as
> > categorical
> > > features, and then model them as 'items' that the user interacts with.
> > They
> > > would not have much of an effect here given how many items there are.
> In
> > > other models like ALS-WR you can weight these pseudo-items much more
> > highly
> > > and get the desired effect to a degree.
> > >
> > >
> > >
> > > On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <a.filiana87@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm fairly new to Mahout. Right now I am experimenting Mahout by
> trying
> > > to
> > > > build a simple recommendation system. What I have is just a boolean
> > data
> > > > set, with only the userID and itemID. I understand that for this
> case I
> > > > have to use GenericBooleanPrefUserBasedRecommender - which I have and
> > > works
> > > > fine.
> > > >
> > > > Apart from the userID and itemID data, I also have the user's
> > attributes
> > > > (their age, gender, list of interests). I would like to combine this
> > into
> > > > the recommendation system to increase the performance of the
> > recommender.
> > > > Is this possible to do or am I trying something that does not make
> > sense?
> > > >
> > > > It would be great if you can give me any inputs or ideas for this.
> (Or
> > > any
> > > > good read based on this matter)
> > > >
> > > > Thank you!
> > > >
> > > > Regards,
> > > >
> > > > *Agata Filiana*
> > > > Erasmus Mundus Student
> > > >
> > >
> >
> >
> >
> > --
> > *Agata Filiana
> > *
> >
>



-- 
*Agata Filiana
*

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

You would have to make up the similarity metric separately since it depends
entirely on how you want to define it.
The part of the book you are talking about concerns rescoring, which is not
the same thing.
Combine the similarity metrics, I mean, not make two recommenders. Make a
metric that is the product of two other metrics. Normalize both of those
metrics to the range [0,1].

Sean


On Mon, Mar 18, 2013 at 6:51 AM, Agata Filiana <a....@gmail.com>wrote:

> Hi,
>
> Thank Sean for the response. I like the idea of multiplying the similarity
> metric based on
> user properties with the one based on CF data.
> I understand that I have to create a seperate similarity metric - can I do
> this with the help of Mahout or does this have to be done seperately, as in
> I have to implement my own similarity measure? It would be great if there
> is some clue on how I get this started.
> Is this somehow similar to the subject of *Injecting domain-specific
> information* in the book Mahout in Action (with the example of the
> gender-based item similarity metric)?
>
> And also how can I multiply the two results - will this affect the result
> of the evaluation of the recommender system? Or it should be normalized in
> a way?
>
> Thank you and sorry for the basic questions.
>
> Regards,
>
> Agata Filiana
>
>
> On 16 March 2013 13:41, Sean Owen <sr...@gmail.com> wrote:
>
> > There are many ways to think about combining these two types of data.
> >
> > If you can make some similarity metric based on age, gender and
> interests,
> > then you can use it as the similarity metric in
> > GenericBooleanPrefUserBasedRecommender. You would be using both data sets
> > in some way. Of course this means learning a whole different similarity
> > metric somehow. A variant on this is to make a similarity metric based on
> > user properties, and also use one based on CF data, and multiply them
> > together to make a new combined similarity metric for this approach. This
> > might work OK.
> >
> > It can also work to treat age and gender and other features as
> categorical
> > features, and then model them as 'items' that the user interacts with.
> They
> > would not have much of an effect here given how many items there are. In
> > other models like ALS-WR you can weight these pseudo-items much more
> highly
> > and get the desired effect to a degree.
> >
> >
> >
> > On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <a.filiana87@gmail.com
> > >wrote:
> >
> > > Hi,
> > >
> > > I'm fairly new to Mahout. Right now I am experimenting Mahout by trying
> > to
> > > build a simple recommendation system. What I have is just a boolean
> data
> > > set, with only the userID and itemID. I understand that for this case I
> > > have to use GenericBooleanPrefUserBasedRecommender - which I have and
> > works
> > > fine.
> > >
> > > Apart from the userID and itemID data, I also have the user's
> attributes
> > > (their age, gender, list of interests). I would like to combine this
> into
> > > the recommendation system to increase the performance of the
> recommender.
> > > Is this possible to do or am I trying something that does not make
> sense?
> > >
> > > It would be great if you can give me any inputs or ideas for this. (Or
> > any
> > > good read based on this matter)
> > >
> > > Thank you!
> > >
> > > Regards,
> > >
> > > *Agata Filiana*
> > > Erasmus Mundus Student
> > >
> >
>
>
>
> --
> *Agata Filiana
> *
>

Re: Boosting User-Based with the user's attributes

Posted by Agata Filiana <a....@gmail.com>.

Hi,

Thank Sean for the response. I like the idea of multiplying the similarity
metric based on
user properties with the one based on CF data.
I understand that I have to create a seperate similarity metric - can I do
this with the help of Mahout or does this have to be done seperately, as in
I have to implement my own similarity measure? It would be great if there
is some clue on how I get this started.
Is this somehow similar to the subject of *Injecting domain-specific
information* in the book Mahout in Action (with the example of the
gender-based item similarity metric)?

And also how can I multiply the two results - will this affect the result
of the evaluation of the recommender system? Or it should be normalized in
a way?

Thank you and sorry for the basic questions.

Regards,

Agata Filiana


On 16 March 2013 13:41, Sean Owen <sr...@gmail.com> wrote:

> There are many ways to think about combining these two types of data.
>
> If you can make some similarity metric based on age, gender and interests,
> then you can use it as the similarity metric in
> GenericBooleanPrefUserBasedRecommender. You would be using both data sets
> in some way. Of course this means learning a whole different similarity
> metric somehow. A variant on this is to make a similarity metric based on
> user properties, and also use one based on CF data, and multiply them
> together to make a new combined similarity metric for this approach. This
> might work OK.
>
> It can also work to treat age and gender and other features as categorical
> features, and then model them as 'items' that the user interacts with. They
> would not have much of an effect here given how many items there are. In
> other models like ALS-WR you can weight these pseudo-items much more highly
> and get the desired effect to a degree.
>
>
>
> On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <a.filiana87@gmail.com
> >wrote:
>
> > Hi,
> >
> > I'm fairly new to Mahout. Right now I am experimenting Mahout by trying
> to
> > build a simple recommendation system. What I have is just a boolean data
> > set, with only the userID and itemID. I understand that for this case I
> > have to use GenericBooleanPrefUserBasedRecommender - which I have and
> works
> > fine.
> >
> > Apart from the userID and itemID data, I also have the user's attributes
> > (their age, gender, list of interests). I would like to combine this into
> > the recommendation system to increase the performance of the recommender.
> > Is this possible to do or am I trying something that does not make sense?
> >
> > It would be great if you can give me any inputs or ideas for this. (Or
> any
> > good read based on this matter)
> >
> > Thank you!
> >
> > Regards,
> >
> > *Agata Filiana*
> > Erasmus Mundus Student
> >
>



-- 
*Agata Filiana
*

Re: Boosting User-Based with the user's attributes

Posted by Sean Owen <sr...@gmail.com>.

There are many ways to think about combining these two types of data.

If you can make some similarity metric based on age, gender and interests,
then you can use it as the similarity metric in
GenericBooleanPrefUserBasedRecommender. You would be using both data sets
in some way. Of course this means learning a whole different similarity
metric somehow. A variant on this is to make a similarity metric based on
user properties, and also use one based on CF data, and multiply them
together to make a new combined similarity metric for this approach. This
might work OK.

It can also work to treat age and gender and other features as categorical
features, and then model them as 'items' that the user interacts with. They
would not have much of an effect here given how many items there are. In
other models like ALS-WR you can weight these pseudo-items much more highly
and get the desired effect to a degree.

On Fri, Mar 15, 2013 at 4:37 PM, Agata Filiana <a....@gmail.com>wrote:

> Hi,
>
> I'm fairly new to Mahout. Right now I am experimenting Mahout by trying to
> build a simple recommendation system. What I have is just a boolean data
> set, with only the userID and itemID. I understand that for this case I
> have to use GenericBooleanPrefUserBasedRecommender - which I have and works
> fine.
>
> Apart from the userID and itemID data, I also have the user's attributes
> (their age, gender, list of interests). I would like to combine this into
> the recommendation system to increase the performance of the recommender.
> Is this possible to do or am I trying something that does not make sense?
>
> It would be great if you can give me any inputs or ideas for this. (Or any
> good read based on this matter)
>
> Thank you!
>
> Regards,
>
> *Agata Filiana*
> Erasmus Mundus Student
>