You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Jamey Wood <ja...@gmail.com> on 2011/11/17 18:16:34 UTC

Weighting Preferences for Particular Items in Mahout?

Is there some way to weight particular preferences within Mahout?  For
example, suppose you were creating some kind of literature recommender that
uses a 5-star preference scale.  If you wanted to give double the weighting
to preferences for novels versus preferences for short stories, what would
be the best way to do it?

Thanks,
Jamey

Re: Weighting Preferences for Particular Items in Mahout?

Posted by Jamey Wood <ja...@gmail.com>.
Got it.  Thanks, Sean!

On Thu, Nov 17, 2011 at 11:42 AM, Sean Owen <sr...@gmail.com> wrote:

> Well I think you could fit it inside some of the user-user similarities,
> yes. For a Pearson correlation, you could count important items twice or
> something, yes. I wouldn't do that by literally adding more items to the
> model as it creates other problems. It's possible; it may or may not have
> the type and magnitude of effect you're looking for but easy enough to try.
>
> On Thu, Nov 17, 2011 at 6:25 PM, Jamey Wood <ja...@gmail.com> wrote:
>
> > I think that's certainly true for item-based recommenders (and item-item
> > similarity).  But isn't it a different story for user-user similarity?
>  In
> > the example below, "novel1" and "novel1-copy" are indeed still separate
> > items--but won't they be separate items that produce duplicative forces
> > (and thus "weighting") in terms of the user-user similarity between user1
> > and user?
> >
> > I do realize that inflating the size of one's dataset in this way might
> > lead to other problems.  But setting that aside for now, I'd like to
> > understand whether or not it would produce this kind of weighting effect
> > for user-user similarities.
> >
> > Thanks,
> > Jamey
> >
> > On Thu, Nov 17, 2011 at 10:59 AM, Sean Owen <sr...@gmail.com> wrote:
> >
> > > I don't think that would quite help, since novel1 and its copy are then
> > > different items, and not somehow combining forces in the final
> > calculation.
> > >
> > > On Thu, Nov 17, 2011 at 5:50 PM, Jamey Wood <ja...@gmail.com>
> > wrote:
> > >
> > > > Thanks, Sean.  We'll look into that.
> > > >
> > > > For user-based recommenders (or even just calculating
> UserSimilarity),
> > > > would it have the desired effect if we added multiple "virtual"
> > > preference
> > > > data points for the "real" items that we wished to more heavily
> weight?
> > > >  For example, if our "real" preference data were:
> > > >
> > > >  user1:novel1:3star
> > > >  user1:story1:4star
> > > >  user2:novel1:1star
> > > >  user2:story1:3star
> > > >
> > > > Would transforming it into this have the desired weighting effect (as
> > > long
> > > > as we filtered out the "copy" items in any actual recommendations)?
> > > >
> > > >  user1:novel1:3star
> > > >  user1:novel1-copy1:3star
> > > >  user1:story1:4star
> > > >  user2:novel1:1star
> > > >  user2:novel1-copy1:1star
> > > >  user2:story1:3star
> > > >
> > > > The hope would be that "novel1" would now have twice the weighting as
> > > > "story1" in determining the similarity of these two users.
> > > >
> > > > Thanks,
> > > > Jamey
> > > >
> > > > On Thu, Nov 17, 2011 at 10:29 AM, Sean Owen <sr...@gmail.com>
> wrote:
> > > >
> > > > > Not directly, but you could modify an item-based recommender to do
> > so.
> > > > > Where it uses an item-item similarity as a weight in a weighted
> > > average,
> > > > > you could modify the weight however you like depending on the types
> > of
> > > > the
> > > > > two items.
> > > > >
> > > > > On Thu, Nov 17, 2011 at 5:16 PM, Jamey Wood <ja...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Is there some way to weight particular preferences within Mahout?
> > >  For
> > > > > > example, suppose you were creating some kind of literature
> > > recommender
> > > > > that
> > > > > > uses a 5-star preference scale.  If you wanted to give double the
> > > > > weighting
> > > > > > to preferences for novels versus preferences for short stories,
> > what
> > > > > would
> > > > > > be the best way to do it?
> > > > > >
> > > > > > Thanks,
> > > > > > Jamey
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Weighting Preferences for Particular Items in Mahout?

Posted by Sean Owen <sr...@gmail.com>.
Well I think you could fit it inside some of the user-user similarities,
yes. For a Pearson correlation, you could count important items twice or
something, yes. I wouldn't do that by literally adding more items to the
model as it creates other problems. It's possible; it may or may not have
the type and magnitude of effect you're looking for but easy enough to try.

On Thu, Nov 17, 2011 at 6:25 PM, Jamey Wood <ja...@gmail.com> wrote:

> I think that's certainly true for item-based recommenders (and item-item
> similarity).  But isn't it a different story for user-user similarity?  In
> the example below, "novel1" and "novel1-copy" are indeed still separate
> items--but won't they be separate items that produce duplicative forces
> (and thus "weighting") in terms of the user-user similarity between user1
> and user?
>
> I do realize that inflating the size of one's dataset in this way might
> lead to other problems.  But setting that aside for now, I'd like to
> understand whether or not it would produce this kind of weighting effect
> for user-user similarities.
>
> Thanks,
> Jamey
>
> On Thu, Nov 17, 2011 at 10:59 AM, Sean Owen <sr...@gmail.com> wrote:
>
> > I don't think that would quite help, since novel1 and its copy are then
> > different items, and not somehow combining forces in the final
> calculation.
> >
> > On Thu, Nov 17, 2011 at 5:50 PM, Jamey Wood <ja...@gmail.com>
> wrote:
> >
> > > Thanks, Sean.  We'll look into that.
> > >
> > > For user-based recommenders (or even just calculating UserSimilarity),
> > > would it have the desired effect if we added multiple "virtual"
> > preference
> > > data points for the "real" items that we wished to more heavily weight?
> > >  For example, if our "real" preference data were:
> > >
> > >  user1:novel1:3star
> > >  user1:story1:4star
> > >  user2:novel1:1star
> > >  user2:story1:3star
> > >
> > > Would transforming it into this have the desired weighting effect (as
> > long
> > > as we filtered out the "copy" items in any actual recommendations)?
> > >
> > >  user1:novel1:3star
> > >  user1:novel1-copy1:3star
> > >  user1:story1:4star
> > >  user2:novel1:1star
> > >  user2:novel1-copy1:1star
> > >  user2:story1:3star
> > >
> > > The hope would be that "novel1" would now have twice the weighting as
> > > "story1" in determining the similarity of these two users.
> > >
> > > Thanks,
> > > Jamey
> > >
> > > On Thu, Nov 17, 2011 at 10:29 AM, Sean Owen <sr...@gmail.com> wrote:
> > >
> > > > Not directly, but you could modify an item-based recommender to do
> so.
> > > > Where it uses an item-item similarity as a weight in a weighted
> > average,
> > > > you could modify the weight however you like depending on the types
> of
> > > the
> > > > two items.
> > > >
> > > > On Thu, Nov 17, 2011 at 5:16 PM, Jamey Wood <ja...@gmail.com>
> > > wrote:
> > > >
> > > > > Is there some way to weight particular preferences within Mahout?
> >  For
> > > > > example, suppose you were creating some kind of literature
> > recommender
> > > > that
> > > > > uses a 5-star preference scale.  If you wanted to give double the
> > > > weighting
> > > > > to preferences for novels versus preferences for short stories,
> what
> > > > would
> > > > > be the best way to do it?
> > > > >
> > > > > Thanks,
> > > > > Jamey
> > > > >
> > > >
> > >
> >
>

Re: Weighting Preferences for Particular Items in Mahout?

Posted by Jamey Wood <ja...@gmail.com>.
I think that's certainly true for item-based recommenders (and item-item
similarity).  But isn't it a different story for user-user similarity?  In
the example below, "novel1" and "novel1-copy" are indeed still separate
items--but won't they be separate items that produce duplicative forces
(and thus "weighting") in terms of the user-user similarity between user1
and user?

I do realize that inflating the size of one's dataset in this way might
lead to other problems.  But setting that aside for now, I'd like to
understand whether or not it would produce this kind of weighting effect
for user-user similarities.

Thanks,
Jamey

On Thu, Nov 17, 2011 at 10:59 AM, Sean Owen <sr...@gmail.com> wrote:

> I don't think that would quite help, since novel1 and its copy are then
> different items, and not somehow combining forces in the final calculation.
>
> On Thu, Nov 17, 2011 at 5:50 PM, Jamey Wood <ja...@gmail.com> wrote:
>
> > Thanks, Sean.  We'll look into that.
> >
> > For user-based recommenders (or even just calculating UserSimilarity),
> > would it have the desired effect if we added multiple "virtual"
> preference
> > data points for the "real" items that we wished to more heavily weight?
> >  For example, if our "real" preference data were:
> >
> >  user1:novel1:3star
> >  user1:story1:4star
> >  user2:novel1:1star
> >  user2:story1:3star
> >
> > Would transforming it into this have the desired weighting effect (as
> long
> > as we filtered out the "copy" items in any actual recommendations)?
> >
> >  user1:novel1:3star
> >  user1:novel1-copy1:3star
> >  user1:story1:4star
> >  user2:novel1:1star
> >  user2:novel1-copy1:1star
> >  user2:story1:3star
> >
> > The hope would be that "novel1" would now have twice the weighting as
> > "story1" in determining the similarity of these two users.
> >
> > Thanks,
> > Jamey
> >
> > On Thu, Nov 17, 2011 at 10:29 AM, Sean Owen <sr...@gmail.com> wrote:
> >
> > > Not directly, but you could modify an item-based recommender to do so.
> > > Where it uses an item-item similarity as a weight in a weighted
> average,
> > > you could modify the weight however you like depending on the types of
> > the
> > > two items.
> > >
> > > On Thu, Nov 17, 2011 at 5:16 PM, Jamey Wood <ja...@gmail.com>
> > wrote:
> > >
> > > > Is there some way to weight particular preferences within Mahout?
>  For
> > > > example, suppose you were creating some kind of literature
> recommender
> > > that
> > > > uses a 5-star preference scale.  If you wanted to give double the
> > > weighting
> > > > to preferences for novels versus preferences for short stories, what
> > > would
> > > > be the best way to do it?
> > > >
> > > > Thanks,
> > > > Jamey
> > > >
> > >
> >
>

Re: Weighting Preferences for Particular Items in Mahout?

Posted by Sean Owen <sr...@gmail.com>.
I don't think that would quite help, since novel1 and its copy are then
different items, and not somehow combining forces in the final calculation.

On Thu, Nov 17, 2011 at 5:50 PM, Jamey Wood <ja...@gmail.com> wrote:

> Thanks, Sean.  We'll look into that.
>
> For user-based recommenders (or even just calculating UserSimilarity),
> would it have the desired effect if we added multiple "virtual" preference
> data points for the "real" items that we wished to more heavily weight?
>  For example, if our "real" preference data were:
>
>  user1:novel1:3star
>  user1:story1:4star
>  user2:novel1:1star
>  user2:story1:3star
>
> Would transforming it into this have the desired weighting effect (as long
> as we filtered out the "copy" items in any actual recommendations)?
>
>  user1:novel1:3star
>  user1:novel1-copy1:3star
>  user1:story1:4star
>  user2:novel1:1star
>  user2:novel1-copy1:1star
>  user2:story1:3star
>
> The hope would be that "novel1" would now have twice the weighting as
> "story1" in determining the similarity of these two users.
>
> Thanks,
> Jamey
>
> On Thu, Nov 17, 2011 at 10:29 AM, Sean Owen <sr...@gmail.com> wrote:
>
> > Not directly, but you could modify an item-based recommender to do so.
> > Where it uses an item-item similarity as a weight in a weighted average,
> > you could modify the weight however you like depending on the types of
> the
> > two items.
> >
> > On Thu, Nov 17, 2011 at 5:16 PM, Jamey Wood <ja...@gmail.com>
> wrote:
> >
> > > Is there some way to weight particular preferences within Mahout?  For
> > > example, suppose you were creating some kind of literature recommender
> > that
> > > uses a 5-star preference scale.  If you wanted to give double the
> > weighting
> > > to preferences for novels versus preferences for short stories, what
> > would
> > > be the best way to do it?
> > >
> > > Thanks,
> > > Jamey
> > >
> >
>

Re: Weighting Preferences for Particular Items in Mahout?

Posted by Jamey Wood <ja...@gmail.com>.
Thanks, Sean.  We'll look into that.

For user-based recommenders (or even just calculating UserSimilarity),
would it have the desired effect if we added multiple "virtual" preference
data points for the "real" items that we wished to more heavily weight?
 For example, if our "real" preference data were:

  user1:novel1:3star
  user1:story1:4star
  user2:novel1:1star
  user2:story1:3star

Would transforming it into this have the desired weighting effect (as long
as we filtered out the "copy" items in any actual recommendations)?

  user1:novel1:3star
  user1:novel1-copy1:3star
  user1:story1:4star
  user2:novel1:1star
  user2:novel1-copy1:1star
  user2:story1:3star

The hope would be that "novel1" would now have twice the weighting as
"story1" in determining the similarity of these two users.

Thanks,
Jamey

On Thu, Nov 17, 2011 at 10:29 AM, Sean Owen <sr...@gmail.com> wrote:

> Not directly, but you could modify an item-based recommender to do so.
> Where it uses an item-item similarity as a weight in a weighted average,
> you could modify the weight however you like depending on the types of the
> two items.
>
> On Thu, Nov 17, 2011 at 5:16 PM, Jamey Wood <ja...@gmail.com> wrote:
>
> > Is there some way to weight particular preferences within Mahout?  For
> > example, suppose you were creating some kind of literature recommender
> that
> > uses a 5-star preference scale.  If you wanted to give double the
> weighting
> > to preferences for novels versus preferences for short stories, what
> would
> > be the best way to do it?
> >
> > Thanks,
> > Jamey
> >
>

Re: Weighting Preferences for Particular Items in Mahout?

Posted by Sean Owen <sr...@gmail.com>.
Not directly, but you could modify an item-based recommender to do so.
Where it uses an item-item similarity as a weight in a weighted average,
you could modify the weight however you like depending on the types of the
two items.

On Thu, Nov 17, 2011 at 5:16 PM, Jamey Wood <ja...@gmail.com> wrote:

> Is there some way to weight particular preferences within Mahout?  For
> example, suppose you were creating some kind of literature recommender that
> uses a 5-star preference scale.  If you wanted to give double the weighting
> to preferences for novels versus preferences for short stories, what would
> be the best way to do it?
>
> Thanks,
> Jamey
>