You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Chris Schilling <ch...@cellixis.com> on 2011/02/19 06:36:30 UTC

Re: user-user recommendations (Implementation)

Hi again!

I was able to "hijack" the GenericUserBasedRecommendation class in my code to produce recommendations based on the count rather than the weighted average.  In order to do this I had to copy both the AbstractUserRecommender and the GUBR classes into my own code and then change the implementation of the Evaluator in GUBR.  I am reasonably new to Java, so there may have been a better way, but this seemed to be the quickest solution.  One reason I thought this was necessary is because the constructors in AbstractRecommender are protected, so it makes it difficult(impossible?) to extend this class outside of the impl.recommender package within Mahout.  I did not want to change the Mahout code and recompile.  

Now, unless I am wrong, I can see two ways of improving(?) the implementation of the Generic*BasedRecommenders.  One would be the addition of a constructor which allowed the passing of an Evaluator.  The other way would be to make the constructors/methods public in the AbstractRecommender.  Then, I can extend this class in my own code.  I understand that Mahout was built to be somewhat standalone.  However there are cases when it would be nice to personalize the evaluation outside of Mahout.  So, maybe the ability to pass an implementation of Evaluator to the constructor is a better option.  

On a slightly different yet related note, I think a better metric than the count (for user based)  would be the sum of similarities between users who have rated the item.  Again, I think the weighted average of sum(rating*similarity)/sum(similarity) shows the problem noted in my previous posts. 

Again, any thoughts are appreciated :)

Thank you for this beautiful framework!  I have been really enjoying all the discussion and learning the intricacies of recommendation engines (and ML in general).
Chris

On Feb 18, 2011, at 5:29 PM, Chris Schilling wrote:

> So, I've been thinking about this a bit more.
> 
> Take an example:  I haverated a very small number of items.  I am able to extract a neighborhood of similar users.  Now let's say there is a single user who has rated the same items with the same rating, but this user is the only rater in my neighborhood who has rated an obscure item very highly.  In the case using a weighted average to predict my 
> recommendations, this obscure item would rise to the top of the list.   In this case, it seems like items rated the most would be better recommendations.  
> 
> I was able to hijack the GenericUserRecommender and change the calculation of the preference to return the count rather than the weighted average.  In my case, this seems to return more intuitive results.  
> 
> Again this is related to the sparseness of the data, but I could see this type of thing occurring often. Any thoughts?
> 
> 
> On Feb 18, 2011, at 3:43 PM, Chris Schilling wrote:
> 
>> Hello again,
>> 
>> Very simple question here:  I am also testing the user-user cf in mahout.  So, once I define my user neighborhood, is it possible to select the recommendations from that based on the number of preferences per item rather than a weighted average?  Basically, I'd like to recommend the items with the most preferences.  It would be simple to implement, so I was curious if this was already possible.  I understand that in this case, the counts become dependent on the size of the neighborhood. This is something I'd want to use for testing.
>> 
>> Thanks
>> Chris
> 


Re: user-user recommendations (Implementation)

Posted by Chris Schilling <ch...@cellixis.com>.
Okay, 

I was wrong about the protected constructors in AbstractRecommender.  I am able to extend that class in my code without problem.  Sorry for the noise there.  Perhaps it still adds to the modularity if there is a constructor which allows the passing of a personalized Evaluator class. 


On Feb 18, 2011, at 9:36 PM, Chris Schilling wrote:

> Hi again!
> 
> I was able to "hijack" the GenericUserBasedRecommendation class in my code to produce recommendations based on the count rather than the weighted average.  In order to do this I had to copy both the AbstractUserRecommender and the GUBR classes into my own code and then change the implementation of the Evaluator in GUBR.  I am reasonably new to Java, so there may have been a better way, but this seemed to be the quickest solution.  One reason I thought this was necessary is because the constructors in AbstractRecommender are protected, so it makes it difficult(impossible?) to extend this class outside of the impl.recommender package within Mahout.  I did not want to change the Mahout code and recompile.  
> 
> Now, unless I am wrong, I can see two ways of improving(?) the implementation of the Generic*BasedRecommenders.  One would be the addition of a constructor which allowed the passing of an Evaluator.  The other way would be to make the constructors/methods public in the AbstractRecommender.  Then, I can extend this class in my own code.  I understand that Mahout was built to be somewhat standalone.  However there are cases when it would be nice to personalize the evaluation outside of Mahout.  So, maybe the ability to pass an implementation of Evaluator to the constructor is a better option.  
> 
> On a slightly different yet related note, I think a better metric than the count (for user based)  would be the sum of similarities between users who have rated the item.  Again, I think the weighted average of sum(rating*similarity)/sum(similarity) shows the problem noted in my previous posts. 
> 
> Again, any thoughts are appreciated :)
> 
> Thank you for this beautiful framework!  I have been really enjoying all the discussion and learning the intricacies of recommendation engines (and ML in general).
> Chris
> 
> On Feb 18, 2011, at 5:29 PM, Chris Schilling wrote:
> 
>> So, I've been thinking about this a bit more.
>> 
>> Take an example:  I haverated a very small number of items.  I am able to extract a neighborhood of similar users.  Now let's say there is a single user who has rated the same items with the same rating, but this user is the only rater in my neighborhood who has rated an obscure item very highly.  In the case using a weighted average to predict my 
>> recommendations, this obscure item would rise to the top of the list.   In this case, it seems like items rated the most would be better recommendations.  
>> 
>> I was able to hijack the GenericUserRecommender and change the calculation of the preference to return the count rather than the weighted average.  In my case, this seems to return more intuitive results.  
>> 
>> Again this is related to the sparseness of the data, but I could see this type of thing occurring often. Any thoughts?
>> 
>> 
>> On Feb 18, 2011, at 3:43 PM, Chris Schilling wrote:
>> 
>>> Hello again,
>>> 
>>> Very simple question here:  I am also testing the user-user cf in mahout.  So, once I define my user neighborhood, is it possible to select the recommendations from that based on the number of preferences per item rather than a weighted average?  Basically, I'd like to recommend the items with the most preferences.  It would be simple to implement, so I was curious if this was already possible.  I understand that in this case, the counts become dependent on the size of the neighborhood. This is something I'd want to use for testing.
>>> 
>>> Thanks
>>> Chris
>> 
>