You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Paulo Villegas (Commented) (JIRA)" <ji...@apache.org> on 2011/12/06 12:59:39 UTC

[jira] [Commented] (MAHOUT-898) Error in formula for preference estimation in GenericItemBasedRecommender

    [ https://issues.apache.org/jira/browse/MAHOUT-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163539#comment-13163539 ] 

Paulo Villegas commented on MAHOUT-898:
---------------------------------------

I sent the 'trivial' patch (taking absolute value) as an attachment above. I could do a similar quick fix for the GenericUserBasedRecommender.

The not-so-trivial patch (mean centering of ratings before applying the formula) will take a little longer, since I'm still coming to grips with the code and how to insert that.

BTW I hope to provide later today the promised prec&recall values for log-likelihood
                
> Error in formula for preference estimation in GenericItemBasedRecommender
> -------------------------------------------------------------------------
>
>                 Key: MAHOUT-898
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-898
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>         Environment: mahout-core
>            Reporter: Paulo Villegas
>            Assignee: Sean Owen
>            Priority: Minor
>              Labels: patch
>         Attachments: GenericItemBasedRecommender.diff
>
>
> The formula to estimate the preference for an item in the Taste item-based recommender normalizes by the sum of similarities for items used in estimation. But the terms in the sum taken to normalize should be in absolute value, since they can be negative (e.g. when using Pearson correlation, similarity is in [-1,1]). Now they are not, and as a result when there are negative and positive values they cancel out, giving a small denominator and incorrectly boosting the preference for the item (symptom: it is easy for a predicted preference to take the maximum value, since the quotient becomes large and it is capped afterwards)
> The patch is rather trivial (a one-liner, actually) for src/main/java/org/apache/mahout/cf/taste/impl/recommender/GenericItemBasedRecommender.java
> Note: the same error & suggested fix happens in GenericUserBasedRecommender

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira