You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2013/04/02 08:57:15 UTC

[jira] [Resolved] (MAHOUT-1185) MemoryDiffStorage.class has a bug for slope one algorithm which could cause incorrect recommendation results

     [ https://issues.apache.org/jira/browse/MAHOUT-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-1185.
-------------------------------

       Resolution: Not A Problem
    Fix Version/s:     (was: 0.8)

Diffs are computed over all pairs of prefs in the list. The diff of a pref and itself is 0 and there is no point in computing this. Thus the outer loop is from i=0 to length-2, and the inner loop is from j=i+1 to length-1. It is correct as-is.

This patch would not only cause an exception (it dereferences at length) but would incorrectly add a non-zero diff consisting just of the item's pref value -- there's no "diff" there, it's an absolute value.
                
> MemoryDiffStorage.class has a bug for slope one algorithm which could cause incorrect recommendation results
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-1185
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1185
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.7
>         Environment: Ubuntu
>            Reporter: Cunlu Zou
>            Assignee: Sean Owen
>              Labels: patch
>         Attachments: MemoryDiffStorage.patch
>
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> The function processOneUser(long averageCount, long userID) in the MemoryDiffStorage.class file contains a bug for calculating the itemAverage. Since the function tried to calculate the average difference among items (in a nested loop) and also the average individual item preference value in the same loop (the loop only from 0 to length-2, *for (int i = 0; i < length - 1; i++)*), the itemAverage variable does not count the last item's preference value for every users which could lead to an incorrect recommendation results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira