You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sebastian Schelter (JIRA)" <ji...@apache.org> on 2010/08/15 20:26:17 UTC

[jira] Commented: (MAHOUT-460) Add "maxPreferencesPerItemConsidered" option to o.a.m.cf.taste.hadoop.similarity.item.ItemSimilarityJob

    [ https://issues.apache.org/jira/browse/MAHOUT-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898726#action_12898726 ] 

Sebastian Schelter commented on MAHOUT-460:
-------------------------------------------

The goal of this issue is to introduce a limititation onto the number of cooccurrences per item to make the runtime of ItemSimilarityJob and RecommenderJob runtime linear to the size of the input and not dependent on the maximum number of preferences per item.

This should be OK to do because you don't really learn anything new about an item after seeing a certain number of preferences and thus it should be sufficient to look at a fixed number of them at maximum per item

> Add "maxPreferencesPerItemConsidered" option to o.a.m.cf.taste.hadoop.similarity.item.ItemSimilarityJob
> -------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-460
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-460
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>            Reporter: Sebastian Schelter
>         Attachments: MAHOUT-460.patch
>
>
> Because "coocurrence algorithms ... scale in the square of the number of occurrences most popular item" (Ted wrote that in a recent mail) we should offer a parameter to the ItemSimilarity job that makes it limit the number of considered preferences per item. RecommenderJob already has such an option.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.