You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2010/06/01 15:23:37 UTC

[jira] Commented: (MAHOUT-407) Limit the number of similar items per item in the ItemSimilarityJob

    [ https://issues.apache.org/jira/browse/MAHOUT-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873988#action_12873988 ] 

Hudson commented on MAHOUT-407:
-------------------------------

Integrated in Mahout-Quality #42 (See [http://hudson.zones.apache.org/hudson/job/Mahout-Quality/42/])
    MAHOUT-407 also make similar options for item similarity configurable in recommender


> Limit the number of similar items per item in the ItemSimilarityJob
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-407
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-407
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: Sebastian Schelter
>
> In order to keep the item-similarity-matrix sparse, it would be a useful improvement to add an option like "maxSimilaritiesPerItem" to o.a.m.cf.taste.hadoop.similarity.item.ItemSimilarityJob, which would make it try to cap the number of similar items per item.
> However as we store each similarity pair only once it could happen that there are more than "maxSimilaritiesPerItem" similar items for a single item as we can't drop some of the pairs because the other item in the pair might have too little similarities otherwise.
> A default value of 100 co-occurrences (similarities) will be used because this is already the default in the distributed recommender.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.