You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2010/06/01 15:23:37 UTC
[jira] Commented: (MAHOUT-407) Limit the number of similar items
per item in the ItemSimilarityJob
[ https://issues.apache.org/jira/browse/MAHOUT-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873988#action_12873988 ]
Hudson commented on MAHOUT-407:
-------------------------------
Integrated in Mahout-Quality #42 (See [http://hudson.zones.apache.org/hudson/job/Mahout-Quality/42/])
MAHOUT-407 also make similar options for item similarity configurable in recommender
> Limit the number of similar items per item in the ItemSimilarityJob
> -------------------------------------------------------------------
>
> Key: MAHOUT-407
> URL: https://issues.apache.org/jira/browse/MAHOUT-407
> Project: Mahout
> Issue Type: New Feature
> Components: Collaborative Filtering
> Reporter: Sebastian Schelter
>
> In order to keep the item-similarity-matrix sparse, it would be a useful improvement to add an option like "maxSimilaritiesPerItem" to o.a.m.cf.taste.hadoop.similarity.item.ItemSimilarityJob, which would make it try to cap the number of similar items per item.
> However as we store each similarity pair only once it could happen that there are more than "maxSimilaritiesPerItem" similar items for a single item as we can't drop some of the pairs because the other item in the pair might have too little similarities otherwise.
> A default value of 100 co-occurrences (similarities) will be used because this is already the default in the distributed recommender.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.