You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Jonathan Young (JIRA)" <ji...@apache.org> on 2010/06/22 16:30:55 UTC
[jira] Updated: (MAHOUT-423) Optimize
getNumUsersWithPreferenceFor(long... itemIDs)
[ https://issues.apache.org/jira/browse/MAHOUT-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Young updated MAHOUT-423:
----------------------------------
Attachment: MAHOUT-423.patch
This patch is for trunk, and optimizes two special cases: itemIDs.length == 1 (don't create the intersection set, just return the number of the preferences) and itemIDs.length == 2 (don't create the intersection set, use the existing set and the fast intersectionSize() function on FastIDSet.
> Optimize getNumUsersWithPreferenceFor(long... itemIDs)
> ------------------------------------------------------
>
> Key: MAHOUT-423
> URL: https://issues.apache.org/jira/browse/MAHOUT-423
> Project: Mahout
> Issue Type: Improvement
> Components: Collaborative Filtering
> Affects Versions: 0.3
> Reporter: Jonathan Young
> Attachments: MAHOUT-423.patch
>
>
> I ran a simple collaborative filtering application using a GenericBooleanPrefDataModel built from (a subset of) the Netflix data, Tanimoto similarity, and the GenericItemBasedRecommender, and then called recommender.mostSimilarItems() (a lot).
> Profiling indicated that the majority of the time was spent in GenericBooleanPrefDataModel.getNumUsersWithPreferenceFor(long... itemIDs). The version in GenericDataModel is optimized for the cases of one and two itemIDs, but the version in GenericBooleanPrefDataModel always computes the intersection set.
> I can create a patch which optimizes the two cases of itemIDs.length == 1 and itemIDs.length == 2 (similar to the version in GenericDataModel), but perhaps the code should be refactored if these are really the most common cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.