You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org> on 2011/11/20 11:53:53 UTC

[jira] [Commented] (MAHOUT-827) Another version of RecommenderJob that broadcasts the similarity matrix

    [ https://issues.apache.org/jira/browse/MAHOUT-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153762#comment-13153762 ] 

jiraposter@reviews.apache.org commented on MAHOUT-827:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2887/
-----------------------------------------------------------

Review request for mahout.


Summary
-------

RecommenderJob now supports an option called "broadcast" that determines whether the recommendations shall be computed with a reduce-side join (the current approach) or a broadcast join (new and faster approach, that is applicable as long as the similarity matrix fits into the memory of a mapper instance)


This addresses bug MAHOUT-827.
    https://issues.apache.org/jira/browse/MAHOUT-827


Diffs
-----

  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/TasteHadoopUtils.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/ItemIDIndexReducer.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/PartialMultiplyMapper.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/UserVectorSplitterMapper.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/broadcast/Estimators.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/broadcast/RecommendationsPerUserMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/broadcast/SimilarityMatrixIterator.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/preparation/PreparePreferenceMatrixJob.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/cf/taste/impl/similarity/GenericItemSimilarity.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/common/AbstractJob.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/common/iterator/sequencefile/SequenceFileDirIterator.java 1204135 
  trunk/core/src/main/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJob.java 1204135 
  trunk/core/src/test/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJobTest.java 1204135 
  trunk/core/src/test/java/org/apache/mahout/cf/taste/hadoop/item/broadcast/SimilarityMatrixIteratorTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/math/hadoop/MathHelper.java 1204135 

Diff: https://reviews.apache.org/r/2887/diff


Testing
-------


Thanks,

Sebastian


                
> Another version of RecommenderJob that broadcasts the similarity matrix
> -----------------------------------------------------------------------
>
>                 Key: MAHOUT-827
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-827
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>    Affects Versions: 0.6
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>         Attachments: MAHOUT-827-2.patch, MAHOUT-827-3.patch, MAHOUT-827.patch
>
>
> Add another version of RecommenderJob that computes the item similarities via RowSimilarityJob but assumes that the resulting similarity matrix fits into the memory of the mappers in the cluster. After the item similarity computation is done, the similarities are broadcasted via Hadoop's distributed cache and the recommendations are computed in a map-only pass over the data afterwards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira