You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sebastian Schelter (JIRA)" <ji...@apache.org> on 2013/07/30 07:15:49 UTC

[jira] [Resolved] (MAHOUT-1289) Move downsampling code into RowSimilarityJob

     [ https://issues.apache.org/jira/browse/MAHOUT-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter resolved MAHOUT-1289.
----------------------------------------

    Resolution: Fixed
    
> Move downsampling code into RowSimilarityJob
> --------------------------------------------
>
>                 Key: MAHOUT-1289
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1289
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>             Fix For: 0.9
>
>
> When computing similarities with RowSimilarityJob, downsampling highly frequent things is crucial for performance. At the moment, this is done by the data preparation code for collaborative filtering.
> We should move the downsampling directly into RowSimilarityJob as we've seen a lot of cases where users want to directly use it.
> Furthermore, it should be possible to fix the random seed for the sampling to be able to conduct repeatable experiments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira