You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2014/06/12 03:18:02 UTC
[jira] [Closed] (SPARK-1672) Support separate partitioners (and
numbers of partitions) for users and products
[ https://issues.apache.org/jira/browse/SPARK-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng closed SPARK-1672.
--------------------------------
Resolution: Implemented
Assignee: Tor Myklebust
PR: https://github.com/apache/spark/pull/1014
> Support separate partitioners (and numbers of partitions) for users and products
> --------------------------------------------------------------------------------
>
> Key: SPARK-1672
> URL: https://issues.apache.org/jira/browse/SPARK-1672
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Tor Myklebust
> Assignee: Tor Myklebust
> Priority: Minor
> Fix For: 1.1.0
>
>
> The user ought to be able to specify a partitioning of his data if he knows a good one. It's convenient to have separate partitioners for users and products so that no strange mapping step needs to happen.
> It may also be reasonable to partition the users and products into different numbers of partitions (for instance, to balance memory requirements) if the dataset is tall, thin, and very sparse.
--
This message was sent by Atlassian JIRA
(v6.2#6252)