You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Matthias Boehm (JIRA)" <ji...@apache.org> on 2016/09/23 04:34:20 UTC
[jira] [Assigned] (SYSTEMML-951) Efficient spark right indexing via
lookup
[ https://issues.apache.org/jira/browse/SYSTEMML-951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias Boehm reassigned SYSTEMML-951:
---------------------------------------
Assignee: Matthias Boehm
> Efficient spark right indexing via lookup
> -----------------------------------------
>
> Key: SYSTEMML-951
> URL: https://issues.apache.org/jira/browse/SYSTEMML-951
> Project: SystemML
> Issue Type: Task
> Components: Runtime
> Reporter: Matthias Boehm
> Assignee: Matthias Boehm
>
> So far all versions of spark right indexing instructions require a full scan over the data set. In case of existing partitioning (which anyway happens for any external format - binary block conversion) such a full scan is unnecessary if we're only interested in a small subset of the data. This task adds an efficient right indexing operation via 'rdd lookups' which access at most <num_lookup> partitions given existing hash partitioning.
> cc [~mwdusenb@us.ibm.com]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)