You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "LI Guobao (JIRA)" <ji...@apache.org> on 2018/04/04 12:41:00 UTC

[jira] [Assigned] (SYSTEMML-2197) Multi-threaded broadcast creation

     [ https://issues.apache.org/jira/browse/SYSTEMML-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

LI Guobao reassigned SYSTEMML-2197:
-----------------------------------

    Assignee: LI Guobao

> Multi-threaded broadcast creation
> ---------------------------------
>
>                 Key: SYSTEMML-2197
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2197
>             Project: SystemML
>          Issue Type: Task
>            Reporter: Matthias Boehm
>            Assignee: LI Guobao
>            Priority: Major
>
> All spark instructions that broadcast one of the input operands, rely on a shared primitive {{sec.getBroadcastForVariable(var)}} for creating partitioned broadcasts, which are wrapper objects around potentially many broadcast variables to overcome Spark 2GB limitation for compressed broadcasts. Each individual broadcast blocks the matrix into squared blocks for direct access without unnecessary copy per task. So far this broadcast creation is single-threaded. 
> This task aims to parallelize the blocking of the given in-memory matrix into squared blocks (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/instructions/spark/data/PartitionedBlock.java#L82) as well as the subsequent partition creation and actual broadcasting (https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L548). 
> For consistency and in order to avoid excessive over-provisioning, this multi-threading should use the common internal thread pool or parallel java streams, which similarly calls the shared {{ForkJoinPool.commonPool}}. An example is the multi-threaded parallelization of RDDs which similarly blocks a given matrix into its squared blocks (see https://github.com/apache/systemml/blob/master/src/main/java/org/apache/sysml/runtime/controlprogram/context/SparkExecutionContext.java#L679).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)