You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@systemml.apache.org by "Matthias Boehm (JIRA)" <ji...@apache.org> on 2017/03/10 23:53:04 UTC

[jira] [Created] (SYSTEMML-1392) Redundant parfor spark dpe result var export

Matthias Boehm created SYSTEMML-1392:
----------------------------------------

             Summary: Redundant parfor spark dpe result var export
                 Key: SYSTEMML-1392
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1392
             Project: SystemML
          Issue Type: Bug
            Reporter: Matthias Boehm


The parfor spark datapartition-execute job current writes result variables per parfor input partition. However, since a reduce task likely has multiple parfor partitions and outputs are guaranteed to have no conflicts, this leads to unnecessary write overhead. 

To fix this issues, we should only write result variables once per physical partition. Similarly, since accumulators are only reported for finished tasks, we should also maintain these task/iteration accumulators just once per task.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)