You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Fei Hu (JIRA)" <ji...@apache.org> on 2017/08/08 17:42:00 UTC

[jira] [Created] (SYSTEMML-1830) Improve the data locality for the tasks in ParFor body

Fei Hu created SYSTEMML-1830:
--------------------------------

             Summary: Improve the data locality for the tasks in ParFor body
                 Key: SYSTEMML-1830
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1830
             Project: SystemML
          Issue Type: Improvement
    Affects Versions: SystemML 1.0
            Reporter: Fei Hu
            Assignee: Fei Hu


For {{RemoteParForSpark}}, the tasks are parallelized without considering the data locality of the input matrixes. It will cause a lot of data shuffling if the volume of the input data size is large. 

We can predict the data location of the input matrixes, and add these location information when parallelizing the ParFor program body. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)