You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Matthias Boehm (JIRA)" <ji...@apache.org> on 2017/06/22 05:50:00 UTC

[jira] [Closed] (SYSTEMML-1727) Wrong mvvar instruction compilation for persistent writes

     [ https://issues.apache.org/jira/browse/SYSTEMML-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matthias Boehm closed SYSTEMML-1727.
------------------------------------

> Wrong mvvar instruction compilation for persistent writes
> ---------------------------------------------------------
>
>                 Key: SYSTEMML-1727
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1727
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Matthias Boehm
>            Assignee: Matthias Boehm
>             Fix For: SystemML 1.0
>
>
> Currently, we compile persistent writes in binary format that read from transient reads to mvvar instructions, which are supposed to be meta data operations on HDFS. However, this comes with two fundamental problems:
> * In case of different file URI schemes between scratch space and persistent write location, we cannot use a rename at all, requiring us to read and write the matrix explicitly. For large data this ultimately leads to OOMs.
> * For scripts where intermediates are fed into such persistent writes but subsequently used by other operations, this can lead to problem of missing inputs because the intermediate does no longer exist under the given temporary filename.
> An example where scripts fail for the second reason is given below:
> {code}
> PROGRAM
> --MAIN PROGRAM
> ----GENERIC (lines 1-1) [recompile=false]
> ------(8) dg(rand) [1000000,1000,1000,1000,1000000000] [0,0,7629 -> 7629MB], CP
> ------(9) TWrite X (8) [1000000,1000,1000,1000,1000000000] [7629,0,0 -> 7629MB], CP
> ----GENERIC (lines 5-5) [recompile=false]
> ----GENERIC (lines 9-9) [recompile=false]
> ------(17) TRead X [1000000,1000,1000,1000,1000000000] [0,0,7629 -> 7629MB], CP
> ------(20) PWrite X (17) [1000000,1000,1000,1000,1000000000] [7629,0,0 -> 7629MB], CP
> ----GENERIC (lines 13-13) [recompile=false]
> ------(24) TRead X [1000000,1000,1000,1000,1000000000] [0,0,7629 -> 7629MB], CP
> ------(26) b(+) (24) [1000000,1000,1000,1000,-1] [7629,0,7629 -> 15259MB], CP
> ------(27) ua(+RC) (26) [0,0,-1,-1,-1] [7629,0,0 -> 7629MB], CP
> ------(28) u(print) (27) [-1,-1,-1,-1,-1] [0,0,0 -> 0MB]
> {code}
> This task aims to fix both related issues by reworking the generation of rmvar instructions in favor of explicit write instructions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)