You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Glenn Weidner (JIRA)" <ji...@apache.org> on 2017/09/09 01:18:00 UTC

[jira] [Updated] (SYSTEMML-1623) Memory efficiency JMLC matrix and frame conversions

     [ https://issues.apache.org/jira/browse/SYSTEMML-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Glenn Weidner updated SYSTEMML-1623:
------------------------------------
    Fix Version/s:     (was: SystemML 1.0)
                   SystemML 0.15

> Memory efficiency JMLC matrix and frame conversions
> ---------------------------------------------------
>
>                 Key: SYSTEMML-1623
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1623
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Matthias Boehm
>            Assignee: Matthias Boehm
>             Fix For: SystemML 0.15
>
>
> The current JMLC conversion functions cause a very inefficient and memory intensive code path with leads to unnecessary OOMs that can be easily avoided. This task aims to add and improve these primitives to allow convenient data conversions with much better memory efficiency. 
> For example consider a scenario of a 500k x 90 input model available as csv file in the classpath, which string representation requires 1GB. The typical codepath currently use looks as follows:
> {code}
> ResourceStream(model_file)
> -> prep
> ---> StringBuilder -> String [3GB tmp, 1GB]
> -> convertToDoubleMatrix
> ---> byte[] -> ByteInputStream [2GB]
> ---> MatrixBlock [360MB]
> ---> double[][] [400MB]
> -> setMatrix
> ---> MatrixBlock [360MB]
> {code} 
> which requires at least 4GB of memory due to strong references to all intermediates. The goal of this task is to reduce this to the following, which only requires 360MB of memory:
> {code}
> ResourceStream(model_file)
> -> convertToMatrix
> ---> MatrixBlock [360MB]
> -> setMatrix
> ---> by references
> {code} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)