You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Matthias Boehm (JIRA)" <ji...@apache.org> on 2016/05/10 02:11:12 UTC
[jira] [Created] (SYSTEMML-677) Random data generator for decision
tree fails w/ data type mismatch
Matthias Boehm created SYSTEMML-677:
---------------------------------------
Summary: Random data generator for decision tree fails w/ data type mismatch
Key: SYSTEMML-677
URL: https://issues.apache.org/jira/browse/SYSTEMML-677
Project: SystemML
Issue Type: Bug
Reporter: Matthias Boehm
Fix For: SystemML 0.9
The data generator for decision tree is composed of a shell script that calls two dml scripts in order to apply the file-based transform (which requires an existing file during compilation) in the second script. However, there is a data type mismatch as the first script outputs a matrix and the second script expects a frame.
This task covers (1) a script level change to output a frame from the first script, and (2) a fix for writing the frame meta data file with a value type accepted by the subsequent transform.
Note that the script level change already exploits matrix-frame casting which has been introduced as part of SYSTEMML-554 but this builtin function is as of today only in CP. This means, the data generator only works for small data that fits into the driver memory. Once the Spark/MR converters from SYSTEMML- are fully integrated, the script will runs for large data too without further script changes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)