You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Sourav Mazumder (JIRA)" <ji...@apache.org> on 2016/02/15 21:23:18 UTC

[jira] [Updated] (SYSTEMML-519) A Zeppelin Notebook showcasing how to use SystemML APIs and existing scripts on Spark

     [ https://issues.apache.org/jira/browse/SYSTEMML-519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sourav Mazumder updated SYSTEMML-519:
-------------------------------------
    Attachment: 2BCHR4T1Q.zip

Please use the structure /2BCHR4T1Q.zip/notes.json as is while adding the same to the relevant folder path

> A Zeppelin Notebook showcasing how to use SystemML APIs and existing scripts on Spark
> -------------------------------------------------------------------------------------
>
>                 Key: SYSTEMML-519
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-519
>             Project: SystemML
>          Issue Type: Documentation
>          Components: Documentation
>    Affects Versions: SystemML 0.9
>            Reporter: Sourav Mazumder
>            Priority: Minor
>              Labels: documentation
>         Attachments: 2BCHR4T1Q.zip
>
>
> Need a sample Zeppelin Notebook which showcases use of SystemML on Spark from Zeppelin. The Notebook sample covers following end to end aspects of creating a model using SystemML
> 1. Ingestion of multiple Datasets from HDFS.
> 2. Exploration of the Data using Spark SQL
> 3. Merging of the data from various data sources for preparing data for building Model
> 4. Building Model using GLM.dml of SystemML
> 5. Using GLM-predict.dml for prediction using larger population
> 6. Relating the prediction back to original dataset
> 7. Visualization of Prediction using R libraries using SparkR.
> Please note that this notebook uses a R interpreter for Zeppelin (https://github.com/apache/incubator-zeppelin/pull/208/commits) which is not part of main branch. So the SparkR paragraphs will not work if someone is using Zeppelin main branch. Alternatively one can use the branch related to this PR (PR#208). Also one has to have R and other relevant R packages installed separately on the same machine where Zeppelin process is running. the other R packages needed for the plots are googleVis, ggplot2, maptools, htmltools, knitr, repr (from http://irkernel.github.io/).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)