You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Niketan Pansare (JIRA)" <ji...@apache.org> on 2016/08/09 21:50:20 UTC
[jira] [Created] (SYSTEMML-855) Add a "Get Started" tutorial for
Python users
Niketan Pansare created SYSTEMML-855:
----------------------------------------
Summary: Add a "Get Started" tutorial for Python users
Key: SYSTEMML-855
URL: https://issues.apache.org/jira/browse/SYSTEMML-855
Project: SystemML
Issue Type: Task
Reporter: Niketan Pansare
As an example, this tutorial could have following sections:
1. Steps to start Python shell (or cloud service like datascientistworkbench) with SystemML support:
wget https://raw.githubusercontent.com/apache/incubator-systemml/master/src/main/java/org/apache/sysml/api/python/SystemML.py
wget https://sparktc.ibmcloud.com/repo/latest/SystemML.jar
2. Give context for one of the algorithm: For example: Linear regression. We can borrow the technical detail from http://apache.github.io/incubator-systemml/algorithms-regression.html#description
3. Explain steps to download data we will use and how to implement Linear regression DS using embedded Python DSL:
https://github.com/apache/incubator-systemml/pull/197
```
import numpy as np
from sklearn import datasets
# Load the diabetes dataset
diabetes = datasets.load_diabetes()
# Use only one feature
diabetes_X = diabetes.data[:, np.newaxis, 2]
# Split the data into training/testing sets
diabetes_X_train = diabetes_X[:-20]
diabetes_X_test = diabetes_X[-20:]
# Split the targets into training/testing sets
diabetes_y_train = diabetes.target[:-20]
diabetes_y_test = diabetes.target[-20:]
```
4. Explain how to use our algorithm instead:
http://apache.github.io/incubator-systemml/algorithms-regression.html#examples
5. To explain tradeoffs of using NumPy or Scikit-Learn v/s SystemML's embedded DSL or SystemML's mllearn, increase the data size. For example: use twitter feed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)