You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2020/07/22 18:01:57 UTC

[GitHub] [systemds] j143 opened a new pull request #999: Notebook for SystemDS on colab for developers

j143 opened a new pull request #999:
URL: https://github.com/apache/systemds/pull/999


   * Creates a workspace with all the dependencies for project build.
   * Helps prototype the DML code in browser.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 edited a comment on pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 edited a comment on pull request #999:
URL: https://github.com/apache/systemds/pull/999#issuecomment-663703040


   1. Let us confine this notebook to engine developers only.
   
   2. About 
   > what does the results mean? can we use them for anything?
   
   Let us take this in a visualization, `dml` developer focused notebook.
   
   Is this okay for you?
   
   3. If you still want this notebook to be removed of spark, happy to do that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on pull request #999:
URL: https://github.com/apache/systemds/pull/999#issuecomment-663621291


   > as a final repeated note Install is way to large again, i think i have said this many times and In this case again, it does not make much sense to install spark for this execution.
   
   1. Hey @Baunsgaard - Sorry that I am not complying there. This notebook is meant for developers (Engine) mainly.
      This acts as a companion to our documentation for example `systemds-distributed-guide`, `systemds-standalone`
   
   2. We will have different notebook something like `systemds-user.ipynb`. In which all the setup will be done in two lines)
     a. `pip install systemds`
     b. `pip install spark` that's it.
   
   (1st comment)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r460257483



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."

Review comment:
       about `JAVA_HOME`, `mvn` seems to use this variable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] asfgit closed pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #999:
URL: https://github.com/apache/systemds/pull/999


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r460255500



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "8Xnb_ePUyQIL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 85
+        },
+        "outputId": "d5546e63-97df-42a9-ce53-f9f361eaff30"
+      },
+      "source": [
+        "import os\n",
+        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
+        "!update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
+        "!java -version\n",
+        "# os.environ[\"SPARK_HOME\"] = \"/content/spark-2.4.5-bin-hadoop2.7\""
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in manual mode\n",
+            "openjdk version \"1.8.0_252\"\n",
+            "OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)\n",
+            "OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "F9tp4EIpK_1o",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zqLC_1yCdr57",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XGuWF9w4d-LW",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "PZkw_gPEQvId",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Run and print a shell command.\n",
+        "def run(cmd):\n",
+        "  print('>> {}'.format(cmd))\n",
+        "  !{cmd}\n",
+        "  print('')\n"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BhmBWf3u3Q0o",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Apache Maven"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I81zPDcblchL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 238
+        },
+        "outputId": "6261a902-52e5-4054-ebf6-43a50447171c"
+      },
+      "source": [
+        "import os\n",
+        "\n",
+        "# Download the maven source.\n",
+        "maven_version = 'apache-maven-3.6.3'\n",
+        "maven_path = f\"/opt/{maven_version}\"\n",
+        "if not os.path.exists(maven_path):\n",
+        "  run(f\"wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\")\n",
+        "  run('unzip -q -d /opt apache-maven.zip')\n",
+        "  run('rm -f apache-maven.zip')\n",
+        "\n",
+        "# Let's choose the absolute path instead of $PATH environment variable.\n",
+        "def maven(args):\n",
+        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
+        "\n",
+        "maven('-v')\n"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.zip\n",
+            "\n",
+            ">> unzip -q -d /opt apache-maven.zip\n",
+            "\n",
+            ">> rm -f apache-maven.zip\n",
+            "\n",
+            ">> /opt/apache-maven-3.6.3/bin/mvn -v\n",
+            "\u001b[1mApache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)\u001b[m\n",
+            "Maven home: /opt/apache-maven-3.6.3\n",
+            "Java version: 1.8.0_252, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre\n",
+            "Default locale: en_US, platform encoding: UTF-8\n",
+            "OS name: \"linux\", version: \"4.19.104+\", arch: \"amd64\", family: \"unix\"\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Xphbe3R43XLw",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Download Apache Spark"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_WgEa00pTs3w",
+        "colab_type": "text"
+      },
+      "source": [
+        "NOTE: If spark is not downloaded. Let us make sure the version we are trying to download is officially supported at\n",
+        "https://spark.apache.org/downloads.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "3zdtkFkLnskx",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 119
+        },
+        "outputId": "a5b8b7ef-a401-4549-e521-320f9f22ed41"
+      },
+      "source": [
+        "# Spark and Hadoop version\n",
+        "spark_version = 'spark-2.4.6'\n",
+        "hadoop_version = 'hadoop2.7'\n",
+        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
+        "if not os.path.exists(spark_path):\n",
+        "  run(f\"wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\")\n",
+        "  run('tar zxf apache-spark.tgz -C /opt')\n",
+        "  run('rm -f apache-spark.tgz')\n",
+        "\n",
+        "os.environ[\"SPARK_HOME\"] = spark_path\n",
+        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"
+      ],
+      "execution_count": 13,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz\n",
+            "\n",
+            ">> tar zxf apache-spark.tgz -C /opt\n",
+            "\n",
+            ">> rm -f apache-spark.tgz\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "91pJ5U8k3cjk",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Get Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "SaPIprmg3lKE",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 153
+        },
+        "outputId": "1eee45c1-cd92-480e-d27c-a2229086b5a0"
+      },
+      "source": [
+        "!git clone https://github.com/apache/systemds systemds\n",
+        "%cd systemds"
+      ],
+      "execution_count": 6,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "Cloning into 'systemds'...\n",
+            "remote: Enumerating objects: 23, done.\u001b[K\n",
+            "remote: Counting objects: 100% (23/23), done.\u001b[K\n",
+            "remote: Compressing objects: 100% (17/17), done.\u001b[K\n",
+            "remote: Total 152626 (delta 0), reused 8 (delta 0), pack-reused 152603\u001b[K\n",
+            "Receiving objects: 100% (152626/152626), 225.02 MiB | 13.23 MiB/s, done.\n",
+            "Resolving deltas: 100% (97811/97811), done.\n",
+            "/content/systemds\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "s0Iorb0ICgHa",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 51
+        },
+        "outputId": "0f40128b-ffa3-422c-abb4-e0ac9359e63f"
+      },
+      "source": [
+        "# Logging flags: -q only for ERROR; -X for DEBUG; -e for ERROR\n",
+        "maven('clean package -q')"
+      ],
+      "execution_count": 7,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> /opt/apache-maven-3.6.3/bin/mvn clean package -q\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "AqM2YNWLNhZm",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Example Classification task\n",
+        "# !$SPARK_HOME/bin/spark-submit ./target/SystemDS.jar -f ./scripts/nn/examples/fm-binclass-dummy-data.dml"
+      ],
+      "execution_count": 8,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "xhMXM8BPltGc",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Playground for DML\n",
+        "\n",
+        "The following code cell is for dml code."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "t59rTyNbOF5b",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 34
+        },
+        "outputId": "0afbd272-46cd-4f05-b500-c96b529a5309"
+      },
+      "source": [
+        "%%writefile /content/test.dml\n",
+        "\n",
+        "# This code code acts as a playground for dml code\n",
+        "X = rand (rows = 20, cols = 10)\n",
+        "y = X %*% rand(rows = ncol(X), cols = 1)\n",
+        "lm(X = X, y = y)"
+      ],
+      "execution_count": 9,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "Writing /content/test.dml\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OFtyIIJqmD_6",
+        "colab_type": "text"
+      },
+      "source": [
+        "Run `dml` with Spark backend"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "kYosPWguO7DO",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 394
+        },
+        "outputId": "ba8be00e-04be-48a4-c7ef-4056eabcd3d2"
+      },
+      "source": [
+        "!$SPARK_HOME/bin/spark-submit \\\n",
+        "    ./target/SystemDS.jar -f /content/test.dml \n"
+      ],
+      "execution_count": 14,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "20/07/18 05:06:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n",
+            "log4j:WARN No appenders could be found for logger (org.apache.sysds.api.DMLScript).\n",
+            "log4j:WARN Please initialize the log4j system properly.\n",
+            "log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.\n",
+            "ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.7ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.7Calling the Direct Solver...\n",
+            "Computing the statistics...\n",
+            "AVG_TOT_Y, 2.306291350486548\n",
+            "STDEV_TOT_Y, 0.41527871477713496\n",
+            "AVG_RES_Y, 5.730468777276343E-9\n",
+            "STDEV_RES_Y, 5.849014866902826E-8\n",
+            "DISPERSION, 3.144664287007204E-15\n",
+            "R2, 0.9999999999999905\n",
+            "ADJUSTED_R2, 0.9999999999999818\n",
+            "R2_NOBIAS, 0.9999999999999906\n",
+            "ADJUSTED_R2_NOBIAS, 0.9999999999999801\n",
+            "R2_VS_0, 0.9999999999999997\n",
+            "ADJUSTED_R2_VS_0, 0.9999999999999994\n",
+            "SystemDS Statistics:\n",
+            "Total execution time:\t\t0.102 sec.\n",
+            "Number of executed Spark inst:\t0.\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "SUGac5w9ZRBQ",
+        "colab_type": "text"
+      },
+      "source": [
+        "### Working with SystemDS **Standalone**\n",
+        "\n",
+        "(NOTE: Pay attention to *directories* and *relative paths*. :))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "g5Nk2Bb4UU2O",
+        "colab_type": "text"
+      },
+      "source": [
+        "##### 1. Set SystemDS environement variables\n",
+        "\n",
+        "These are useful for the `./bin/systemds` script."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "2ZnSzkq8UT32",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!export SYSTEMDS_ROOT=$(pwd)\n",
+        "!export PATH=$SYSTEMDS_ROOT/bin:$PATH"
+      ],
+      "execution_count": 15,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "Tcxkh8cdUy1V",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!echo 'export SYSTEMDS_ROOT='$(pwd) >> ~/.bashrc\n",
+        "!echo 'export PATH=$SYSTEMDS_ROOT/bin:$PATH' >> ~/.bashrc"
+      ],
+      "execution_count": 16,

Review comment:
       I did `!cat ~/.bashrc`, could see `SYSTEMDS_ROOT`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r460256044



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on pull request #999:
URL: https://github.com/apache/systemds/pull/999#issuecomment-663703040


   Let us confine this notebook to engine developers only.
   
   About 
   > what does the results mean? can we use them for anything?
   
   Let us take this in a visualization `dml` developer focused notebook.
   
   Is this okay for you?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r460748904



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "8Xnb_ePUyQIL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 85
+        },
+        "outputId": "d5546e63-97df-42a9-ce53-f9f361eaff30"
+      },
+      "source": [
+        "import os\n",
+        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
+        "!update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
+        "!java -version\n",
+        "# os.environ[\"SPARK_HOME\"] = \"/content/spark-2.4.5-bin-hadoop2.7\""
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in manual mode\n",
+            "openjdk version \"1.8.0_252\"\n",
+            "OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)\n",
+            "OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "F9tp4EIpK_1o",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zqLC_1yCdr57",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XGuWF9w4d-LW",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "PZkw_gPEQvId",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Run and print a shell command.\n",
+        "def run(cmd):\n",
+        "  print('>> {}'.format(cmd))\n",
+        "  !{cmd}\n",
+        "  print('')\n"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BhmBWf3u3Q0o",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Apache Maven"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I81zPDcblchL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 238
+        },
+        "outputId": "6261a902-52e5-4054-ebf6-43a50447171c"
+      },
+      "source": [
+        "import os\n",
+        "\n",
+        "# Download the maven source.\n",
+        "maven_version = 'apache-maven-3.6.3'\n",
+        "maven_path = f\"/opt/{maven_version}\"\n",
+        "if not os.path.exists(maven_path):\n",
+        "  run(f\"wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\")\n",
+        "  run('unzip -q -d /opt apache-maven.zip')\n",
+        "  run('rm -f apache-maven.zip')\n",
+        "\n",
+        "# Let's choose the absolute path instead of $PATH environment variable.\n",
+        "def maven(args):\n",
+        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
+        "\n",
+        "maven('-v')\n"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.zip\n",
+            "\n",
+            ">> unzip -q -d /opt apache-maven.zip\n",
+            "\n",
+            ">> rm -f apache-maven.zip\n",
+            "\n",
+            ">> /opt/apache-maven-3.6.3/bin/mvn -v\n",
+            "\u001b[1mApache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)\u001b[m\n",
+            "Maven home: /opt/apache-maven-3.6.3\n",
+            "Java version: 1.8.0_252, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre\n",
+            "Default locale: en_US, platform encoding: UTF-8\n",
+            "OS name: \"linux\", version: \"4.19.104+\", arch: \"amd64\", family: \"unix\"\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Xphbe3R43XLw",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Download Apache Spark"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_WgEa00pTs3w",
+        "colab_type": "text"
+      },
+      "source": [
+        "NOTE: If spark is not downloaded. Let us make sure the version we are trying to download is officially supported at\n",
+        "https://spark.apache.org/downloads.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "3zdtkFkLnskx",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 119
+        },
+        "outputId": "a5b8b7ef-a401-4549-e521-320f9f22ed41"
+      },
+      "source": [
+        "# Spark and Hadoop version\n",
+        "spark_version = 'spark-2.4.6'\n",
+        "hadoop_version = 'hadoop2.7'\n",
+        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
+        "if not os.path.exists(spark_path):\n",
+        "  run(f\"wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\")\n",
+        "  run('tar zxf apache-spark.tgz -C /opt')\n",
+        "  run('rm -f apache-spark.tgz')\n",
+        "\n",
+        "os.environ[\"SPARK_HOME\"] = spark_path\n",
+        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"
+      ],
+      "execution_count": 13,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz\n",
+            "\n",
+            ">> tar zxf apache-spark.tgz -C /opt\n",
+            "\n",
+            ">> rm -f apache-spark.tgz\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "91pJ5U8k3cjk",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Get Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "SaPIprmg3lKE",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 153
+        },
+        "outputId": "1eee45c1-cd92-480e-d27c-a2229086b5a0"
+      },
+      "source": [
+        "!git clone https://github.com/apache/systemds systemds\n",
+        "%cd systemds"
+      ],
+      "execution_count": 6,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "Cloning into 'systemds'...\n",
+            "remote: Enumerating objects: 23, done.\u001b[K\n",
+            "remote: Counting objects: 100% (23/23), done.\u001b[K\n",
+            "remote: Compressing objects: 100% (17/17), done.\u001b[K\n",
+            "remote: Total 152626 (delta 0), reused 8 (delta 0), pack-reused 152603\u001b[K\n",
+            "Receiving objects: 100% (152626/152626), 225.02 MiB | 13.23 MiB/s, done.\n",
+            "Resolving deltas: 100% (97811/97811), done.\n",
+            "/content/systemds\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "s0Iorb0ICgHa",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 51
+        },
+        "outputId": "0f40128b-ffa3-422c-abb4-e0ac9359e63f"
+      },
+      "source": [
+        "# Logging flags: -q only for ERROR; -X for DEBUG; -e for ERROR\n",
+        "maven('clean package -q')"
+      ],
+      "execution_count": 7,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> /opt/apache-maven-3.6.3/bin/mvn clean package -q\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "AqM2YNWLNhZm",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Example Classification task\n",
+        "# !$SPARK_HOME/bin/spark-submit ./target/SystemDS.jar -f ./scripts/nn/examples/fm-binclass-dummy-data.dml"
+      ],
+      "execution_count": 8,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "xhMXM8BPltGc",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Playground for DML\n",
+        "\n",
+        "The following code cell is for dml code."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "t59rTyNbOF5b",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 34
+        },
+        "outputId": "0afbd272-46cd-4f05-b500-c96b529a5309"
+      },
+      "source": [
+        "%%writefile /content/test.dml\n",
+        "\n",
+        "# This code code acts as a playground for dml code\n",
+        "X = rand (rows = 20, cols = 10)\n",
+        "y = X %*% rand(rows = ncol(X), cols = 1)\n",
+        "lm(X = X, y = y)"
+      ],
+      "execution_count": 9,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "Writing /content/test.dml\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OFtyIIJqmD_6",
+        "colab_type": "text"
+      },
+      "source": [
+        "Run `dml` with Spark backend"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "kYosPWguO7DO",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 394
+        },
+        "outputId": "ba8be00e-04be-48a4-c7ef-4056eabcd3d2"
+      },
+      "source": [
+        "!$SPARK_HOME/bin/spark-submit \\\n",
+        "    ./target/SystemDS.jar -f /content/test.dml \n"
+      ],
+      "execution_count": 14,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "20/07/18 05:06:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n",
+            "log4j:WARN No appenders could be found for logger (org.apache.sysds.api.DMLScript).\n",
+            "log4j:WARN Please initialize the log4j system properly.\n",
+            "log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.\n",
+            "ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.7ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.7Calling the Direct Solver...\n",
+            "Computing the statistics...\n",
+            "AVG_TOT_Y, 2.306291350486548\n",
+            "STDEV_TOT_Y, 0.41527871477713496\n",
+            "AVG_RES_Y, 5.730468777276343E-9\n",
+            "STDEV_RES_Y, 5.849014866902826E-8\n",
+            "DISPERSION, 3.144664287007204E-15\n",
+            "R2, 0.9999999999999905\n",
+            "ADJUSTED_R2, 0.9999999999999818\n",
+            "R2_NOBIAS, 0.9999999999999906\n",
+            "ADJUSTED_R2_NOBIAS, 0.9999999999999801\n",
+            "R2_VS_0, 0.9999999999999997\n",
+            "ADJUSTED_R2_VS_0, 0.9999999999999994\n",
+            "SystemDS Statistics:\n",
+            "Total execution time:\t\t0.102 sec.\n",
+            "Number of executed Spark inst:\t0.\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "SUGac5w9ZRBQ",
+        "colab_type": "text"
+      },
+      "source": [
+        "### Working with SystemDS **Standalone**\n",
+        "\n",
+        "(NOTE: Pay attention to *directories* and *relative paths*. :))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "g5Nk2Bb4UU2O",
+        "colab_type": "text"
+      },
+      "source": [
+        "##### 1. Set SystemDS environement variables\n",
+        "\n",
+        "These are useful for the `./bin/systemds` script."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "2ZnSzkq8UT32",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!export SYSTEMDS_ROOT=$(pwd)\n",
+        "!export PATH=$SYSTEMDS_ROOT/bin:$PATH"
+      ],
+      "execution_count": 15,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "Tcxkh8cdUy1V",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!echo 'export SYSTEMDS_ROOT='$(pwd) >> ~/.bashrc\n",
+        "!echo 'export PATH=$SYSTEMDS_ROOT/bin:$PATH' >> ~/.bashrc"
+      ],
+      "execution_count": 16,

Review comment:
       so the point here is not that it does not put the lines into .bashrc, the point is that it does not use them for the following commands, if they did you would be able to change the `./bin/systemds` to `systemds`.
   i did not have success with making this work with either of the two methods.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on pull request #999:
URL: https://github.com/apache/systemds/pull/999#issuecomment-667716032


   Thank you for the review. :tada:


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard commented on pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on pull request #999:
URL: https://github.com/apache/systemds/pull/999#issuecomment-664232435


   >     1. Let us confine this notebook to engine developers only.
   
   I think that the engine developers should not be the ones such a notebook is directed at, but again it might just be my opinion.
   Again a matter of priority, if you think that this notebook is needed for engine developers then fine but in general the 'beginner' user should be taken more into account, therefore addressing such a guide for them would be great. 
   
   >     2. About
   > > what does the results mean? can we use them for anything?
   >
   > Let us take this in a visualization, `dml` developer focused notebook.
   
   I think that would be unnecessarily confusing and duplication of work elsewhere. The discussing results should be a necessary extension of executing anything. That said i might be the only one with that opinion, therefore if you think that discussion of results is not needed then fine.
   
   > Is this okay for you?
   > 
   >     1. If you still want this notebook to be removed of spark, happy to do that.
   
   Yes i think spark is an unnecessary addition in such a guide.
   Our system is able to leverage spark and make operations faster, but it is not limited to only run in cases where spark is installed. In this specific case where you want to guide people into using a system it should be easy to do and jump into the interesting parts as early as possible, not fight with setup, which in many cases deter users.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r460158055



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "8Xnb_ePUyQIL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 85
+        },
+        "outputId": "d5546e63-97df-42a9-ce53-f9f361eaff30"
+      },
+      "source": [
+        "import os\n",
+        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
+        "!update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
+        "!java -version\n",
+        "# os.environ[\"SPARK_HOME\"] = \"/content/spark-2.4.5-bin-hadoop2.7\""
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in manual mode\n",
+            "openjdk version \"1.8.0_252\"\n",
+            "OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)\n",
+            "OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "F9tp4EIpK_1o",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zqLC_1yCdr57",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XGuWF9w4d-LW",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "PZkw_gPEQvId",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Run and print a shell command.\n",
+        "def run(cmd):\n",
+        "  print('>> {}'.format(cmd))\n",
+        "  !{cmd}\n",
+        "  print('')\n"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BhmBWf3u3Q0o",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Apache Maven"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I81zPDcblchL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 238
+        },
+        "outputId": "6261a902-52e5-4054-ebf6-43a50447171c"
+      },
+      "source": [
+        "import os\n",
+        "\n",
+        "# Download the maven source.\n",
+        "maven_version = 'apache-maven-3.6.3'\n",
+        "maven_path = f\"/opt/{maven_version}\"\n",
+        "if not os.path.exists(maven_path):\n",
+        "  run(f\"wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\")\n",
+        "  run('unzip -q -d /opt apache-maven.zip')\n",
+        "  run('rm -f apache-maven.zip')\n",
+        "\n",
+        "# Let's choose the absolute path instead of $PATH environment variable.\n",
+        "def maven(args):\n",
+        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
+        "\n",
+        "maven('-v')\n"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.zip\n",
+            "\n",
+            ">> unzip -q -d /opt apache-maven.zip\n",
+            "\n",
+            ">> rm -f apache-maven.zip\n",
+            "\n",
+            ">> /opt/apache-maven-3.6.3/bin/mvn -v\n",
+            "\u001b[1mApache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)\u001b[m\n",
+            "Maven home: /opt/apache-maven-3.6.3\n",
+            "Java version: 1.8.0_252, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre\n",
+            "Default locale: en_US, platform encoding: UTF-8\n",
+            "OS name: \"linux\", version: \"4.19.104+\", arch: \"amd64\", family: \"unix\"\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Xphbe3R43XLw",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Download Apache Spark"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_WgEa00pTs3w",
+        "colab_type": "text"
+      },
+      "source": [
+        "NOTE: If spark is not downloaded. Let us make sure the version we are trying to download is officially supported at\n",
+        "https://spark.apache.org/downloads.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "3zdtkFkLnskx",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 119
+        },
+        "outputId": "a5b8b7ef-a401-4549-e521-320f9f22ed41"
+      },
+      "source": [
+        "# Spark and Hadoop version\n",
+        "spark_version = 'spark-2.4.6'\n",
+        "hadoop_version = 'hadoop2.7'\n",
+        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
+        "if not os.path.exists(spark_path):\n",
+        "  run(f\"wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\")\n",
+        "  run('tar zxf apache-spark.tgz -C /opt')\n",
+        "  run('rm -f apache-spark.tgz')\n",
+        "\n",
+        "os.environ[\"SPARK_HOME\"] = spark_path\n",
+        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"
+      ],
+      "execution_count": 13,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz\n",
+            "\n",
+            ">> tar zxf apache-spark.tgz -C /opt\n",
+            "\n",
+            ">> rm -f apache-spark.tgz\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "91pJ5U8k3cjk",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Get Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "SaPIprmg3lKE",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 153
+        },
+        "outputId": "1eee45c1-cd92-480e-d27c-a2229086b5a0"
+      },
+      "source": [
+        "!git clone https://github.com/apache/systemds systemds\n",

Review comment:
       Yes, will do.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r460256905



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },

Review comment:
       I have reorganized the sections with more description!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r459935249



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."

Review comment:
       At install the java should be on path, so that you don't need to set that environment variable.

##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },

Review comment:
       This above code cell contains nothing, Maybe make some introduction text saying ...
   
   ```code
   In this notebook we show ... x ... becauase ... y etc. 
   it is separated into 3 parts...
   1. install [link]
   2. code [link]
   3. results [link]
   ```

##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "8Xnb_ePUyQIL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 85
+        },
+        "outputId": "d5546e63-97df-42a9-ce53-f9f361eaff30"
+      },
+      "source": [
+        "import os\n",
+        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
+        "!update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
+        "!java -version\n",
+        "# os.environ[\"SPARK_HOME\"] = \"/content/spark-2.4.5-bin-hadoop2.7\""
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in manual mode\n",
+            "openjdk version \"1.8.0_252\"\n",
+            "OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)\n",
+            "OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "F9tp4EIpK_1o",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zqLC_1yCdr57",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XGuWF9w4d-LW",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "PZkw_gPEQvId",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Run and print a shell command.\n",
+        "def run(cmd):\n",
+        "  print('>> {}'.format(cmd))\n",
+        "  !{cmd}\n",
+        "  print('')\n"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BhmBWf3u3Q0o",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Apache Maven"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I81zPDcblchL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 238
+        },
+        "outputId": "6261a902-52e5-4054-ebf6-43a50447171c"
+      },
+      "source": [
+        "import os\n",
+        "\n",
+        "# Download the maven source.\n",
+        "maven_version = 'apache-maven-3.6.3'\n",
+        "maven_path = f\"/opt/{maven_version}\"\n",
+        "if not os.path.exists(maven_path):\n",
+        "  run(f\"wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\")\n",
+        "  run('unzip -q -d /opt apache-maven.zip')\n",
+        "  run('rm -f apache-maven.zip')\n",
+        "\n",
+        "# Let's choose the absolute path instead of $PATH environment variable.\n",
+        "def maven(args):\n",
+        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
+        "\n",
+        "maven('-v')\n"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.zip\n",
+            "\n",
+            ">> unzip -q -d /opt apache-maven.zip\n",
+            "\n",
+            ">> rm -f apache-maven.zip\n",
+            "\n",
+            ">> /opt/apache-maven-3.6.3/bin/mvn -v\n",
+            "\u001b[1mApache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)\u001b[m\n",
+            "Maven home: /opt/apache-maven-3.6.3\n",
+            "Java version: 1.8.0_252, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre\n",
+            "Default locale: en_US, platform encoding: UTF-8\n",
+            "OS name: \"linux\", version: \"4.19.104+\", arch: \"amd64\", family: \"unix\"\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Xphbe3R43XLw",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Download Apache Spark"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_WgEa00pTs3w",
+        "colab_type": "text"
+      },
+      "source": [
+        "NOTE: If spark is not downloaded. Let us make sure the version we are trying to download is officially supported at\n",
+        "https://spark.apache.org/downloads.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "3zdtkFkLnskx",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 119
+        },
+        "outputId": "a5b8b7ef-a401-4549-e521-320f9f22ed41"
+      },
+      "source": [
+        "# Spark and Hadoop version\n",
+        "spark_version = 'spark-2.4.6'\n",
+        "hadoop_version = 'hadoop2.7'\n",
+        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
+        "if not os.path.exists(spark_path):\n",
+        "  run(f\"wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\")\n",
+        "  run('tar zxf apache-spark.tgz -C /opt')\n",
+        "  run('rm -f apache-spark.tgz')\n",
+        "\n",
+        "os.environ[\"SPARK_HOME\"] = spark_path\n",
+        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"
+      ],
+      "execution_count": 13,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz\n",
+            "\n",
+            ">> tar zxf apache-spark.tgz -C /opt\n",
+            "\n",
+            ">> rm -f apache-spark.tgz\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "91pJ5U8k3cjk",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Get Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "SaPIprmg3lKE",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 153
+        },
+        "outputId": "1eee45c1-cd92-480e-d27c-a2229086b5a0"
+      },
+      "source": [
+        "!git clone https://github.com/apache/systemds systemds\n",
+        "%cd systemds"
+      ],
+      "execution_count": 6,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "Cloning into 'systemds'...\n",
+            "remote: Enumerating objects: 23, done.\u001b[K\n",
+            "remote: Counting objects: 100% (23/23), done.\u001b[K\n",
+            "remote: Compressing objects: 100% (17/17), done.\u001b[K\n",
+            "remote: Total 152626 (delta 0), reused 8 (delta 0), pack-reused 152603\u001b[K\n",
+            "Receiving objects: 100% (152626/152626), 225.02 MiB | 13.23 MiB/s, done.\n",
+            "Resolving deltas: 100% (97811/97811), done.\n",
+            "/content/systemds\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "s0Iorb0ICgHa",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 51
+        },
+        "outputId": "0f40128b-ffa3-422c-abb4-e0ac9359e63f"
+      },
+      "source": [
+        "# Logging flags: -q only for ERROR; -X for DEBUG; -e for ERROR\n",
+        "maven('clean package -q')"
+      ],
+      "execution_count": 7,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> /opt/apache-maven-3.6.3/bin/mvn clean package -q\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "AqM2YNWLNhZm",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Example Classification task\n",
+        "# !$SPARK_HOME/bin/spark-submit ./target/SystemDS.jar -f ./scripts/nn/examples/fm-binclass-dummy-data.dml"
+      ],
+      "execution_count": 8,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "xhMXM8BPltGc",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Playground for DML\n",
+        "\n",
+        "The following code cell is for dml code."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "t59rTyNbOF5b",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 34
+        },
+        "outputId": "0afbd272-46cd-4f05-b500-c96b529a5309"
+      },
+      "source": [
+        "%%writefile /content/test.dml\n",
+        "\n",
+        "# This code code acts as a playground for dml code\n",
+        "X = rand (rows = 20, cols = 10)\n",
+        "y = X %*% rand(rows = ncol(X), cols = 1)\n",
+        "lm(X = X, y = y)"
+      ],
+      "execution_count": 9,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "Writing /content/test.dml\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OFtyIIJqmD_6",
+        "colab_type": "text"
+      },
+      "source": [
+        "Run `dml` with Spark backend"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "kYosPWguO7DO",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 394
+        },
+        "outputId": "ba8be00e-04be-48a4-c7ef-4056eabcd3d2"
+      },
+      "source": [
+        "!$SPARK_HOME/bin/spark-submit \\\n",
+        "    ./target/SystemDS.jar -f /content/test.dml \n"
+      ],
+      "execution_count": 14,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "20/07/18 05:06:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n",
+            "log4j:WARN No appenders could be found for logger (org.apache.sysds.api.DMLScript).\n",
+            "log4j:WARN Please initialize the log4j system properly.\n",
+            "log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.\n",
+            "ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.7ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.7Calling the Direct Solver...\n",
+            "Computing the statistics...\n",
+            "AVG_TOT_Y, 2.306291350486548\n",
+            "STDEV_TOT_Y, 0.41527871477713496\n",
+            "AVG_RES_Y, 5.730468777276343E-9\n",
+            "STDEV_RES_Y, 5.849014866902826E-8\n",
+            "DISPERSION, 3.144664287007204E-15\n",
+            "R2, 0.9999999999999905\n",
+            "ADJUSTED_R2, 0.9999999999999818\n",
+            "R2_NOBIAS, 0.9999999999999906\n",
+            "ADJUSTED_R2_NOBIAS, 0.9999999999999801\n",
+            "R2_VS_0, 0.9999999999999997\n",
+            "ADJUSTED_R2_VS_0, 0.9999999999999994\n",
+            "SystemDS Statistics:\n",
+            "Total execution time:\t\t0.102 sec.\n",
+            "Number of executed Spark inst:\t0.\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "SUGac5w9ZRBQ",
+        "colab_type": "text"
+      },
+      "source": [
+        "### Working with SystemDS **Standalone**\n",
+        "\n",
+        "(NOTE: Pay attention to *directories* and *relative paths*. :))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "g5Nk2Bb4UU2O",
+        "colab_type": "text"
+      },
+      "source": [
+        "##### 1. Set SystemDS environement variables\n",
+        "\n",
+        "These are useful for the `./bin/systemds` script."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "2ZnSzkq8UT32",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!export SYSTEMDS_ROOT=$(pwd)\n",
+        "!export PATH=$SYSTEMDS_ROOT/bin:$PATH"
+      ],
+      "execution_count": 15,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "Tcxkh8cdUy1V",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!echo 'export SYSTEMDS_ROOT='$(pwd) >> ~/.bashrc\n",
+        "!echo 'export PATH=$SYSTEMDS_ROOT/bin:$PATH' >> ~/.bashrc"
+      ],
+      "execution_count": 16,

Review comment:
       It seems like after some debugging from my side that all the exports doesn't work inside this notebook. This also includes the echo to ~/.bashrc.
   also fixing block `16` adding a `!source ~/.bashrc` does not work. 

##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"

Review comment:
       Could we not move most of the install command into this block. like:
   
   ```
   sudo apt install openjdk-8-jdk-headless
   sudo apt install maven
   ```

##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "8Xnb_ePUyQIL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 85
+        },
+        "outputId": "d5546e63-97df-42a9-ce53-f9f361eaff30"
+      },
+      "source": [
+        "import os\n",
+        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
+        "!update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
+        "!java -version\n",
+        "# os.environ[\"SPARK_HOME\"] = \"/content/spark-2.4.5-bin-hadoop2.7\""
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in manual mode\n",
+            "openjdk version \"1.8.0_252\"\n",
+            "OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)\n",
+            "OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "F9tp4EIpK_1o",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zqLC_1yCdr57",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XGuWF9w4d-LW",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "PZkw_gPEQvId",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Run and print a shell command.\n",
+        "def run(cmd):\n",
+        "  print('>> {}'.format(cmd))\n",
+        "  !{cmd}\n",
+        "  print('')\n"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BhmBWf3u3Q0o",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Apache Maven"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I81zPDcblchL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 238
+        },
+        "outputId": "6261a902-52e5-4054-ebf6-43a50447171c"
+      },
+      "source": [
+        "import os\n",
+        "\n",
+        "# Download the maven source.\n",
+        "maven_version = 'apache-maven-3.6.3'\n",
+        "maven_path = f\"/opt/{maven_version}\"\n",
+        "if not os.path.exists(maven_path):\n",
+        "  run(f\"wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\")\n",
+        "  run('unzip -q -d /opt apache-maven.zip')\n",
+        "  run('rm -f apache-maven.zip')\n",
+        "\n",
+        "# Let's choose the absolute path instead of $PATH environment variable.\n",
+        "def maven(args):\n",
+        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
+        "\n",
+        "maven('-v')\n"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.zip\n",
+            "\n",
+            ">> unzip -q -d /opt apache-maven.zip\n",
+            "\n",
+            ">> rm -f apache-maven.zip\n",
+            "\n",
+            ">> /opt/apache-maven-3.6.3/bin/mvn -v\n",
+            "\u001b[1mApache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)\u001b[m\n",
+            "Maven home: /opt/apache-maven-3.6.3\n",
+            "Java version: 1.8.0_252, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre\n",
+            "Default locale: en_US, platform encoding: UTF-8\n",
+            "OS name: \"linux\", version: \"4.19.104+\", arch: \"amd64\", family: \"unix\"\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Xphbe3R43XLw",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Download Apache Spark"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_WgEa00pTs3w",
+        "colab_type": "text"
+      },
+      "source": [
+        "NOTE: If spark is not downloaded. Let us make sure the version we are trying to download is officially supported at\n",
+        "https://spark.apache.org/downloads.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "3zdtkFkLnskx",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 119
+        },
+        "outputId": "a5b8b7ef-a401-4549-e521-320f9f22ed41"
+      },
+      "source": [
+        "# Spark and Hadoop version\n",
+        "spark_version = 'spark-2.4.6'\n",
+        "hadoop_version = 'hadoop2.7'\n",
+        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
+        "if not os.path.exists(spark_path):\n",
+        "  run(f\"wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\")\n",
+        "  run('tar zxf apache-spark.tgz -C /opt')\n",
+        "  run('rm -f apache-spark.tgz')\n",
+        "\n",
+        "os.environ[\"SPARK_HOME\"] = spark_path\n",
+        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"
+      ],
+      "execution_count": 13,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz\n",
+            "\n",
+            ">> tar zxf apache-spark.tgz -C /opt\n",
+            "\n",
+            ">> rm -f apache-spark.tgz\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "91pJ5U8k3cjk",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Get Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "SaPIprmg3lKE",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 153
+        },
+        "outputId": "1eee45c1-cd92-480e-d27c-a2229086b5a0"
+      },
+      "source": [
+        "!git clone https://github.com/apache/systemds systemds\n",

Review comment:
       Could we use the trick of only downloading the last commit `--depth=1`
   

##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "8Xnb_ePUyQIL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 85
+        },
+        "outputId": "d5546e63-97df-42a9-ce53-f9f361eaff30"
+      },
+      "source": [
+        "import os\n",
+        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
+        "!update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
+        "!java -version\n",
+        "# os.environ[\"SPARK_HOME\"] = \"/content/spark-2.4.5-bin-hadoop2.7\""
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in manual mode\n",
+            "openjdk version \"1.8.0_252\"\n",
+            "OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)\n",
+            "OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "F9tp4EIpK_1o",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zqLC_1yCdr57",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XGuWF9w4d-LW",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "PZkw_gPEQvId",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Run and print a shell command.\n",
+        "def run(cmd):\n",
+        "  print('>> {}'.format(cmd))\n",
+        "  !{cmd}\n",
+        "  print('')\n"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BhmBWf3u3Q0o",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Apache Maven"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I81zPDcblchL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 238
+        },
+        "outputId": "6261a902-52e5-4054-ebf6-43a50447171c"
+      },
+      "source": [
+        "import os\n",
+        "\n",
+        "# Download the maven source.\n",
+        "maven_version = 'apache-maven-3.6.3'\n",
+        "maven_path = f\"/opt/{maven_version}\"\n",
+        "if not os.path.exists(maven_path):\n",
+        "  run(f\"wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\")\n",
+        "  run('unzip -q -d /opt apache-maven.zip')\n",
+        "  run('rm -f apache-maven.zip')\n",
+        "\n",
+        "# Let's choose the absolute path instead of $PATH environment variable.\n",
+        "def maven(args):\n",
+        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
+        "\n",
+        "maven('-v')\n"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.zip\n",
+            "\n",
+            ">> unzip -q -d /opt apache-maven.zip\n",
+            "\n",
+            ">> rm -f apache-maven.zip\n",
+            "\n",
+            ">> /opt/apache-maven-3.6.3/bin/mvn -v\n",
+            "\u001b[1mApache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)\u001b[m\n",
+            "Maven home: /opt/apache-maven-3.6.3\n",
+            "Java version: 1.8.0_252, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre\n",
+            "Default locale: en_US, platform encoding: UTF-8\n",
+            "OS name: \"linux\", version: \"4.19.104+\", arch: \"amd64\", family: \"unix\"\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Xphbe3R43XLw",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Download Apache Spark"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_WgEa00pTs3w",
+        "colab_type": "text"
+      },
+      "source": [
+        "NOTE: If spark is not downloaded. Let us make sure the version we are trying to download is officially supported at\n",
+        "https://spark.apache.org/downloads.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "3zdtkFkLnskx",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 119
+        },
+        "outputId": "a5b8b7ef-a401-4549-e521-320f9f22ed41"
+      },
+      "source": [
+        "# Spark and Hadoop version\n",
+        "spark_version = 'spark-2.4.6'\n",
+        "hadoop_version = 'hadoop2.7'\n",
+        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
+        "if not os.path.exists(spark_path):\n",
+        "  run(f\"wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\")\n",
+        "  run('tar zxf apache-spark.tgz -C /opt')\n",
+        "  run('rm -f apache-spark.tgz')\n",
+        "\n",
+        "os.environ[\"SPARK_HOME\"] = spark_path\n",
+        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"

Review comment:
       I think a version using spark, and one without would be better. Since adding the spark install, unnecessarily complicates this notebook. Also using spark inside this notebook should lead to no improved performance only overhead, since it is executed in a single node.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] j143 commented on a change in pull request #999: Notebook for SystemDS on colab for developers

Posted by GitBox <gi...@apache.org>.
j143 commented on a change in pull request #999:
URL: https://github.com/apache/systemds/pull/999#discussion_r460255867



##########
File path: notebooks/systemds_dev.ipynb
##########
@@ -0,0 +1,882 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "SystemDS on Colaboratory.ipynb",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github.com/apache/systemds/blob/master/notebooks/colab/systemds_dev_standalone.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_BbCdLjRoy2A",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Developer notebook for Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I76oWp7foiyF",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kvD4HBMi0ohY",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Java\n",
+        "This installs Java 8 of Open JDK"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "fUhBhrGmyAvs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!apt-get install openjdk-8-jdk-headless -qq > /dev/null"
+      ],
+      "execution_count": 1,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4Kjvk_h1AHl",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Set Environment Variables\n",
+        "Set the locations where Spark and Java are installed."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "8Xnb_ePUyQIL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 85
+        },
+        "outputId": "d5546e63-97df-42a9-ce53-f9f361eaff30"
+      },
+      "source": [
+        "import os\n",
+        "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\"\n",
+        "!update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java\n",
+        "!java -version\n",
+        "# os.environ[\"SPARK_HOME\"] = \"/content/spark-2.4.5-bin-hadoop2.7\""
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in manual mode\n",
+            "openjdk version \"1.8.0_252\"\n",
+            "OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)\n",
+            "OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "F9tp4EIpK_1o",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zqLC_1yCdr57",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Apache SystemDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XGuWF9w4d-LW",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "PZkw_gPEQvId",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Run and print a shell command.\n",
+        "def run(cmd):\n",
+        "  print('>> {}'.format(cmd))\n",
+        "  !{cmd}\n",
+        "  print('')\n"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BhmBWf3u3Q0o",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Install Apache Maven"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "I81zPDcblchL",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 238
+        },
+        "outputId": "6261a902-52e5-4054-ebf6-43a50447171c"
+      },
+      "source": [
+        "import os\n",
+        "\n",
+        "# Download the maven source.\n",
+        "maven_version = 'apache-maven-3.6.3'\n",
+        "maven_path = f\"/opt/{maven_version}\"\n",
+        "if not os.path.exists(maven_path):\n",
+        "  run(f\"wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/{maven_version}-bin.zip\")\n",
+        "  run('unzip -q -d /opt apache-maven.zip')\n",
+        "  run('rm -f apache-maven.zip')\n",
+        "\n",
+        "# Let's choose the absolute path instead of $PATH environment variable.\n",
+        "def maven(args):\n",
+        "  run(f\"{maven_path}/bin/mvn {args}\")\n",
+        "\n",
+        "maven('-v')\n"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            ">> wget -q -nc -O apache-maven.zip https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.zip\n",
+            "\n",
+            ">> unzip -q -d /opt apache-maven.zip\n",
+            "\n",
+            ">> rm -f apache-maven.zip\n",
+            "\n",
+            ">> /opt/apache-maven-3.6.3/bin/mvn -v\n",
+            "\u001b[1mApache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)\u001b[m\n",
+            "Maven home: /opt/apache-maven-3.6.3\n",
+            "Java version: 1.8.0_252, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre\n",
+            "Default locale: en_US, platform encoding: UTF-8\n",
+            "OS name: \"linux\", version: \"4.19.104+\", arch: \"amd64\", family: \"unix\"\n",
+            "\n"
+          ],
+          "name": "stdout"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Xphbe3R43XLw",
+        "colab_type": "text"
+      },
+      "source": [
+        "#### Download Apache Spark"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_WgEa00pTs3w",
+        "colab_type": "text"
+      },
+      "source": [
+        "NOTE: If spark is not downloaded. Let us make sure the version we are trying to download is officially supported at\n",
+        "https://spark.apache.org/downloads.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "3zdtkFkLnskx",
+        "colab_type": "code",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 119
+        },
+        "outputId": "a5b8b7ef-a401-4549-e521-320f9f22ed41"
+      },
+      "source": [
+        "# Spark and Hadoop version\n",
+        "spark_version = 'spark-2.4.6'\n",
+        "hadoop_version = 'hadoop2.7'\n",
+        "spark_path = f\"/opt/{spark_version}-bin-{hadoop_version}\"\n",
+        "if not os.path.exists(spark_path):\n",
+        "  run(f\"wget -q -nc -O apache-spark.tgz https://downloads.apache.org/spark/{spark_version}/{spark_version}-bin-{hadoop_version}.tgz\")\n",
+        "  run('tar zxf apache-spark.tgz -C /opt')\n",
+        "  run('rm -f apache-spark.tgz')\n",
+        "\n",
+        "os.environ[\"SPARK_HOME\"] = spark_path\n",
+        "os.environ[\"PATH\"] += \":$SPARK_HOME/bin\"\n"

Review comment:
       Yes, for now I am adding this as an optional.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org