You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@systemds.apache.org by ja...@apache.org on 2021/05/14 03:05:07 UTC

[systemds] branch master updated: [SYSTEMDS-2970] Initial linux instructions for GPU (#1274)

This is an automated email from the ASF dual-hosted git repository.

janardhan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemds.git


The following commit(s) were added to refs/heads/master by this push:
     new 3eb8c5a  [SYSTEMDS-2970] Initial linux instructions for GPU (#1274)
3eb8c5a is described below

commit 3eb8c5ae5d7612fbc92cc0baca183e5b33ada940
Author: j143 <j1...@protonmail.com>
AuthorDate: Fri May 14 08:35:00 2021 +0530

    [SYSTEMDS-2970] Initial linux instructions for GPU (#1274)
    
    * Add commercial version of NVIDIA hardware supported.
    * CUDA 10.2 and CuDNN 7.6.5 install instructions
    * Verified the instructions work on NVIDIA Tesla K80 with 30GB memory
    
    * add gpu pagelink to header
---
 docs/_includes/header.html |   7 +--
 docs/site/gpu.md           | 132 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 135 insertions(+), 4 deletions(-)

diff --git a/docs/_includes/header.html b/docs/_includes/header.html
index 82506e1..fcf7cec 100644
--- a/docs/_includes/header.html
+++ b/docs/_includes/header.html
@@ -43,14 +43,15 @@ limitations under the License.
                         <li class="divider"></li>
                         <li><b>Running SystemDS:</b></li>
                         <li><a href=".{% if page.path contains 'site' %}/..{% endif %}/site/run">Standalone Guide</a></li>
+                        <li><a href=".{% if page.path contains 'site' %}/..{% endif %}/site/gpu">GPU Guide</a></li>
                         <li class="divider"></li>
                         <li><b>Language Guides:</b></li>
                         <li><a href=".{% if page.path contains 'site' %}/..{% endif %}/site/dml-language-reference.html">DML Language Reference</a></li>
                         <li><a href=".{% if page.path contains 'site' %}/..{% endif %}/site/builtins-reference.html">Built-in Functions Reference</a></li>
                         <li><a href=".{% if page.path contains 'site' %}/..{% endif %}/site/dml-vs-r-guide.html">DML vs R guide</a></li>
                         <li class="divider"></li>
-                        <li><b>ML Algorithms:</b></li>
-                        <li><a href=".{% if page.path contains 'site' %}/..{% endif %}/site/algorithms-reference.html">Algorithms Reference</a></li>
+                        <li><b>Algorithms:</b></li>
+                        <li><a href=".{% if page.path contains 'site' %}/..{% endif %}/site/algorithms-reference.html">ML Algorithms Reference</a></li>
                         <li class="divider"></li>
                         <li><b>Other:</b></li>
                         <li><a href="https://github.com/apache/systemds/blob/master/CONTRIBUTING.md">Contributing to SystemDS 🡕</a></li>
@@ -68,4 +69,4 @@ limitations under the License.
             </ul>
         </nav>
     </div>
-</header>
\ No newline at end of file
+</header>
diff --git a/docs/site/gpu.md b/docs/site/gpu.md
index 80a3e5e..0734171 100644
--- a/docs/site/gpu.md
+++ b/docs/site/gpu.md
@@ -24,6 +24,7 @@ limitations under the License.
 This guide covers the GPU hardware and software setup for using SystemDS `gpu` mode.
 
 - [Requirements](#requirements)
+- [Linux](#linux)
 - [Windows](#windows)
 - [Command-line users](#command-line-users)
 - [Scala Users](#scala-users)
@@ -54,6 +55,20 @@ architecture specific PTX is not available enable JIT PTX with instructions comp
   > nvcc SystemDS.cu --gpu-architecture=compute_50 --gpu-code=sm_50,sm_52
   > ```
 
+Note: A disk of minimum size 30 GB is recommended.
+
+
+A minimum version of 10.2 CUDA toolkit version is recommended, for the following GPUs.
+
+| GPU type | Status | 
+| --- | --- |
+| NVIDIA T4 | Experimental |
+| NVIDIA V100 | Experimental |
+| NVIDIA P100 | Experimental |
+| NVIDIA P4 | Experimental |
+| NVIDIA K80 | Tested |
+| NVIDIA A100 | Not supported |
+
 ### Software
 
 The following NVIDIA software is required to be installed in your system:
@@ -65,13 +80,128 @@ CUDA toolkit
   3. [CUDA 10.2](https://developer.nvidia.com/cuda-10.2-download-archive)
   4. [CUDNN 7.x](https://developer.nvidia.com/cudnn)
 
+## Linux
+
+One easiest way to install the NVIDIA software is with `apt` on Ubuntu. For other distributions
+refer to the [CUDA install Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html).
+
+Note: All linux distributions may not support this. you might encounter some problems with driver
+installations.
+
+To check the CUDA compatible driver version:
+
+Install [CUPTI](http://docs.nvidia.com/cuda/cupti/) which ships with CUDA toolkit for profiling.
+
+```sh
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
+```
+
+### Install CUDA with apt
+
+The following instructions are for installing CUDA 10.2 on Ubuntu 18.04. These instructions
+might work for other Debian-based distros.
+
+Note: [Secure Boot](https://wiki.ubuntu.com/UEFI/SecureBoot) tends to complication installation.
+These instructions may not address this.
+
+#### Ubuntu 18.04 (CUDA 10.2)
+
+```sh
+
+# Add NVIDIA package repositories
+# 1. Download the Ubuntu 18.04 driver repository
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
+# 2. Move the repository to preferences
+sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
+# 3. Fetch keys
+sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
+# 4. add repository
+sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
+# 5. Update package lists
+sudo apt-get update
+
+# ---
+# 6. get the machine-learning repo
+# this downloads the repository package but not the actual installation package
+wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
+
+sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
+sudo apt-get update
+
+wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
+sudo apt install ./libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
+sudo apt-get update
+
+wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.deb
+sudo apt install ./libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.deb
+sudo apt-get update
+
+# ---
+
+# 7. Install development and runtime libraries (~4GB)
+sudo apt-get install --no-install-recommends \
+    cuda-10-2 \
+    libcudnn7=7.6.5.32-1+cuda10.2 \
+    libcudnn7-dev=7.6.5.32-1+cuda10.2
+    
+# Reboot the system. And run `nvidia-smi` for GPU check.
+```
+
+#### Installation check
+
+```sh
+$ nvidia-smi
+Thu May 13 04:19:11 2021
++-----------------------------------------------------------------------------+
+| NVIDIA-SMI 465.19.01    Driver Version: 465.19.01    CUDA Version: 11.3     |
+|-------------------------------+----------------------+----------------------+
+| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
+| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
+|                               |                      |               MIG M. |
+|===============================+======================+======================|
+|   0  NVIDIA Tesla K80    Off  | 00000000:00:1E.0 Off |                    0 |
+| N/A   38C    P0    58W / 149W |      0MiB / 11441MiB |     98%      Default |
+|                               |                      |                  N/A |
++-------------------------------+----------------------+----------------------+
+
++-----------------------------------------------------------------------------+
+| Processes:                                                                  |
+|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
+|        ID   ID                                                   Usage      |
+|=============================================================================|
+|  No running processes found                                                 |
++-----------------------------------------------------------------------------+
+```
+
+#### To run SystemDS with CUDA
+
+Pass `.dml` file with `-f` flag
+
+```sh
+java -Xmx4g -Xms4g -Xmn400m -cp target/SystemDS.jar:target/lib/*:target/SystemDS-*.jar org.apache.sysds.api.DMLScript -f ../main.dml -exec singlenode -gpu
+```
+
+```output
+[ INFO] BEGIN DML run 05/14/2021 02:37:26
+[ INFO] Initializing CUDA
+[ INFO] GPU memory - Total: 11996.954624 MB, Available: 11750.539264 MB on GPUContext{deviceNum=0}
+[ INFO] Total number of GPUs on the machine: 1
+[ INFO] GPUs being used: -1
+[ INFO] Initial GPU memory: 10575485337
+
+This is SystemDS!
+
+SystemDS Statistics:
+Total execution time:           0.020 sec.
+```
+
 ## Windows
 
 Install the hardware and software requirements.
 
 Add CUDA, CUPTI, and cuDNN installation directories to `%PATH%` environmental
 variable. Neural networks won't run without cuDNN `cuDNN64_7*.dll`.
-See [Windows install from source guide](./windows-source-installation.md).
+See [Windows install from source guide](windows-source-installation).
 
 ```sh
 SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin;%PATH%