You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@systemml.apache.org by de...@apache.org on 2017/04/07 18:58:37 UTC

[33/50] [abbrv] incubator-systemml git commit: Upgraded to use jcuda8 (from the maven repo)

Upgraded to use jcuda8 (from the maven repo)

Closes #291


Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/be4eaaf2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/be4eaaf2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/be4eaaf2

Branch: refs/heads/gh-pages
Commit: be4eaaf2a9b27d0a611cedb8b1d53e9a0a6a9296
Parents: fd96a3e
Author: Nakul Jindal <na...@gmail.com>
Authored: Fri Mar 3 18:11:45 2017 -0800
Committer: Nakul Jindal <na...@gmail.com>
Committed: Fri Mar 3 18:11:46 2017 -0800

----------------------------------------------------------------------
 devdocs/gpu-backend.md | 61 +++++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 35 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/be4eaaf2/devdocs/gpu-backend.md
----------------------------------------------------------------------
diff --git a/devdocs/gpu-backend.md b/devdocs/gpu-backend.md
index c6f66d6..40311c7 100644
--- a/devdocs/gpu-backend.md
+++ b/devdocs/gpu-backend.md
@@ -19,52 +19,43 @@ limitations under the License.
 
 # Initial prototype for GPU backend
 
-A GPU backend implements two important abstract classes:
+The GPU backend implements two important abstract classes:
 1. `org.apache.sysml.runtime.controlprogram.context.GPUContext`
 2. `org.apache.sysml.runtime.controlprogram.context.GPUObject`
 
-The GPUContext is responsible for GPU memory management and initialization/destruction of Cuda handles.
+The `GPUContext` is responsible for GPU memory management and initialization/destruction of Cuda handles.
+Currently, an active instance of the `GPUContext` class is made available globally and is used to store handles
+of the allocated blocks on the GPU. A count is kept per block for the number of instructions that need it.
+When the count is 0, the block may be evicted on a call to `GPUObject.evict()`.
 
-A GPUObject (like RDDObject and BroadcastObject) is stored in CacheableData object. It gets call-backs from SystemML's bufferpool on following methods
+A `GPUObject` (like RDDObject and BroadcastObject) is stored in CacheableData object. It gets call-backs from SystemML's bufferpool on following methods
 1. void acquireDeviceRead()
-2. void acquireDenseDeviceModify(int numElemsToAllocate)
-3. void acquireHostRead()
-4. void acquireHostModify()
-5. void release(boolean isGPUCopyModified)
+2. void acquireDeviceModifyDense()
+3. void acquireDeviceModifySparse
+4. void acquireHostRead()
+5. void acquireHostModify()
+6. void releaseInput()
+7. void releaseOutput()
 
-## JCudaContext:
-The current prototype supports Nvidia's CUDA libraries using JCuda wrapper. The implementation for the above classes can be found in:
-1. `org.apache.sysml.runtime.controlprogram.context.JCudaContext`
-2. `org.apache.sysml.runtime.controlprogram.context.JCudaObject`
+Sparse matrices on GPU are represented in `CSR` format. In the SystemML runtime, they are represented in `MCSR` or modified `CSR` format.
+A conversion cost is incurred when sparse matrices are sent back and forth between host and device memory.
 
-### Setup instructions for JCudaContext:
+Concrete classes `JCudaContext` and `JCudaObject` (which extend `GPUContext` & `GPUObject` respectively) contain references to `org.jcuda.*`.
 
-1. Follow the instructions from `https://developer.nvidia.com/cuda-downloads` and install CUDA 7.5.
-2. Follow the instructions from `https://developer.nvidia.com/cudnn` and install CuDNN v4.
-3. Download install JCuda binaries version 0.7.5b and JCudnn version 0.7.5. Easiest option would be to use mavenized jcuda: 
-```python
-git clone https://github.com/MysterionRise/mavenized-jcuda.git
-mvn -Djcuda.version=0.7.5b -Djcudnn.version=0.7.5 clean package
-CURR_DIR=`pwd`
-JCUDA_PATH=$CURR_DIR"/target/lib/"
-JAR_PATH="."
-for j in `ls $JCUDA_PATH/*.jar`
-do
-        JAR_PATH=$JAR_PATH":"$j
-done
-export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JCUDA_PATH
-```
+The `LibMatrixCUDA` class contains methods to invoke CUDA libraries (where available) and invoke custom kernels. 
+Runtime classes (that extend `GPUInstruction`) redirect calls to functions in this class.
+Some functions in `LibMatrixCUDA` need finer control over GPU memory management primitives. These are provided by `JCudaObject`.
+
+### Setup instructions:
 
-Note for Windows users:
-* CuDNN v4 is available to download: `http://developer.download.nvidia.com/compute/redist/cudnn/v4/cudnn-7.0-win-x64-v4.0-prod.zip`
-* If above steps doesn't work for JCuda, copy the DLLs into C:\lib (or /lib) directory.
+1. Follow the instructions from `https://developer.nvidia.com/cuda-downloads` and install CUDA 8.0.
+2. Follow the instructions from `https://developer.nvidia.com/cudnn` and install CuDNN v5.1.
 
-To use SystemML's GPU backend, 
+To use SystemML's GPU backend when using the jar or uber-jar
 1. Add JCuda's jar into the classpath.
-2. Include CUDA, CuDNN and JCuda's libraries in LD_LIBRARY_PATH (or using -Djava.library.path).
-3. Use `-gpu` flag.
+2. Use `-gpu` flag.
 
 For example: to use GPU backend in standalone mode:
-```python
-java -classpath $JAR_PATH:systemml-0.10.0-incubating-SNAPSHOT-standalone.jar org.apache.sysml.api.DMLScript -f MyDML.dml -gpu -exec singlenode ... 
+```bash
+java -classpath $JAR_PATH:systemml-0.14.0-incubating-SNAPSHOT-standalone.jar org.apache.sysml.api.DMLScript -f MyDML.dml -gpu -exec singlenode ... 
 ```