You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@systemml.apache.org by de...@apache.org on 2017/04/07 18:58:14 UTC

[10/50] [abbrv] incubator-systemml git commit: [SYSTEMML-1190] Allow Scala UDF to be passed to SystemML via external UDF mechanism

[SYSTEMML-1190] Allow Scala UDF to be passed to SystemML via external UDF mechanism

The registration mechanism is inspired from Spark SQLContext's UDF. The
key construct is ml.udf.register("fn to be used in DML", scala UDF).

The restrictions for Scala UDF are as follows:
- Only types specified by DML language is supported for parameters and return types (i.e. Int, Double, Boolean, String, double[][]).
- At minimum, the function should have 1 argument and 1 return value.
- At max, the function can have 10 arguments and 10 return values.

Closes #349.


Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/45fab153
Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/45fab153
Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/45fab153

Branch: refs/heads/gh-pages
Commit: 45fab15340234a74b799b1c488e93f8037a59307
Parents: 5b21588
Author: Niketan Pansare <np...@us.ibm.com>
Authored: Mon Jan 23 13:27:35 2017 -0800
Committer: Niketan Pansare <np...@us.ibm.com>
Committed: Mon Jan 23 13:31:07 2017 -0800

----------------------------------------------------------------------
 spark-mlcontext-programming-guide.md | 39 +++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/45fab153/spark-mlcontext-programming-guide.md
----------------------------------------------------------------------
diff --git a/spark-mlcontext-programming-guide.md b/spark-mlcontext-programming-guide.md
index dcaa125..759d392 100644
--- a/spark-mlcontext-programming-guide.md
+++ b/spark-mlcontext-programming-guide.md
@@ -1636,6 +1636,45 @@ scala> for (i <- 1 to 5) {
 
 </div>
 
+## Passing Scala UDF to SystemML
+
+SystemML allows the users to pass a Scala UDF (with input/output types supported by SystemML)
+to the DML script via MLContext. The restrictions for the supported Scala UDFs are as follows:
+
+1. Only types specified by DML language is supported for parameters and return types (i.e. Int, Double, Boolean, String, double[][]).
+2. At minimum, the function should have 1 argument and 1 return value.
+3. At max, the function can have 10 arguments and 10 return values. 
+
+{% highlight scala %}
+import org.apache.sysml.api.mlcontext._
+import org.apache.sysml.api.mlcontext.ScriptFactory._
+val ml = new MLContext(sc)
+
+// Demonstrates how to pass a simple scala UDF to SystemML
+def addOne(x:Double):Double = x + 1
+ml.udf.register("addOne", addOne _)
+val script1 = dml("v = addOne(2.0); print(v)")
+ml.execute(script1)
+
+// Demonstrates operation on local matrices (double[][])
+def addOneToDiagonal(x:Array[Array[Double]]):Array[Array[Double]] = {  for(i <- 0 to x.length-1) x(i)(i) = x(i)(i) + 1; x }
+ml.udf.register("addOneToDiagonal", addOneToDiagonal _)
+val script2 = dml("m1 = matrix(0, rows=3, cols=3); m2 = addOneToDiagonal(m1); print(toString(m2));")
+ml.execute(script2)
+
+// Demonstrates multi-return function
+def multiReturnFn(x:Double):(Double, Int) = (x + 1, (x * 2).toInt)
+ml.udf.register("multiReturnFn", multiReturnFn _)
+val script3 = dml("[v1, v2] = multiReturnFn(2.0); print(v1)")
+ml.execute(script3)
+
+// Demonstrates multi-argument multi-return function
+def multiArgReturnFn(x:Double, y:Int):(Double, Int) = (x + 1, (x * y).toInt)
+ml.udf.register("multiArgReturnFn", multiArgReturnFn _)
+val script4 = dml("[v1, v2] = multiArgReturnFn(2.0, 1); print(v2)")
+ml.execute(script4)
+{% endhighlight %}
+
 ---
 
 # Jupyter (PySpark) Notebook Example - Poisson Nonnegative Matrix Factorization