You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2021/02/27 09:39:52 UTC

[GitHub] [systemds] Baunsgaard commented on a change in pull request #1191: Cspline Builtin

Baunsgaard commented on a change in pull request #1191:
URL: https://github.com/apache/systemds/pull/1191#discussion_r584095759



##########
File path: scripts/builtin/cspline.dml
##########
@@ -0,0 +1,63 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum

Review comment:
       please write Integer

##########
File path: scripts/builtin/cspline.dml
##########
@@ -0,0 +1,63 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes

Review comment:
       It is not called LOG in the function description but mode

##########
File path: scripts/builtin/csplineCG.dml
##########
@@ -0,0 +1,275 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE CONJUGATE GRADIENT ALGORITHM
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_Y Matrix[Double] ---      Predicted value
+# K      Matrix[Double] ---      Matrix of k parameters
+#
+# The Log file, when requested, contains the following per-iteration variables in CSV
+# format, each line containing triple (NAME, ITERATION, VALUE) with ITERATION = 0 for
+# initial values:
+#
+# NAME                  MEANING
+# -------------------------------------------------------------------------------------
+# CG_RESIDUAL_NORM      L2-norm of Conj.Grad.residual, which is A %*% beta - t(X) %*% y
+#                           where A = t(X) %*% X + diag (lambda), or a similar quantity
+# CG_RESIDUAL_RATIO     Ratio of current L2-norm of Conj.Grad.residual over the initial
+# -------------------------------------------------------------------------------------
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_csplineCG =function (Matrix[Double] X, Matrix[Double] Y, Double inp_x, 
+  Double tol = 0.000001, Integer maxi = 10
+  ) 
+  return (Matrix[Double] pred_Y, Matrix[Double] K) 
+{
+  xs = X;
+  ys = Y;
+  inp_x = inp_x;
+  fileLog = " ";     

Review comment:
       There is no way of setting this file Log in parameters.
   I would suggest either removing the feature, or replacing it with print statements. (surrounded by verbose flags.)

##########
File path: src/test/java/org/apache/sysds/test/functions/builtin/BuiltinCsplineTest.java
##########
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+ 
+package org.apache.sysds.test.applications;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import org.junit.runners.Parameterized.Parameters;
+
+import org.apache.sysds.test.AutomatedTestBase;
+import org.apache.sysds.runtime.matrix.data.MatrixValue;
+import org.apache.sysds.test.TestUtils;
+
+@RunWith(value = Parameterized.class)
+@net.jcip.annotations.NotThreadSafe
+public class BuiltinCsplineTest extends AutomatedTestBase {
+	protected final static String TEST_NAME = "cspline";
+	protected final static String TEST_DIR = "functions/builtin/";
+	protected String TEST_CLASS_DIR = TEST_DIR + BuiltinCsplineTest.class.getSimpleName() + "/";
+	
+	protected int numRecords;
+	private final static int numDim = 1;
+    
+    public BuiltinCsplineTest(int rows, int cols) {
+		numRecords = rows;
+	}
+
+	@Parameters
+	public static Collection<Object[]> data() {
+		Object[][] data = new Object[][] {{10, 1}, {100, 1}, {1000, 1}};
+		return Arrays.asList(data);
+	}
+
+	@Override
+	public void setUp() {
+		addTestConfiguration(TEST_CLASS_DIR, TEST_NAME);
+	}
+
+	@Test
+	public void testCsplineDS() {	
+		System.out.println(
+			"------------ BEGIN " + TEST_NAME + " TEST WITH {" + numRecords + ", " + numDim + "} ------------");

Review comment:
       remove this print

##########
File path: scripts/builtin/cspline.dml
##########
@@ -0,0 +1,63 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_Y Matrix[Double] ---      Predicted value
+# K      Matrix[Double] ---      Matrix of k parameters
+#
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_cspline = function(Matrix[Double] X, Matrix[Double] Y, Double inp_x, 
+String mode = "DS", Double tol = -1.0, Integer maxi = -1)
+  return (Matrix[Double] pred_Y, Matrix[Double] K)
+{
+  if( mode == "CG" & maxi != -1 & tol != -1.0)
+  {
+    [pred_Y, K] = csplineCG(X=X, Y=Y, inp_x=inp_x, tol=tol, maxi=maxi);
+  }
+  else 
+  {    
+    [pred_Y, K] = csplineDS(X=X, Y=Y, inp_x=inp_x);
+  }
+}

Review comment:
       add newline in the end of the script

##########
File path: scripts/builtin/csplineCG.dml
##########
@@ -0,0 +1,275 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE CONJUGATE GRADIENT ALGORITHM
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------

Review comment:
       Documentation does not align to function 

##########
File path: scripts/builtin/csplineCG.dml
##########
@@ -0,0 +1,275 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE CONJUGATE GRADIENT ALGORITHM
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_Y Matrix[Double] ---      Predicted value
+# K      Matrix[Double] ---      Matrix of k parameters
+#
+# The Log file, when requested, contains the following per-iteration variables in CSV
+# format, each line containing triple (NAME, ITERATION, VALUE) with ITERATION = 0 for
+# initial values:
+#
+# NAME                  MEANING
+# -------------------------------------------------------------------------------------
+# CG_RESIDUAL_NORM      L2-norm of Conj.Grad.residual, which is A %*% beta - t(X) %*% y
+#                           where A = t(X) %*% X + diag (lambda), or a similar quantity
+# CG_RESIDUAL_RATIO     Ratio of current L2-norm of Conj.Grad.residual over the initial
+# -------------------------------------------------------------------------------------
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_csplineCG =function (Matrix[Double] X, Matrix[Double] Y, Double inp_x, 
+  Double tol = 0.000001, Integer maxi = 10
+  ) 
+  return (Matrix[Double] pred_Y, Matrix[Double] K) 
+{
+  xs = X;
+  ys = Y;
+  inp_x = inp_x;
+  fileLog = " ";     
+
+  print ("BEGIN CUBIC SPLINE SCRIPT");
+
+  print("Calculating Ks ...")
+
+  ks = xs
+  ks=calcKnotsDerivKsCG(xs, ys,
+                        maxi, tol, fileLog)
+
+  print("Interpolating ...")
+
+  x = inp_x
+  y = interpSplineCG(x, xs, ys, ks)
+
+  print("For inp_x = " + inp_x + " Calculated y = " + y)
+
+  pred_Y = matrix(y, 1, 1)
+  K = ks
+  
+  print ("END CUBIC SPLINE REGRESSION SCRIPT");
+}
+
+
+
+#given X<nx1> and corresponding Y<nx1> values for the function. where each (xi,yi) represents a knot.
+#it calculates the first derivates i.e k of the interpolated polynomial
+calcKnotsDerivKsCG = function (
+  matrix[double] X, matrix[double] Y,
+  int max_iteration,
+  double tolerance,
+  String fileLog
+) return (matrix[double] K) {
+
+  nx = nrow(X)
+  ny = nrow(Y)
+  if (nx != ny) {
+      stop("X and Y vectors are of different size")

Review comment:
       inconsistent indentation

##########
File path: scripts/builtin/csplineCG.dml
##########
@@ -0,0 +1,275 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE CONJUGATE GRADIENT ALGORITHM
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_Y Matrix[Double] ---      Predicted value
+# K      Matrix[Double] ---      Matrix of k parameters
+#
+# The Log file, when requested, contains the following per-iteration variables in CSV
+# format, each line containing triple (NAME, ITERATION, VALUE) with ITERATION = 0 for
+# initial values:
+#
+# NAME                  MEANING
+# -------------------------------------------------------------------------------------
+# CG_RESIDUAL_NORM      L2-norm of Conj.Grad.residual, which is A %*% beta - t(X) %*% y
+#                           where A = t(X) %*% X + diag (lambda), or a similar quantity
+# CG_RESIDUAL_RATIO     Ratio of current L2-norm of Conj.Grad.residual over the initial
+# -------------------------------------------------------------------------------------
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_csplineCG =function (Matrix[Double] X, Matrix[Double] Y, Double inp_x, 
+  Double tol = 0.000001, Integer maxi = 10
+  ) 
+  return (Matrix[Double] pred_Y, Matrix[Double] K) 
+{
+  xs = X;
+  ys = Y;
+  inp_x = inp_x;
+  fileLog = " ";     
+
+  print ("BEGIN CUBIC SPLINE SCRIPT");
+
+  print("Calculating Ks ...")

Review comment:
       please add a verbose flag if you want to print. and then encapsulate all the printing in such.
   If in doubt, take a look at one of the other builtin functions.

##########
File path: scripts/builtin/csplineDS.dml
##########
@@ -0,0 +1,183 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE DIRECT SOLVER
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME   TYPE            DEFAULT  MEANING
+# --------------------------------------------------------------------------------------------
+# X      Matrix[Double]    ---     1-column matrix of x values knots
+# Y      Matrix[Double]    ---     1-column matrix of corresponding y values knots
+# inp_x  Double            ---     the given input x, for which the cspline will find predicted y.
+#
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_y Matrix[Double]    ---     Predicted value
+# K      Matrix[Double]    ---     Matrix of k parameters 
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x

Review comment:
       again assumptions in docs above, not here. We use the above table to generate code and comments elsewhere, therefore this would be ignored.

##########
File path: scripts/builtin/cspline.dml
##########
@@ -0,0 +1,63 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_Y Matrix[Double] ---      Predicted value
+# K      Matrix[Double] ---      Matrix of k parameters
+#
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x

Review comment:
       add these assumptions to the descriptions of the inputs. these should be at X.

##########
File path: scripts/builtin/csplineDS.dml
##########
@@ -0,0 +1,183 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE DIRECT SOLVER
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME   TYPE            DEFAULT  MEANING
+# --------------------------------------------------------------------------------------------
+# X      Matrix[Double]    ---     1-column matrix of x values knots
+# Y      Matrix[Double]    ---     1-column matrix of corresponding y values knots
+# inp_x  Double            ---     the given input x, for which the cspline will find predicted y.
+#
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_y Matrix[Double]    ---     Predicted value
+# K      Matrix[Double]    ---     Matrix of k parameters 
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_csplineDS = function (Matrix[Double] X, Matrix[Double] Y, Double inp_x) 
+  return (Matrix[Double] pred_Y, Matrix[Double] K) 
+{
+  xs = X;
+  ys = Y;
+
+  print ("BEGIN CUBIC SPLINE SCRIPT");
+
+  print("Calculating Ks ...")
+  ks = xs
+  ks = calcKnotsDerivKs(xs, ys)
+
+
+  print("Interpolating ...")
+  x = inp_x
+  y = interpSpline(x, xs, ys, ks)
+
+  print("For inp_x = " + x + " Calculated y = " + y)
+
+  pred_Y = matrix(y, 1, 1)
+  K = ks
+  print ("END CUBIC SPLINE REGRESSION SCRIPT");
+}
+
+
+
+#given X<nx1> and corresponding Y<nx1> values for the function. where each (xi,yi) represents a knot.
+#it calculates the first derivates i.e k of the interpolated polynomial
+calcKnotsDerivKs = function (
+  matrix[double] X, matrix[double] Y
+) return (matrix[double] K) {
+
+
+  nx = nrow(X)
+  ny = nrow(Y)
+  if (nx != ny) {
+      stop("X and Y vectors are of different size")

Review comment:
       indentation

##########
File path: src/main/java/org/apache/sysds/common/Builtins.java
##########
@@ -38,6 +38,9 @@
  */
 public enum Builtins {
 	//builtin functions
+    CSPLINE("cspline", true),
+	CSPLINE_CG("csplineCG", true),
+	CSPLINE_DS("csplineDS", true),

Review comment:
       we try to keep these in alphabetical order, so input these slightly longer down.

##########
File path: scripts/builtin/csplineDS.dml
##########
@@ -0,0 +1,183 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE DIRECT SOLVER
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME   TYPE            DEFAULT  MEANING
+# --------------------------------------------------------------------------------------------
+# X      Matrix[Double]    ---     1-column matrix of x values knots
+# Y      Matrix[Double]    ---     1-column matrix of corresponding y values knots
+# inp_x  Double            ---     the given input x, for which the cspline will find predicted y.
+#
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_y Matrix[Double]    ---     Predicted value
+# K      Matrix[Double]    ---     Matrix of k parameters 
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_csplineDS = function (Matrix[Double] X, Matrix[Double] Y, Double inp_x) 
+  return (Matrix[Double] pred_Y, Matrix[Double] K) 
+{
+  xs = X;
+  ys = Y;
+
+  print ("BEGIN CUBIC SPLINE SCRIPT");

Review comment:
       verbose flag

##########
File path: scripts/builtin/cspline.dml
##########
@@ -0,0 +1,63 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_Y Matrix[Double] ---      Predicted value
+# K      Matrix[Double] ---      Matrix of k parameters

Review comment:
       Does K contain all information to make predictions on new data? if so maybe if you have time make a predict builtin

##########
File path: src/test/java/org/apache/sysds/test/functions/builtin/BuiltinCsplineTest.java
##########
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+ 
+package org.apache.sysds.test.applications;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import org.junit.runners.Parameterized.Parameters;
+
+import org.apache.sysds.test.AutomatedTestBase;
+import org.apache.sysds.runtime.matrix.data.MatrixValue;
+import org.apache.sysds.test.TestUtils;
+
+@RunWith(value = Parameterized.class)
+@net.jcip.annotations.NotThreadSafe
+public class BuiltinCsplineTest extends AutomatedTestBase {
+	protected final static String TEST_NAME = "cspline";
+	protected final static String TEST_DIR = "functions/builtin/";
+	protected String TEST_CLASS_DIR = TEST_DIR + BuiltinCsplineTest.class.getSimpleName() + "/";
+	
+	protected int numRecords;
+	private final static int numDim = 1;
+    
+    public BuiltinCsplineTest(int rows, int cols) {
+		numRecords = rows;
+	}
+
+	@Parameters
+	public static Collection<Object[]> data() {
+		Object[][] data = new Object[][] {{10, 1}, {100, 1}, {1000, 1}};
+		return Arrays.asList(data);
+	}
+
+	@Override
+	public void setUp() {
+		addTestConfiguration(TEST_CLASS_DIR, TEST_NAME);
+	}
+
+	@Test
+	public void testCsplineDS() {	
+		System.out.println(
+			"------------ BEGIN " + TEST_NAME + " TEST WITH {" + numRecords + ", " + numDim + "} ------------");
+
+		int rows = numRecords;
+		int cols = numDim;
+
+		getAndLoadTestConfiguration(TEST_NAME);
+
+		List<String> proArgs = new ArrayList<>();
+		proArgs.add("-args");
+		proArgs.add(input("X"));
+		proArgs.add(input("Y"));
+		proArgs.add(Double.toString(4.5));
+		proArgs.add(output("pred_y"));
+		proArgs.add("DS");
+		programArgs = proArgs.toArray(new String[proArgs.size()]);
+		
+		fullDMLScriptName = getScript();
+		
+		rCmd = getRCmd(input("X.mtx"), input("Y.mtx"), Double.toString(4.5), expected("pred_y"));
+
+		double[][] X = new double[rows][cols];
+
+		// X axis is given in the increasing order
+		for (int rid = 0; rid < rows; rid++) {
+			for (int cid = 0; cid < cols; cid++) {
+				X[rid][cid] = rid+1;
+			}
+		}
+
+		double[][] Y = getRandomMatrix(rows, cols, 0, 5, 1.0, -1);
+
+		writeInputMatrixWithMTD("X", X, true);
+		writeInputMatrixWithMTD("Y", Y, true);
+
+		runTest(true, EXCEPTION_NOT_EXPECTED, null, -1);
+		runRScript(true);
+
+		HashMap<MatrixValue.CellIndex, Double> priorR = readRMatrixFromExpectedDir("pred_y");
+		HashMap<MatrixValue.CellIndex, Double> priorSYSTEMDS= readDMLMatrixFromOutputDir("pred_y");
+		
+		System.out.println("--->"+priorR);
+		System.out.println("--->"+priorSYSTEMDS);
+
+		double[][] from_R = TestUtils.convertHashMapToDoubleArray(priorR);
+		double[][] from_DML = TestUtils.convertHashMapToDoubleArray(priorSYSTEMDS);
+
+		System.out.println("..."+from_R[0][0]);
+		System.out.println("..."+from_DML[0][0]);
+
+		TestUtils.compareMatrices(from_R, from_DML, Math.pow(10, -12));
+	}
+
+	@Test
+	public void testCsplineCG() {
+		System.out.println(
+			"------------ BEGIN CG" + TEST_NAME + " TEST WITH {" + numRecords + ", " + numDim + "} ------------");
+
+		int rows = numRecords;
+		int cols = numDim;
+		int numIter = rows;
+
+		getAndLoadTestConfiguration(TEST_NAME);
+
+		List<String> proArgs = new ArrayList<>();
+		proArgs.add("-args");
+		proArgs.add(input("X"));
+		proArgs.add(input("Y"));
+		proArgs.add(Double.toString(4.5));
+		proArgs.add(output("pred_y"));
+		proArgs.add("CG");
+		proArgs.add(Double.toString(0.000001));
+		proArgs.add(Double.toString(numIter));
+		programArgs = proArgs.toArray(new String[proArgs.size()]);
+		
+		fullDMLScriptName = getScript();
+		
+		rCmd = getRCmd(input("X.mtx"), input("Y.mtx"), Double.toString(4.5), expected("pred_y"));
+
+		double[][] X = new double[rows][cols];
+
+		// X axis is given in the increasing order
+		for (int rid = 0; rid < rows; rid++) {
+			for (int cid = 0; cid < cols; cid++) {
+				X[rid][cid] = rid+1;
+			}
+		}
+
+		double[][] Y = getRandomMatrix(rows, cols, 0, 5, 1.0, -1);
+
+		writeInputMatrixWithMTD("X", X, true);
+		writeInputMatrixWithMTD("Y", Y, true);
+
+		runTest(true, EXCEPTION_NOT_EXPECTED, null, -1);
+		runRScript(true);
+
+		HashMap<MatrixValue.CellIndex, Double> priorR = readRMatrixFromExpectedDir("pred_y");
+		HashMap<MatrixValue.CellIndex, Double> priorSYSTEMDS= readDMLMatrixFromOutputDir("pred_y");
+		
+		System.out.println("--->"+priorR);
+		System.out.println("--->"+priorSYSTEMDS);
+
+		double[][] from_R = TestUtils.convertHashMapToDoubleArray(priorR);
+		double[][] from_DML = TestUtils.convertHashMapToDoubleArray(priorSYSTEMDS);
+
+		System.out.println("..."+from_R[0][0]);
+		System.out.println("..."+from_DML[0][0]);

Review comment:
       remove print

##########
File path: scripts/builtin/csplineCG.dml
##########
@@ -0,0 +1,275 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE CONJUGATE GRADIENT ALGORITHM
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME  TYPE           DEFAULT   MEANING
+# --------------------------------------------------------------------------------------------
+# X     Matrix[Double]  ---      1-column matrix of x values knots
+# Y     Matrix[Double]  ---      1-column matrix of corresponding y values knots
+# inp_x Double          ---      the given input x, for which the cspline will find predicted y.
+# Log   String          " "      Location to store iteration-specific variables for monitoring and debugging purposes
+#
+# tol   Double          0.000001 Tolerance (epsilon); conjugate graduent procedure terminates early if
+#                                L2 norm of the beta-residual is less than tolerance * its initial norm
+# maxi  Int             0        Maximum number of conjugate gradient iterations, 0 = no maximum
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_Y Matrix[Double] ---      Predicted value
+# K      Matrix[Double] ---      Matrix of k parameters
+#
+# The Log file, when requested, contains the following per-iteration variables in CSV
+# format, each line containing triple (NAME, ITERATION, VALUE) with ITERATION = 0 for
+# initial values:
+#
+# NAME                  MEANING
+# -------------------------------------------------------------------------------------
+# CG_RESIDUAL_NORM      L2-norm of Conj.Grad.residual, which is A %*% beta - t(X) %*% y
+#                           where A = t(X) %*% X + diag (lambda), or a similar quantity
+# CG_RESIDUAL_RATIO     Ratio of current L2-norm of Conj.Grad.residual over the initial
+# -------------------------------------------------------------------------------------
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_csplineCG =function (Matrix[Double] X, Matrix[Double] Y, Double inp_x, 
+  Double tol = 0.000001, Integer maxi = 10
+  ) 
+  return (Matrix[Double] pred_Y, Matrix[Double] K) 
+{
+  xs = X;
+  ys = Y;
+  inp_x = inp_x;
+  fileLog = " ";     
+
+  print ("BEGIN CUBIC SPLINE SCRIPT");
+
+  print("Calculating Ks ...")
+
+  ks = xs
+  ks=calcKnotsDerivKsCG(xs, ys,
+                        maxi, tol, fileLog)
+
+  print("Interpolating ...")
+
+  x = inp_x
+  y = interpSplineCG(x, xs, ys, ks)
+
+  print("For inp_x = " + inp_x + " Calculated y = " + y)
+
+  pred_Y = matrix(y, 1, 1)
+  K = ks
+  
+  print ("END CUBIC SPLINE REGRESSION SCRIPT");
+}
+
+
+
+#given X<nx1> and corresponding Y<nx1> values for the function. where each (xi,yi) represents a knot.
+#it calculates the first derivates i.e k of the interpolated polynomial
+calcKnotsDerivKsCG = function (
+  matrix[double] X, matrix[double] Y,
+  int max_iteration,
+  double tolerance,
+  String fileLog
+) return (matrix[double] K) {
+
+  nx = nrow(X)
+  ny = nrow(Y)
+  if (nx != ny) {
+      stop("X and Y vectors are of different size")
+  }
+
+  Xu = truncCG(X, 1, "up") # Xu is (X where X[0] is removed)
+  Xd = truncCG(X, 1, "down") # Xd is (X where X[n] is removed)
+
+  Bx=1/(Xu-Xd) # The expr => 1/Delta(X(i) = 1/(X(i)-X(i-1))
+
+
+  Bxd = resizeCG(Bx, 1, 0, "tr") # Bxd is (0, Bx) vector
+  Bxu = resizeCG(Bx, 1, 0, "br") # Bxu is (Bx, 0) vector
+  Dx = 2*(Bxd + Bxu) # the major diagonal entry 2(1/Delta(X(i) + 1/Delta(X(i+1)))
+
+  MDx = diag(Dx) # convert vector to diagonal matrix
+
+  MBx = diag(Bx) # this is the temp diagonal matrix, which will form the bands of the tri-diagonal matrix
+  MUx = resizeCG(MBx, 1, 1, "bl") # the upper diagonal matrix of the band
+  MLx = resizeCG(MBx, 1, 1, "tr") # the lower diagonal matrix of the band
+
+  A=MUx+MDx+MLx # create the complete tri-diagonal matrix
+
+  #calculate b matrix
+  Yu = truncCG(Y, 1, "up") # Yu is (Y where Y[0] is removed)
+  Yd = truncCG(Y, 1, "down") # Yd is (Y where Y[n] is removed)
+  By=(Yu-Yd)/(Bx*Bx) # the expr => Delta(Y(i))/Delta(X(i))*Delta(X(i))
+
+  By1=resizeCG(By, 1, 0, "tr") # By1 is (0, By) vector
+  By2=resizeCG(By, 1, 0, "br") # By2 is (By, 0) vector
+  b=3*(By1+By2) # the b entries 3*(Delta(Y(i))/Delta(X(i))*Delta(X(i)) + Delta(Y(i+1))/Delta(X(i+1))*Delta(X(i+1)))
+
+
+  # solve Ax = b for x vector and assign it to K
+  K = CGSolver(A, b,
+                   max_iteration, tolerance, fileLog)
+
+}
+
+#given the X<nx1> and Y<nx1> n sample points and K (the first derivative of the interp polynomial), it calculate the
+#  y for the given x using the cubic spline interpolation
+interpSplineCG = function(
+  double x, matrix[double] X, matrix[double] Y, matrix[double] K
+) return (double q) {
+
+  #first find the right knots for interpolation
+  i = as.integer(nrow(X) - sum(X >= x) + 1)
+
+  #calc the y as per the algo docs
+  t = (x - X[i-1,1]) / ( X[i,1] - X[i-1,1])
+
+  a =  K[i-1,1]*(X[i,1]-X[i-1,1]) - (Y[i,1]-Y[i-1,1])
+  b = -K[i,1]*(X[i,1]-X[i-1,1]) + (Y[i,1]-Y[i-1,1])
+
+  qm = (1-t)*Y[i-1,1] + t*Y[i,1] + t*(1-t)*(a*(1-t)+b*t)
+
+  q = as.scalar(qm)
+
+}
+
+
+#solve Ax = b
+#   for CG our formulation is
+#      t(A)Ax = t(A)b // where t is transpose
+CGSolver = function (
+ matrix[double] A, matrix[double] b,
+ int max_iteration,
+ double tolerance,
+ String fileLog
+) return (matrix[double] x) {
+
+  n = nrow(A)
+  if (max_iteration < 0) {
+    max_iteration = n
+  }
+  if (tolerance < 0) {
+    tolerance = 0.000001
+  }
+
+  x = matrix (0, rows = n, cols = 1); #/* solution vector x<nx1>*/
+
+  # BEGIN THE CONJUGATE GRADIENT ALGORITHM
+  print ("Running the CG algorithm...");
+
+  i = 0
+  r = -t(A) %*% b # initial guess x0 = t(0.0)
+  p = -r
+  norm_r2 = sum (r ^ 2)
+  norm_r2_initial = norm_r2
+  norm_r2_target = norm_r2_initial * tolerance ^ 2
+  print ("||r|| initial value = " + sqrt (norm_r2_initial) + ",  target value = " + sqrt (norm_r2_target));
+  log_str = "CG_RESIDUAL_NORM,0," + sqrt (norm_r2_initial)
+  log_str = append (log_str, "CG_RESIDUAL_RATIO,0,1.0")
+
+  while (i < max_iteration & norm_r2 > norm_r2_target) {
+    q = t(A) %*% (A %*% p)
+    a = norm_r2 / sum (p*q)
+    x = x + a*p
+    r = r + a*q
+    old_norm_r2 = norm_r2
+    norm_r2 = sum(r^2)
+    p = -r + (norm_r2/ old_norm_r2)*p
+    i = i + 1
+    print ("Iteration " + i + ":  ||r|| / ||r init|| = " + sqrt (norm_r2 / norm_r2_initial))
+    log_str = append (log_str, "CG_RESIDUAL_NORM,"  + i + "," + sqrt (norm_r2))
+    log_str = append (log_str, "CG_RESIDUAL_RATIO," + i + "," + sqrt (norm_r2 / norm_r2_initial))
+  }
+  if (i >= max_iteration) {
+    print ("Warning: the maximum number of iterations has been reached.")
+  }
+
+  print ("The CG algorithm is done.")
+  # END THE CONJUGATE GRADIENT ALGORITHM
+
+  if (fileLog != " ") {
+    write (log_str, fileLog);
+  } else {
+    print("log_str:" + log_str)
+  }
+}
+
+
+#
+# truncCG the matrix by the specified amount in the specified direction.
+# The shifted cells are discarded, the matrix is smaller in size
+#
+truncCG = function (
+  matrix[double] X, # nxm matrix
+  int by, # shift amount
+  String dir # currently only 'up' is supported. but should handle 'up', 'down', 'left', 'right'
+) return (matrix[double] Y) # Y is
+{
+    r = nrow(X); c = ncol(X);
+    if (by > r) {
+      stop("can't pop matrix more than number of rows")
+    }
+    Y = matrix(0.0, r-by, c)
+
+    if (r != by ) {
+      if (dir == "up") {
+        Y[1:r-by,] = X[1+by:r,]
+      } else if (dir == "down") {
+        Y[1:r-by,] = X[1:r-by,]
+      } else {
+        stop("truncCG unsupported direction " + dir)
+      }
+    }
+}
+
+# resizeCG (only grow and not truncCGate) the matrix by the specified amount in the specified direction
+resizeCG = function(
+  matrix[double] X, #nxm matrix
+  int rby, # row resizeCG count
+  int cby,  # col resizeCG count
+  String dir
+) return (matrix[double] Y) # Y is
+{
+   r = nrow(X); c = ncol(X);
+   rn = r + rby; cn = c + cby;
+   Y = matrix(0.0, rn, cn)
+   if (dir == "tr") { # top right
+     Y[1+rby:rn, 1:c] = X
+   } else if (dir == "bl") { # bottom left
+     Y[1:r, 1+cby:cn] = X
+   } else if (dir == "tl") { # top left
+     Y[1+rby:rn, 1+cby:cn ] = X
+   } else if (dir == "br") { # bottom right
+     Y[1:r, 1:c] = X
+   } else {
+     stop("Unknown direction dir => " + dir)
+   }
+}

Review comment:
       add newline in the end

##########
File path: src/test/java/org/apache/sysds/test/functions/builtin/BuiltinCsplineTest.java
##########
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+ 
+package org.apache.sysds.test.applications;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import org.junit.runners.Parameterized.Parameters;
+
+import org.apache.sysds.test.AutomatedTestBase;
+import org.apache.sysds.runtime.matrix.data.MatrixValue;
+import org.apache.sysds.test.TestUtils;
+
+@RunWith(value = Parameterized.class)
+@net.jcip.annotations.NotThreadSafe
+public class BuiltinCsplineTest extends AutomatedTestBase {
+	protected final static String TEST_NAME = "cspline";
+	protected final static String TEST_DIR = "functions/builtin/";
+	protected String TEST_CLASS_DIR = TEST_DIR + BuiltinCsplineTest.class.getSimpleName() + "/";
+	
+	protected int numRecords;
+	private final static int numDim = 1;
+    
+    public BuiltinCsplineTest(int rows, int cols) {
+		numRecords = rows;
+	}
+
+	@Parameters
+	public static Collection<Object[]> data() {
+		Object[][] data = new Object[][] {{10, 1}, {100, 1}, {1000, 1}};
+		return Arrays.asList(data);
+	}
+
+	@Override
+	public void setUp() {
+		addTestConfiguration(TEST_CLASS_DIR, TEST_NAME);
+	}
+
+	@Test
+	public void testCsplineDS() {	
+		System.out.println(
+			"------------ BEGIN " + TEST_NAME + " TEST WITH {" + numRecords + ", " + numDim + "} ------------");
+
+		int rows = numRecords;
+		int cols = numDim;
+
+		getAndLoadTestConfiguration(TEST_NAME);
+
+		List<String> proArgs = new ArrayList<>();
+		proArgs.add("-args");
+		proArgs.add(input("X"));
+		proArgs.add(input("Y"));
+		proArgs.add(Double.toString(4.5));
+		proArgs.add(output("pred_y"));
+		proArgs.add("DS");
+		programArgs = proArgs.toArray(new String[proArgs.size()]);
+		
+		fullDMLScriptName = getScript();
+		
+		rCmd = getRCmd(input("X.mtx"), input("Y.mtx"), Double.toString(4.5), expected("pred_y"));
+
+		double[][] X = new double[rows][cols];
+
+		// X axis is given in the increasing order
+		for (int rid = 0; rid < rows; rid++) {
+			for (int cid = 0; cid < cols; cid++) {
+				X[rid][cid] = rid+1;
+			}
+		}
+
+		double[][] Y = getRandomMatrix(rows, cols, 0, 5, 1.0, -1);
+
+		writeInputMatrixWithMTD("X", X, true);
+		writeInputMatrixWithMTD("Y", Y, true);
+
+		runTest(true, EXCEPTION_NOT_EXPECTED, null, -1);
+		runRScript(true);
+
+		HashMap<MatrixValue.CellIndex, Double> priorR = readRMatrixFromExpectedDir("pred_y");
+		HashMap<MatrixValue.CellIndex, Double> priorSYSTEMDS= readDMLMatrixFromOutputDir("pred_y");
+		
+		System.out.println("--->"+priorR);
+		System.out.println("--->"+priorSYSTEMDS);
+
+		double[][] from_R = TestUtils.convertHashMapToDoubleArray(priorR);
+		double[][] from_DML = TestUtils.convertHashMapToDoubleArray(priorSYSTEMDS);
+
+		System.out.println("..."+from_R[0][0]);
+		System.out.println("..."+from_DML[0][0]);
+
+		TestUtils.compareMatrices(from_R, from_DML, Math.pow(10, -12));
+	}
+
+	@Test
+	public void testCsplineCG() {
+		System.out.println(
+			"------------ BEGIN CG" + TEST_NAME + " TEST WITH {" + numRecords + ", " + numDim + "} ------------");

Review comment:
       remove print

##########
File path: scripts/builtin/csplineDS.dml
##########
@@ -0,0 +1,183 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+#
+# THIS SCRIPT SOLVES CUBIC SPLINE INTERPOLATION USING THE DIRECT SOLVER
+#
+# INPUT PARAMETERS:
+# --------------------------------------------------------------------------------------------
+# NAME   TYPE            DEFAULT  MEANING
+# --------------------------------------------------------------------------------------------
+# X      Matrix[Double]    ---     1-column matrix of x values knots
+# Y      Matrix[Double]    ---     1-column matrix of corresponding y values knots
+# inp_x  Double            ---     the given input x, for which the cspline will find predicted y.
+#
+# --------------------------------------------------------------------------------------------
+# OUTPUT: 
+# pred_y Matrix[Double]    ---     Predicted value
+# K      Matrix[Double]    ---     Matrix of k parameters 
+#
+
+#Assumptions:
+# - The inputs xs are monotonically increasing,
+# - there is no duplicates points in x
+
+#Algorithms: It implement the https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
+#it usages natural spline with q1''(x0) == qn''(xn) == 0.0
+
+
+m_csplineDS = function (Matrix[Double] X, Matrix[Double] Y, Double inp_x) 
+  return (Matrix[Double] pred_Y, Matrix[Double] K) 
+{
+  xs = X;
+  ys = Y;
+
+  print ("BEGIN CUBIC SPLINE SCRIPT");
+
+  print("Calculating Ks ...")
+  ks = xs
+  ks = calcKnotsDerivKs(xs, ys)
+
+
+  print("Interpolating ...")
+  x = inp_x
+  y = interpSpline(x, xs, ys, ks)
+
+  print("For inp_x = " + x + " Calculated y = " + y)
+
+  pred_Y = matrix(y, 1, 1)
+  K = ks
+  print ("END CUBIC SPLINE REGRESSION SCRIPT");
+}
+
+
+
+#given X<nx1> and corresponding Y<nx1> values for the function. where each (xi,yi) represents a knot.
+#it calculates the first derivates i.e k of the interpolated polynomial
+calcKnotsDerivKs = function (
+  matrix[double] X, matrix[double] Y
+) return (matrix[double] K) {
+
+
+  nx = nrow(X)
+  ny = nrow(Y)
+  if (nx != ny) {
+      stop("X and Y vectors are of different size")
+  }
+
+  Xu = trunc(X, 1, "up") # Xu is (X where X[0] is removed)
+  Xd = trunc(X, 1, "down") # Xd is (X where X[n] is removed)
+
+  Bx=1/(Xu-Xd) # The expr => 1/Delta(X(i) = 1/(X(i)-X(i-1))
+
+  Bxd = resize(Bx, 1, 0, "tr") # Bxd is (0, Bx) vector
+  Bxu = resize(Bx, 1, 0, "br") # Bxu is (Bx, 0) vector
+  Dx = 2*(Bxd + Bxu) # the major diagonal entry 2(1/Delta(X(i) + 1/Delta(X(i+1)))
+
+  MDx = diag(Dx) # convert vector to diagonal matrix
+  MBx = diag(Bx) # this is the temp diagonal matrix, which will form the bands of the tri-diagonal matrix
+  MUx = resize(MBx, 1, 1, "bl") # the upper diagonal matrix of the band
+  MLx = resize(MBx, 1, 1, "tr") # the lower diagonal matrix of the band
+
+  A=MUx+MDx+MLx # create the complete tri-diagonal matrix
+  #calculate b matrix
+  Yu = trunc(Y, 1, "up") # Yu is (Y where Y[0] is removed)
+  Yd = trunc(Y, 1, "down") # Yd is (Y where Y[n] is removed)
+  By=(Yu-Yd)/(Bx*Bx) # the expr => Delta(Y(i))/Delta(X(i))*Delta(X(i))
+
+  By1=resize(By, 1, 0, "tr") # By1 is (0, By) vector
+  By2=resize(By, 1, 0, "br") # By2 is (By, 0) vector
+  b=3*(By1+By2) # the b entries 3*(Delta(Y(i))/Delta(X(i))*Delta(X(i)) + Delta(Y(i+1))/Delta(X(i+1))*Delta(X(i+1)))
+
+  K = solve(A, b)  /* solve Ax = b for x vector and assign it to K*/
+
+}
+
+#given the X<nx1> and Y<nx1> n sample points and K (the first derivative of the interp polynomial), it calculate the
+#  y for the given x using the cubic spline interpolation
+interpSpline = function(
+  double x, matrix[double] X, matrix[double] Y, matrix[double] K
+) return (double q) {
+
+  #first find the right knots for interpolation
+  i = as.integer(nrow(X) - sum(X >= x) + 1)
+
+  #calc the y as per the algo docs
+  t = (x - X[i-1,1]) / ( X[i,1] - X[i-1,1])
+
+  a =  K[i-1,1]*(X[i,1]-X[i-1,1]) - (Y[i,1]-Y[i-1,1])
+  b = -K[i,1]*(X[i,1]-X[i-1,1]) + (Y[i,1]-Y[i-1,1])
+
+  qm = (1-t)*Y[i-1,1] + t*Y[i,1] + t*(1-t)*(a*(1-t)+b*t)
+
+  q = as.scalar(qm)
+
+}
+
+#
+# trunc the matrix by the specified amount in the specified direction.
+# The shifted cells are discarded, the matrix is smaller in size
+#
+trunc = function (
+  matrix[double] X, # nxm matrix
+  int by, # shift amount
+  String dir # currently only 'up' is supported. but should handle 'up', 'down', 'left', 'right'
+) return (matrix[double] Y) # Y is
+{
+    r = nrow(X); c = ncol(X);

Review comment:
       indentation

##########
File path: src/test/java/org/apache/sysds/test/functions/builtin/BuiltinCsplineTest.java
##########
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+ 
+package org.apache.sysds.test.applications;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import org.junit.runners.Parameterized.Parameters;
+
+import org.apache.sysds.test.AutomatedTestBase;
+import org.apache.sysds.runtime.matrix.data.MatrixValue;
+import org.apache.sysds.test.TestUtils;
+
+@RunWith(value = Parameterized.class)
+@net.jcip.annotations.NotThreadSafe
+public class BuiltinCsplineTest extends AutomatedTestBase {
+	protected final static String TEST_NAME = "cspline";
+	protected final static String TEST_DIR = "functions/builtin/";
+	protected String TEST_CLASS_DIR = TEST_DIR + BuiltinCsplineTest.class.getSimpleName() + "/";
+	
+	protected int numRecords;
+	private final static int numDim = 1;
+    
+    public BuiltinCsplineTest(int rows, int cols) {
+		numRecords = rows;
+	}
+
+	@Parameters
+	public static Collection<Object[]> data() {
+		Object[][] data = new Object[][] {{10, 1}, {100, 1}, {1000, 1}};
+		return Arrays.asList(data);
+	}
+
+	@Override
+	public void setUp() {
+		addTestConfiguration(TEST_CLASS_DIR, TEST_NAME);
+	}
+
+	@Test
+	public void testCsplineDS() {	
+		System.out.println(
+			"------------ BEGIN " + TEST_NAME + " TEST WITH {" + numRecords + ", " + numDim + "} ------------");

Review comment:
       If you want to print we have Log4J, where you can add a LOG.debug("print something"). This makes it possible to enable and disable the printing without having to change the code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org