You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@systemml.apache.org by du...@apache.org on 2017/07/03 20:23:14 UTC
systemml git commit: [SYSTEMML-1677] Add a new 2D cross-entropy loss layer to the `nn` lib

Repository: systemml
Updated Branches:
  refs/heads/master e7cfcadc9 -> d56c05ece


[SYSTEMML-1677] Add a new 2D cross-entropy loss layer to the `nn` lib

Computes the forward pass for a 2D cross-entropy loss function. The
inputs consist of N examples, each of shape (C, Hin, Win), where each
pixel has C dimensions corresponding to normalized probabilities of C
classes. The loss is applied to each pixel location, and then averaged
over all pixels and all examples.

Closes #556.


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/d56c05ec
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/d56c05ec
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/d56c05ec

Branch: refs/heads/master
Commit: d56c05ecefe588050099a0219c04a21cd1359e85
Parents: e7cfcad
Author: Fei Hu <hu...@gmail.com>
Authored: Mon Jul 3 13:21:51 2017 -0700
Committer: Mike Dusenberry <mw...@us.ibm.com>
Committed: Mon Jul 3 13:21:51 2017 -0700

----------------------------------------------------------------------
 scripts/nn/layers/cross_entropy_loss2d.dml | 105 ++++++++++++++++++++++++
 scripts/nn/layers/softmax2d.dml            |   4 +-
 scripts/nn/test/grad_check.dml             |  48 ++++++++++-
 scripts/nn/test/run_tests.dml              |   2 +
 scripts/nn/test/test.dml                   |  58 +++++++++++++
 5 files changed, 210 insertions(+), 7 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/systemml/blob/d56c05ec/scripts/nn/layers/cross_entropy_loss2d.dml
----------------------------------------------------------------------
diff --git a/scripts/nn/layers/cross_entropy_loss2d.dml b/scripts/nn/layers/cross_entropy_loss2d.dml
new file mode 100644
index 0000000..a76d5b0
--- /dev/null
+++ b/scripts/nn/layers/cross_entropy_loss2d.dml
@@ -0,0 +1,105 @@
+#-------------------------------------------------------------
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+#-------------------------------------------------------------
+
+/*
+ * 2D Cross-Entropy loss function.
+ */
+ source("nn/util.dml") as util
+ source("nn/layers/cross_entropy_loss.dml") as cross_entropy_loss
+
+forward = function(matrix[double] pred, matrix[double] y, int C)
+    return (double loss) {
+  /*
+   * Computes the forward pass for a 2D cross-entropy loss function. The
+   * inputs consist of N examples, each of shape (C, Hin, Win), where
+   * each pixel has C dimensions corresponding to normalized
+   * probabilities of C classes. The loss is applied to each pixel
+   * location, and then averaged over all pixels and all examples.
+   *
+   *   ```
+   *   L_ijk = -y_ijk^T * log(pred_ijk)
+   *   L = (1/N*H*W) sum(L_ijk) for i=1 to N, j=1 to H, k=1 to W.
+   *   ```
+   *
+   * In these equations, `L` is the total loss, `L_ijk` is the loss for
+   * the pixel `j, k` in the example 'i', `y_ijk` is the C-dimensional
+   * vector of target class probabilities, `pred_ijk` is C-dimensional
+   * vector of predicted class probabilities, and `N` is the number of
+   * examples.
+   *
+   * For each pixel location, this can be interpreted as the negative
+   * log-likelihood assuming a Bernoulli distribution generalized to C
+   * dimensions, or a Multinomial with one observation.
+   *
+   * Inputs:
+   *  - pred: Predictions, of shape (N, C*Win*Hin).
+   *  - y: Targets, of shape (N, C*Win*Hin).
+   *  - C: Number of input channels (dimensionality of input depth).
+   *
+   * Outputs:
+   *  - loss: Average loss.
+   */
+  N = nrow(y)
+
+  #Transpose the matrix from (N, C*H*W) to (N*H*W, C)
+  pred_C_NHW = util::transpose_NCHW_to_CNHW(pred, C)
+  pred_NHW_C = t(pred_C_NHW)
+
+  #Transpose the matrix from (N, C*H*W) to (N*H*W, C)
+  y_C_NHW = util::transpose_NCHW_to_CNHW(y, C)
+  y_NHW_C = t(y_C_NHW)
+
+  loss = cross_entropy_loss::forward(pred_NHW_C, y_NHW_C)
+}
+
+backward = function(matrix[double] pred, matrix[double] y, int C)
+    return (matrix[double] dpred) {
+  /*
+   * Computes the backward pass of a 2D cross-entropy loss function.  The
+   * inputs consist of N examples with a shape (Hin, Win), each pixel in
+   * the 2d-example with C dimensions corresponding to normalized
+   * probabilities of C classes.
+   *
+   * Inputs:
+   *  - pred: Predictions, of shape (N, C*Win*Hin).
+   *  - y: Targets, of shape (N, C*Win*Hin).
+   *  - C: Number of input channels (dimensionality of input depth).
+   *
+   * Outputs:
+   *  - dpred: Gradient wrt `pred`, of shape (N, C*Win*Hin).
+   */
+  N = nrow(y)
+
+  #Transpose the matrix from (N, C*H*W) to (N*H*W, C)
+  pred_C_NHW = util::transpose_NCHW_to_CNHW(pred, C)
+  pred_NHW_C = t(pred_C_NHW)
+
+  #Transpose the matrix from (N, C*H*W) to (N*H*W, C)
+  y_C_NHW = util::transpose_NCHW_to_CNHW(y, C)
+  y_NHW_C = t(y_C_NHW)
+
+  dpred_NHW_C = cross_entropy_loss::backward(pred_NHW_C, y_NHW_C)
+
+  #Transpose the matrix from (N*H*W, C) to (N, C*H*W)
+  dpred_C_NHW = t(dpred_NHW_C)
+  dpred = util::transpose_NCHW_to_CNHW(dpred_C_NHW, N)
+}
+

http://git-wip-us.apache.org/repos/asf/systemml/blob/d56c05ec/scripts/nn/layers/softmax2d.dml
----------------------------------------------------------------------
diff --git a/scripts/nn/layers/softmax2d.dml b/scripts/nn/layers/softmax2d.dml
index aad587d..0207ac4 100644
--- a/scripts/nn/layers/softmax2d.dml
+++ b/scripts/nn/layers/softmax2d.dml
@@ -22,7 +22,6 @@
 /*
  * 2D Softmax classifier layer.
  */
-
  source("nn/util.dml") as util
  source("nn/layers/softmax.dml") as softmax
 
@@ -52,7 +51,6 @@
    * Outputs:
    *  - probs: Outputs, of shape (N, C*Hin*Win).
    */
-
   # For numerical stability, we subtract the max score of an example from all scores for that
   # example.  This is equivalent to the original formulation:
   # e^scores_ijk / sum(e^scores_ijk) == C*e^scores_ijk / C*sum(e^scores_ijk)
@@ -97,7 +95,6 @@ backward = function(matrix[double] dprobs, matrix[double] scores, int C)
    * Outputs:
    *  - dscores: Gradient wrt `scores`, of shape (N, C*Win*Hin).
    */
-
   N = nrow(scores)
 
   #Transpose the matrix from (N, C*H*W) to (N*H*W, C)
@@ -114,3 +111,4 @@ backward = function(matrix[double] dprobs, matrix[double] scores, int C)
   dscores_C_NHW = t(dscores_NHW_C)
   dscores = util::transpose_NCHW_to_CNHW(dscores_C_NHW, N)
 }
+

http://git-wip-us.apache.org/repos/asf/systemml/blob/d56c05ec/scripts/nn/test/grad_check.dml
----------------------------------------------------------------------
diff --git a/scripts/nn/test/grad_check.dml b/scripts/nn/test/grad_check.dml
index 6844f40..c969a98 100644
--- a/scripts/nn/test/grad_check.dml
+++ b/scripts/nn/test/grad_check.dml
@@ -31,6 +31,7 @@ source("nn/layers/conv2d_depthwise.dml") as conv2d_depthwise
 source("nn/layers/conv2d_transpose.dml") as conv2d_transpose
 source("nn/layers/conv2d_transpose_depthwise.dml") as conv2d_transpose_depthwise
 source("nn/layers/cross_entropy_loss.dml") as cross_entropy_loss
+source("nn/layers/cross_entropy_loss2d.dml") as cross_entropy_loss2d
 source("nn/layers/dropout.dml") as dropout
 source("nn/layers/l1_loss.dml") as l1_loss
 source("nn/layers/l1_reg.dml") as l1_reg
@@ -922,12 +923,12 @@ cross_entropy_loss = function() {
   print("Grad checking the cross-entropy loss function.")
 
   # Generate data
-  N = 3 # num examples
-  K = 10 # num targets
+  N = 3  # num examples
+  K = 10  # num targets
   pred = rand(rows=N, cols=K, min=0, max=1, pdf="uniform")
-  pred = pred / rowSums(pred)  # normalized probs
+  pred = softmax::forward(pred)  # normalized probs
   y = rand(rows=N, cols=K, min=0, max=1, pdf="uniform")
-  y = y / rowSums(y)  # normalized probs
+  y = softmax::forward(y)  # normalized probs
 
   # Compute analytical gradient
   dpred = cross_entropy_loss::backward(pred, y)
@@ -951,6 +952,45 @@ cross_entropy_loss = function() {
   }
 }
 
+cross_entropy_loss2d = function() {
+  /*
+   * Gradient check for the 2D cross-entropy loss function.
+   */
+  print("Grad checking the 2D cross-entropy loss function.")
+
+  # Generate data
+  N = 3  # num examples
+  C = 10  # num targets
+  Hin = 5  # example height
+  Win = 5  # example width
+  pred = rand(rows=N, cols=C*Hin*Win, min=0, max=1, pdf="uniform")
+  pred = softmax2d::forward(pred, C)  # normalized probs
+
+  y = rand(rows=N, cols=C*Hin*Win, min=0, max=1, pdf="uniform")
+  y = softmax2d::forward(y, C)  # normalized probs
+
+  # Compute analytical gradient
+  dpred = cross_entropy_loss2d::backward(pred, y, C)
+
+  # Grad check
+  h = 1e-6
+  for (i in 1:nrow(pred)) {
+    for (j in 1:ncol(pred)) {
+      # Compute numerical derivative
+      old = as.scalar(pred[i,j])
+      pred[i,j] = old - h
+      lossmh = cross_entropy_loss2d::forward(pred, y, C)
+      pred[i,j] = old + h
+      lossph = cross_entropy_loss2d::forward(pred, y, C)
+      pred[i,j] = old  # reset W[i,j]
+      dpred_num = (lossph-lossmh) / (2*h)  # numerical derivative
+
+      # Check error
+      rel_error = test_util::check_rel_grad_error(as.scalar(dpred[i,j]), dpred_num, lossph, lossmh)
+    }
+  }
+}
+
 dropout = function() {
   /*
    * Gradient check for the (inverted) dropout layer.

http://git-wip-us.apache.org/repos/asf/systemml/blob/d56c05ec/scripts/nn/test/run_tests.dml
----------------------------------------------------------------------
diff --git a/scripts/nn/test/run_tests.dml b/scripts/nn/test/run_tests.dml
index 0662ffa..0f42816 100644
--- a/scripts/nn/test/run_tests.dml
+++ b/scripts/nn/test/run_tests.dml
@@ -31,6 +31,7 @@ print("---")
 
 # Loss & loss-related functions
 grad_check::cross_entropy_loss()
+grad_check::cross_entropy_loss2d()
 grad_check::l1_loss()
 grad_check::l1_reg()
 grad_check::l2_loss()
@@ -93,6 +94,7 @@ test::conv2d_depthwise()
 test::conv2d_transpose()
 test::conv2d_transpose_depthwise()
 test::cross_entropy_loss()
+test::cross_entropy_loss2d()
 test::im2col()
 test::max_pool2d()
 test::padding()

http://git-wip-us.apache.org/repos/asf/systemml/blob/d56c05ec/scripts/nn/test/test.dml
----------------------------------------------------------------------
diff --git a/scripts/nn/test/test.dml b/scripts/nn/test/test.dml
index adaef5c..e63639c 100644
--- a/scripts/nn/test/test.dml
+++ b/scripts/nn/test/test.dml
@@ -30,6 +30,7 @@ source("nn/layers/conv2d_depthwise.dml") as conv2d_depthwise
 source("nn/layers/conv2d_transpose.dml") as conv2d_transpose
 source("nn/layers/conv2d_transpose_depthwise.dml") as conv2d_transpose_depthwise
 source("nn/layers/cross_entropy_loss.dml") as cross_entropy_loss
+source("nn/layers/cross_entropy_loss2d.dml") as cross_entropy_loss2d
 source("nn/layers/max_pool2d.dml") as max_pool2d
 source("nn/layers/max_pool2d_builtin.dml") as max_pool2d_builtin
 source("nn/layers/tanh.dml") as tanh
@@ -411,6 +412,63 @@ cross_entropy_loss = function() {
   }
 }
 
+cross_entropy_loss2d = function() {
+  /*
+   * Test for the 2D cross-entropy loss function.
+   */
+  print("Testing the 2D cross-entropy loss function.")
+
+  # Generate data
+  N = 2  # num examples
+  C = 3  # num targets
+  Hin = 3  # example height
+  Win = 3  # example width
+  loss_expected = 0.0770996
+
+  # pred data after the softmax
+  pred = matrix("9.99909163e-01   4.99988675e-01   4.53958055e-05
+                 9.99909163e-01   4.53958055e-05   4.53958055e-05
+                 9.99909163e-01   4.53958055e-05   4.53958055e-05
+                 4.53958055e-05   4.99988675e-01   4.53958055e-05
+                 4.53958055e-05   9.99909163e-01   4.53958055e-05
+                 4.53958055e-05   9.99909163e-01   4.53958055e-05
+                 4.53958055e-05   2.26994507e-05   9.99909163e-01
+                 4.53958055e-05   4.53958055e-05   9.99909163e-01
+                 4.53958055e-05   4.53958055e-05   9.99909163e-01
+                 9.99909163e-01   4.99988675e-01   4.53958055e-05
+                 9.99909163e-01   4.53958055e-05   4.53958055e-05
+                 9.99909163e-01   4.53958055e-05   4.53958055e-05
+                 4.53958055e-05   4.99988675e-01   4.53958055e-05
+                 4.53958055e-05   9.99909163e-01   4.53958055e-05
+                 4.53958055e-05   9.99909163e-01   4.53958055e-05
+                 4.53958055e-05   2.26994507e-05   9.99909163e-01
+                 4.53958055e-05   4.53958055e-05   9.99909163e-01
+                 4.53958055e-05   4.53958055e-05   9.99909163e-01", rows=N, cols=C*Hin*Win)
+  y = matrix("1 0 0
+              1 0 0
+              1 0 0
+              0 1 0
+              0 1 0
+              0 1 0
+              0 0 1
+              0 0 1
+              0 0 1
+              1 0 0
+              1 0 0
+              1 0 0
+              0 1 0
+              0 1 0
+              0 1 0
+              0 0 1
+              0 0 1
+              0 0 1", rows=N, cols=C*Hin*Win)
+
+  loss = cross_entropy_loss2d::forward(pred, y, C)
+
+  # Equivalency check
+  rel_error = test_util::check_rel_error(loss, loss_expected, 1e-3, 1e-4)
+}
+
 im2col = function() {
   /*
    * Test for the `im2col` and `col2im` functions.