You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by qk...@apache.org on 2017/08/03 22:22:11 UTC
[incubator-mxnet] branch master updated: [R] update docs from mx.symbol.MakeLoss. close #2922 (#7325)

This is an automated email from the ASF dual-hosted git repository.

qkou pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
     new dd4512f  [R] update docs from mx.symbol.MakeLoss. close #2922 (#7325)
dd4512f is described below

commit dd4512f82051711240adc301033e52bec7998abf
Author: Qiang Kou (KK) <qk...@qkou.info>
AuthorDate: Thu Aug 3 22:22:09 2017 +0000

    [R] update docs from mx.symbol.MakeLoss. close #2922 (#7325)
---
 R-package/vignettes/CustomLossFunction.Rmd | 159 +++++++++++++++++++++
 docs/tutorials/r/CustomLossFunction.md     | 220 ++++++++++++++++++++++++-----
 2 files changed, 341 insertions(+), 38 deletions(-)

diff --git a/R-package/vignettes/CustomLossFunction.Rmd b/R-package/vignettes/CustomLossFunction.Rmd
new file mode 100644
index 0000000..1817109
--- /dev/null
+++ b/R-package/vignettes/CustomLossFunction.Rmd
@@ -0,0 +1,159 @@
+---
+title: "Customized loss function"
+output:
+  md_document:
+    variant: markdown_github
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+```
+
+This tutorial provides guidelines for using customized loss function in network construction.
+
+Model Training Example
+----------
+
+Let's begin with a small regression example. We can build and train a regression model with the following code:
+
+```{r}
+data(BostonHousing, package = "mlbench")
+BostonHousing[, sapply(BostonHousing, is.factor)] <-
+  as.numeric(as.character(BostonHousing[, sapply(BostonHousing, is.factor)]))
+BostonHousing <- data.frame(scale(BostonHousing))
+
+test.ind = seq(1, 506, 5)    # 1 pt in 5 used for testing
+train.x = data.matrix(BostonHousing[-test.ind,-14])
+train.y = BostonHousing[-test.ind, 14]
+test.x = data.matrix(BostonHousing[--test.ind,-14])
+test.y = BostonHousing[--test.ind, 14]
+
+require(mxnet)
+
+data <- mx.symbol.Variable("data")
+label <- mx.symbol.Variable("label")
+fc1 <- mx.symbol.FullyConnected(data, num_hidden = 14, name = "fc1")
+tanh1 <- mx.symbol.Activation(fc1, act_type = "tanh", name = "tanh1")
+fc2 <- mx.symbol.FullyConnected(tanh1, num_hidden = 1, name = "fc2")
+lro <- mx.symbol.LinearRegressionOutput(fc2, name = "lro")
+
+mx.set.seed(0)
+model <- mx.model.FeedForward.create(lro, X = train.x, y = train.y,
+                                     ctx = mx.cpu(),
+                                     num.round = 5,
+                                     array.batch.size = 60,
+                                     optimizer = "rmsprop",
+                                     verbose = TRUE,
+                                     array.layout = "rowmajor",
+                                     batch.end.callback = NULL,
+                                     epoch.end.callback = NULL)
+
+pred <- predict(model, test.x)
+sum((test.y - pred[1,])^2) / length(test.y)
+```
+
+Besides the `LinearRegressionOutput`, we also provide `LogisticRegressionOutput` and `MAERegressionOutput`.
+However, this might not be enough for real-world models. You can provide your own loss function
+by using `mx.symbol.MakeLoss` when constructing the network.
+
+How to Use Your Own Loss Function
+---------
+
+We still use our previous example, but this time we use `mx.symbol.MakeLoss` to minimize the `(pred-label)^2`
+
+```{r}
+data <- mx.symbol.Variable("data")
+label <- mx.symbol.Variable("label")
+fc1 <- mx.symbol.FullyConnected(data, num_hidden = 14, name = "fc1")
+tanh1 <- mx.symbol.Activation(fc1, act_type = "tanh", name = "tanh1")
+fc2 <- mx.symbol.FullyConnected(tanh1, num_hidden = 1, name = "fc2")
+lro2 <- mx.symbol.MakeLoss(mx.symbol.square(mx.symbol.Reshape(fc2, shape = 0) - label), name="lro2")
+```
+
+Then we can train the network just as usual.
+
+```{r}
+mx.set.seed(0)
+model2 <- mx.model.FeedForward.create(lro2, X = train.x, y = train.y,
+                                      ctx = mx.cpu(),
+                                      num.round = 5,
+                                      array.batch.size = 60,
+                                      optimizer = "rmsprop",
+                                      verbose = TRUE,
+                                      array.layout = "rowmajor",
+                                      batch.end.callback = NULL,
+                                      epoch.end.callback = NULL)
+```
+
+We should get very similar results because we are actually minimizing the same loss function.
+However, the result is quite different.
+
+```{r}
+pred2 <- predict(model2, test.x)
+sum((test.y - pred2)^2) / length(test.y)
+```
+
+This is because output of `mx.symbol.MakeLoss` is the gradient of loss with respect to the input data.
+We can get the real prediction as below.
+
+```{r}
+internals = internals(model2$symbol)
+fc_symbol = internals[[match("fc2_output", outputs(internals))]]
+
+model3 <- list(symbol = fc_symbol,
+               arg.params = model2$arg.params,
+               aux.params = model2$aux.params)
+
+class(model3) <- "MXFeedForwardModel"
+
+pred3 <- predict(model3, test.x)
+sum((test.y - pred3[1,])^2) / length(test.y)
+```
+
+We have provided many operations on the symbols. An example of `|pred-label|` can be found below.
+
+```{r}
+lro_abs <- mx.symbol.MakeLoss(mx.symbol.abs(mx.symbol.Reshape(fc2, shape = 0) - label))
+mx.set.seed(0)
+model4 <- mx.model.FeedForward.create(lro_abs, X = train.x, y = train.y,
+                                      ctx = mx.cpu(),
+                                      num.round = 20,
+                                      array.batch.size = 60,
+                                      optimizer = "sgd",
+                                      learning.rate = 0.001,
+                                      verbose = TRUE,
+                                      array.layout = "rowmajor",
+                                      batch.end.callback = NULL,
+                                      epoch.end.callback = NULL)
+
+internals = internals(model4$symbol)
+fc_symbol = internals[[match("fc2_output", outputs(internals))]]
+
+model5 <- list(symbol = fc_symbol,
+               arg.params = model4$arg.params,
+               aux.params = model4$aux.params)
+
+class(model5) <- "MXFeedForwardModel"
+
+pred5 <- predict(model5, test.x)
+sum(abs(test.y - pred5[1,])) / length(test.y)
+```
+
+
+```{r}
+lro_mae <- mx.symbol.MAERegressionOutput(fc2, name = "lro")
+mx.set.seed(0)
+model6 <- mx.model.FeedForward.create(lro_mae, X = train.x, y = train.y,
+                                      ctx = mx.cpu(),
+                                      num.round = 20,
+                                      array.batch.size = 60,
+                                      optimizer = "sgd",
+                                      learning.rate = 0.001,
+                                      verbose = TRUE,
+                                      array.layout = "rowmajor",
+                                      batch.end.callback = NULL,
+                                      epoch.end.callback = NULL)
+pred6 <- predict(model6, test.x)
+sum(abs(test.y - pred6[1,])) / length(test.y)
+```
+
diff --git a/docs/tutorials/r/CustomLossFunction.md b/docs/tutorials/r/CustomLossFunction.md
index a710480..afb9951 100644
--- a/docs/tutorials/r/CustomLossFunction.md
+++ b/docs/tutorials/r/CustomLossFunction.md
@@ -3,57 +3,201 @@ Customized loss function
 
 This tutorial provides guidelines for using customized loss function in network construction.
 
-
 Model Training Example
-----------
+----------------------
 
 Let's begin with a small regression example. We can build and train a regression model with the following code:
 
+``` r
+data(BostonHousing, package = "mlbench")
+BostonHousing[, sapply(BostonHousing, is.factor)] <-
+  as.numeric(as.character(BostonHousing[, sapply(BostonHousing, is.factor)]))
+BostonHousing <- data.frame(scale(BostonHousing))
+
+test.ind = seq(1, 506, 5)    # 1 pt in 5 used for testing
+train.x = data.matrix(BostonHousing[-test.ind,-14])
+train.y = BostonHousing[-test.ind, 14]
+test.x = data.matrix(BostonHousing[--test.ind,-14])
+test.y = BostonHousing[--test.ind, 14]
+
+require(mxnet)
+```
+
+    ## Loading required package: mxnet
+
+``` r
+data <- mx.symbol.Variable("data")
+label <- mx.symbol.Variable("label")
+fc1 <- mx.symbol.FullyConnected(data, num_hidden = 14, name = "fc1")
+tanh1 <- mx.symbol.Activation(fc1, act_type = "tanh", name = "tanh1")
+fc2 <- mx.symbol.FullyConnected(tanh1, num_hidden = 1, name = "fc2")
+lro <- mx.symbol.LinearRegressionOutput(fc2, name = "lro")
+
+mx.set.seed(0)
+model <- mx.model.FeedForward.create(lro, X = train.x, y = train.y,
+                                     ctx = mx.cpu(),
+                                     num.round = 5,
+                                     array.batch.size = 60,
+                                     optimizer = "rmsprop",
+                                     verbose = TRUE,
+                                     array.layout = "rowmajor",
+                                     batch.end.callback = NULL,
+                                     epoch.end.callback = NULL)
+```
+
+    ## Start training with 1 devices
+
+``` r
+pred <- predict(model, test.x)
+```
+
+    ## Warning in mx.model.select.layout.predict(X, model): Auto detect layout of input matrix, use rowmajor..
+
+``` r
+sum((test.y - pred[1,])^2) / length(test.y)
+```
 
- ```r
-    library(mxnet)
-    data(BostonHousing, package="mlbench")
-    train.ind = seq(1, 506, 3)
-    train.x = data.matrix(BostonHousing[train.ind, -14])
-    train.y = BostonHousing[train.ind, 14]
-    test.x = data.matrix(BostonHousing[-train.ind, -14])
-    test.y = BostonHousing[-train.ind, 14]
-    data <- mx.symbol.Variable("data")
-    fc1 <- mx.symbol.FullyConnected(data, num_hidden=1)
-    lro <- mx.symbol.LinearRegressionOutput(fc1)
-    mx.set.seed(0)
-    model <- mx.model.FeedForward.create(
-      lro, X=train.x, y=train.y,
-      eval.data=list(data=test.x, label=test.y),
-      ctx=mx.cpu(), num.round=10, array.batch.size=20,
-      learning.rate=2e-6, momentum=0.9, eval.metric=mx.metric.rmse)
- ```
-
-Besides the `LinearRegressionOutput`, we also provide `LogisticRegressionOutput` and `MAERegressionOutput`.
-However, this might not be enough for real-world models. You can provide your own loss function
-by using `mx.symbol.MakeLoss` when constructing the network.
+    ## [1] 0.2485236
 
+Besides the `LinearRegressionOutput`, we also provide `LogisticRegressionOutput` and `MAERegressionOutput`. However, this might not be enough for real-world models. You can provide your own loss function by using `mx.symbol.MakeLoss` when constructing the network.
 
 How to Use Your Own Loss Function
----------
+---------------------------------
+
+We still use our previous example, but this time we use `mx.symbol.MakeLoss` to minimize the `(pred-label)^2`
+
+``` r
+data <- mx.symbol.Variable("data")
+label <- mx.symbol.Variable("label")
+fc1 <- mx.symbol.FullyConnected(data, num_hidden = 14, name = "fc1")
+tanh1 <- mx.symbol.Activation(fc1, act_type = "tanh", name = "tanh1")
+fc2 <- mx.symbol.FullyConnected(tanh1, num_hidden = 1, name = "fc2")
+lro2 <- mx.symbol.MakeLoss(mx.symbol.square(mx.symbol.Reshape(fc2, shape = 0) - label), name="lro2")
+```
+
+Then we can train the network just as usual.
+
+``` r
+mx.set.seed(0)
+model2 <- mx.model.FeedForward.create(lro2, X = train.x, y = train.y,
+                                      ctx = mx.cpu(),
+                                      num.round = 5,
+                                      array.batch.size = 60,
+                                      optimizer = "rmsprop",
+                                      verbose = TRUE,
+                                      array.layout = "rowmajor",
+                                      batch.end.callback = NULL,
+                                      epoch.end.callback = NULL)
+```
+
+    ## Start training with 1 devices
+
+We should get very similar results because we are actually minimizing the same loss function. However, the result is quite different.
+
+``` r
+pred2 <- predict(model2, test.x)
+```
+
+    ## Warning in mx.model.select.layout.predict(X, model): Auto detect layout of input matrix, use rowmajor..
+
+``` r
+sum((test.y - pred2)^2) / length(test.y)
+```
+
+    ## [1] 1.234584
+
+This is because output of `mx.symbol.MakeLoss` is the gradient of loss with respect to the input data. We can get the real prediction as below.
+
+``` r
+internals = internals(model2$symbol)
+fc_symbol = internals[[match("fc2_output", outputs(internals))]]
+
+model3 <- list(symbol = fc_symbol,
+               arg.params = model2$arg.params,
+               aux.params = model2$aux.params)
+
+class(model3) <- "MXFeedForwardModel"
+
+pred3 <- predict(model3, test.x)
+```
+
+    ## Warning in mx.model.select.layout.predict(X, model): Auto detect layout of input matrix, use rowmajor..
+
+``` r
+sum((test.y - pred3[1,])^2) / length(test.y)
+```
+
+    ## [1] 0.248294
+
+We have provided many operations on the symbols. An example of `|pred-label|` can be found below.
+
+``` r
+lro_abs <- mx.symbol.MakeLoss(mx.symbol.abs(mx.symbol.Reshape(fc2, shape = 0) - label))
+mx.set.seed(0)
+model4 <- mx.model.FeedForward.create(lro_abs, X = train.x, y = train.y,
+                                      ctx = mx.cpu(),
+                                      num.round = 20,
+                                      array.batch.size = 60,
+                                      optimizer = "sgd",
+                                      learning.rate = 0.001,
+                                      verbose = TRUE,
+                                      array.layout = "rowmajor",
+                                      batch.end.callback = NULL,
+                                      epoch.end.callback = NULL)
+```
+
+    ## Start training with 1 devices
+
+``` r
+internals = internals(model4$symbol)
+fc_symbol = internals[[match("fc2_output", outputs(internals))]]
+
+model5 <- list(symbol = fc_symbol,
+               arg.params = model4$arg.params,
+               aux.params = model4$aux.params)
+
+class(model5) <- "MXFeedForwardModel"
+
+pred5 <- predict(model5, test.x)
+```
+
+    ## Warning in mx.model.select.layout.predict(X, model): Auto detect layout of input matrix, use rowmajor..
+
+``` r
+sum(abs(test.y - pred5[1,])) / length(test.y)
+```
+
+    ## [1] 0.7056902
+
+``` r
+lro_mae <- mx.symbol.MAERegressionOutput(fc2, name = "lro")
+mx.set.seed(0)
+model6 <- mx.model.FeedForward.create(lro_mae, X = train.x, y = train.y,
+                                      ctx = mx.cpu(),
+                                      num.round = 20,
+                                      array.batch.size = 60,
+                                      optimizer = "sgd",
+                                      learning.rate = 0.001,
+                                      verbose = TRUE,
+                                      array.layout = "rowmajor",
+                                      batch.end.callback = NULL,
+                                      epoch.end.callback = NULL)
+```
+
+    ## Start training with 1 devices
 
-We still use our previous example.
+``` r
+pred6 <- predict(model6, test.x)
+```
 
- ```r
-    library(mxnet)
-    data <- mx.symbol.Variable("data")
-    fc1 <- mx.symbol.FullyConnected(data, num_hidden=1)
-    lro <- mx.symbol.MakeLoss(mx.symbol.square(mx.symbol.Reshape(fc1, shape = 0) - label))
- ```
+    ## Warning in mx.model.select.layout.predict(X, model): Auto detect layout of input matrix, use rowmajor..
 
-In the last line of network definition, we do not use the predefined loss function. We define the loss
-by ourselves, which is `(pred-label)^2`.
+``` r
+sum(abs(test.y - pred6[1,])) / length(test.y)
+```
 
-We have provided many operations on the symbols, so you can also define `|pred-label|` using the line below.
+    ## [1] 0.7056902
 
- ```r
-    lro <- mx.symbol.MakeLoss(mx.symbol.abs(mx.symbol.Reshape(fc1, shape = 0) - label))
- ```
 
 ## Next Steps
 * [Neural Networks with MXNet in Five Minutes](http://mxnet.io/tutorials/r/fiveMinutesNeuralNetwork.html)

-- 
To stop receiving notification emails like this one, please contact
['"commits@mxnet.apache.org" <co...@mxnet.apache.org>'].