You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/01/09 10:43:59 UTC
[GitHub] alexmosc opened a new issue #9358: Why do running 1 round of an MXNET model training produce Train-mse=NaN?
alexmosc opened a new issue #9358: Why do running 1 round of an MXNET model training produce Train-mse=NaN?
URL: https://github.com/apache/incubator-mxnet/issues/9358
If I run just 1 round of an MXNET model training with `mx.model.FeedForward.create` I get NaN as a training error. Is this for a purpose?
```
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mxnet_0.10.1 pryr_0.1.3 quantregForest_1.3-6 RColorBrewer_1.1-2 randomForest_4.6-12 ggjoy_0.4.0 ggridges_0.4.1
[8] DT_0.2 caret_6.0-77 lattice_0.20-35 FSelector_0.21 scales_0.5.0 nnet_7.3-12 infotheo_1.2.0
[15] cluster_2.0.6 forecast_8.2 gridExtra_2.3 kableExtra_0.6.1 knitr_1.17 rmarkdown_1.8 markdown_0.8
[22] TTR_0.23-2 tseries_0.10-42 ggplot2_2.2.1 magrittr_1.5 data.table_1.10.4-3
loaded via a namespace (and not attached):
[1] colorspace_1.3-2 class_7.3-14 rprojroot_1.2 rstudioapi_0.7 DRR_0.0.2 prodlim_1.6.1 lubridate_1.7.1 xml2_1.1.1
[9] codetools_0.2-15 splines_3.4.0 mnormt_1.5-5 robustbase_0.92-8 RcppRoll_0.2.2 jsonlite_1.5 entropy_1.2.1 rJava_0.9-9
[17] broom_0.4.3 ddalpha_1.3.1 kernlab_0.9-25 sfsmisc_1.1-1 DiagrammeR_0.9.2 readr_1.1.1 compiler_3.4.0 httr_1.3.1
[25] backports_1.1.1 assertthat_0.2.0 Matrix_1.2-9 lazyeval_0.2.1 visNetwork_2.0.1 htmltools_0.3.6 tools_3.4.0 bindrcpp_0.2
[33] igraph_1.1.2 gtable_0.2.0 glue_1.2.0 reshape2_1.4.2 dplyr_0.7.4 Rcpp_0.12.14 rgexf_0.15.3 fracdiff_1.4-2
[41] nlme_3.1-131 iterators_1.0.8 psych_1.7.8 lmtest_0.9-35 timeDate_3042.101 gower_0.1.2 stringr_1.2.0 rvest_0.3.2
[49] RWekajars_3.9.1-5 XML_3.98-1.9 DEoptimR_1.0-8 MASS_7.3-47 zoo_1.8-0 ipred_0.9-6 hms_0.4.0 parallel_3.4.0
[57] quantmod_0.4-11 curl_3.0 downloader_0.4 rpart_4.1-11 stringi_1.1.6 Rook_1.1-1 foreach_1.4.3 RWeka_0.4-36
[65] lava_1.5.1 rlang_0.1.4 pkgconfig_2.0.1 evaluate_0.10.1 purrr_0.2.4 bindr_0.1 recipes_0.1.1 htmlwidgets_0.9
[73] CVST_0.2-1 tidyselect_0.2.3 plyr_1.8.4 R6_2.2.2 dimRed_0.1.0 foreign_0.8-67 withr_2.1.0 xts_0.10-0
[81] survival_2.41-3 tibble_1.3.4 viridis_0.4.0 grid_3.4.0 influenceR_0.1.0 ModelMetrics_1.1.0 digest_0.6.12 tidyr_0.7.2
[89] brew_1.0-6 stats4_3.4.0 munsell_0.4.3 viridisLite_0.2.0 quadprog_1.5-5
```
Console:
```
Start training with 1 devices
[1] Train-mse=NaN
```
```
library(mxnet)
hidden_u_1 <- 10
activ_hidden_1 <- 'tanh'
hidden_u_2 <- 1
learn_rate <- 0.001
initializer <- mx.init.uniform(1)
optimizer <- 'rmsprop' #sgd
loss <- mx.metric.mse
device.cpu <- mx.cpu()
mini_batch <- 64 #8
rounds <- 1 #2
## data symbols
nn_data <- mx.symbol.Variable('data')
nn_label <- mx.symbol.Variable('label')
## first fully connected layer
fc1 <- mx.symbol.FullyConnected(data = nn_data
, num_hidden = hidden_u_1)
activ1 <- mx.symbol.Activation(data = fc1, act.type = activ_hidden_1)
## second fully connected layer
fc2 <- mx.symbol.FullyConnected(data = activ1, num_hidden = hidden_u_2)
q_func <- mx.symbol.LinearRegressionOutput(data = fc2, label = nn_label, name = 'regr')
# initialize NN
train.x <- matrix(rnorm(mini_batch * 10, 0, 1), ncol = 10)
train.y = rnorm(64, 0, 1)
nn_model <- mx.model.FeedForward.create(
symbol = q_func,
X = train.x,
y = train.y,
ctx = device.cpu,
num.round = rounds,
array.batch.size = mini_batch, #60
optimizer = optimizer,
eval.metric = loss,
learning.rate = learn_rate,
initializer = initializer
)
```
## What have you tried to solve it?
If I use 2 or more rounds, or minibatch of the size smaller than the number of samples in my dataset, I get a numeric value of train error.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services