You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/08/02 15:53:16 UTC

[GitHub] alexmosc opened a new issue #12002: how I properly dimensionalize my array and tune `rnn.graph.unroll` to make the LSTM work for this multidimensional sequence

alexmosc opened a new issue #12002: how I properly dimensionalize my array and tune `rnn.graph.unroll` to make the LSTM work for this multidimensional sequence
URL: https://github.com/apache/incubator-mxnet/issues/12002
 
 
   It is essentially a call for help rather than code-related issue.
   
   Assume a matrix with 5 rows and 20 columns. Each column is a sample of a multivariate timeseries. Each row is one dimension of the multivariate timeseries.
   
   I have also a vector of 20 output values.
   
   I am trying to build an LSTM model with sequence length = 20 which would iterate over samples 1 to 20 and regress output values associated.
   
   I get all sorts of "shape mismatch" and "You are trying to split the 0-th axis of input tensor with shape" error messages. 
   
   The question is how I properly dimensionalize my array of input data and tune `rnn.graph.unroll` to make the LSTM work for this multidimensional sequence.
   
   
   ```
   library(mxnet)
   
   rm(symbol)
   
   symbol <- rnn.graph.unroll(seq_len = 20, 
                              num_rnn_layer =  1, 
                              num_hidden = 50,
                              input_size = NULL,
                              num_embed = NULL, 
                              num_decode = 1,
                              masking = F, 
                              loss_output = "linear",
                              dropout = 0.2, 
                              ignore_label = -1,
                              cell_type = "lstm",
                              output_last_state = F,
                              config = "seq-to-one")
   
   #graph.viz(symbol, type = "graph", direction = "LR", graph.height.px = 600, graph.width.px = 800)
   
   # train.data <- mx.io.arrayiter(
   #           data = matrix(rnorm(100, 0, 1), ncol = 20)
   #           , label = rnorm(20, 0, 1)
   #           , batch.size = 20
   #           , shuffle = F
   #                  )
   
   train.x <- array(
                  t(matrix(rnorm(100, 0, 1), nrow = 1))
                  , dim = c(5, 20)
   )
   
   train.y <- matrix(rnorm(20, 0, 1), nrow = 1)
   
   nn_model <- mx.model.FeedForward.create(
        symbol,
        X = train.x,
        y = train.y,
        ctx = mx.cpu(),
        begin.round = 1,
        num.round = 1000,
        optimizer = "sgd",
        learning.rate = 0.01,
        initializer = mx.init.uniform(0.01),
        eval.metric = mx.metric.mse,
        array.batch.size = 1,
        array.layout = 'colmajor'
   )
   ```
   
   Alexey

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services