You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/01/12 20:23:00 UTC

[GitHub] opringle opened a new issue #9405: BucketingModule causes error if symbol name not specified

opringle opened a new issue #9405: BucketingModule causes error if symbol name not specified
URL: https://github.com/apache/incubator-mxnet/issues/9405
 
 
   ## Description
   
   When using the *BucketingModule* together with *mx.rnn.BuckSentenceIter*, I hit an uncaught exception if I do not specify the symbol name in either *mx.sym.Embedding* or *mx.sym.FullyConnected* symbols, despite each symbol in the network having a unique name.
   
   ## Environment info (Required)
   
   ```
   ----------Python Info----------
   Version      : 3.6.3
   Compiler     : GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)
   Build        : ('default', 'Oct  4 2017 06:09:15')
   Arch         : ('64bit', '')
   ------------Pip Info-----------
   Version      : 9.0.1
   Directory    : /Users/opringle/Envs/finn-dl/lib/python3.6/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.0.0
   Directory    : /Users/opringle/Envs/finn-dl/lib/python3.6/site-packages/mxnet
   Commit Hash   : 2b67436802b750e15b9fbfdf275958c1000be6a8
   ----------System Info----------
   Platform     : Darwin-17.3.0-x86_64-i386-64bit
   system       : Darwin
   node         : MacBook-Pro-2.local
   release      : 17.3.0
   version      : Darwin Kernel Version 17.3.0: Thu Nov  9 18:09:22 PST 2017; root:xnu-4570.31.3~1/RELEASE_X86_64
   ----------Hardware Info----------
   machine      : x86_64
   processor    : i386
   b'machdep.cpu.brand_string: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz'
   b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C'
   b'machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM SMAP RDSEED ADX IPT SGX FPU_CSDS MPX CLFSOPT'
   b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI'
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0077 sec, LOAD: 0.4726 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0263 sec, LOAD: 0.0633 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0897 sec, LOAD: 0.0599 sec.
   Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0259 sec, LOAD: 0.1184 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0139 sec, LOAD: 0.2371 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0305 sec, LOAD: 0.2018 sec.
   ```
   
   I'm using python
   
   ## Error Message:
   
   ```
   libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: unordered_map::at: key not found
   Abort trap: 6
   ```
   
   ## Minimum reproducible example
   
   ```python
   import mxnet as mx
   import random
   
   #synthetically create 1000 encoded sentences
   encoded_sentences = []
   for i in list(range(1000)):
       sentence_length = random.randint(1,20)
       sentence = random.sample(range(20), sentence_length)
       encoded_sentences.append(sentence)
   
   #define hyperparameters/info
   vocab = list(set(index for list in encoded_sentences for index in list))
   batch_size = 18
   num_embed = 10
   num_hidden = 3
   buckets = [5,10,15,20]
   epochs = 15
   
   #use mxnet bucketing iterator, label is next value in sentence
   train_iter = mx.rnn.BucketSentenceIter(sentences = encoded_sentences,
                                          batch_size = batch_size,
                                          buckets = buckets,
                                          data_name = 'data',
                                          label_name = 'softmax_label')
   
   #define recurrent cell outside of network TODO: why outside?
   r_cell = mx.rnn.LSTMCell(num_hidden=num_hidden)
   
   #define a network symbol based on sentence length
   def sym_gen1(seq_len):
   
       print("-" * 50)
   
       #define input shape so we can infer layer shapes as we go
       data_shape = (batch_size, seq_len)
       label_shape = (batch_size, seq_len)
   
       data = mx.sym.Variable('data')
       print("\ndata shape: ", data.infer_shape(data=data_shape)[1][0], "\n")
   
       label = mx.sym.Variable('softmax_label')
       print("\nlabel shape: ", label.infer_shape(softmax_label=label_shape)[1][0], "\n")
   
       embed = mx.sym.Embedding(data=data, input_dim=len(vocab), output_dim=num_embed,name='embed')
       print("\nembed layer shape: ", embed.infer_shape(data=data_shape)[1][0], "\n")
   
       output, shapes = r_cell.unroll(seq_len, inputs=embed, merge_outputs=True)
       print("\nconcatenated recurrent layer shape: ", output.infer_shape(data=data_shape)[1][0], "after ", seq_len, " unrolls\n")
   
       reshape = mx.sym.Reshape(output, shape=(-1, num_hidden))
       print("\nafter reshaping: ", reshape.infer_shape(data=data_shape)[1][0], "\n")
   
       fc = mx.sym.FullyConnected(data=reshape, num_hidden=len(vocab), name='pred')
       print("\nfully connected layer shape: ", fc.infer_shape(data=data_shape)[1][0], "\n")
   
       label = mx.sym.Reshape(label, shape=(-1,))
       print("\nlabel shape after reshaping: ", label.infer_shape(softmax_label=label_shape)[1][0], "\n")
   
       pred = mx.sym.SoftmaxOutput(data=fc, label=label, name='softmax')
       print("\nsoftmax shape: ", pred.infer_shape(data=data_shape, softmax_label = label_shape)[1][0], "\n")
   
       return pred, ('data',), ('softmax_label',)
   
   def sym_gen2(seq_len):
   
       print("-" * 50)
   
       #define input shape so we can infer layer shapes as we go
       data_shape = (batch_size, seq_len)
       label_shape = (batch_size, seq_len)
   
       data = mx.sym.Variable('data')
       print("\ndata shape: ", data.infer_shape(data=data_shape)[1][0], "\n")
   
       label = mx.sym.Variable('softmax_label')
       print("\nlabel shape: ", label.infer_shape(softmax_label=label_shape)[1][0], "\n")
   
       embed = mx.sym.Embedding(data=data, input_dim=len(vocab), output_dim=num_embed,name='embed')
       print("\nembed layer shape: ", embed.infer_shape(data=data_shape)[1][0], "\n")
   
       output, shapes = r_cell.unroll(seq_len, inputs=embed, merge_outputs=True)
       print("\nconcatenated recurrent layer shape: ", output.infer_shape(data=data_shape)[1][0], "after ", seq_len, " unrolls\n")
   
       reshape = mx.sym.Reshape(output, shape=(-1, num_hidden))
       print("\nafter reshaping: ", reshape.infer_shape(data=data_shape)[1][0], "\n")
   
       fc = mx.sym.FullyConnected(data=reshape, num_hidden=len(vocab))
       print("\nfully connected layer shape: ", fc.infer_shape(data=data_shape)[1][0], "\n")
   
       label = mx.sym.Reshape(label, shape=(-1,))
       print("\nlabel shape after reshaping: ", label.infer_shape(softmax_label=label_shape)[1][0], "\n")
   
       pred = mx.sym.SoftmaxOutput(data=fc, label=label, name='softmax')
       print("\nsoftmax shape: ", pred.infer_shape(data=data_shape, softmax_label = label_shape)[1][0], "\n")
   
       return pred, ('data',), ('softmax_label',)
   
   
   #create a trainable bucketing module on cpu
   model = mx.mod.BucketingModule(sym_gen=sym_gen1, default_bucket_key=train_iter.default_bucket_key, context=mx.cpu())
   model_not_working = mx.mod.BucketingModule(sym_gen=sym_gen2, default_bucket_key=train_iter.default_bucket_key, context=mx.cpu())
   
   #fit the first module
   print("\n\n\nFITTING FIRST MODULE...\n\n\n")
   metric = mx.metric.create('loss')
   model.bind(data_shapes=train_iter.provide_data,
              label_shapes=train_iter.provide_label)
   model.init_params()
   model.init_optimizer(optimizer='Adam')
   for epoch in range(epochs):
       train_iter.reset()
       metric.reset()
       for batch in train_iter:
           model.forward(batch, is_train=True)             # compute predictions
           model.backward()                                # compute gradients
           model.update()                                  # update parameters
           model.update_metric(metric, batch.label)        # update metric
       print('\n', 'Epoch %d, Training %s' % (epoch, metric.get()))
   
   #fit the second module
   print("\n\n\nFITTING SECOND MODULE...\n\n\n")
   metric = mx.metric.create('loss')
   model_not_working.bind(data_shapes=train_iter.provide_data,
              label_shapes=train_iter.provide_label)
   model_not_working.init_params()
   model_not_working.init_optimizer(optimizer='Adam')
   for epoch in range(epochs):
       train_iter.reset()
       metric.reset()
       for batch in train_iter:
           model_not_working.forward(batch, is_train=True)             # compute predictions
           model_not_working.backward()                                # compute gradients
           model_not_working.update()                                  # update parameters
           model_not_working.update_metric(metric, batch.label)        # update metric
       print('\n', 'Epoch %d, Training %s' % (epoch, metric.get()))
   ```
   
   ## Steps to reproduce
   
   1. Run the script
   2. You will see the module fit correctly with *gen_sym1()* but fail with *gen_sym2()*.  The only difference between the two is the name arguement has been removed from *fc = mx.sym.FullyConnected(data=reshape, num_hidden=len(vocab), name='pred')*.
   
   ## What have you tried to solve it?
   
   1. I have solved the problem by including an input to the name parameter when creating the symbol. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services