You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/04/10 21:35:17 UTC

[GitHub] [incubator-mxnet] szhengac opened a new issue #18024: Transformer Model Segfault

szhengac opened a new issue #18024: Transformer Model Segfault
URL: https://github.com/apache/incubator-mxnet/issues/18024
 
 
   Training transformer in [gluonnlp](https://github.com/dmlc/gluon-nlp/tree/master/scripts/machine_translation) with master brach leads to segmentation fault. The training passed with the nightly build on March 12th. I used AWS Linux AMI with cuda100.
   
   
   Command: 
   `
   python train_transformer.py --dataset WMT2014BPE --src_lang en --tgt_lang de --batch_size 2700 --optimizer adam --num_accumulated 16 --lr 3.0 --warmup_steps 4000 --save_dir transformer_en_de_u512 --epochs 30 --gpus 0,1,2,3,4,5,6,7 --scaled --average_start 5 --num_buckets 20 --bucket_scheme exp --bleu 13a --log_interval 10
   `

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] szhengac commented on issue #18024: Transformer Model Segfault

Posted by GitBox <gi...@apache.org>.
szhengac commented on issue #18024: Transformer Model Segfault
URL: https://github.com/apache/incubator-mxnet/issues/18024#issuecomment-612329865
 
 
   @leezu There is no any other message other than the segmentation fault.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #18024: Transformer Model Segfault

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18024: Transformer Model Segfault
URL: https://github.com/apache/incubator-mxnet/issues/18024#issuecomment-612329359
 
 
   Can you include the backtrace?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services