You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/02/17 01:27:39 UTC

[GitHub] Ishitori commented on issue #13644: aws distributed training struck !

Ishitori commented on issue #13644: aws distributed training struck !
URL: https://github.com/apache/incubator-mxnet/issues/13644#issuecomment-464405780
 
 
   @Davdi, did you manage to find the issue? To me it seems that it should be connected to connectivity, because when you remove `--kvstore dist_sync` the default value is used, which for this tutorial is `device`. 
   
   I also notice that you set n = 1, but mention "2 instances". Which configuration do you try to achieve? And what is the content of your hosts file?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services