You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/01/25 02:52:41 UTC

[GitHub] yuewu001 opened a new issue #9557: update_on_kvstore error setting with multiple machines

yuewu001 opened a new issue #9557: update_on_kvstore error setting with multiple machines
URL: https://github.com/apache/incubator-mxnet/issues/9557
 
 
   When I was training with multiple machines, i found that [model.py:_create_kvstore ](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/model.py)function sets update_on_kvstore to True. In the gluon interface (trainer.py), i found the following code: 
   ```python
    if 'dist' in kvstore.type:
       update_on_kvstore = False
       for i, param in enumerate(self._params):
           param_arrays = param.list_data()
           kvstore.init(i, param_arrays[0])
           kvstore.pull(i, param_arrays, priority=-i)
   ```
   while in module.py, update_on_kvstore is not set to False. 
   
   Is this a bug?
   
   Besides, the gluon interfaces pull all param_arrarys whatever update_on_kvstore is. But in the python interface (model.py), only when update_on_kvstore is True, the params are pulled.  Any reasons?
   
   ```python
   def _initialize_kvstore(kvstore, param_arrays, arg_params, param_names, update_on_kvstore):
       """Initialize kvstore"""
       for idx, param_on_devs in enumerate(param_arrays):
           name = param_names[idx]
           kvstore.init(name,  #arg_params[name])
   
           if update_on_kvstore:
               kvstore.pull(name, param_on_devs, priority=-idx)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services