You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by gi...@git.apache.org on 2017/08/10 05:43:24 UTC
[GitHub] idealboy commented on issue #7412: About van when using distribute training
idealboy commented on issue #7412: About van when using distribute training
URL: https://github.com/apache/incubator-mxnet/issues/7412#issuecomment-321456391
my environment setting about mxnet dist-sync on two machines are below, two machines are ssh-able:
(1)10.15.240.189:
export DMLC_NUM_WORKER=1
export MXNET_GPU_WORKER_NTHREADS=8
export MXNET_CPU_WORKER_NTHREADS=8
export MXNET_CPU_PRIORITY_NTHREADS=8
export MXNET_GPU_COPY_NTHREADS=2
export DMLC_NUM_SERVER=1
export DMLC_PS_ROOT_URI=10.15.240.189
export DMLC_PS_ROOT_PORT=3000
export DMLC_ROLE=scheduler
export DMLC_INTERFACE="eth0"
(2)10.155.133.82
export DMLC_ROLE=worker
export DMLC_WORKER_NUM=1
export DMLC_SERVER_NUM=1
export DMLC_PS_ROOT_URI=10.15.240.189
export DMLC_PS_ROOT_PORT=8000
export DMLC_ROLE=worker
export DMLC_INTERFACE="eth0"
my program is:
python ../../tools/launch.py -n 2 -H hosts --sync-dst-dir /tmp/mxnet python train_mnist.py --network lenet --gpus 0 --kv-store dist_sync
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services