You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/31 21:32:05 UTC

[GitHub] [incubator-mxnet] szhengac edited a comment on issue #17444: [Large Tensor] Add LT support for NN optimizers and 1 activation function

szhengac edited a comment on issue #17444: [Large Tensor] Add LT support for NN optimizers and 1 activation function
URL: https://github.com/apache/incubator-mxnet/pull/17444#issuecomment-580919548
 
 
   > So I tested MXNet (build from source using this branch)
   > with flags :
   > 
   > ```
   > python -c "from mxnet.runtime import feature_list; print(feature_list())"
   > [✔ CUDA, ✔ CUDNN, ✖ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✔ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✔ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   > ```
   > 
   > Results for training 10 epochs on 8 GPUS
   > 
   > ```
   > INFO:root:[Epoch 0] train=0.120292 val=0.158000 loss=6.658037 time: 109.734473
   >   INFO:root:[Epoch 1] train=0.167548 val=0.179600 loss=2.297145 time: 92.212359
   >   INFO:root:[Epoch 2] train=0.210777 val=0.237700 loss=2.109626 time: 92.110430
   >   INFO:root:[Epoch 3] train=0.240705 val=0.255700 loss=2.032153 time: 92.476469
   >   INFO:root:[Epoch 4] train=0.262039 val=0.273600 loss=1.976788 time: 94.570572
   >   INFO:root:[Epoch 5] train=0.279728 val=0.302300 loss=1.915808 time: 91.655044
   >   INFO:root:[Epoch 6] train=0.295393 val=0.309900 loss=1.868357 time: 94.903087
   >   INFO:root:[Epoch 7] train=0.312901 val=0.331600 loss=1.825083 time: 94.501921
   >   INFO:root:[Epoch 8] train=0.330889 val=0.334100 loss=1.788333 time: 95.653459
   >   INFO:root:[Epoch 9] train=0.344211 val=0.349900 loss=1.757741 time: 94.065917
   > ```
   > 
   > Is this fine?
   
   Can you also test the optimizer op with a large sparse tensor? Currently, SGD, Adagrad, Adam, and FTRL support `row_sparse` weight and gradient.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services