You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/01/29 15:19:36 UTC

[GitHub] PistonY opened a new issue #13709: Why FP16 training speed is too slow on Tesla T4 in Gluon?

PistonY opened a new issue #13709: Why FP16 training speed is too slow on Tesla T4 in Gluon?
URL: https://github.com/apache/incubator-mxnet/issues/13709
 
 
   Hi, I tried to train with FP16 on  Tesla T4, but it's speed is slower than GTX 1070 with FP32.
   Could you please give me some suggests to solve that?
   T4 is on Mxnet-cu100mkl and GTX1070 is on mxnet-cu90mkl
   Here are my script and logs:
   code: https://gist.github.com/PistonY/8dfcefdc46b747afd4d18b37f9a18665
   logs:
   T4 log:
   ```
   INFO:root:Iter 390. Loss: 2.14372, Train RMSE 0.23653.Time 00:05:47.lr 0.019948717948717953
   INFO:root:Test Loss: 1.935017, Test acc 0.327200.
   INFO:root:Iter 780. Loss: 1.89404, Train RMSE 0.22111.Time 00:05:52.lr 0.03994871794871795
   INFO:root:Test Loss: 1.460350, Test acc 0.473100.
   INFO:root:Iter 1170. Loss: 1.72982, Train RMSE 0.20837.Time 00:05:49.lr 0.05994871794871795
   INFO:root:Test Loss: 1.288763, Test acc 0.559500.
   INFO:root:Iter 1560. Loss: 1.57620, Train RMSE 0.19388.Time 00:05:48.lr 0.07994871794871795
   INFO:root:Test Loss: 1.856537, Test acc 0.530100.
   ```
   GTX 1070 log:
   ```
   INFO:root:Epoch 0, Iter 390. Loss: 2.12699, Train RMSE 0.23722.Time 00:03:00.lr 0.019948717948717953
   INFO:root:Test Loss: 1.746372, Test acc 0.361800.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services