You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/01/30 02:55:44 UTC

[GitHub] PistonY edited a comment on issue #13709: Why FP16 training speed is too slow on Tesla T4 in Gluon?

PistonY edited a comment on issue #13709: Why FP16 training speed is too slow on Tesla T4 in Gluon?
URL: https://github.com/apache/incubator-mxnet/issues/13709#issuecomment-458792488
 
 
   I tried to use fixed input,FP32 work well but FP16 out of memory.
   This is my script.
   ```python
   from mxnet import nd, autograd
   from mxnet import gluon
   from mxnet.gluon import loss as gloss
   from gluoncv.model_zoo import *
   import mxnet as mx
   import time
   
   ctx = mx.gpu(0)
   
   data = nd.random.normal(shape=(64, 3, 224, 224), ctx=ctx)
   lable = nd.random.randint(low=0, high=1, shape=(64, 1), ctx=ctx)
   
   net = resnet101_v2()
   net.hybridize()
   net.initialize(ctx=ctx)
   
   net(data)
   
   test_num = 500
   dtype = 'float16'    # float32 or float16
   if dtype != 'float32':
       net.cast(dtype)
   Loss = gloss.SoftmaxCrossEntropyLoss()
   trainer = gluon.Trainer(net.collect_params(),
                           'nag', {'learning_rate': 0.1, 'momentum': 0.9,
                                   'multi_precision': True  # when fp16 is enabled
                                   })
   sta = time.time()
   for _ in range(test_num):
       with autograd.record():
           output = net(data.astype(dtype, copy=False))
           loss = Loss(output, lable.astype(dtype, copy=False))
       loss.backward()
       trainer.step(128)
   end = time.time()
   print(end - sta)
   ```
   mxnet version is 1.5.0 (--pre)
   When training with FP32,it cost 9921Mb memory and 75s.
   But I tested with FP16 memory usage from 7000Mb continue to grow until out of memory.
   I don't know why, it's looks like memory doesn't free.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services