You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/12/01 07:30:22 UTC

[GitHub] RogerChern commented on issue #8884: forward can't run parallelly using multi-gpus when custom operator using numpy

RogerChern commented on issue #8884: forward can't run parallelly using multi-gpus when custom operator using numpy  
URL: https://github.com/apache/incubator-mxnet/issues/8884#issuecomment-348422498
 
 
   We here at TuSimple also found this phenomenon. This is the bottleneck for large-scale training of DET models.
   We also found that change CPU_WORKER number does not alleviate this.
   A viable workaround is to rewrite memory-oriented operators in pure C++ by copying in_data from GPU and out_data back to GPU.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services