You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/12/04 15:51:58 UTC

[GitHub] maxkohlbrenner opened a new issue #8937: mxnet cpu broadcasting >5x slower than numpy

maxkohlbrenner opened a new issue #8937: mxnet cpu broadcasting >5x slower than numpy
URL: https://github.com/apache/incubator-mxnet/issues/8937
 
 
   ## Description
   When running on cpu only, broadcasting a  larger array is a lot more time consuming on mxnet than on numpy. Is there a way to speed this up?
   
   ## Environment info (Required)
   Running mxnet using python 2.7 on ubuntu 16.04 in cpu mode on Intel? Xeon(R) CPU E5-2687W v3 @ 3.10GHz ? 20
   Config entries for matrix computation libraries are:
   USE_CUDA = 1
   USE_CUDNN = 1
   USE_NVRTC = 0
   USE_OPENCV = 1
   USE_OPENMP = 1
   MKLML_ROOT=/usr/local
   USE_MKL2017 = 1
   USE_MKL2017_EXPERIMENTAL = 1
   USE_NNPACK = 0
   USE_BLAS = atlas
   USE_LAPACK = 1
   USE_INTEL_PATH = NONE
   
   MXNet commit hash:
   a5edbf94094581ee27157eae4f2113115a3994e7
   
   A simple test script outputs the following: 
   
   ```
   import mxnet as mx
   from   mxnet import nd
   import time
   import numpy as np
   
   num_trials = 100
   shape_a = (1000, 10000)
   shape_b = (1000,     1)
   
   strt = time.time()
   
   for i in range(num_trials):
       nd.waitall()
   
       a = nd.empty(shape_a)
       b = nd.empty(shape_b)
       c = a + b
       
       c = c.asnumpy() # implicitely waits for results to be finished
   stp = time.time()
   
   print c.shape
   print 'avg time over {} trials: {}s'.format(num_trials, (stp - strt) / float(num_trials))
   
   strt = time.time()
   for i in range(num_trials):
       a = np.empty(shape_a)
       b = np.empty(shape_b)
       c = a + b
   
   stp = time.time()
   
   print c.shape
   print 'avg time over {} trials: {}s'.format(num_trials, (stp - strt) / float(num_trials))
   ```
   
   (1000, 10000)
   avg time over 100 trials: 0.0974750995636s
   (1000, 10000)
   avg time over 100 trials: 0.0150070381165s
   
   When broadcasting larger arrays, a performance difference is clearly noticeable, is there anything I can do to speed it up?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services