You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/11/18 18:00:37 UTC

[GitHub] marfago opened a new issue #8708: NDArray stack function is slow

marfago opened a new issue #8708: NDArray stack function is slow
URL: https://github.com/apache/incubator-mxnet/issues/8708
 
 
   Hello,
   
   I am working on a very small NN in particular the example in mlp-dropout-scratch.ipynb. I am using only CPUs since my GPU is not yet supoported. I am on OSX and mxnet 12.1.
   
   Running the experiments I noticed that changing the layer's size does not affect too much execution time, so, after profiling the code, I found out that the function nd.stack takes the biggest part  of the execution time. 
   
   In particular the function below takes 9.1 secs to be executed on my machine:
   ```
   %load_ext line_profiler
   from mxnet.gluon.data import dataloader
   def loop():
       s = time.time()
       for i, (data, label) in enumerate(train_data):
           a = 1
       print("Execution time =",time.time()-s)
   %lprun -f dataloader._batchify loop()
   ```
   
   and the timing is:
   ```
   Execution time = 9.579260110855103
   
   Timer unit: 1e-06 s
   
   Total time: 0.173336 s
   File: /usr/local/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py
   Function: _batchify at line 29
   
   Line #      Hits         Time  Per Hit   % Time  Line Contents
   ==============================================================
       29                                           def _batchify(data):
       30                                               """Collate data into batch."""
       31       354          724      2.0      0.4      if isinstance(data[0], nd.NDArray):
       32       118        91534    775.7     52.8          return nd.stack(*data)
       33       236          225      1.0      0.1      elif isinstance(data[0], tuple):
       34       118        34821    295.1     20.1          data = zip(*data)
       35       118          646      5.5      0.4          return [_batchify(i) for i in data]
       36                                               else:
       37       118         5568     47.2      3.2          data = np.asarray(data)
       38       118        39818    337.4     23.0          return nd.array(data, dtype=data.dtype)
   ```
   
   This execution time is not affecting long executions, but it is still impacting when you are running short execution or quick tests. 0Is there an alternative approach?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services