You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/04/06 14:52:33 UTC

[GitHub] [incubator-mxnet] kpuatamazon commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL

kpuatamazon commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL
URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-609843360
 
 
   Here's some benchmarks on 2fff11d4233814aa4ad07858779338090ec2132d (current tip of master) with the same Skylake c5.9xlarge.  DNNL is substantially slower than MKL. 
   
   (Which DNNL is in master?)
   
   ```python
   #!/usr/bin/env python3
   import mxnet as mx
   import time
   
   def time_procedure(shape, count, proc):
     rows, inner, cols = shape
     a = mx.nd.random_uniform(shape=(rows, inner), low=-1.0, high=1.0)
     b = mx.nd.random_uniform(shape=(cols, inner), low=-1.0, high=1.0)
     # Burn in
     proc(a, b, cols)
     mx.nd.waitall()
     begin = time.time()
     for i in range(0, count):
       proc(a, b, cols)
       mx.nd.waitall()
     return (time.time() - begin) / count
   
   shapes = [(5, 512, 512), (5,512,1536), (5,512,2048), (5,2048,512), (4,512,512)]
   count = 1000
   
   procedures = {
     "fullyconnected (DNNL)" : (lambda a, b, cols : mx.nd.FullyConnected(a, b, no_bias=True, num_hidden=cols)),
     "dot (MKL)" : (lambda a, b, cols : mx.nd.dot(a, b, transpose_b = True))
   }
   for s in shapes:
     print("Shape " + str(s))
     stats = {}
     for name, l in procedures.items():
       stats[name] = time_procedure(s, count, l)
       print("{:.7f} seconds for {}".format(stats[name], name))
   ```
   Run as `OMP_NUM_THREADS=4 ./mult_bench.py`:
   ```
   Shape (5, 512, 512)
   0.0000961 seconds for fullyconnected (DNNL)
   0.0000509 seconds for dot (MKL)
   Shape (5, 512, 1536)
   0.0002011 seconds for fullyconnected (DNNL)
   0.0000735 seconds for dot (MKL)
   Shape (5, 512, 2048)
   0.0002521 seconds for fullyconnected (DNNL)
   0.0001027 seconds for dot (MKL)
   Shape (5, 2048, 512)
   0.0003569 seconds for fullyconnected (DNNL)
   0.0001018 seconds for dot (MKL)
   Shape (4, 512, 512)
   0.0000946 seconds for fullyconnected (DNNL)
   0.0000496 seconds for dot (MKL)
   ```
   
   I don't really mind what the default BLAS implementation is.  But choosing MKL should require undocumented compile options.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services