You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/12/14 10:29:21 UTC

[GitHub] [incubator-mxnet] kpuatamazon commented on pull request #19562: [WIP]Integrate oneDNN layer normalization implementation

kpuatamazon commented on pull request #19562:
URL: https://github.com/apache/incubator-mxnet/pull/19562#issuecomment-744344661


   I've been using a c5.12xlarge `Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz`.  Assume these are some sort of seconds?  
   
   We should at least do `-march=native` to see if it's just a matter of CPU support i.e. MXNet doesn't seem to enable AVX512 by default and one could add CPUID dispatch.  
   
   Might as well reshape to two dimensions with the axis preserved and everything else multiplied right?  
   
   Also, I feel like the optimal assembly implementation would benefit from a different ordering of the input tensor to allow for pure vertical adds whereas layer normalization is currently setup for horizontal adds.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org