You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/09/27 22:22:35 UTC

[GitHub] [incubator-mxnet] mk-61 opened a new pull request #20615: Fast cuDNN NHWC kernels support

mk-61 opened a new pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615


   ## Description ##
   This PR makes cuDNN-backed BatchNorm operator use newer API calls (cudnnBatchNormalizationForwardTrainingEx / cudnnBatchNormalizationBackwardEx), which bring in significant speed up in some cases (fp16 NHWC / NDHWC layouts).
   
   I also refactored and simplified code a bit.
   
   I tested fp16 NHWC speedup on my Layout Management feature branch (not up-streamed yet) on ResNet50 model.
   The correctness should be covered by existing tests.
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] Code is well-documented
   
   ### Changes ###
   - [x] Make use of newer cuDNN API calls / new kernels
   - [x] Refactoring
   
   @DickJC123 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mk-61 commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mk-61 commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929393692






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929676444


   @mxnet-bot run ci [unix-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929676444


   @mxnet-bot run ci [unix-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mk-61 commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mk-61 commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929393692


   @mxnet-bot run ci [centos-gpu, unix-cpu, website]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929393767


   Jenkins CI successfully triggered : [centos-gpu, website, unix-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929560340


   Jenkins CI successfully triggered : [unix-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mk-61 commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mk-61 commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929671534


   > Yeah, it would be good to check that NCHW does not regress.
   
   Verified on RN50 / Volta - no regressions and the same kernels used, as far as nsys stats show.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929676479


   Jenkins CI successfully triggered : [unix-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mk-61 commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mk-61 commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929560270


   @mxnet-bot run ci [unix-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mk-61 commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mk-61 commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929601447


   > LGTM, did you also check the performance of NCHW case?
   
   You mean compared to functions without "Ex" suffix? Not, I haven't, can do if you like me to. Although I think the logic behind "Ex" functions is "make it faster in some case and fallback to the previous implementations otherwise". Specifically, I expected (and verified) speedup in FP16/NHWC, assumed it shouldn't regress in other cases, unless there's a bug, which cuDNN would need to fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929604317


   Yeah, it would be good to check that NCHW does not regress.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-931508553


   Thanks for the contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929676479


   Jenkins CI successfully triggered : [unix-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20615: Fast cuDNN NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-928361618






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx merged pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
ptrendx merged pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20615: Fast cuDNN NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-928361618


   Hey @mk-61 , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [sanity, windows-cpu, miscellaneous, website, windows-gpu, unix-gpu, centos-cpu, unix-cpu, clang, edge, centos-gpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mk-61 commented on pull request #20615: Fast cuDNN BatchNorm NHWC kernels support

Posted by GitBox <gi...@apache.org>.
mk-61 commented on pull request #20615:
URL: https://github.com/apache/incubator-mxnet/pull/20615#issuecomment-929671534


   > Yeah, it would be good to check that NCHW does not regress.
   
   Verified on RN50 / Volta - no regressions and the same kernels used, as far as nsys stats show.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org