You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/23 14:34:40 UTC

[GitHub] [incubator-mxnet] bgawrych opened a new pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

bgawrych opened a new pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777


   ## Description ##
   This PR fixes bug which occurs when training gluonCV deeplab with oneDNN support. 
   Original issue: https://github.com/dmlc/gluon-cv/issues/1368
    
   To reproduce:
   ```
   python3 train.py --dataset pascal_aug --model-zoo deeplab_resnet101_coco --aux --lr 0.001 --checkname res101 --no-cuda
   ```
   
   where train.py comes from https://gluon-cv.mxnet.io/_downloads/1f67ebbf3e0adc5c6fa863a3bc7672a6/train.py (from  https://gluon-cv.mxnet.io/build/examples_segmentation/train_fcn.html )
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ## Comments ##
   Author of the fix is @anko-intel with my small help
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] xinyu-intel commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
xinyu-intel commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-665012689


   @bgawrych thx, can you please add a simple python test case for this fix? I think it should be more clear.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-663042745


   Hey @bgawrych , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [clang, unix-cpu, windows-cpu, sanity, edge, miscellaneous, windows-gpu, website, centos-cpu, unix-gpu, centos-gpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-695716628


   @pengzhao-intel @bgawrych Missing in v1.x Was it intentional?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] roy6324 commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
roy6324 commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-664340360


   Thanks for the fix , problem is solved.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] pengzhao-intel commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
pengzhao-intel commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-695749563


   Yes, it's better to include this PR. @anko-intel 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] bgawrych commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
bgawrych commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-664800680






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] pengzhao-intel merged pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
pengzhao-intel merged pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] wuxun-zhang commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
wuxun-zhang commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-663926758


   Thanks for the fix. Looks that master branch also needs this fix.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] anko-intel commented on a change in pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
anko-intel commented on a change in pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#discussion_r459504958



##########
File path: src/operator/tensor/elemwise_sum.cc
##########
@@ -118,11 +118,26 @@ void ElementWiseSumComputeExCPU(const nnvm::NodeAttrs& attrs,
        inputs[1].storage_type() == kCSRStorage && inputs[2].storage_type() == kDefaultStorage) ||
       (inputs.size() > 4U && common::ContainsStorageType(inputs, kDefaultStorage) &&
        outputs[0].storage_type() == kDefaultStorage)) {
+    
     mshadow::Stream<cpu>* s = ctx.get_stream<cpu>();
     Resource rsc = ResourceManager::Get()->Request(ctx.run_ctx.get_ctx(),
         ResourceRequest(ResourceRequest::kTempSpace));
     NDArray out_nd = outputs[0];
+#if MXNET_USE_MKLDNN == 1
+      std::vector<NDArray> in_data;
+      for (const NDArray& in : inputs) {
+        // if ndarray is in default storage and MKLDNN is available,
+        // need to make sure cpu layout data is used, instead of MKL layout
+        if (in.storage_type() == kDefaultStorage) {
+          in_data.push_back(in.Reorder2Default());

Review comment:
       what about emplace_back




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] anko-intel commented on a change in pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
anko-intel commented on a change in pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#discussion_r459502487



##########
File path: src/operator/tensor/elemwise_sum.cc
##########
@@ -118,11 +118,26 @@ void ElementWiseSumComputeExCPU(const nnvm::NodeAttrs& attrs,
        inputs[1].storage_type() == kCSRStorage && inputs[2].storage_type() == kDefaultStorage) ||
       (inputs.size() > 4U && common::ContainsStorageType(inputs, kDefaultStorage) &&
        outputs[0].storage_type() == kDefaultStorage)) {
+    
     mshadow::Stream<cpu>* s = ctx.get_stream<cpu>();
     Resource rsc = ResourceManager::Get()->Request(ctx.run_ctx.get_ctx(),
         ResourceRequest(ResourceRequest::kTempSpace));
     NDArray out_nd = outputs[0];
+#if MXNET_USE_MKLDNN == 1
+      std::vector<NDArray> in_data;
+      for (const NDArray& in : inputs) {
+        // if ndarray is in default storage and MKLDNN is available,
+        // need to make sure cpu layout data is used, instead of MKL layout
+        if (in.storage_type() == kDefaultStorage) {

Review comment:
        && in.IsMKLDNNData()




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-664800718


   Jenkins CI successfully triggered : [unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] bgawrych commented on pull request #18777: [v1.7.x] ElementWiseSum fix for oneDNN

Posted by GitBox <gi...@apache.org>.
bgawrych commented on pull request #18777:
URL: https://github.com/apache/incubator-mxnet/pull/18777#issuecomment-666974179


   > LGTM
   > please add a test case too
   
   @pengzhao-intel Done :)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org