You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/07/26 11:48:53 UTC

[GitHub] [incubator-mxnet] anko-intel commented on a change in pull request #19896: [FEATURE] OneDNN adaptive pooling support was added to mxnet.

anko-intel commented on a change in pull request #19896:
URL: https://github.com/apache/incubator-mxnet/pull/19896#discussion_r676528368



##########
File path: src/operator/contrib/adaptive_avg_pooling.cc
##########
@@ -197,6 +198,90 @@ num_threads(engine::OpenMP::Get()->GetRecommendedOMPThreadCount())
   }
 }
 
+#if MXNET_USE_MKLDNN == 1
+bool SupportMKLDNNAveragePooling(const NDArray &in_data,
+                                    const NDArray &out_data) {
+  for (int64_t idx = 2; idx < in_data.shape().ndim(); ++idx) {
+    const int s1 = in_data.shape()[idx];
+    const int s2 = out_data.shape()[idx];
+    if (s2 == 0) {
+      return false;
+    }
+    if (s1 % s2 != 0) {
+      return false;
+    }
+  }
+  const int IH = in_data.shape()[2];
+  const int IW = in_data.shape()[3];
+  const int OH = out_data.shape()[2];
+  const int OW = out_data.shape()[3];
+
+  const int strides_H = floor((IH << 1) / OH) - floor(IH / OH);
+  const int strides_W = floor((IW << 1) / OW) - floor(IW / OW);
+  const int kernel_H = ceil((IH << 1) / OH) - floor(IH / OH);
+  const int kernel_W = ceil((IW << 1) / OW) - floor(IW / OW);
+  const int pad_l_top = (strides_H * (OH - 1) + kernel_H - IH) / 2;
+  const int pad_l_left = (strides_W * (OW - 1) + kernel_W - IW) / 2;
+

Review comment:
       Could you also return false when original performance is better than OneDNN. I guess you can easy find simple, when you sort the table with results by improvement percentage. Please double check if you collect your results on server machine.
   Such boolean  value should be commented as made due to performance (not support) reason. for example:
   bool better_performance = shape()[a] * shape()[b] > DDDD; // For smaller tensor native implementation is faster.
   
   Please also consider to check whole network in test, as result in microbenchmark could be not inline with results of whole model - for example due to memory layout conversions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org