You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/12/23 07:18:15 UTC

[GitHub] [incubator-mxnet] xinyu-intel opened a new pull request #17147: [WIP] Quantized Elemwise Mul Operator

xinyu-intel opened a new pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147
 
 
   ## Description ##
   Add Quantized Elemwise Mul Operator for NCF further optimization.
   - support s8s8 inputs
   - support s32/s8/f32 output
   - may change to use dnnl primitive when dnnl v1.2 is ready.
   
   @pengzhao-intel @TaoLv @ciyongch 
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments are documented. 
   - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
   - Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be made.
   - Interesting edge cases to note here
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r360801936
 
 

 ##########
 File path: src/operator/quantization/quantized_elemwise_mul.cc
 ##########
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ *  Copyright (c) 2016 by Contributors
+ * \file quantized_elemwise_mul.cc
+ * \brief CPU Implementation of basic elementwise binary broadcast operators
+ */
+#include <mxnet/op_attr_types.h>
+#include "../tensor/elemwise_binary_op-inl.h"
+#include "./quantized_elemwise_mul-inl.h"
+#include "./quantization_utils.h"
+
+namespace mxnet {
+namespace op {
+
+DMLC_REGISTER_PARAMETER(QuantizeElemwiseMulParam);
+
+static std::vector<std::string> QuantizedElemwiseMulOutputNames(const NodeAttrs &attrs) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  if (params.enable_float_output)
+    return std::vector<std::string>{"output"};
+  else
+    return std::vector<std::string>{"output", "min_output", "max_output"};
+}
+
+inline bool QuantizedElemwiseMulOpShape(const nnvm::NodeAttrs& attrs,
+                                        mxnet::ShapeVector *in_attrs,
+                                        mxnet::ShapeVector *out_attrs) {
+  using namespace mshadow;
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  const mxnet::TShape &lshape = (*in_attrs)[quantized_elemwise_mul::kLhs];
+  const mxnet::TShape &rshape = (*in_attrs)[quantized_elemwise_mul::kRhs];
+  if (!ndim_is_known(lshape) || !ndim_is_known(rshape)) return false;
+  CHECK_EQ(lshape.ndim(), rshape.ndim()) << "Currently, quantized elemwise multiply doesn't support broadcast.";
+  for (int i = 0; i < lshape.ndim(); ++i) {
+    CHECK_EQ(lshape[i], rshape[i]);
+  }
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMax, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMax, mxnet::TShape(1, 1));
+  out_attrs->clear();
 
 Review comment:
   Why not use `SHAPE_ASSIGN_CHECK` for the `out_attrs`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r360803585
 
 

 ##########
 File path: src/operator/quantization/quantized_elemwise_mul.cc
 ##########
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ *  Copyright (c) 2016 by Contributors
+ * \file quantized_elemwise_mul.cc
+ * \brief CPU Implementation of basic elementwise binary broadcast operators
+ */
+#include <mxnet/op_attr_types.h>
+#include "../tensor/elemwise_binary_op-inl.h"
+#include "./quantized_elemwise_mul-inl.h"
+#include "./quantization_utils.h"
+
+namespace mxnet {
+namespace op {
+
+DMLC_REGISTER_PARAMETER(QuantizeElemwiseMulParam);
+
+static std::vector<std::string> QuantizedElemwiseMulOutputNames(const NodeAttrs &attrs) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  if (params.enable_float_output)
+    return std::vector<std::string>{"output"};
+  else
+    return std::vector<std::string>{"output", "min_output", "max_output"};
+}
+
+inline bool QuantizedElemwiseMulOpShape(const nnvm::NodeAttrs& attrs,
+                                        mxnet::ShapeVector *in_attrs,
+                                        mxnet::ShapeVector *out_attrs) {
+  using namespace mshadow;
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  const mxnet::TShape &lshape = (*in_attrs)[quantized_elemwise_mul::kLhs];
+  const mxnet::TShape &rshape = (*in_attrs)[quantized_elemwise_mul::kRhs];
+  if (!ndim_is_known(lshape) || !ndim_is_known(rshape)) return false;
+  CHECK_EQ(lshape.ndim(), rshape.ndim()) << "Currently, quantized elemwise multiply doesn't support broadcast.";
+  for (int i = 0; i < lshape.ndim(); ++i) {
+    CHECK_EQ(lshape[i], rshape[i]);
+  }
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMax, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMax, mxnet::TShape(1, 1));
+  out_attrs->clear();
+
+  mxnet::TShape oshape(lshape);
+  out_attrs->push_back(oshape);
+  if (!params.enable_float_output) {
+    out_attrs->push_back(mxnet::TShape(1, 1));
+    out_attrs->push_back(mxnet::TShape(1, 1));
+  }
+  return shape_is_known(oshape);
+}
+
+inline bool QuantizedElemwiseMulOpType(const nnvm::NodeAttrs& attrs,
+                                       std::vector<int> *in_type,
+                                       std::vector<int> *out_type) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  for (int i = 0; i < 2; ++i) {
+    if (in_type->at(i) == mshadow::kInt8) {
+      TYPE_ASSIGN_CHECK(*in_type, i, mshadow::kInt8);
+    } else {
+      LOG(ERROR) << "currently, quantized elemwise mul only support int8 inputs.";
+    }
+  }
+  TYPE_ASSIGN_CHECK(*in_type, 2, mshadow::kFloat32);
+  TYPE_ASSIGN_CHECK(*in_type, 3, mshadow::kFloat32);
+  TYPE_ASSIGN_CHECK(*in_type, 4, mshadow::kFloat32);
+  TYPE_ASSIGN_CHECK(*in_type, 5, mshadow::kFloat32);
+
+  int dtype = mshadow::kInt32;
+  if (params.max_calib_range.has_value() && params.min_calib_range.has_value()) {
+    dtype = mshadow::kInt8;
+  }
+  if (!params.enable_float_output) {
+    TYPE_ASSIGN_CHECK(*out_type, 0, dtype);
+    TYPE_ASSIGN_CHECK(*out_type, 1, mshadow::kFloat32);
+    TYPE_ASSIGN_CHECK(*out_type, 2, mshadow::kFloat32);
+  } else {
+    TYPE_ASSIGN_CHECK(*out_type, 0, mshadow::kFloat32);
+  }
+  return true;
+}
+
+inline bool QuantizedElemwiseMulOpStorageType(const nnvm::NodeAttrs& attrs,
+                                              int dev_mask,
+                                              DispatchMode* dispatch_mode,
+                                              std::vector<int> *in_attrs,
+                                              std::vector<int> *out_attrs) {
+  using namespace common;
+  *dispatch_mode = DispatchMode::kFCompute;
+
+  for (auto &v : *out_attrs) {
+    v = kDefaultStorage;
+    if (common::stype_string(v).compare("unknown") == 0) {
+      return false;
+    }
+  }
+
+  for (auto &v : *in_attrs) {
+    v = kDefaultStorage;
+    if (common::stype_string(v).compare("unknown") == 0) {
+      return false;
+    }
+  }
+  return true;
+}
+
+void QuantizedElemwiseMulOpForward(const nnvm::NodeAttrs &attrs,
+                                   const OpContext &ctx,
+                                   const std::vector<TBlob> &inputs,
+                                   const std::vector<OpReqType> &req,
+                                   const std::vector<TBlob> &outputs) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  using namespace mxnet_op;
+
+  float lhs_min = inputs[quantized_elemwise_mul::kLhsMin].dptr<float>()[0];
+  float lhs_max = inputs[quantized_elemwise_mul::kLhsMax].dptr<float>()[0];
+  float rhs_min = inputs[quantized_elemwise_mul::kRhsMin].dptr<float>()[0];
+  float rhs_max = inputs[quantized_elemwise_mul::kRhsMax].dptr<float>()[0];
+
+  float cached_output_min_ = 0.f;
+  float cached_output_max_ = 0.f;
+  float out_data_scale = 1.f;
+  float out_scale = 1.f;
+  // output default set as int32
+  float output_data_range = kInt32Range;
 
 Review comment:
   Move this var to `  if (!params.enable_float_output) {` as it's only used there.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r360804532
 
 

 ##########
 File path: src/operator/quantization/quantized_elemwise_mul.cc
 ##########
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ *  Copyright (c) 2016 by Contributors
+ * \file quantized_elemwise_mul.cc
+ * \brief CPU Implementation of basic elementwise binary broadcast operators
+ */
+#include <mxnet/op_attr_types.h>
+#include "../tensor/elemwise_binary_op-inl.h"
+#include "./quantized_elemwise_mul-inl.h"
+#include "./quantization_utils.h"
+
+namespace mxnet {
+namespace op {
+
+DMLC_REGISTER_PARAMETER(QuantizeElemwiseMulParam);
+
+static std::vector<std::string> QuantizedElemwiseMulOutputNames(const NodeAttrs &attrs) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  if (params.enable_float_output)
+    return std::vector<std::string>{"output"};
+  else
+    return std::vector<std::string>{"output", "min_output", "max_output"};
+}
+
+inline bool QuantizedElemwiseMulOpShape(const nnvm::NodeAttrs& attrs,
+                                        mxnet::ShapeVector *in_attrs,
+                                        mxnet::ShapeVector *out_attrs) {
+  using namespace mshadow;
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  const mxnet::TShape &lshape = (*in_attrs)[quantized_elemwise_mul::kLhs];
+  const mxnet::TShape &rshape = (*in_attrs)[quantized_elemwise_mul::kRhs];
+  if (!ndim_is_known(lshape) || !ndim_is_known(rshape)) return false;
+  CHECK_EQ(lshape.ndim(), rshape.ndim()) << "Currently, quantized elemwise multiply doesn't support broadcast.";
+  for (int i = 0; i < lshape.ndim(); ++i) {
+    CHECK_EQ(lshape[i], rshape[i]);
+  }
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMax, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMax, mxnet::TShape(1, 1));
+  out_attrs->clear();
+
+  mxnet::TShape oshape(lshape);
+  out_attrs->push_back(oshape);
+  if (!params.enable_float_output) {
+    out_attrs->push_back(mxnet::TShape(1, 1));
+    out_attrs->push_back(mxnet::TShape(1, 1));
+  }
+  return shape_is_known(oshape);
+}
+
+inline bool QuantizedElemwiseMulOpType(const nnvm::NodeAttrs& attrs,
+                                       std::vector<int> *in_type,
+                                       std::vector<int> *out_type) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  for (int i = 0; i < 2; ++i) {
+    if (in_type->at(i) == mshadow::kInt8) {
+      TYPE_ASSIGN_CHECK(*in_type, i, mshadow::kInt8);
+    } else {
+      LOG(ERROR) << "currently, quantized elemwise mul only support int8 inputs.";
+    }
+  }
+  TYPE_ASSIGN_CHECK(*in_type, 2, mshadow::kFloat32);
+  TYPE_ASSIGN_CHECK(*in_type, 3, mshadow::kFloat32);
+  TYPE_ASSIGN_CHECK(*in_type, 4, mshadow::kFloat32);
+  TYPE_ASSIGN_CHECK(*in_type, 5, mshadow::kFloat32);
+
+  int dtype = mshadow::kInt32;
+  if (params.max_calib_range.has_value() && params.min_calib_range.has_value()) {
+    dtype = mshadow::kInt8;
+  }
+  if (!params.enable_float_output) {
+    TYPE_ASSIGN_CHECK(*out_type, 0, dtype);
+    TYPE_ASSIGN_CHECK(*out_type, 1, mshadow::kFloat32);
+    TYPE_ASSIGN_CHECK(*out_type, 2, mshadow::kFloat32);
+  } else {
+    TYPE_ASSIGN_CHECK(*out_type, 0, mshadow::kFloat32);
+  }
+  return true;
+}
+
+inline bool QuantizedElemwiseMulOpStorageType(const nnvm::NodeAttrs& attrs,
+                                              int dev_mask,
+                                              DispatchMode* dispatch_mode,
+                                              std::vector<int> *in_attrs,
+                                              std::vector<int> *out_attrs) {
+  using namespace common;
+  *dispatch_mode = DispatchMode::kFCompute;
+
+  for (auto &v : *out_attrs) {
+    v = kDefaultStorage;
+    if (common::stype_string(v).compare("unknown") == 0) {
+      return false;
+    }
+  }
+
+  for (auto &v : *in_attrs) {
+    v = kDefaultStorage;
+    if (common::stype_string(v).compare("unknown") == 0) {
+      return false;
+    }
+  }
+  return true;
+}
+
+void QuantizedElemwiseMulOpForward(const nnvm::NodeAttrs &attrs,
+                                   const OpContext &ctx,
+                                   const std::vector<TBlob> &inputs,
+                                   const std::vector<OpReqType> &req,
+                                   const std::vector<TBlob> &outputs) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  using namespace mxnet_op;
+
+  float lhs_min = inputs[quantized_elemwise_mul::kLhsMin].dptr<float>()[0];
+  float lhs_max = inputs[quantized_elemwise_mul::kLhsMax].dptr<float>()[0];
+  float rhs_min = inputs[quantized_elemwise_mul::kRhsMin].dptr<float>()[0];
+  float rhs_max = inputs[quantized_elemwise_mul::kRhsMax].dptr<float>()[0];
+
+  float cached_output_min_ = 0.f;
+  float cached_output_max_ = 0.f;
+  float out_data_scale = 1.f;
+  float out_scale = 1.f;
+  // output default set as int32
+  float output_data_range = kInt32Range;
+  // dataA && dataB are uint8
+  if (outputs[quantized_elemwise_mul::kOut].type_flag_ == mshadow::kInt8) {
+    output_data_range = kInt8Range;
+  } else {
+    output_data_range = kInt32Range;
+  }
+  if (!params.enable_float_output) {
+    if (params.max_calib_range.has_value() && params.min_calib_range.has_value()) {
+      cached_output_min_ = params.min_calib_range.value();
+      cached_output_max_ = params.max_calib_range.value();
+      out_data_scale = output_data_range / MaxAbs(cached_output_min_, cached_output_max_);
+      auto lhs_scale = kInt8Range / MaxAbs(lhs_min, lhs_max);
+      auto rhs_scale = kInt8Range / MaxAbs(rhs_min, rhs_max);
+      out_scale = out_data_scale / lhs_scale / rhs_scale;
+    } else {
+      Stream<cpu> *s = ctx.get_stream<cpu>();
+      if (inputs[quantized_elemwise_mul::kLhs].type_flag_ == mshadow::kInt8 &&
+          inputs[quantized_elemwise_mul::kRhs].type_flag_ == mshadow::kInt8) {
+        mxnet_op::Kernel<QuantizationRangeForS8S8MultiplicationStruct, cpu>::Launch(
+            s, 1, &cached_output_min_, &cached_output_max_, &lhs_min, &lhs_max, &rhs_min, &rhs_max);
+      } else {
+        LOG(ERROR) << "lhs and rhs only support iny8 dtype.";
+      }
+    }
+  } else {
+    auto lhs_scale = kInt8Range / MaxAbs(lhs_min, lhs_max);
+    auto rhs_scale = kInt8Range / MaxAbs(rhs_min, rhs_max);
+    out_scale = 1.0 / lhs_scale / rhs_scale;
+  }
+
+  size_t out_size = outputs[quantized_elemwise_mul::kOut].Size();
+  auto *input_l = inputs[quantized_elemwise_mul::kLhs].dptr<int8_t>();
+  auto *input_r = inputs[quantized_elemwise_mul::kRhs].dptr<int8_t>();
+  // TODO(Xinyu): a temp solution to enable Elemwise INT8 computation,
+  // will be refactored after the DNNL primitive is done.
+  if (!params.enable_float_output) {
+    if (params.max_calib_range.has_value() && params.min_calib_range.has_value()) {
+      typedef int8_t out_type;
+      auto *out_data = outputs[quantized_elemwise_mul::kOut].dptr<out_type>();
+  #pragma omp simd
+      for (size_t i = 0; i < out_size; ++i) {
+        const int8_t a = input_l[i];
+        const int8_t b = input_r[i];
+        out_data[i] = static_cast<out_type>(a * b * out_scale);
+      }
+    } else {
+      typedef int32_t out_type;
+      auto *out_data = outputs[quantized_elemwise_mul::kOut].dptr<out_type>();
+  #pragma omp simd
+      for (size_t i = 0; i < out_size; ++i) {
+        const int8_t a = input_l[i];
+        const int8_t b = input_r[i];
+        out_data[i] = static_cast<out_type>(a * b * out_scale);
+      }
+    }
+  } else {
+    typedef float_t out_type;
+    auto *out_data = outputs[quantized_elemwise_mul::kOut].dptr<out_type>();
+#pragma omp simd
+    for (size_t i = 0; i < out_size; ++i) {
+      const int8_t a = input_l[i];
+      const int8_t b = input_r[i];
+      out_data[i] = static_cast<out_type>(a * b * out_scale);
+    }
+  }
+
+  if (!params.enable_float_output) {
+    outputs[quantized_elemwise_mul::kOutMin].dptr<float>()[0] = cached_output_min_;
+    outputs[quantized_elemwise_mul::kOutMax].dptr<float>()[0] = cached_output_max_;
+  }
+}
+
+NNVM_REGISTER_OP(_contrib_quantized_elemwise_mul)
+.describe(R"code(Multiplies arguments int8 element-wise.
+)code" ADD_FILELINE)
+.set_num_inputs(6)
+.set_num_outputs([](const NodeAttrs& attrs) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  return (!params.enable_float_output) ? 3 : 1;
+})
+.set_attr<nnvm::FListInputNames>("FListInputNames",
+  [](const NodeAttrs& attrs) {
+    return std::vector<std::string>{"lhs", "rhs", "lhs_min", "lhs_max", "rhs_min", "rhs_max"};
+  })
+.set_attr<nnvm::FListOutputNames>("FListOutputNames", QuantizedElemwiseMulOutputNames)
+.set_attr<mxnet::FInferShape>("FInferShape", QuantizedElemwiseMulOpShape)
+.set_attr<nnvm::FInferType>("FInferType", QuantizedElemwiseMulOpType)
+.set_attr<FInferStorageType>("FInferStorageType", QuantizedElemwiseMulOpStorageType)
+.set_attr<FResourceRequest>("FResourceRequest",
+  [](const NodeAttrs& attrs) {
+    return std::vector<ResourceRequest>{ResourceRequest::kTempSpace};
+  })
+.set_attr<FCompute>("FCompute<cpu>", QuantizedElemwiseMulOpForward)
+// TODO(Xinyu): a temp solution to enable GluonCV INT8 flow,
+// will be reverted after the improvement of CachedOP is done.
+.set_attr<nnvm::FGradient>("FGradient", MakeZeroGradNodes)
+.set_attr<FNeedRequantize>("FNeedRequantize", [](const NodeAttrs& attrs) { return true; })
+.add_argument("lhs", "NDArray-or-Symbol", "first input")
+.add_argument("rhs", "NDArray-or-Symbol", "second input")
+.add_argument("lhs_min", "NDArray-or-Symbol", "Minimum value of first input.")
+.add_argument("lhs_max", "NDArray-or-Symbol", "Maximum value of first input.")
+.add_argument("rhs_min", "NDArray-or-Symbol", "Minimum value of second input.")
+.add_argument("rhs_max", "NDArray-or-Symbol", "Maximum value of second input.")
+.set_attr_parser(ParamParser<QuantizeElemwiseMulParam>)
+.add_arguments(QuantizeElemwiseMulParam::__FIELDS__());
+
+NNVM_REGISTER_OP(elemwise_mul)
+.set_attr<FQuantizable>("FQuantizable", [](const NodeAttrs& attrs) {
+    return QuantizeType::kMust;
 
 Review comment:
   Suppose this ops is `kSupport` instead of `kMust`, right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r360799854
 
 

 ##########
 File path: src/operator/quantization/quantized_elemwise_mul.cc
 ##########
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ *  Copyright (c) 2016 by Contributors
+ * \file quantized_elemwise_mul.cc
+ * \brief CPU Implementation of basic elementwise binary broadcast operators
 
 Review comment:
   `mul` instead of `broadcast`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] xinyu-intel commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

xinyu-intel commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r361239795
 
 

 ##########
 File path: src/operator/quantization/quantized_elemwise_mul.cc
 ##########
 @@ -247,9 +246,6 @@ NNVM_REGISTER_OP(_contrib_quantized_elemwise_mul)
 .add_arguments(QuantizeElemwiseMulParam::__FIELDS__());
 
 NNVM_REGISTER_OP(elemwise_mul)
-.set_attr<FQuantizable>("FQuantizable", [](const NodeAttrs& attrs) {
-    return QuantizeType::kMust;
 
 Review comment:
   default value is `kSupport`, so no need to add this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r360799517

##########
File path: src/operator/quantization/quantized_elemwise_mul-inl.h
##########
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * Copyright (c) 2016 by Contributors

Review comment:
2016 -> 2019, for the header of all new files.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r361237265
 
 

 ##########
 File path: tests/python/quantization/test_quantization.py
 ##########
 @@ -341,6 +341,68 @@ def check_quantized_elemwise_add(data_shape, qtype):
         check_quantized_elemwise_add((3, 4, 56, 56), qtype)
         check_quantized_elemwise_add((32, 56, 64, 11), qtype)
 
+@with_seed()
+def test_quantized_elemwise_mul():
+    def check_quantized_elemwise_mul(data_shape, qtype):
+        if is_test_for_native_cpu():
+            print('skipped testing quantized_elemwise_mul for native cpu since it is not supported yet')
+            return
+        elif qtype != 'int8':
+            print('skipped testing quantized_elemwise_mul for not supported data type')
+            return
+        elif is_test_for_gpu():
+            print('skipped testing quantized_elemwise_mul for gpu since it is not supported yet')
+            return
+
+        dataA = mx.sym.Variable(name='dataA', shape=data_shape, dtype='float32')
+        dataB = mx.sym.Variable(name='dataB', shape=data_shape, dtype='float32')
+        elemwise_mul_fp32 = mx.sym.elemwise_mul(dataA, dataB)
+        arg_names = elemwise_mul_fp32.list_arguments()
+        elemwise_mul_fp32_exe = elemwise_mul_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')
+        if qtype == 'uint8':
+            data_low = 0.0
+            data_high = 255.0
+        else:
+            data_low = -127.0
+            data_high = 127.0
+
+        dataA_val = mx.nd.random.uniform(low=data_low, high=data_high, shape=data_shape).astype('int32')
+        dataB_val = mx.nd.random.uniform(low=data_low, high=data_high, shape=data_shape).astype('int32')
+        elemwise_mul_fp32_exe.arg_dict[arg_names[0]][:] = dataA_val
+
+        elemwise_mul_fp32_exe.arg_dict[arg_names[1]][:] = dataB_val
+
+        output = elemwise_mul_fp32_exe.forward()[0]
+
+        qdataA = mx.sym.Variable(name='qdataA', shape=data_shape, dtype=qtype)
+        qdataB = mx.sym.Variable(name='qdataB', shape=data_shape, dtype=qtype)
+        min_dataA = mx.sym.Variable(name='min_dataA')
+        max_dataA = mx.sym.Variable(name='max_dataA')
+        min_dataB = mx.sym.Variable(name='min_dataB')
+        max_dataB = mx.sym.Variable(name='max_dataB')
+        quantized_elemwise_mul = mx.sym.contrib.quantized_elemwise_mul(qdataA, qdataB, min_dataA, max_dataA, min_dataB, max_dataB)
+        elemwise_mul_int8_exe = quantized_elemwise_mul.simple_bind(ctx=mx.current_context(), grad_req='null')
+        qarg_names = quantized_elemwise_mul.list_arguments()
+        elemwise_mul_int8_exe.arg_dict[qarg_names[0]][:] = elemwise_mul_fp32_exe.arg_dict[arg_names[0]].astype(qtype)
+        elemwise_mul_int8_exe.arg_dict[qarg_names[1]][:] = elemwise_mul_fp32_exe.arg_dict[arg_names[1]].astype(qtype)
+        quantized_range = 127.0
+        elemwise_mul_int8_exe.arg_dict[qarg_names[2]][:] = data_low
+        elemwise_mul_int8_exe.arg_dict[qarg_names[3]][:] = data_high
+        elemwise_mul_int8_exe.arg_dict[qarg_names[4]][:] = data_low
+        elemwise_mul_int8_exe.arg_dict[qarg_names[5]][:] = data_high
+        qoutput, min_range, max_range = elemwise_mul_int8_exe.forward()
+        min_val = min_range.asnumpy().tolist()[0]
+        max_val = max_range.asnumpy().tolist()[0]
+
+        fp32_rslt = output.asnumpy()
+        int8_rslt = qoutput.asnumpy()*max_val/0x7fffffff
 
 Review comment:
   The `min_val` and `max_val` is not the scale range for `qoutput` when its dtype is `int32`, if so, then the `int8_rslt` will be incorrect.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r361237272
 
 

 ##########
 File path: src/operator/quantization/quantized_elemwise_mul.cc
 ##########
 @@ -247,9 +246,6 @@ NNVM_REGISTER_OP(_contrib_quantized_elemwise_mul)
 .add_arguments(QuantizeElemwiseMulParam::__FIELDS__());
 
 NNVM_REGISTER_OP(elemwise_mul)
-.set_attr<FQuantizable>("FQuantizable", [](const NodeAttrs& attrs) {
-    return QuantizeType::kMust;
 
 Review comment:
   Keep this attribute as `kSupport`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

ciyongch commented on a change in pull request #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#discussion_r360802245
 
 

 ##########
 File path: src/operator/quantization/quantized_elemwise_mul.cc
 ##########
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ *  Copyright (c) 2016 by Contributors
+ * \file quantized_elemwise_mul.cc
+ * \brief CPU Implementation of basic elementwise binary broadcast operators
+ */
+#include <mxnet/op_attr_types.h>
+#include "../tensor/elemwise_binary_op-inl.h"
+#include "./quantized_elemwise_mul-inl.h"
+#include "./quantization_utils.h"
+
+namespace mxnet {
+namespace op {
+
+DMLC_REGISTER_PARAMETER(QuantizeElemwiseMulParam);
+
+static std::vector<std::string> QuantizedElemwiseMulOutputNames(const NodeAttrs &attrs) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  if (params.enable_float_output)
+    return std::vector<std::string>{"output"};
+  else
+    return std::vector<std::string>{"output", "min_output", "max_output"};
+}
+
+inline bool QuantizedElemwiseMulOpShape(const nnvm::NodeAttrs& attrs,
+                                        mxnet::ShapeVector *in_attrs,
+                                        mxnet::ShapeVector *out_attrs) {
+  using namespace mshadow;
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  const mxnet::TShape &lshape = (*in_attrs)[quantized_elemwise_mul::kLhs];
+  const mxnet::TShape &rshape = (*in_attrs)[quantized_elemwise_mul::kRhs];
+  if (!ndim_is_known(lshape) || !ndim_is_known(rshape)) return false;
+  CHECK_EQ(lshape.ndim(), rshape.ndim()) << "Currently, quantized elemwise multiply doesn't support broadcast.";
+  for (int i = 0; i < lshape.ndim(); ++i) {
+    CHECK_EQ(lshape[i], rshape[i]);
+  }
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kLhsMax, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMin, mxnet::TShape(1, 1));
+  SHAPE_ASSIGN_CHECK(*in_attrs, quantized_elemwise_mul::kRhsMax, mxnet::TShape(1, 1));
+  out_attrs->clear();
+
+  mxnet::TShape oshape(lshape);
+  out_attrs->push_back(oshape);
+  if (!params.enable_float_output) {
+    out_attrs->push_back(mxnet::TShape(1, 1));
+    out_attrs->push_back(mxnet::TShape(1, 1));
+  }
+  return shape_is_known(oshape);
+}
+
+inline bool QuantizedElemwiseMulOpType(const nnvm::NodeAttrs& attrs,
+                                       std::vector<int> *in_type,
+                                       std::vector<int> *out_type) {
+  const QuantizeElemwiseMulParam& params = nnvm::get<QuantizeElemwiseMulParam>(attrs.parsed);
+  for (int i = 0; i < 2; ++i) {
+    if (in_type->at(i) == mshadow::kInt8) {
+      TYPE_ASSIGN_CHECK(*in_type, i, mshadow::kInt8);
+    } else {
+      LOG(ERROR) << "currently, quantized elemwise mul only support int8 inputs.";
+    }
+  }
+  TYPE_ASSIGN_CHECK(*in_type, 2, mshadow::kFloat32);
 
 Review comment:
   Suggest to use enum same as above.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] pengzhao-intel merged pull request #17147: Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

pengzhao-intel merged pull request #17147: Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] xinyu-intel commented on issue #17147: [WIP] Quantized Elemwise Mul Operator

Posted by GitBox <gi...@apache.org>.

xinyu-intel commented on issue #17147: [WIP] Quantized Elemwise Mul Operator
URL: https://github.com/apache/incubator-mxnet/pull/17147#issuecomment-568635184
 
 
   @ciyongch thx, please review again:)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services