You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/07/11 21:53:41 UTC

[GitHub] [incubator-tvm] anijain2305 opened a new pull request #6039: MXNet pre-quantized BERT

anijain2305 opened a new pull request #6039:
URL: https://github.com/apache/incubator-tvm/pull/6039


   MXnet pre-quantized BERT model - https://gluon-nlp.mxnet.io/examples/sentence_embedding/bert.html#Quantize-the-model
   
   Features added in this PR
   
   * Support for Tensor quantization for MXNet Dense operator
   * Support for Channel quantization for MXNet Dense operator
   * Adding channel wise support for dequantization
   * Support softmax use_length for axis=-1
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] icemelon9 commented on a change in pull request #6039: MXNet pre-quantized BERT

Posted by GitBox <gi...@apache.org>.

icemelon9 commented on a change in pull request #6039:
URL: https://github.com/apache/incubator-tvm/pull/6039#discussion_r453364084



##########
File path: python/tvm/relay/frontend/nnvm_common.py
##########
@@ -57,9 +60,53 @@ def _impl(inputs, attrs):
 def _softmax_op(new_op):
     """softmax/log_softmax"""
     def _impl(inputs, attrs, _dtype='float32'):
-        # TODO(@icemelon9): currently ignore the 2nd input to softmax for mxnet 1.6
-        # assert len(inputs) == 1
         axis = attrs.get_int("axis", -1)
+        use_length = attrs.get_bool("use_length", False)
+        if use_length:
+            # The second arg is valid_length. We can use sequence mask to mask the input before
+            # computing softmax
+            assert len(inputs) == 2
+
+            data = inputs[0]
+            length = inputs[1]
+            data_shape = _infer_shape(data)
+            length_shape = _infer_shape(length)
+
+            if axis < 0:
+                axis = len(data_shape) + axis
+
+            data_ndims = len(data_shape)
+            length_ndims = len(length_shape)
+
+            # Sequence_mask supports axis = 0 and 1 and requires data to be in specific format.
+            if axis == data_ndims - 1 and data_ndims > 2 and length_ndims == 2:
+                new_batch_size = 1
+                for dim in range(length_ndims):
+                    assert data_shape[dim] == length_shape[dim]
+                    new_batch_size *= data_shape[dim]
+
+                # Reshape the data and length to satisfy sequence mask
+                data = _op.reshape(data, newshape=(new_batch_size, -1))
+                length = _op.reshape(length, newshape=(new_batch_size))
+
+                # Input data is now 2D, we can set the axis = 1
+                axis = 1
+            elif data_ndims > 2:
+                raise error.OpNotImplemented(\
+                        "Operator softmax with use_length=True is supported only for axis -1")
+
+            res = _op.sequence_mask(data=data,
+                                    valid_length=length,
+                                    mask_value=float(min_value("float").value),

Review comment:
       It'll be better to use the dtype of data instead of "float".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] anijain2305 commented on pull request #6039: MXNet pre-quantized BERT

Posted by GitBox <gi...@apache.org>.

anijain2305 commented on pull request #6039:
URL: https://github.com/apache/incubator-tvm/pull/6039#issuecomment-660275132


   @icemelon9 Can you please manage this PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] anijain2305 commented on a change in pull request #6039: MXNet pre-quantized BERT

Posted by GitBox <gi...@apache.org>.

anijain2305 commented on a change in pull request #6039:
URL: https://github.com/apache/incubator-tvm/pull/6039#discussion_r453445910



##########
File path: python/tvm/relay/frontend/nnvm_common.py
##########
@@ -57,9 +60,53 @@ def _impl(inputs, attrs):
 def _softmax_op(new_op):
     """softmax/log_softmax"""
     def _impl(inputs, attrs, _dtype='float32'):
-        # TODO(@icemelon9): currently ignore the 2nd input to softmax for mxnet 1.6
-        # assert len(inputs) == 1
         axis = attrs.get_int("axis", -1)
+        use_length = attrs.get_bool("use_length", False)
+        if use_length:
+            # The second arg is valid_length. We can use sequence mask to mask the input before
+            # computing softmax
+            assert len(inputs) == 2
+
+            data = inputs[0]
+            length = inputs[1]
+            data_shape = _infer_shape(data)
+            length_shape = _infer_shape(length)
+
+            if axis < 0:
+                axis = len(data_shape) + axis
+
+            data_ndims = len(data_shape)
+            length_ndims = len(length_shape)
+
+            # Sequence_mask supports axis = 0 and 1 and requires data to be in specific format.
+            if axis == data_ndims - 1 and data_ndims > 2 and length_ndims == 2:
+                new_batch_size = 1
+                for dim in range(length_ndims):
+                    assert data_shape[dim] == length_shape[dim]
+                    new_batch_size *= data_shape[dim]
+
+                # Reshape the data and length to satisfy sequence mask
+                data = _op.reshape(data, newshape=(new_batch_size, -1))
+                length = _op.reshape(length, newshape=(new_batch_size))
+
+                # Input data is now 2D, we can set the axis = 1
+                axis = 1
+            elif data_ndims > 2:
+                raise error.OpNotImplemented(\
+                        "Operator softmax with use_length=True is supported only for axis -1")
+
+            res = _op.sequence_mask(data=data,
+                                    valid_length=length,
+                                    mask_value=float(min_value("float").value),

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] icemelon9 merged pull request #6039: MXNet pre-quantized BERT

Posted by GitBox <gi...@apache.org>.

icemelon9 merged pull request #6039:
URL: https://github.com/apache/incubator-tvm/pull/6039


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] anijain2305 commented on pull request #6039: MXNet pre-quantized BERT

Posted by GitBox <gi...@apache.org>.

anijain2305 commented on pull request #6039:
URL: https://github.com/apache/incubator-tvm/pull/6039#issuecomment-657168012


   @icemelon9 @eric-haibin-lin @shoubhik 
   
   Please review


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] icemelon9 commented on pull request #6039: MXNet pre-quantized BERT

Posted by GitBox <gi...@apache.org>.

icemelon9 commented on pull request #6039:
URL: https://github.com/apache/incubator-tvm/pull/6039#issuecomment-661412899


   Thanks @anijain2305 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org