You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/08/18 08:35:11 UTC

[GitHub] [tvm] juda opened a new pull request, #12485: [Relay][Op] embedding_bag operator implementation

juda opened a new pull request, #12485:
URL: https://github.com/apache/tvm/pull/12485

   This PR adds a new operator that supports embedding_bag operator similar to `torch.embedding_bag`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r959842525


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2247,6 +2247,47 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,
+            padding_idx,
+        ) = inputs
+
+        assert len(_infer_shape(indices)) == 1, "Expects 1D indices for aten::embedding_bag."
+
+        assert (

Review Comment:
   actually if `scale_grad_by_freq` and `sparse` arguments won't effect the results of our embedding_bag impl here, the expected behavior I have in mind will be that we can have some tests confirming this fact and raise a warning, instead of assertion here, to user if any of those two arguments are passed with non-default values. Does that make sense?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on PR #12485:
URL: https://github.com/apache/tvm/pull/12485#issuecomment-1302774613

   @areusch this should be closed by #12993 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] juda commented on pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

juda commented on PR #12485:
URL: https://github.com/apache/tvm/pull/12485#issuecomment-1232424032

   @tvm-bot rerun


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r951887727


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2485,6 +2509,12 @@ def interpolate(self, inputs, input_types):
         )
 
     def numel(self, inputs, input_types):
+        shape = self.infer_shape(inputs[0])

Review Comment:
   We may want to check if we still need this change here. This is to fix a legacy bug in the old python impl.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r959842525


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2247,6 +2247,47 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,
+            padding_idx,
+        ) = inputs
+
+        assert len(_infer_shape(indices)) == 1, "Expects 1D indices for aten::embedding_bag."
+
+        assert (

Review Comment:
   actually if `scale_grad_by_freq` and `sparse` arguments won't effect the results of our embedding_bag impl here, the expected behavior I have in mind will be that we can have some tests confirming this fact and raise a warning, instead of assertion here, to user if any of those two arguments are passed. Does that make sense?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r951886852


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2242,6 +2243,29 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,

Review Comment:
   is it possible that we can support `sparse`, `per_sample_weights` and `include_last_offset` as well? I think at least for model `dlrm`, `sparce` is needed. If we can't really figure out a way to support `scale_grad_by_freq`, for now my take is that we can explicitly put a check here and fail the compilation if this argument is passed from pytorch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] juda commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

juda commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r952026615


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2242,6 +2243,29 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,

Review Comment:
   setting `scale_grad_by_freq` didn't change the result and I read the pytorch's source code, it is used for computing gradient?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r951888906


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -3919,7 +3950,8 @@ def _handel_nested_input(inputs):
                     out_names = _get_output_names(op_node)
                     outputs.update(zip(out_names, relay_out))
                 else:
-                    assert op_node.outputsSize() == 1
+                    # The node_name of embedding_bag is like "22_23_24_25"

Review Comment:
   this change looks very clean to me. We may want to check if this node is `embedding_bag` before we do the split. My concern is that this change may not work on other ops that return more than one outputs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r953194878


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -3919,7 +3950,8 @@ def _handel_nested_input(inputs):
                     out_names = _get_output_names(op_node)
                     outputs.update(zip(out_names, relay_out))
                 else:
-                    assert op_node.outputsSize() == 1
+                    # The node_name of embedding_bag is like "22_23_24_25"

Review Comment:
   yeah but my point is that 
   
   1) Other ops which have one output will enter this branch. And this line essentially does nothing for those ops, which could confuse others who don't have a better context of the reason we add this split here.
   2) if there is other future ops that also have more than one outputs but got concatenated into one, like previously what we did with `embedding_bag`, this behavior may not be ideal/expected. 
   
   Hence my take on this is that we add a check like
   ```
   if this is embedding_bag:
       # comment that specify the reason why we only do split for embedding_bag 
       we split the output by `-`
   ```
   which could make the codebase much cleaner. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r953195691


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2242,6 +2243,29 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,

Review Comment:
   Per offline discussion, we think that we can leave `scale_grad_by_freq` and `sparse` unused. It would be great if we can ignore those arguments here in the python end instead of passing those to `_make.embedding_bag` but ignoring it, which makes the logic more clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r953194878


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -3919,7 +3950,8 @@ def _handel_nested_input(inputs):
                     out_names = _get_output_names(op_node)
                     outputs.update(zip(out_names, relay_out))
                 else:
-                    assert op_node.outputsSize() == 1
+                    # The node_name of embedding_bag is like "22_23_24_25"

Review Comment:
   yeah but my point is that 
   
   1) Other ops which have one output will enter this branch. And this line essentially does nothing for those ops, which could confuse others who don't have a better context of the reason why we add this split here.
   2) if there is other future ops that also have more than one outputs but got concatenated into one, like previously what we did with `embedding_bag`, this behavior may not be ideal/expected. 
   
   Hence my take on this is that we add a check like
   ```
   if this is embedding_bag:
       # comment that specify the reason why we only do split for embedding_bag 
       we split the output by `-`
   ```
   which could make the codebase much cleaner. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on PR #12485:
URL: https://github.com/apache/tvm/pull/12485#issuecomment-1226898205

   I took a quick look and this PR should be in pretty good shape after CI is fixed. 
   
   cc: @masahi Can you also take a look at this? Would really appreciate your review on this. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r951886852


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2242,6 +2243,29 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,

Review Comment:
   is it possible that we can support `sparse`, `per_sample_weights` and `include_last_offset` as well? I think at least for model `dlrm`, `sparse` is needed. If we can't really figure out a way to support `scale_grad_by_freq`, for now my take is that we can explicitly put a check here and fail the compilation if this argument is passed from pytorch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r953195691


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2242,6 +2243,29 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,

Review Comment:
   Per offline discussion, we think that we can leave `scale_grad_by_freq` and `sparse`. It would be great if we can ignore those arguments here in the python end, which makes the logic more clear.



##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2242,6 +2243,29 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,

Review Comment:
   Per offline discussion, we think that we can leave `scale_grad_by_freq` and `sparse` unused. It would be great if we can ignore those arguments here in the python end, which makes the logic more clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] juda commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

juda commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r953460729


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2485,6 +2509,12 @@ def interpolate(self, inputs, input_types):
         )
 
     def numel(self, inputs, input_types):
+        shape = self.infer_shape(inputs[0])

Review Comment:
   I move the code and everything looks good



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r959839869


##########
include/tvm/topi/transform.h:
##########
@@ -2016,6 +2016,70 @@ inline Tensor adv_index(const Tensor& data, const Array<Tensor>& indices,
       name, tag);
 }
 
+/*!
+ * \brief Returns the sums, means or maxes of bags of embeddings
+ * \param input 1-d tensor of indics.
+ * \param weight the 2-d lookup table.
+ * \param offset the starting index position of each bag.
+ * \param mode  the way (`sum`, `mean` or `max`) to reduce the bag.
+ * \param per_sample_weights the weights for every input.
+ * \param padding_idx the index of padding_idx is not contributed to the result.
+ * \param include_last_offset if the last element of offset array is included.
+ * \param dtype the type of result.
+ * \param name output tensor name.
+ * \param tag output tensor tag.
+ * \return The embedded tensor.
+ */
+inline Tensor embedding_bag(const Tensor& input, const Tensor& weight, const Tensor& offset,
+                            int mode, const Tensor& per_sample_weights, Integer padding_idx,
+                            bool include_last_offset, DataType dtype,
+                            const std::string name = "T_embedding_bag",
+                            const std::string tag = kInjective) {
+  auto row = offset->shape[0];
+  auto column = weight->shape[1];
+
+  int N = GetConstInt(input->shape[0]);
+  if (include_last_offset) {
+    row = row - 1;
+  }
+
+  Array<PrimExpr> oshape{row, column};
+
+  auto func = [&](tvm::tir::Var i, tvm::tir::Var j) {
+    auto ret = make_zero(dtype);
+    auto count = make_zero(dtype);  // count how many elements are used
+
+    auto st = offset(i);  // start point
+    auto ed = tvm::tir::Select(row == i + 1, PrimExpr(N),
+                               cast(DataType::Int(32), offset(i + 1)));  // end point
+
+    // can't find `fold` function, so use a stupid method to iterate from st to ed

Review Comment:
   lets make this comment more formal. Something like:
   ``` Use loop iteration here for the lack of fold in relay```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] juda commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

juda commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r960550601


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2247,6 +2247,47 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,
+            padding_idx,
+        ) = inputs
+
+        assert len(_infer_shape(indices)) == 1, "Expects 1D indices for aten::embedding_bag."
+
+        assert (

Review Comment:
   is it ok for current change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] shingjan commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

shingjan commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r951889739


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -2242,6 +2243,29 @@ def embedding(self, inputs, input_types):
 
         return _op.take(weight, indices.astype("int32"), axis=0)
 
+    def embedding_bag(self, inputs, _):
+        assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+        (
+            weights,
+            indices,
+            offsets_1d,
+            scale_grad_by_freq,
+            mode,
+            sparse,
+            per_sample_weights,
+            include_last_offset,

Review Comment:
   And after we add these arguments with tests, this PR should be in pretty good shape.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] juda commented on a diff in pull request #12485: [Relay][Op] embedding_bag operator implementation

Posted by GitBox <gi...@apache.org>.

juda commented on code in PR #12485:
URL: https://github.com/apache/tvm/pull/12485#discussion_r952026083


##########
python/tvm/relay/frontend/pytorch.py:
##########
@@ -3919,7 +3950,8 @@ def _handel_nested_input(inputs):
                     out_names = _get_output_names(op_node)
                     outputs.update(zip(out_names, relay_out))
                 else:
-                    assert op_node.outputsSize() == 1
+                    # The node_name of embedding_bag is like "22_23_24_25"

Review Comment:
   I guess this branch won't be entered if there are multiple outputs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org