You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/05/06 01:46:11 UTC

[GitHub] [incubator-mxnet] Zha0q1 opened a new pull request #20249: [v1.x] ONNX Add an option to un-interleave BERT

Zha0q1 opened a new pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249


   Add model (BERT) specific logic to un-interleave the self attention mat mul. This can potentially speed up inference with trt 8.0 whose compiler can recognize the new pattern.
   
   usage `model_specific_logics='gluonnlp_bert'`
   ```
   converted_model_path = mx.onnx.export_model(sym_file, params_file, input_shapes,
                                                               input_types, onnx_file, model_specific_logics='gluonnlp_bert')
   ```
   The first screenshot is the old graph, the second is the new graph. Note that use of `onnx-sim` is required 
   ![image](https://user-images.githubusercontent.com/16669457/117230463-01363800-add2-11eb-94bf-357595d0c70d.png)
   ![image](https://user-images.githubusercontent.com/16669457/117230472-05625580-add2-11eb-8c18-cdcf2e86de6e.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] waytrue17 commented on a change in pull request #20249: [v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
waytrue17 commented on a change in pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#discussion_r627671641



##########
File path: python/mxnet/onnx/mx2onnx/_op_translations/__init__.py
##########
@@ -18,5 +18,6 @@
 # coding: utf-8
 """ONNX export op translation"""
 
+from . import _gluonnlp_bert

Review comment:
       Would this logic benefits specifically for TRT? If so, shall we consider to name it like '_gluonnlp_bert_trt'?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #20249: [poc][v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#issuecomment-833965720


   @szha @ptrendx for insights


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on pull request #20249: [poc][v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
ptrendx commented on pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#issuecomment-840153792


   @Zha0q1 What is the difficulty in doing this transformation after the export? You should be able to modify the ONNX graph itself, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on pull request #20249: [v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
szha commented on pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#issuecomment-833962325


   It probably makes more sense to have gluonnlp export a BERT graph that doesn't involve interleaving than doing it in mxnet. framework shouldn't have knowledge about or rely on the implementation of its ecosystem packages.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 edited a comment on pull request #20249: [v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
Zha0q1 edited a comment on pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#issuecomment-833965061


   > It probably makes more sense to have gluonnlp export a BERT graph that doesn't involve interleaving than doing it in mxnet. framework shouldn't have knowledge about or rely on the implementation of its ecosystem packages.
   
   I agree. Alternatively we might be able to make mx2onnx support loading custom conversion functions dynamically and put these functions to gluonnlp? Or if even that is too hacky this pr can serve as a poc to evaluate the performance benefit with trt 8.0 by un-interleaving the matrix multiplication


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on a change in pull request #20249: [v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
szha commented on a change in pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#discussion_r627842716



##########
File path: python/mxnet/onnx/mx2onnx/_export_model.py
##########
@@ -83,6 +83,11 @@ def export_model(sym, params, in_shapes=None, in_types=np.float32,
         This is the old name of in_types. We keep this parameter name for backward compatibility
     in_shapes : List of tuple
         This is the old name of in_shapes. We keep this parameter name for backward compatibility
+    model_specific_logics : str
+        Specifies if model-specific conversion logic should be used. Refer to ./_op_translations/
+    cheat_sheet : dict of str to str
+        This is a dict that stors some hyperparameters values or additional info about the model that
+        would be used in model-specific conversion functions

Review comment:
       these options are semantically unclear and hard to maintain




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20249: [v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#issuecomment-833164329


   Hey @Zha0q1 , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [clang, miscellaneous, unix-gpu, edge, centos-cpu, unix-cpu, centos-gpu, windows-cpu, website, windows-gpu, sanity]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #20249: [v1.x] ONNX Add an option to un-interleave BERT

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #20249:
URL: https://github.com/apache/incubator-mxnet/pull/20249#issuecomment-833965061


   > It probably makes more sense to have gluonnlp export a BERT graph that doesn't involve interleaving than doing it in mxnet. framework shouldn't have knowledge about or rely on the implementation of its ecosystem packages.
   
   I agree. Alternatively we might be able to make mx2onnx support loading custom conversion functions dynamically and put these functions to gluonnlp? Or if even that is too hacky this pr can server as a poc to evaluate the performance benefit with trt 8.0 by un-interleaving the matrix multiplication


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org