You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/06/09 17:46:36 UTC

[GitHub] [incubator-tvm] mbrookhart opened a new pull request #5755: Edit onnx parser to infer values in post order

mbrookhart opened a new pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755


   This makes infer_value and infer_value_simulated run one op at a time to prevent recompiling/rerunning the same graph over and over.
   
   If the old version ran in `t = A * N * M`, where N is the number of nodes in the call, and M is the number of times infer_value is called, this runs in `t = B * N`, but `B >> A`. This means some models slow down during import, but very long running models speed up.
   
   Import performance testing on my laptop with  models from the onnx model zoo and HuggingFace
   
   RN50 (https://s3.amazonaws.com/onnx-model-zoo/resnet/resnet50v1/resnet50v1.tar.gz)
     master: 5.05 s
     this: 6.05 s
   BERT-Squad (https://github.com/onnx/models/raw/master/text/machine_comprehension/bert-squad/model/bertsquad-8.tar.gz):
     master: 68.8 s
     this: 158.5 s
   HuggingFace BERT (`transformers.TFBertForSequenceClassification.from_pretrained('bert-base-cased')`)
     master: 567.1 s
     this: 122.5 s
   
   @jwfromm @masahi Do you think this is worth including?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi edited a comment on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-642983173


   ok I now understand why it makes importing some model slower. Previously there are fewer but heavy `infer_value` calls, now there are many but they are all single node compile, so LLVM overhead dominates.
   
   I wish there were a "fast-path" compile option for llvm for use case like this. Turning on all optimization to compile a single node and get a compile time constant is clearly overkill. At least we can lower the opt level (always 3 in our backend) for const eval mode @tqchen 
   
   Maybe we can add some option here to tell llvm that we want a fast path compile 
   https://github.com/apache/incubator-tvm/blob/master/python/tvm/relay/frontend/common.py#L509 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] mbrookhart commented on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

mbrookhart commented on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-642711335


   Yes, that's the idea, and why the hugging face model import speeds up so much. It seems like there is significant overhead each time we call MCJIT though, which is why BERT-Squad, with many ops but fewer infer_value calls, slows down. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi edited a comment on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-642300115


   Correct me if I'm wrong, does this change make `infer_value` on down stream nodes faster because the arguments are constant (computed bottom up) rather than another call nodes that require recompilation?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jwfromm commented on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

jwfromm commented on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-642273871


   I'm definitely in favor of this change, some of the models that take a long time to import are very unpleasant to deal with and taking a small hit on much faster importing models is worth it in my opinion. Would love to hear your thoughts @masahi and @jonso4 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi commented on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-642983173


   ok I now understand why it makes importing some model slower. Previously there are fewer but heavy `infer_value` calls, now there are many but they are all single node compile, so LLVM overhead dominates.
   
   I wish there were a "fast-path" compile option for llvm for use case like this. Turning on all optimization to compile a single node and get a compile time constant is clearly overkill. At least we can lower the opt level (always 3 in our backend) for const eval mode @tqchen 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi commented on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-642984901


   I think it's a very good idea, and we can probably mitigate the slow down by tweaking llvm.
   
   Will leave it open for another day for others to comment 
   @soiferj @siju-samuel @icemelon9 @kevinthesun 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi merged pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

masahi merged pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi commented on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-642300115


   Correct me if I'm wrong, does this change mean `infer_value` on down stream nodes should be faster because the arguments are constant (computed bottom up) rather than another call nodes that require recompilation?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi commented on pull request #5755: Edit onnx parser to infer values in post order

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #5755:
URL: https://github.com/apache/incubator-tvm/pull/5755#issuecomment-643503058


   Thanks @mbrookhart @jwfromm @siju-samuel 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org