You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/10/22 00:36:46 UTC

[GitHub] [tvm] masahi opened a new issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

masahi opened a new issue #9349:
URL: https://github.com/apache/tvm/issues/9349


   PyTorch 1.10 has just been released. We are now at PT 1.7 which we upgraded to one year ago. I think it is a good time for another update. 
   https://pytorch.org/blog/pytorch-1.10-released/
   
   Recently I've been sensing growing interest in more tight integration with PyTorch, for example using TVM as a backend in a PyTorch-based application (https://github.com/apache/tvm/pull/8777) or using TVM for training acceleration. I'm looking forward to seeing more development in this space, in addition to continuing to support the traditional usage of TVM as a e2e inference solution for PT models. From these point of view, I believe having our PT support up-to-date and actively maintained is increasingly important.   
   
   - [ ] Figure out what's broken, any API change
   - [ ] Send out necessary fix
   - [ ] Upgrade the CI GPU image


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi closed issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi closed issue #9349:
URL: https://github.com/apache/tvm/issues/9349


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950738503


   @lhutton1 Unfortunately, I don't have a solution for this problem. The error I got with PT 1.10 + TVM seemed related to LLVM (it says `Unable to find target for this triple (no targets are registered)` and dies during tracing). I also got `Aborted (core dumped)` error upon process exit.
   
   Also I've never hit this problem with older PyTorch versions + TVM. But ONNX + PT did cause me some pain in the past, which was also fixed by swapping the import order, https://github.com/onnx/onnx/issues/2394#issuecomment-581638840. I wonder if this is the same dynamic loader related issue. I'd love to dig deep if this is something we can fix from our side.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950738503


   @lhutton1 Unfortunately, I don't have a solution for this problem. The error I got with PT 1.10 + TVM seemed related to LLVM (it says `Unable to find target for this triple (no targets are registered)` and dies during tracing). I also got `Aborted (core dumped)` error upon process exit.
   
   Also I've never hit this problem with older PyTorch versions + TVM. But ONNX + PT did cause me some pain in the past, which was also fixed by swapping the import order, https://github.com/onnx/onnx/issues/2394#issuecomment-581638840. I wonder if this is the same dynamic loader related issue. I'd love to dig deep if this is something we can fix from our side. 
   
   Related issues from PT repo
   https://github.com/pytorch/pytorch/issues/2507 
   https://github.com/pytorch/pytorch/issues/19739


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lhutton1 commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
lhutton1 commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-951242475


   Thanks @masahi for looking into this, it's interesting to see many other repos having similar issues. Apologies for checking late, but I can confirm the change you suggested above fixes the issue :) - _Just to add for completeness, the PyTorch version I'm using currently is `1.8.1`_ .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950044958


   Tracing MaskRCNN is broken with this release https://github.com/pytorch/vision/issues/4158#issuecomment-950043269
   
   cc @hgt312 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lhutton1 commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
lhutton1 commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950685224


   > ~Tracing MaskRCNN is broken with this release [pytorch/vision#4158 (comment)](https://github.com/pytorch/vision/issues/4158#issuecomment-950043269)~
   > 
   > cc @hgt312
   > 
   > UPDATE: If I import PyTorch before TVM, the error I got during tracing is gone. MaskRCNN still traces correctly.
   
   Thanks for looking into the upgrade @masahi, this will be very useful! I've also experienced the import order issue while attempting to trace a PyTorch model. For example,
   ```python
   import tvm.driver.tvmc as tvmc
   import torch
   
   traced_model = tvmc.load("mobilenetv2_quantized.pth", shape_dict={"input0": [1, 3, 224, 224]})
   ```
   Results in the following:
   ```
   munmap_chunk(): invalid pointer
   Aborted (core dumped)
   ```
   Yet after changing the order of imports, it works perfectly fine. The reason I bring this up is because sometimes its not possible to simply swap the order of imports to fix the issue. For example, someone using tvmc directly from the command line. Are you aware of any other ways this could be fixed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950044958


   ~~Tracing MaskRCNN is broken with this release https://github.com/pytorch/vision/issues/4158#issuecomment-950043269~~
   
   cc @hgt312 
   
   UPDATE: If I import PyTorch before TVM, the error I got during tracing is gone. MaskRCNN still traces correctly, but we need to add some converters for annoying ops coming from scripting.
   ```
   NotImplementedError: The following operators are not implemented: ['prim::Uninitialized', 'aten::format', 'aten::dim', 'aten::__contains__', 'aten::__isnot__', 'prim::unchecked_cast']
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi closed issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi closed issue #9349:
URL: https://github.com/apache/tvm/issues/9349


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950044958


   ~~Tracing MaskRCNN is broken with this release https://github.com/pytorch/vision/issues/4158#issuecomment-950043269~~
   
   cc @hgt312 
   
   UPDATE: If I import PyTorch before TVM, the error I got during tracing is gone. MaskRCNN still traces correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950738503


   @lhutton1 Unfortunately, I don't have a solution for this problem. The error I got with PT 1.10 + TVM seemed related to LLVM (it says `Unable to find target for this triple (no targets are registered)` and dies during tracing). But your errors seems much worse :) 
   
   Also I've never hit this problem with older PyTorch versions + TVM. But ONNX + PT did cause me some pain in the past, which was also fixed by swapping the import order, https://github.com/onnx/onnx/issues/2394#issuecomment-581638840. I wonder if this is the same dynamic loader related issue. I'd love to dig deep if this is something we can fix from our side.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950078067


   With some simple fix, MaskRCNN can be correctly converted to Relay and pass the test. Other tests are also looking good, only less than five of them are broken but seems easy to fix:
   
   ```
   FAILED test_forward.py::test_forward_deform_conv - tvm._ffi.base.TVMError: Traceback (most recent call last):
   FAILED test_forward.py::test_forward_linspace - tvm._ffi.base.TVMError: Traceback (most recent call last):
   FAILED test_forward.py::test_forward_nll_loss - NotImplementedError: The following operators are not implemented: ['aten::nll_loss_nd']
   ```
   
   So it looks like it's going to be an easy upgrade. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950765995


   It might be due to our use of `RTLD_GLOBAL`, e.g. https://github.com/apache/tvm/blob/dfe4cebbdadab3d4e6e6ba3951276a51a4ffeaf6/python/tvm/_ffi/base.py#L57
   
   @lhutton1 Try replacing above with `ctypes.CDLL(lib_path[0])`, it fixes the issue on my end (mask rcnn test works without swapping the import order).
   
   See related issues in other repos
   https://github.com/dmlc/dgl/issues/2255
   https://github.com/pytorch/pytorch/pull/28536
   https://github.com/pytorch/pytorch/issues/3059


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-1007864029


   Upgrade to PT v1.10 is now complete.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950765995


   It might be due to our use of `RTLD_GLOBAL`, e.g. https://github.com/apache/tvm/blob/dfe4cebbdadab3d4e6e6ba3951276a51a4ffeaf6/python/tvm/_ffi/base.py#L57
   
   @lhutton1 Try replacing above with `ctypes.CDLL(lib_path[0])`, it fixes the issue on my end (mask rcnn test works).
   
   See related issues in other repos
   https://github.com/dmlc/dgl/issues/2255
   https://github.com/pytorch/pytorch/pull/28536
   https://github.com/pytorch/pytorch/issues/3059


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950738503


   @lhutton1 Unfortunately, I don't have a solution for this problem. The error I got with PT 1.10 + TVM seemed related to LLVM (it says `Unable to find target for this triple (no targets are registered)` and dies during tracing). I also got `Aborted (core dumped)` error upon process exit.
   
   Also I've never hit this problem with older PyTorch versions + TVM. But ONNX + PT did cause me some pain in the past, which was also fixed by swapping the import order, https://github.com/onnx/onnx/issues/2394#issuecomment-581638840. I wonder if this is the same dynamic loader related issue. I'd love to dig deep if this is something we can fix from our side. 
   
   Related issues from PT repo
   https://github.com/pytorch/pytorch/issues/2507 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-950765995


   It might be due to our use of `RTLD_GLOBAL`, e.g. https://github.com/apache/tvm/blob/dfe4cebbdadab3d4e6e6ba3951276a51a4ffeaf6/python/tvm/_ffi/base.py#L57
   
   See related issues in other repos
   https://github.com/dmlc/dgl/issues/2255
   https://github.com/pytorch/pytorch/pull/28536
   https://github.com/pytorch/pytorch/issues/3059


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #9349: [Torch, CI] Upgrade to PyTorch 1.10

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #9349:
URL: https://github.com/apache/tvm/issues/9349#issuecomment-1007864029


   Upgrade to PT v1.10 is now complete.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org