You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/04/13 23:27:21 UTC
[GitHub] [tvm] haojin2 opened a new pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
haojin2 opened a new pull request #7843:
URL: https://github.com/apache/tvm/pull/7843
This PR wants to fix a small bug in PT converter.
Bug reproduction script:
```Python
import torch
from torch import nn
from torch.nn import Linear
class SimpleModel(nn.Module):
def __init__(self, input_size, output_size):
super(SimpleModel, self).__init__()
self.fc = Linear(input_size, output_size)
def forward(self, x):
return self.fc(x)
batch_size = 128
dim = 64
T = 50
x = torch.randn((batch_size, T, dim))
model = SimpleModel(dim, 1)
model.eval()
scripted_model = torch.jit.trace(model, x).eval()
import tvm
from tvm import relay
mod, params = relay.frontend.from_pytorch(scripted_model, [("data", [batch_size, T, dim])])
target = tvm.target.Target('cuda -libs=cublas')
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target, params=params)
tvm_ctx = tvm.gpu(0)
rt = tvm.contrib.graph_executor.GraphModule(lib['default'](tvm_ctx))
ndarray_inputs = {
"data": x.numpy()
}
rt.set_input(**ndarray_inputs)
rt.run()
print(rt.get_output(0).asnumpy())
```
Without this fix (current `main`):
```
python repro.py
Cannot find config for target=cuda -keys=cuda,gpu -libs=cublas -max_num_threads=1024 -thread_warp_size=32, workload=('batch_matmul_cublas.cuda', ('TENSOR', (128, 50, 64), 'float32'), ('TENSOR', (1, 1, 64), 'float32'), (128, 50, 1)). A fallback configuration is used, which may bring great performance regression.
Traceback (most recent call last):
File "repro.py", line 41, in <module>
rt.run()
File "/home/ubuntu/.local/lib/python3.6/site-packages/tvm-0.8.dev846+g81afb14c4-py3.6-linux-x86_64.egg/tvm/contrib/graph_executor.py", line 206, in run
self._run()
File "/home/ubuntu/.local/lib/python3.6/site-packages/tvm-0.8.dev846+g81afb14c4-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
3: TVMFuncCall
2: tvm::runtime::GraphExecutor::Run()
1: std::_Function_handler<void (), tvm::runtime::GraphExecutor::CreateTVMOp(tvm::runtime::TVMOpParam const&, std::vector<DLTensor, std::allocator<DLTensor> > const&, unsigned long)::{lambda()#3}>::_M_invoke(std::_Any_data const&)
0: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::WrapPackedFunc(int (*)(TVMValue*, int*, int, TVMValue*, int*, void*), tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
2: TVMFuncCall
1: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::contrib::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
0: void tvm::contrib::CallBatchGemm<tvm::contrib::CublasSgemmBatchOp>(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, tvm::contrib::CublasSgemmBatchOp)
File "/home/ubuntu/tvm/src/runtime/contrib/cublas/../cblas/gemm_common.h", line 189
File "/home/ubuntu/tvm/src/runtime/library_module.cc", line 78
TVMError:
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
Check failed: ret == 0 (-1 vs. 0) : TVMError:
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
Check failed: BatchCount3D(B) == batch_size (1 vs. 128) :
terminate called after throwing an instance of 'tvm::runtime::InternalError'
what(): [23:18:59] /home/ubuntu/tvm/src/runtime/workspace_pool.cc:118:
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
Check failed: allocated_.size() == 1 (2 vs. 1) :
Stack trace:
0: tvm::runtime::WorkspacePool::~WorkspacePool()
1: __call_tls_dtors
2: 0x00007fdd60f44236
3: exit
4: __libc_start_main
5: _start
6: 0xffffffffffffffff
Aborted (core dumped)
```
With this fix:
```
python repro.py
Cannot find config for target=cuda -keys=cuda,gpu -libs=cublas -max_num_threads=1024 -thread_warp_size=32, workload=('batch_matmul_cublas.cuda', ('TENSOR', (128, 50, 64), 'float32'), ('TENSOR', (128, 1, 64), 'float32'), (128, 50, 1)). A fallback configuration is used, which may bring great performance regression.
[[[-0.21468109]
[ 0.3858583 ]
[ 0.16572809]
...
[-0.03322682]
[ 0.33868816]
[ 0.3021463 ]]
[[ 1.052577 ]
[ 0.26492748]
[ 0.37078723]
...
[-0.0752994 ]
[-0.66205776]
[-0.19348428]]
[[ 0.6743065 ]
[ 0.02969196]
[-0.03708391]
...
[ 0.16056934]
[ 0.41362724]
[ 0.629748 ]]
...
[[-0.05230951]
[-0.3116043 ]
[-0.07618818]
...
[-0.7429178 ]
[ 0.34146884]
[-0.46452078]]
[[ 0.6838716 ]
[-0.0820943 ]
[ 0.01337433]
...
[ 0.6866671 ]
[-0.4317361 ]
[ 0.16978306]]
[[ 0.7288995 ]
[ 0.57882047]
[ 0.40440276]
...
[ 0.36602104]
[ 0.6143365 ]
[ 0.5057366 ]]]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-820737424
Given that #7845 merged, I'll close this. Thanks @haojin2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jcf94 commented on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
jcf94 commented on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-819176454
@haojin2 Thanks! Nice catch!
Your solution on the [2 dim matrix * 3 dim matrix] seems to repeat the [2 dim matrix] for batch_size times then apply the [batch_matmul], I'm thinking will it be better to merge the first 2 dim of the [3 dim matrix] and process a simple [matmul].
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] haojin2 commented on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
haojin2 commented on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-819191793
@jcf94 I think what you says makes sense, I'll make that change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jcf94 commented on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
jcf94 commented on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-819199604
... By the way, seems there gets another PR #7845 related this problem.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi closed pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
masahi closed pull request #7843:
URL: https://github.com/apache/tvm/pull/7843
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] wweic commented on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
wweic commented on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-819206202
Thanks @jcf94 @comaniac for the prompt review.
@haojin2 Looks like #7845 is the idea @jcf94 suggested, should we help review #7845 and merge that instead?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jcf94 commented on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
jcf94 commented on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-819196060
> Is this related to #7730? The current CuBLAS support for batch_matmul doesn't support implicit broadcasting, but the TE compute does. It would be better to support it on the CuBLAS side without introducing a new op.
I guess this is another problem, just a bug of Pytorch frontend.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jcf94 removed a comment on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
jcf94 removed a comment on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-819176454
@haojin2 Thanks! Nice catch!
Your solution on the [2 dim matrix * 3 dim matrix] seems to repeat the [2 dim matrix] for batch_size times then apply the [batch_matmul], I'm thinking will it be better to merge the first 2 dim of the [3 dim matrix] and process a simple [matmul].
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] comaniac commented on pull request #7843: Fix PyTorch batch_matmul conversion when given (3-dim, 2-dim) input pair
Posted by GitBox <gi...@apache.org>.
comaniac commented on pull request #7843:
URL: https://github.com/apache/tvm/pull/7843#issuecomment-819194151
Is this related to #7730? The current CuBLAS support for batch_matmul doesn't support implicit broadcasting, but the TE compute does. It would be better to support it on the CuBLAS side without introducing a new op.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org