You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/10/07 21:20:18 UTC
[GitHub] [incubator-tvm] qiangxu1996 opened a new issue #6649: Some models not working on Android GPU
qiangxu1996 opened a new issue #6649:
URL: https://github.com/apache/incubator-tvm/issues/6649
I am recently trying different models and testing them on android_rpc. However, I discovered that many models do not work with either OpenCL or Vulkan. The outcome for some of the models is listed below.
| | `deploy_model_on_android.py` | PyTorch Pretrained Resnet18 | Fast-depth | SC-SfMLearner |
| ------ | ---------------------------- | --------------------------- | ---------- | ------------- |
| OpenCL | Fail | Success | Fail | Success |
| Vulkan | Success | Siccess | Fail | Fail |
All models work on CPU without problem. The mobile side is hanging until the watchdog wakes up in the case of fast-depth running on Vulkan, while the crash messages from logcat for both OpenCL cases are like
```
2020-10-07 16:18:25.702 30873-30900/org.apache.tvm.tvmrpc W/System.err: Load module from /data/user/0/org.apache.tvm.tvmrpc/cache/tvm4j_rpc_7263567248671157602/net.so
2020-10-07 16:18:28.912 30873-30900/org.apache.tvm.tvmrpc E/libc++abi: terminating with uncaught exception of type std::bad_cast: std::bad_cast
2020-10-07 16:18:28.912 30873-30900/org.apache.tvm.tvmrpc A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 30900 (Thread-2), pid 30873 (mrpc:RPCProcess)
2020-10-07 16:18:28.936 30961-30961/? W/crash_dump64: type=1400 audit(0.0:815): avc: denied { search } for name="org.apache.tvm.tvmrpc" dev="sda45" ino=6684883 scontext=u:r:crash_dump:s0:c512,c768 tcontext=u:object_r:app_data_file:s0:c512,c768 tclass=dir permissive=0
2020-10-07 16:18:28.952 30961-30961/? I/crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstone
2020-10-07 16:18:28.953 859-859/? I//system/bin/tombstoned: received crash request for pid 30873
2020-10-07 16:18:28.955 30961-30961/? I/crash_dump64: performing dump of process 30873 (target tid = 30900)
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: Build fingerprint: 'google/walleye/walleye:8.1.0/OPM2.171026.006.G1/4820017:user/release-keys'
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: Revision: 'MP1'
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: ABI: 'arm64'
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: pid: 30873, tid: 30900, name: Thread-2 >>> org.apache.tvm.tvmrpc:RPCProcess <<<
2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: Abort message: 'terminating with uncaught exception of type std::bad_cast: std::bad_cast'
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: x0 0000000000000000 x1 00000000000078b4 x2 0000000000000006 x3 0000000000000008
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: x4 fefeff75ad4fe127 x5 fefeff75ad4fe127 x6 fefeff75ad4fe127 x7 7f7f7f7fff7fff7f
2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: x8 0000000000000083 x9 0000000010000000 x10 00000076b128cd00 x11 0000000000000001
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x12 0000000000000018 x13 0000000000000000 x14 0000000000000000 x15 003668d56858db21
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x16 00000057b320afa8 x17 0000007748dc352c x18 00000076a1bad640 x19 0000000000007899
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x20 00000000000078b4 x21 0000000000000083 x22 ffffff80ffffffc8 x23 00000076b128cef0
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x24 00000076b128cdd0 x25 00000076b128ce10 x26 00000076a8b460c0 x27 00000076a1bad538
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: x28 00000076a5059f00 x29 00000076b128cd40 x30 0000007748d78760
2020-10-07 16:18:28.958 30961-30961/? A/DEBUG: sp 00000076b128cd00 pc 0000007748d78788 pstate 0000000060000000
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: backtrace:
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #00 pc 000000000001d788 /system/lib64/libc.so (abort+120)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #01 pc 000000000009ce88 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #02 pc 000000000009d07c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #03 pc 00000000000aead0 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #04 pc 00000000000ae0fc /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #05 pc 00000000000ae058 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so (__cxa_throw+112)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #06 pc 00000000000abe5c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++.so (std::__1::locale::use_facet(std::__1::locale::id&) const+216)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #07 pc 00000000001d8ed8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (_ZNSt3__124__put_character_sequenceIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_PKS4_m+160)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #08 pc 0000000001329348 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeLoad(llvm::Instruction const*)+648)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #09 pc 00000000013235e8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeInstruction(llvm::Instruction const*)+68)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #10 pc 0000000001322710 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeModule(llvm::Module&)+804)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #11 pc 0000000001333800 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::runOnModule(llvm::Module&)+8)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #12 pc 0000000000315768 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::MPPassManager::runOnModule(llvm::Module&)+464)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #13 pc 00000000003163a8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::PassManagerImpl::run(llvm::Module&)+400)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #14 pc 00000000009bd17c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::llclib::Compile(llvm::Module*, void* (*)(unsigned int), char**, unsigned int&, llvm::Module*, llvm::CLPrintfInterpreter const*)+5504)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #15 pc 000000000156bf9c /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (clang::clanglib::Codegen(llvm::MemoryBuffer*, cl_compiler_target, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, llvm::OwningArrayPtr<char>&, unsigned int&, cl_rs_compiler_info*)+1164)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #16 pc 0000000001583268 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so ((anonymous namespace)::CompilationModel::link()+6848)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #17 pc 0000000001579950 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (cl_compiler_link_program+364)
2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: #18 pc 0000000000053ae8 /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libCB.so (cl_program_link_immediate+832)
2020-10-07 16:18:29.504 859-859/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_07
```
I believe this is caused by certain bugs with TVM mobile GPU support. Attached is a TorchScript model of fast-depth and the corresponding test script for bug reproduciton. The test is performed on Android 8.1 on Pixel 2, while the server side uses the demo_android docker image, with pytorch 1.4 installed.
[fast-depth.zip](https://github.com/apache/incubator-tvm/files/5343898/fast-depth.zip)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org