You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/10/07 21:20:18 UTC

[GitHub] [incubator-tvm] qiangxu1996 opened a new issue #6649: Some models not working on Android GPU

qiangxu1996 opened a new issue #6649:
URL: https://github.com/apache/incubator-tvm/issues/6649


   I am recently trying different models and testing them on android_rpc. However, I discovered that many models do not work with either OpenCL or Vulkan. The outcome for some of the models is listed below.
   
   |        | `deploy_model_on_android.py` | PyTorch Pretrained Resnet18 | Fast-depth | SC-SfMLearner |
   | ------ | ---------------------------- | --------------------------- | ---------- | ------------- |
   | OpenCL | Fail                         | Success                     | Fail       | Success       |
   | Vulkan | Success                      | Siccess                     | Fail       | Fail          |
   
   All models work on CPU without problem. The mobile side is hanging until the watchdog wakes up in the case of fast-depth running on Vulkan, while the crash messages from logcat for both OpenCL cases are like
   
   ```
   2020-10-07 16:18:25.702 30873-30900/org.apache.tvm.tvmrpc W/System.err: Load module from /data/user/0/org.apache.tvm.tvmrpc/cache/tvm4j_rpc_7263567248671157602/net.so
   2020-10-07 16:18:28.912 30873-30900/org.apache.tvm.tvmrpc E/libc++abi: terminating with uncaught exception of type std::bad_cast: std::bad_cast
   2020-10-07 16:18:28.912 30873-30900/org.apache.tvm.tvmrpc A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 30900 (Thread-2), pid 30873 (mrpc:RPCProcess)
   2020-10-07 16:18:28.936 30961-30961/? W/crash_dump64: type=1400 audit(0.0:815): avc: denied { search } for name="org.apache.tvm.tvmrpc" dev="sda45" ino=6684883 scontext=u:r:crash_dump:s0:c512,c768 tcontext=u:object_r:app_data_file:s0:c512,c768 tclass=dir permissive=0
   2020-10-07 16:18:28.952 30961-30961/? I/crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstone
   2020-10-07 16:18:28.953 859-859/? I//system/bin/tombstoned: received crash request for pid 30873
   2020-10-07 16:18:28.955 30961-30961/? I/crash_dump64: performing dump of process 30873 (target tid = 30900)
   2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
   2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: Build fingerprint: 'google/walleye/walleye:8.1.0/OPM2.171026.006.G1/4820017:user/release-keys'
   2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: Revision: 'MP1'
   2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: ABI: 'arm64'
   2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: pid: 30873, tid: 30900, name: Thread-2  >>> org.apache.tvm.tvmrpc:RPCProcess <<<
   2020-10-07 16:18:28.955 30961-30961/? A/DEBUG: signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
   2020-10-07 16:18:28.957 30961-30961/? A/DEBUG: Abort message: 'terminating with uncaught exception of type std::bad_cast: std::bad_cast'
   2020-10-07 16:18:28.957 30961-30961/? A/DEBUG:     x0   0000000000000000  x1   00000000000078b4  x2   0000000000000006  x3   0000000000000008
   2020-10-07 16:18:28.957 30961-30961/? A/DEBUG:     x4   fefeff75ad4fe127  x5   fefeff75ad4fe127  x6   fefeff75ad4fe127  x7   7f7f7f7fff7fff7f
   2020-10-07 16:18:28.957 30961-30961/? A/DEBUG:     x8   0000000000000083  x9   0000000010000000  x10  00000076b128cd00  x11  0000000000000001
   2020-10-07 16:18:28.958 30961-30961/? A/DEBUG:     x12  0000000000000018  x13  0000000000000000  x14  0000000000000000  x15  003668d56858db21
   2020-10-07 16:18:28.958 30961-30961/? A/DEBUG:     x16  00000057b320afa8  x17  0000007748dc352c  x18  00000076a1bad640  x19  0000000000007899
   2020-10-07 16:18:28.958 30961-30961/? A/DEBUG:     x20  00000000000078b4  x21  0000000000000083  x22  ffffff80ffffffc8  x23  00000076b128cef0
   2020-10-07 16:18:28.958 30961-30961/? A/DEBUG:     x24  00000076b128cdd0  x25  00000076b128ce10  x26  00000076a8b460c0  x27  00000076a1bad538
   2020-10-07 16:18:28.958 30961-30961/? A/DEBUG:     x28  00000076a5059f00  x29  00000076b128cd40  x30  0000007748d78760
   2020-10-07 16:18:28.958 30961-30961/? A/DEBUG:     sp   00000076b128cd00  pc   0000007748d78788  pstate 0000000060000000
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG: backtrace:
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #00 pc 000000000001d788  /system/lib64/libc.so (abort+120)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #01 pc 000000000009ce88  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #02 pc 000000000009d07c  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #03 pc 00000000000aead0  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #04 pc 00000000000ae0fc  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #05 pc 00000000000ae058  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++_shared.so (__cxa_throw+112)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #06 pc 00000000000abe5c  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libc++.so (std::__1::locale::use_facet(std::__1::locale::id&) const+216)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #07 pc 00000000001d8ed8  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (_ZNSt3__124__put_character_sequenceIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_PKS4_m+160)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #08 pc 0000000001329348  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeLoad(llvm::Instruction const*)+648)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #09 pc 00000000013235e8  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeInstruction(llvm::Instruction const*)+68)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #10 pc 0000000001322710  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::scalarizeModule(llvm::Module&)+804)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #11 pc 0000000001333800  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::QGPUScalarizationPass::runOnModule(llvm::Module&)+8)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #12 pc 0000000000315768  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::MPPassManager::runOnModule(llvm::Module&)+464)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #13 pc 00000000003163a8  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::PassManagerImpl::run(llvm::Module&)+400)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #14 pc 00000000009bd17c  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (llvm::llclib::Compile(llvm::Module*, void* (*)(unsigned int), char**, unsigned int&, llvm::Module*, llvm::CLPrintfInterpreter const*)+5504)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #15 pc 000000000156bf9c  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (clang::clanglib::Codegen(llvm::MemoryBuffer*, cl_compiler_target, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, llvm::OwningArrayPtr<char>&, unsigned int&, cl_rs_compiler_info*)+1164)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #16 pc 0000000001583268  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so ((anonymous namespace)::CompilationModel::link()+6848)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #17 pc 0000000001579950  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libllvm-qcom.so (cl_compiler_link_program+364)
   2020-10-07 16:18:29.034 30961-30961/? A/DEBUG:     #18 pc 0000000000053ae8  /data/app/org.apache.tvm.tvmrpc-_gWNahSFGoEEFkEpu901fQ==/lib/arm64/libCB.so (cl_program_link_immediate+832)
   2020-10-07 16:18:29.504 859-859/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_07
   ```
   
   I believe this is caused by certain bugs with TVM mobile GPU support. Attached is a TorchScript model of fast-depth and the corresponding test script for bug reproduciton. The test is performed on Android 8.1 on Pixel 2, while the server side uses the demo_android docker image, with pytorch 1.4 installed.
   
   [fast-depth.zip](https://github.com/apache/incubator-tvm/files/5343898/fast-depth.zip)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org