You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@tvm.apache.org by Sergio via TVM Discuss <no...@discuss.tvm.ai> on 2020/04/16 19:54:53 UTC
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
When I run a modified version of the tutorial file "tune_relay_cuda.py" (using target = "rocm"), I get the following error some time auto-tuning starts
Tuning...
Task(func_name=topi_nn_conv2d, args=(('TENSOR', (1, 512, 14, 14), 'float32'), ('TENSOR', (512, 512, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'float32'), kwargs={}, workload=('conv2d', (1, 512, 14, 14, 'float32'), (512, 512, 3, 3, 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'NCHW', 'float32'))
rocm
[Task 1/ 9] Current/Best: 21.86/3183.80 GFLOPS | Progress: (60/100) | 174.55 s
Segmentation fault (core dumped)
I am using a Vega 20 AMD GPU and I was wondering if I should add the `-model xx` definition to the target to avoid this.
I was wondering if somebody has experienced the same issue in the past. Any information on this issue would be greatly appreciated
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/1) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/34b673dccab46cb884eb65902e6915fafa25aa8f2764d7f7e8cb45576a0a4c47).
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
Posted by Sergio via TVM Discuss <no...@discuss.tvm.ai>.
Downgrading to xgboost 0.90 fixed the segmentation fault issue!
Thanks a lot @t-vi
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/6) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/4f79be5788ee71905dcd7c4fb64038a59d0ceb67b6b262417908988078f665be).
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
Posted by Thomas V via TVM Discuss <no...@discuss.tvm.ai>.
Currently, we use the CUDA schedule (and op) on ROCm:
https://github.com/apache/incubator-tvm/blob/2cd987d92724be0f859bfb624ce797f9c70167bb/python/tvm/relay/op/strategy/rocm.py#L47-L50
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/8) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/b744b44f96334395902b155a75b1424d7aa8d27a2ef448315d5ffa1c02584dea).
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
Posted by Sergio via TVM Discuss <no...@discuss.tvm.ai>.
Hi @t-vi,
I have one follow-up question. I was wondering if you know the location of the file defining the schedule for the ROCm backend conv2d. So far I have checked the file in the link below, but I haven't been able to find the schedule template. I would appreciate any information on this regard.
https://github.com/apache/incubator-tvm/blob/2cd987d92724be0f859bfb624ce797f9c70167bb/topi/python/topi/rocm/conv2d.py
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/7) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/acea2afc9a08f380e4d92380b7ecac32c52cd557a7aa2f58b10e6f2fdb0d651a).
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
Posted by Thomas V via TVM Discuss <no...@discuss.tvm.ai>.
Given that it happens after 60 steps, this might not be ROCm but rather the xgboost module. In that case, upgrading to the pre-release or downgrading helps.
https://github.com/apache/incubator-tvm/issues/4953#issuecomment-619255802
That said we also fixed a potential segfault in the AMDGPU llvm codegen last week, so upgrading to the latest TVM master might be a good idea.
Best regards
Thomas
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/5) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/eb64480f15cee739e7b07a9df6a79a6ac6e70eaeb6c70f5ec12d5b13b3ee5271).
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
Posted by tqchen via TVM Discuss <no...@discuss.tvm.ai>.
You will need to compile with miopen header in your include path. Alternatively, you can remove the miopen.cc, this won’t affect the autotvm part
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/4) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/b0b9b15cd921a9d2cd1722dd4b24138769bd63c99441c89189e9c3c72edafb7c).
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
Posted by Sergio via TVM Discuss <no...@discuss.tvm.ai>.
Hi @tqchen
Thank you for your prompt reply. I am following the instructions in the link you sent but, when executing the Makefile, I get
rocm_runtime_pack.cc:33:52: fatal error: ../../src/contrib/miopen/conv_forward.cc: No such file or directory
I noticed that the directory
../../src/contrib/miopen
does not exist. I could find thew missing file in
../../src/runtime/contrib/miopen/
maybe I should just modify accordingly and make?
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/3) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/31b4a4dd0608a89218f832f08a3667968f108c3d4373dc4a283fa02ca1bbb6cf).
[TVM Discuss] [Questions] ROCm 'segmentation fault' error when
auto-tuning
Posted by tqchen via TVM Discuss <no...@discuss.tvm.ai>.
Youw will need to setup an RPC server explicitly as per https://github.com/apache/incubator-tvm/tree/master/apps/rocm_rpc due to a limitation of the rocm driver
---
[Visit Topic](https://discuss.tvm.ai/t/rocm-segmentation-fault-error-when-auto-tuning/6402/2) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/3948106a7b9b1ecc205dd3e8c6f98a554aee04188ac97d9ea16b8e92bd7f0040).