You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/11/02 11:49:50 UTC
[GitHub] [tvm] leandron opened a new pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
leandron opened a new pull request #9425:
URL: https://github.com/apache/tvm/pull/9425
Fix repository URL in ubuntu_install_rocm.sh:
* ROCm dependency installation process was following outdated procedures.
This PR makes the installation script to point to the correct repository
* Installation documentation is at:
https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html
* Fixes #9413
cc @tqchen @jtuyls @Mousius
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r743550958
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
Interesting because I checked the current version being used, and it looks like it is rocm 4.3.0. That is why I proposed to use this version specifically:
```
$ docker run -it --rm tlcpack/ci-gpu:v0.78 bash
root@488adea49541:/# dpkg -l | grep rocm
ii rocm-clang-ocl 0.5.0.40300-52 amd64 OpenCL compilation with clang compiler.
ii rocm-cmake 0.5.0.40300-52 amd64 rocm-cmake built using CMake
ii rocm-dbgapi 0.48.0.40300-52 amd64 Library to provide AMD GPU debugger API
ii rocm-debug-agent 2.0.1.40300-52 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent)
ii rocm-dev 4.3.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime software stack
ii rocm-device-libs 1.0.0.40300-52 amd64 Radeon Open Compute - device libraries
ii rocm-gdb 10.2.40300-52 amd64 ROCgdb
ii rocm-opencl 2.0.0.40300-52 amd64 OpenCL: Open Computing Language on ROCclr
ii rocm-opencl-dev 2.0.0.40300-52 amd64 OpenCL: Open Computing Language on ROCclr
ii rocm-smi-lib 4.0.0.40300-52 amd64 AMD System Management libraries
ii rocm-utils 4.3.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime software stack
ii rocminfo 1.0.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool
root@488adea49541:/#
```
I'm not very familiar with ROCm in general, so can you have a look and see what's best for us to do?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-957399046
I was just looking at this as well. However, I had to update the lld to lld-12 as well. Otherwise, I am seeing following issue when building the demo_rocm docker image:
```
The following packages have unmet dependencies:
lld : Depends: lld-14 (>= 14~) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
The command '/bin/sh -c bash /install/ubuntu_install_rocm.sh' returned a non-zero code: 100
ERROR: docker build failed.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744108401
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
Then the TVM build is fine but I am seeing following issue when running an example because we have LLVM 12 in TVM:
```
E TVMError: Fail to load bitcode file /opt/rocm/amdgcn/bitcode/hc.bc
E line -1:Invalid record (Producer: 'LLVM13.0.0git' Reader: 'LLVM 12.0.1')
```
I went back and tried a bunch versions of ROCM and LLVM (upstream, not the one included in ROCM) and this is what I got when running an example with every combination:
```
ROCM 4.3
+ lld-9 + llvm-config-9 -> LLVM ERROR: Unknown specifier in datalayout string
+ lld-10 + llvm-config-10 -> LLVM ERROR: Unknown specifier in datalayout string
+ lld-11 + llvm-config-11 -> LLVM ERROR: Unknown specifier in datalayout string
+ lld-12 + llvm-config-12 -> TVMError: Fail to load bitcode file /opt/rocm/amdgcn/bitcode/hc.bc line -1:Invalid record (Producer: 'LLVM13.0.0git' Reader: 'LLVM 12.0.1')
ROCM 4.2
+ lld-9 + llvm-config-9 -> LLVM ERROR: Unknown specifier in datalayout string
+ lld-10 + llvm-config-10 -> LLVM ERROR: Unknown specifier in datalayout string
+ lld-11 + llvm-config-11 -> LLVM ERROR: Unknown specifier in datalayout string
+ lld-12 + llvm-config-12 -> Check failed: ret == 0 (-1 vs. 0) : TVMError: ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: hipErrorSharedObjectInitFailed
ROCM 4.1
+ lld-9 + llvm-config-9 -> Works
+ lld-10 + llvm-config-10 -> Works
+ lld-11 + llvm-config-11 -> Works
+ lld-12 + llvm-config-12 -> Check failed: ret == 0 (-1 vs. 0) : TVMError: ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: hipErrorSharedObjectInitFaile
ROCM 4.0
+ lld-9 + llvm-config-9 -> Works
+ lld-10 + llvm-config-10 -> Works
+ lld-11 + llvm-config-11 -> Works
+ lld-12 + llvm-config-12 -> Works
```
It looks like the last version of ROCM that works across the board is v4.0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-961360765
I think we fixed it now, can you have a look @jtuyls?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-961360765
I think we fixed it now, can you have a look @jtuyls?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r743550958
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
Interesting because I checked the current version being used, and it looks like it is rocm 4.3.0. That is why I proposed to use this version specifically:
```
$ docker run -it --rm tlcpack/ci-gpu:v0.78 bash
root@488adea49541:/# dpkg -l | grep rocm
ii rocm-clang-ocl 0.5.0.40300-52 amd64 OpenCL compilation with clang compiler.
ii rocm-cmake 0.5.0.40300-52 amd64 rocm-cmake built using CMake
ii rocm-dbgapi 0.48.0.40300-52 amd64 Library to provide AMD GPU debugger API
ii rocm-debug-agent 2.0.1.40300-52 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent)
ii rocm-dev 4.3.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime software stack
ii rocm-device-libs 1.0.0.40300-52 amd64 Radeon Open Compute - device libraries
ii rocm-gdb 10.2.40300-52 amd64 ROCgdb
ii rocm-opencl 2.0.0.40300-52 amd64 OpenCL: Open Computing Language on ROCclr
ii rocm-opencl-dev 2.0.0.40300-52 amd64 OpenCL: Open Computing Language on ROCclr
ii rocm-smi-lib 4.0.0.40300-52 amd64 AMD System Management libraries
ii rocm-utils 4.3.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime software stack
ii rocminfo 1.0.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool
root@488adea49541:/#
```
I'm not very familiar with ROCm in general, so can you have a look and see what's best for us to do?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-966947788
So it seems the installation is fixed now: https://ci.tlcpack.ai/blue/organizations/jenkins/docker-images-ci%2Fdaily-docker-image-rebuild/detail/daily-docker-image-rebuild/109/pipeline
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-957399046
I was just looking at this as well. However, I had to update the lld to lld-12 as well. Otherwise, I am seeing following issue when building the demo_rocm docker image:
```
The following packages have unmet dependencies:
lld : Depends: lld-14 (>= 14~) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
The command '/bin/sh -c bash /install/ubuntu_install_rocm.sh' returned a non-zero code: 100
ERROR: docker build failed.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-964755102
> per discussion with @masahi , waiting for @leandron to post a patch to fix LLVM 13 builds.
I should clarify that the patch to fix LLVM 13 is not required. As @jtuyls suggested, we should install llvm 13 in https://github.com/apache/tvm/blob/main/docker/install/ubuntu1804_install_llvm.sh to avoid problem when TVM is built with rocm 4.3.
@leandron So can you update `ubuntu1804_install_llvm.sh` as well? Or update to rocm 4.2 instead, which doesn't require llvm 13.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744600701
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
ok I tried building with llvm-13 but actually didn't hit an error. The discussions in https://discuss.tvm.apache.org/t/rocm-target-fails-with-llvm-error/11208/8 and https://github.com/apache/tvm/issues/9319 confused me, but probably they were using on older TVM.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] areusch commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
areusch commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-964750728
per discussion with @masahi , waiting for @leandron to post a patch to fix LLVM 13 builds.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r743490081
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
I think we should use ROCM version 4.2 here. ROCM version 4.3 includes LLVM 13, which doesn't build with TVM, see https://discuss.tvm.apache.org/t/rocm-target-fails-with-llvm-error/11208/2. ROCM version 4.2 includes LLVM 12, which works.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744613143
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
If we want to use ROCM 4.3, I think we should add llvm-13 to the install script: https://github.com/apache/tvm/blob/main/docker/install/ubuntu1804_install_llvm.sh before merging. Otherwise, running TVM with ROCM inside the docker images won't work because the last version there is llvm-12.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi merged pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
masahi merged pull request #9425:
URL: https://github.com/apache/tvm/pull/9425
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-961360765
I think we fixed it now, can you have a look @jtuyls?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r745065510
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
I've never used `ci-gpu` locally, so I'm not sure. But I wouldn't be surprised if rocm tests are not exercised at all.
Yes, if rocm 4.3 is intended to be used with TVM in a docker, llvm should also be upgraded to 13.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-961085436
> I was just looking at this as well. However, I had to update the lld to lld-12 as well. Otherwise, I am seeing following issue when building the demo_rocm docker image:
>
> ```
> The following packages have unmet dependencies:
> lld : Depends: lld-14 (>= 14~) but it is not going to be installed
> E: Unable to correct problems, you have held broken packages.
> The command '/bin/sh -c bash /install/ubuntu_install_rocm.sh' returned a non-zero code: 100
> ERROR: docker build failed.
> ```
I see. So what do we need to do? Migrate the whole CI to LLVM-12? Can you help me to understand what needs to be done? perhaps in separate patches, so that we can safely update ROCm and unblock the update of Docker images?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744615448
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
Btw, there are some rocm tests here: https://github.com/apache/tvm/blob/main/tests/python/unittest/test_target_codegen_rocm.py but I don't think they are actually being run inside the ci-gpu regressions? Otherwise, we should have encountered these version issues earlier?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r743550958
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
Interesting because I checked the current version being used, and it looks like it is rocm 4.3.0. That is why I proposed to use this version specifically:
```
$ docker run -it --rm tlcpack/ci-gpu:v0.78 bash
root@488adea49541:/# dpkg -l | grep rocm
ii rocm-clang-ocl 0.5.0.40300-52 amd64 OpenCL compilation with clang compiler.
ii rocm-cmake 0.5.0.40300-52 amd64 rocm-cmake built using CMake
ii rocm-dbgapi 0.48.0.40300-52 amd64 Library to provide AMD GPU debugger API
ii rocm-debug-agent 2.0.1.40300-52 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent)
ii rocm-dev 4.3.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime software stack
ii rocm-device-libs 1.0.0.40300-52 amd64 Radeon Open Compute - device libraries
ii rocm-gdb 10.2.40300-52 amd64 ROCgdb
ii rocm-opencl 2.0.0.40300-52 amd64 OpenCL: Open Computing Language on ROCclr
ii rocm-opencl-dev 2.0.0.40300-52 amd64 OpenCL: Open Computing Language on ROCclr
ii rocm-smi-lib 4.0.0.40300-52 amd64 AMD System Management libraries
ii rocm-utils 4.3.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime software stack
ii rocminfo 1.0.0.40300-52 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool
root@488adea49541:/#
```
I'm not very familiar with ROCm in general, so can you have a look and see what's best for us to do?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-966492561
> > per discussion with @masahi , waiting for @leandron to post a patch to fix LLVM 13 builds.
>
> I should clarify that the patch to fix LLVM 13 is not required. As @jtuyls suggested, we should install llvm 13 in https://github.com/apache/tvm/blob/main/docker/install/ubuntu1804_install_llvm.sh to avoid problem when TVM is built with rocm 4.3.
>
> @leandron So can you update `ubuntu1804_install_llvm.sh` as well? Or update to rocm 4.2 instead, which doesn't require llvm 13.
Yes, I've done this in #9498, as I consider it is a separate change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r743490081
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
I think we should use ROCM version 4.2 here. ROCM version 4.3 includes LLVM 13, which doesn't build with TVM, see https://discuss.tvm.apache.org/t/rocm-target-fails-with-llvm-error/11208/2. ROCM version 4.2 includes LLVM 12, which works.
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
I think we should use ROCM version 4.2 here. ROCM version 4.3 includes LLVM 13, which doesn't build with TVM, see https://discuss.tvm.apache.org/t/rocm-target-fails-with-llvm-error/11208/2. ROCM version 4.2 includes LLVM 12, which works.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744017263
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
We don't have to use rocm's fork of llvm 13. Rocm 4.3 works fine
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jtuyls commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
jtuyls commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-957399046
I was just looking at this as well. However, I had to update the lld to lld-12 as well. Otherwise, I am seeing following issue when building the demo_rocm docker image:
```
The following packages have unmet dependencies:
lld : Depends: lld-14 (>= 14~) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
The command '/bin/sh -c bash /install/ubuntu_install_rocm.sh' returned a non-zero code: 100
ERROR: docker build failed.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] Mousius commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
Mousius commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r742928940
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,8 +21,8 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
apt-get update && apt-get install -y \
rocm-dev \
lld && \
Review comment:
I think it's just this @leandron:
```suggestion
lld-12 && \
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] areusch commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
areusch commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r746216146
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
from tests/scripts/task_python_unittest_gpuonly.sh:
```
export TVM_TEST_TARGETS="cuda;opencl;metal;rocm;nvptx;opencl -device=mali,aocl_sw_emu"
```
these are the targets exercised on ci-gpu. looks like rocm should be.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#issuecomment-964755102
> per discussion with @masahi , waiting for @leandron to post a patch to fix LLVM 13 builds.
I should clarify that the patch to fix LLVM 13 is not required. As @jtuyls suggested, we should install llvm 13 in https://github.com/apache/tvm/blob/main/docker/install/ubuntu1804_install_llvm.sh to avoid problem when TVM is built with rocm 4.3.
@leandron So can you update `ubuntu1804_install_llvm.sh` as well? Or update to rocm 4.2 instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
leandron commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744543873
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
> @leandron can you also add the llvm 13 build fix in my discuss post above?
I think these are quite separate changes. Would you mind submitting your LLVM 13 fix on a separate PR, that we can then review and merge, then rebase this one and merge?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on a change in pull request #9425: Fix repository URL in ubuntu_install_rocm.sh
Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744166777
##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
set -o pipefail
# Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
Review comment:
Yes rocm 4.3 apparently requires llvm 13. I confirmed that rocm 4.3 + the upstream llvm 13 works, but to build TVM with the upstream llvm 13, we need to fix one line in `codegen_llvm.cc`:
https://discuss.tvm.apache.org/t/rocm-target-fails-with-llvm-error/11208/6
There is also an open issue for building with llvm 13 https://github.com/apache/tvm/issues/9319
@leandron can you also add the llvm 13 build fix in my discuss post above?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org