You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/29 07:52:35 UTC
[GitHub] [incubator-mxnet] leezu commented on a change in pull request #18574: Update the onnx-tensorrt submodule

leezu commented on a change in pull request #18574:
URL: https://github.com/apache/incubator-mxnet/pull/18574#discussion_r461244621



##########
File path: ci/docker/docker-compose.yml
##########
@@ -108,6 +108,16 @@ services:
         BASE_IMAGE: nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
       cache_from:
         - ${DOCKER_CACHE_REGISTRY}/build.ubuntu_gpu_cu101:latest
+  ubuntu_gpu_cu102:

Review comment:
       Let's delete the ubuntu_gpu_cu101 then?

##########
File path: ci/docker/docker-compose.yml
##########
@@ -108,6 +108,16 @@ services:
         BASE_IMAGE: nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
       cache_from:
         - ${DOCKER_CACHE_REGISTRY}/build.ubuntu_gpu_cu101:latest
+  ubuntu_gpu_cu102:

Review comment:
       Generally we can't test all possible cuda versions and thus our strategy so far is to test the oldest and newest supported version. For simplicity, I'd recommend to stick with the existing strategy and update all tests to 10.2.

##########
File path: ci/docker/Dockerfile.build.ubuntu
##########
@@ -110,6 +108,15 @@ COPY install/requirements /work/
 RUN python3 -m pip install cmake==3.16.6 && \
     python3 -m pip install -r /work/requirements
 
+RUN git clone --recursive -b 3.5.1.1 https://github.com/google/protobuf.git && \
+    cd protobuf && \
+    ./autogen.sh && \
+    ./configure && \
+    make -j$(nproc) install && \
+    cd .. && \
+    rm -rf protobuf && \
+    ldconfig

Review comment:
       Is the version provided by Ubuntu too old?

##########
File path: ci/docker/Dockerfile.build.ubuntu
##########
@@ -110,6 +108,15 @@ COPY install/requirements /work/
 RUN python3 -m pip install cmake==3.16.6 && \
     python3 -m pip install -r /work/requirements
 
+RUN git clone --recursive -b 3.5.1.1 https://github.com/google/protobuf.git && \
+    cd protobuf && \
+    ./autogen.sh && \
+    ./configure && \
+    make -j$(nproc) install && \
+    cd .. && \
+    rm -rf protobuf && \
+    ldconfig

Review comment:
       Do you know that you can reproduce the CI build via `python ci/build.py -R --platform ubuntu_gpu_cu102 /work/runtime_functions.sh build_ubuntu_gpu_tensorrt`?

##########
File path: ci/docker/Dockerfile.build.ubuntu
##########
@@ -110,6 +108,15 @@ COPY install/requirements /work/
 RUN python3 -m pip install cmake==3.16.6 && \
     python3 -m pip install -r /work/requirements
 
+RUN git clone --recursive -b 3.5.1.1 https://github.com/google/protobuf.git && \
+    cd protobuf && \
+    ./autogen.sh && \
+    ./configure && \
+    make -j$(nproc) install && \
+    cd .. && \
+    rm -rf protobuf && \
+    ldconfig

Review comment:
       Do you know that you can reproduce the CI build via `python ci/build.py --platform ubuntu_gpu_cu102 /work/runtime_functions.sh build_ubuntu_gpu_tensorrt`?

##########
File path: ci/docker/Dockerfile.build.ubuntu
##########
@@ -110,6 +108,15 @@ COPY install/requirements /work/
 RUN python3 -m pip install cmake==3.16.6 && \
     python3 -m pip install -r /work/requirements
 
+RUN git clone --recursive -b 3.5.1.1 https://github.com/google/protobuf.git && \
+    cd protobuf && \
+    ./autogen.sh && \
+    ./configure && \
+    make -j$(nproc) install && \
+    cd .. && \
+    rm -rf protobuf && \
+    ldconfig

Review comment:
       Do you know that you can reproduce the CI build via `python ci/build.py --platform ubuntu_gpu_cu102 /work/runtime_functions.sh build_ubuntu_gpu_tensorrt`? If the newer version is really needed, we should also update the documentation.

##########
File path: ci/docker/Dockerfile.build.ubuntu
##########
@@ -137,17 +137,27 @@ RUN cd /usr/local && \
 # https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
 ARG BASE_IMAGE
 RUN export SHORT_CUDA_VERSION=${CUDA_VERSION%.*} && \
+    wget -O nvidia-ml.deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb && \
+    dpkg -i nvidia-ml.deb && \

Review comment:
       Currently the container already contains the machine learning repo in `/etc/apt/sources.list.d/nvidia-ml.list`. Why would the container team decide to break their users by removing the machine learning repo? Is there any other reason for re-installing the already existing repository?

##########
File path: ci/docker/Dockerfile.build.ubuntu
##########
@@ -137,17 +137,27 @@ RUN cd /usr/local && \
 # https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
 ARG BASE_IMAGE
 RUN export SHORT_CUDA_VERSION=${CUDA_VERSION%.*} && \
+    wget -O nvidia-ml.deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb && \
+    dpkg -i nvidia-ml.deb && \

Review comment:
       I noticed the following issue with the container though I'm not sure about the impact https://gitlab.com/nvidia/container-images/cuda/-/issues/82




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org