You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/08/25 17:59:14 UTC

[GitHub] [incubator-mxnet] Kh4L opened a new pull request #19011: [WIP] TensorRT: add INT8 with calibration

Kh4L opened a new pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011


   ## Description ##
   This PR adds INT8 with calibration support to MXNet-TensorRT.
   It enables TensorRT internal optimization to create an INT8 engine (that will contain some INT8 kernels, if they are faster than the FP16 or FP32 ones). 
   In this first version, the quantization and de-quantization values are computed during the calibration phase. During this phase (of a number of iterations set by the `calibration_iters`), the user is expect to provide samples representing the inference data, used to calibrate the engine. The inference model is slower during this phase.
   Once the calibration is done, the MXNet-TensorRT inference model is ready for fast inference with INT8. 
   
   Saving and loading of the calibration tables will be added in a later PR.
   
   ## Usage ##
   WIP
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486608559



##########
File path: src/operator/subgraph/tensorrt/onnx_to_tensorrt.cc
##########
@@ -18,7 +18,7 @@
  */
 
 /*!
- * Copyright (c) 2019 by Contributors
+ * Copyright (c) 2020 by Contributors
  * \file onnx_to_tensorrt.cc
  * \brief TensorRT integration with the MXNet executor
  * \author Marek Kolodziej, Clement Fuji Tsang

Review comment:
       If you are changing the copyright years then add yourself to the authors too ;-)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486616070



##########
File path: src/operator/subgraph/tensorrt/tensorrt_int8_calibrator.cc
##########
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * Copyright (c) 2020 by Contributors
+ * \file tensorrt-inl.h
+ * \brief TensorRT operation registration
+ * \author Serge Panev
+*/
+
+#if MXNET_USE_TENSORRT
+
+#include "./tensorrt_int8_calibrator.h"
+
+#include <atomic>
+#include <unordered_map>
+
+namespace onnx_to_tensorrt {
+
+// set the batch size before constructing the thread to execute engine
+int TRTInt8Calibrator::getBatchSize() const { return batch_size_; }
+
+TRTInt8Calibrator::TRTInt8Calibrator(
+    std::unordered_map<std::string, mxnet::NDArray> params_map,
+    std::unordered_map<std::string, std::pair<void*, size_t>> input_buffers,
+    int batch_size, int n_iter)
+    : batch_size_(batch_size),
+      done_(false),
+      params_map_(params_map),
+      input_buffers_(std::move(input_buffers)),
+      // Make sure setBatch() waits until getBatch() is called (the first time).
+      calib_running_(true),
+      batch_is_set_(false),
+      n_iter_(n_iter) {}
+
+bool TRTInt8Calibrator::setBatch(const std::unordered_map<std::string, void*>& data,
+                                 const cudaStream_t stream) {
+  std::unique_lock<std::mutex> lk(mutex_);
+  // Wait while the queue is full or calibration is running.
+  cv_.wait(lk, [&]{ return (!calib_running_ && !batch_is_set_) || done_; });
+  if (done_)
+    return false;
+  n_iter_--;
+
+  for (const auto& it : data) {
+    auto in_it = input_buffers_.find(it.first);
+    if (in_it == input_buffers_.end()) {
+      LOG(FATAL) << "TensorRT op input name '" << it.first
+                 << "' does not match with the buffer names";
+    }
+    const auto& buff_and_size = in_it->second;
+    auto status = cudaMemcpyAsync(buff_and_size.first, it.second, buff_and_size.second,
+                                  cudaMemcpyDeviceToDevice, stream);
+    if (status != cudaSuccess) {
+      LOG(FATAL) << "cudaMemcpy in  TensorRT op for '" << it.first
+                 << "' failed with " << status;
+    }
+  }
+  // TODO(spanev): see if we can use something like cudaStreamAddCallback here
+  cudaStreamSynchronize(stream);

Review comment:
       This also could return an error.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on a change in pull request #19011: TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
szha commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r482673794



##########
File path: ci/docker/runtime_functions.sh
##########
@@ -1069,6 +1069,22 @@ unittest_ubuntu_python3_gpu_nocudnn() {
     nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu
 }
 
+unittest_ubuntu_tensorrt_gpu() {
+    set -ex
+    export PYTHONPATH=./python/
+    export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0
+    export MXNET_SUBGRAPH_VERBOSE=0
+    export LD_LIBRARY_PATH=/work/mxnet/lib:$LD_LIBRARY_PATH
+    export CUDNN_VERSION=${CUDNN_VERSION:-7.0.3}
+    export MXNET_ENABLE_CYTHON=0
+    export DMLC_LOG_STACK_TRACE_DEPTH=10
+    pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100==0.24

Review comment:
       this address returns 404 for me: https://developer.download.nvidia.com/compute/redist




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486614971



##########
File path: src/operator/subgraph/tensorrt/tensorrt.cc
##########
@@ -289,6 +293,21 @@ OpStatePtr TRTCreateState(const nnvm::NodeAttrs& attrs, Context ctx,
     } else if (it_inputs != inputs_to_idx.end()) {
       shape_inputs[i] = in_shape[it_inputs->second];
       dtype_inputs[i] = in_type[it_inputs->second];
+      if (tensorrt_int8) {
+        int dtype_size;
+        if (dtype_inputs[i] == mshadow::kFloat32) {
+          dtype_size = 4;
+        } else if (dtype_inputs[i] == mshadow::kFloat16) {
+          dtype_size = 2;
+        } else {
+          LOG(FATAL) << "TensorRT op supports only float32 and float16 inputs.";
+        }
+        size_t buffer_size = shape_inputs[i].Size() * dtype_size;
+        void *ptr;
+        cudaMalloc(&ptr, buffer_size);

Review comment:
       Can we either use the storage manager or at least check for errors here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486610844



##########
File path: src/operator/subgraph/tensorrt/onnx_to_tensorrt.h
##########
@@ -20,7 +20,7 @@
  */
 
 /*!
- * Copyright (c) 2019 by Contributors
+ * Copyright (c) 2020 by Contributors

Review comment:
       2019 - 2020




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Kh4L commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
Kh4L commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r487700435



##########
File path: ci/docker/runtime_functions.sh
##########
@@ -1069,6 +1069,22 @@ unittest_ubuntu_python3_gpu_nocudnn() {
     nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu
 }
 
+unittest_ubuntu_tensorrt_gpu() {
+    set -ex
+    export PYTHONPATH=./python/
+    export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0
+    export MXNET_SUBGRAPH_VERBOSE=0
+    export LD_LIBRARY_PATH=/work/mxnet/lib:$LD_LIBRARY_PATH
+    export CUDNN_VERSION=${CUDNN_VERSION:-7.0.3}
+    export MXNET_ENABLE_CYTHON=0
+    export DMLC_LOG_STACK_TRACE_DEPTH=10
+    pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100==0.24
+    wget -nc http://data.mxnet.io/data/val_256_q90.rec
+    python3.6 tests/python/tensorrt/rec2idx.py val_256_q90.rec val_256_q90.idx
+    nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_trt_gpu.xml --verbose --nocapture tests/python/tensorrt/

Review comment:
       @szha is this fine to leave it as nosetests as this is the 1.8 branch?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Kh4L commented on pull request #19011: [WIP] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
Kh4L commented on pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#issuecomment-685011377


   @mxnet-bot run ci [sanity]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] marcoabreu commented on a change in pull request #19011: [WIP] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
marcoabreu commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r476750212



##########
File path: ci/docker/runtime_functions.sh
##########
@@ -1069,6 +1069,22 @@ unittest_ubuntu_python3_gpu_nocudnn() {
     nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu
 }
 
+unittest_ubuntu_tensorrt_gpu() {
+    set -ex
+    export PYTHONPATH=./python/
+    export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0
+    export MXNET_SUBGRAPH_VERBOSE=0
+    export LD_LIBRARY_PATH=/work/mxnet/lib:$LD_LIBRARY_PATH
+    export CUDNN_VERSION=${CUDNN_VERSION:-7.0.3}
+    export MXNET_ENABLE_CYTHON=0
+    export DMLC_LOG_STACK_TRACE_DEPTH=10
+    pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100==0.24

Review comment:
       @leezu could you assist reviewing this PR? I think the structure wrt installation of packages and creating a new job might not be in line. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486607836



##########
File path: tests/python/tensorrt/rec2idx.py
##########
@@ -0,0 +1,107 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       DALI has this script, why including it here too?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486613800



##########
File path: src/operator/subgraph/tensorrt/tensorrt-inl.h
##########
@@ -56,17 +58,29 @@ struct TRTParam {
   std::unordered_map<std::string, uint32_t> inputs_to_idx;
   std::unordered_map<std::string, uint32_t> outputs_to_idx;
   std::unordered_map<std::string, NDArray> params_map;
+  bool fp16_mode;
+  bool int8_mode;

Review comment:
       Maybe `TRTPrecision` enum?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx merged pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx merged pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] samskalicky commented on a change in pull request #19011: TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
samskalicky commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r482684492



##########
File path: ci/docker/runtime_functions.sh
##########
@@ -1069,6 +1069,22 @@ unittest_ubuntu_python3_gpu_nocudnn() {
     nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu
 }
 
+unittest_ubuntu_tensorrt_gpu() {
+    set -ex
+    export PYTHONPATH=./python/
+    export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0
+    export MXNET_SUBGRAPH_VERBOSE=0
+    export LD_LIBRARY_PATH=/work/mxnet/lib:$LD_LIBRARY_PATH
+    export CUDNN_VERSION=${CUDNN_VERSION:-7.0.3}
+    export MXNET_ENABLE_CYTHON=0
+    export DMLC_LOG_STACK_TRACE_DEPTH=10
+    pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100==0.24
+    wget -nc http://data.mxnet.io/data/val_256_q90.rec
+    python3.6 tests/python/tensorrt/rec2idx.py val_256_q90.rec val_256_q90.idx
+    nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_trt_gpu.xml --verbose --nocapture tests/python/tensorrt/

Review comment:
       I missed that discussion, can you point me to the RFC so I can catch up?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] marcoabreu commented on pull request #19011: [WIP] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
marcoabreu commented on pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#issuecomment-680278552


   @KellenSunderland


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KellenSunderland commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
KellenSunderland commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486688214



##########
File path: src/operator/subgraph/tensorrt/onnx_to_tensorrt.cc
##########
@@ -18,7 +18,7 @@
  */
 
 /*!
- * Copyright (c) 2019 by Contributors
+ * Copyright (c) 2020 by Contributors
  * \file onnx_to_tensorrt.cc
  * \brief TensorRT integration with the MXNet executor
  * \author Marek Kolodziej, Clement Fuji Tsang

Review comment:
       +1




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Kh4L commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
Kh4L commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r483783893



##########
File path: ci/docker/runtime_functions.sh
##########
@@ -1069,6 +1069,22 @@ unittest_ubuntu_python3_gpu_nocudnn() {
     nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu
 }
 
+unittest_ubuntu_tensorrt_gpu() {
+    set -ex
+    export PYTHONPATH=./python/
+    export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0
+    export MXNET_SUBGRAPH_VERBOSE=0
+    export LD_LIBRARY_PATH=/work/mxnet/lib:$LD_LIBRARY_PATH
+    export CUDNN_VERSION=${CUDNN_VERSION:-7.0.3}
+    export MXNET_ENABLE_CYTHON=0
+    export DMLC_LOG_STACK_TRACE_DEPTH=10
+    pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100==0.24
+    wget -nc http://data.mxnet.io/data/val_256_q90.rec
+    python3.6 tests/python/tensorrt/rec2idx.py val_256_q90.rec val_256_q90.idx
+    nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_trt_gpu.xml --verbose --nocapture tests/python/tensorrt/

Review comment:
       I forgot to add the [1.x] tag, this is 1.x PR, where we still use nosetests




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19011: [WIP] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#issuecomment-680181380


   Hey @Kh4L , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [website, centos-cpu, unix-gpu, miscellaneous, windows-gpu, centos-gpu, clang, unix-cpu, windows-cpu, edge, sanity]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Kh4L commented on a change in pull request #19011: TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
Kh4L commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r483783097



##########
File path: ci/docker/runtime_functions.sh
##########
@@ -1069,6 +1069,22 @@ unittest_ubuntu_python3_gpu_nocudnn() {
     nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu
 }
 
+unittest_ubuntu_tensorrt_gpu() {
+    set -ex
+    export PYTHONPATH=./python/
+    export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0
+    export MXNET_SUBGRAPH_VERBOSE=0
+    export LD_LIBRARY_PATH=/work/mxnet/lib:$LD_LIBRARY_PATH
+    export CUDNN_VERSION=${CUDNN_VERSION:-7.0.3}
+    export MXNET_ENABLE_CYTHON=0
+    export DMLC_LOG_STACK_TRACE_DEPTH=10
+    pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100==0.24

Review comment:
       https://developer.download.nvidia.com/compute/redist isn't supposed to be accessed by itself
   The pip install line is the standard way of installing DALI:
   https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Kh4L commented on pull request #19011: TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
Kh4L commented on pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#issuecomment-685407395


   @mxnet-bot run ci [unix-gpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486611111



##########
File path: src/operator/subgraph/tensorrt/tensorrt-inl.h
##########
@@ -20,7 +20,7 @@
  */
 
 /*!
- * Copyright (c) 2019 by Contributors
+ * Copyright (c) 2020 by Contributors

Review comment:
       Same




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on a change in pull request #19011: TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
szha commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r482673562



##########
File path: ci/docker/runtime_functions.sh
##########
@@ -1069,6 +1069,22 @@ unittest_ubuntu_python3_gpu_nocudnn() {
     nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_gpu.xml --verbose tests/python/gpu
 }
 
+unittest_ubuntu_tensorrt_gpu() {
+    set -ex
+    export PYTHONPATH=./python/
+    export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0
+    export MXNET_SUBGRAPH_VERBOSE=0
+    export LD_LIBRARY_PATH=/work/mxnet/lib:$LD_LIBRARY_PATH
+    export CUDNN_VERSION=${CUDNN_VERSION:-7.0.3}
+    export MXNET_ENABLE_CYTHON=0
+    export DMLC_LOG_STACK_TRACE_DEPTH=10
+    pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100==0.24
+    wget -nc http://data.mxnet.io/data/val_256_q90.rec
+    python3.6 tests/python/tensorrt/rec2idx.py val_256_q90.rec val_256_q90.idx
+    nosetests-3.4 $NOSE_COVERAGE_ARGUMENTS $NOSE_TIMER_ARGUMENTS --with-xunit --xunit-file nosetests_trt_gpu.xml --verbose --nocapture tests/python/tensorrt/

Review comment:
       in mxnet we switched to pytest and are no longer using nose.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486609772



##########
File path: src/operator/subgraph/tensorrt/onnx_to_tensorrt.cc
##########
@@ -112,18 +117,81 @@ std::tuple<unique_ptr<nvinfer1::ICudaEngine>,
       }
       throw dmlc::Error("Cannot parse ONNX into TensorRT Engine");
   }
-  if (dmlc::GetEnv("MXNET_TENSORRT_USE_FP16", true)) {
+  trt_builder->setMaxBatchSize(max_batch_size);
+  std::future<onnx_to_tensorrt::unique_ptr<nvinfer1::ICudaEngine>> future_int8_engine;
+#if NV_TENSORRT_MAJOR > 6
+  auto builder_config = InferObject(trt_builder->createBuilderConfig());
+
+  if (fp16_mode) {
+    if (trt_builder->platformHasFastFp16()) {
+      builder_config->setFlag(nvinfer1::BuilderFlag::kFP16);
+    } else {
+      LOG(WARNING) << "TensorRT can't use fp16 on this platform";
+    }
+  }
+
+  builder_config->setMaxWorkspaceSize(max_workspace_size);
+  builder_config->setFlag(nvinfer1::BuilderFlag::kDEBUG);

Review comment:
       Is this intentional?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19011: TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#issuecomment-685407426


   Jenkins CI successfully triggered : [unix-gpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486613508



##########
File path: tests/python/tensorrt/test_tensorrt.py
##########
@@ -0,0 +1,202 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import ctypes
+import mxnet as mx
+from mxnet.base import SymbolHandle, check_call, _LIB, mx_uint, c_str_array, c_str, mx_real_t
+from mxnet.symbol import Symbol
+import numpy as np
+from mxnet.test_utils import assert_almost_equal
+from mxnet import gluon
+from mxnet.gluon import nn
+from mxnet import nd
+from mxnet.gluon.model_zoo import vision
+
+####################################
+######### FP32/FP16 tests ##########
+####################################
+
+# Using RN50 to test TRT integration
+def get_model(batch_shape, gluon_model=False):
+    if not gluon_model:
+        path = 'resnet50_v2'
+        if not os.path.exists(path):
+            model = vision.resnet50_v2(pretrained=True)
+            model.hybridize()
+            model.forward(mx.nd.zeros(batch_shape))
+            model.export(path)
+        sym, arg_params, aux_params = mx.model.load_checkpoint(path, 0)
+        return sym, arg_params, aux_params
+    else:
+        model = vision.resnet50_v2(pretrained=True)
+        model.hybridize()
+        return model
+
+
+def get_default_executor(input_data):
+     sym, arg_params, aux_params = get_model(batch_shape=input_data.shape)
+     executor = sym.simple_bind(ctx=mx.gpu(0), data=input_data.shape, grad_req='null', force_rebind=True)
+     executor.copy_params_from(arg_params, aux_params)
+     return executor    
+
+def get_baseline(input_data):
+    executor = get_default_executor(input_data) 
+    output = executor.forward(is_train=False, data=input_data)
+    return output
+
+
+def check_tensorrt_symbol(baseline, input_data, fp16_mode, tol):
+    sym, arg_params, aux_params = get_model(batch_shape=input_data.shape)
+    trt_sym = sym.optimize_for('TensorRT', args=arg_params, aux=aux_params, ctx=mx.gpu(0),
+                               fp16_mode=fp16_mode)

Review comment:
       :unamused: 
   Maybe `precision=some enum` instead of those `fp16_mode` and `int8_mode`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19011: [WIP] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#issuecomment-685011405


   Jenkins CI successfully triggered : [sanity]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #19011: [1.x] TensorRT: add INT8 with calibration

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011#discussion_r486608296



##########
File path: src/operator/subgraph/tensorrt/onnx_to_tensorrt.cc
##########
@@ -18,7 +18,7 @@
  */
 
 /*!
- * Copyright (c) 2019 by Contributors
+ * Copyright (c) 2020 by Contributors

Review comment:
       2019 - 2020?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org