You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/09/06 01:51:35 UTC

[GitHub] [tvm] jcf94 commented on a change in pull request #8808: [BYOC][TensorRT] Add TensorRT own int8 calibration support to TensorRT BYOC integration

jcf94 commented on a change in pull request #8808:
URL: https://github.com/apache/tvm/pull/8808#discussion_r702516070



##########
File path: src/runtime/contrib/tensorrt/tensorrt_runtime.cc
##########
@@ -225,55 +267,75 @@ class TensorRTRuntime : public JSONRuntimeBase {
   TensorRTEngineAndContext& GetOrBuildEngine() {
     int batch_size = GetBatchSize();
     int compatible_engine_batch_size = -1;
-    if (FindCompatibleEngine(batch_size, &compatible_engine_batch_size)) {
+    bool find_engine_flag = FindCompatibleEngine(batch_size, &compatible_engine_batch_size);
+    const bool use_int8 = (dmlc::GetEnv("TVM_TENSORRT_USE_INT8", 0) != 0);
+    const bool int8_calibration_not_used_or_not_complete = (calibrator_ != nullptr && num_calibration_batches_remaining_ != 0); 
+    if (find_engine_flag && (!use_int8 || calibrator_ == nullptr || int8_calibration_not_used_or_not_complete)) {
       // A compatible engine already exists.
       return trt_engine_cache_.at(std::make_pair(symbol_name_, compatible_engine_batch_size));
     }
+
     // For single engine mode, remove previous engine and update max_batch_size.
     if (!multi_engine_mode_) {
       DestroyEngines();
       max_batch_size_ = batch_size;
     }
     DLOG(INFO) << "Building new TensorRT engine for subgraph " << symbol_name_
                << " with batch size " << batch_size;
-    const bool use_fp16 = dmlc::GetEnv("TVM_TENSORRT_USE_FP16", false);
-    TensorRTBuilder builder(&logger_, data_entry_, max_workspace_size_, use_implicit_batch_,
-                            use_fp16, batch_size);
 
-    // Add inputs and constants.
-    for (size_t i = 0; i < input_nodes_.size(); ++i) {
-      auto nid = input_nodes_[i];
-      const auto& node = nodes_[nid];
-      std::string name = node.GetOpName();
-      if (node.GetOpType() == "input") {
-        builder.AddInput(nid, EntryID(nid, 0), node);
-      } else {
-        ICHECK_EQ(node.GetOpType(), "const");
-        uint32_t eid = EntryID(nid, 0);
-        builder.AddConstant(nid, data_entry_[eid]);
+    // Build engine.
+    if (calibrator_ != nullptr && num_calibration_batches_remaining_ == 0) {
+      // Calibration complete and build int8 engine
+      BuildEngineFromJson(batch_size);

Review comment:
       Overall looks good to me, just I still have one suggestion like before: there's no need to build engine every time(what if I have many calibrator data sets?).
   When there is remaining `num_calibration_batches_remaining_`, only keeps the calibrator object is enough.
   
   Or we can merge it first for a basic support, and I can help fix this later when I have time. 😃 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org