You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/04/24 17:24:50 UTC

[GitHub] [tvm] tqchen opened a new pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

tqchen opened a new pull request #7918:
URL: https://github.com/apache/tvm/pull/7918


   Previouslyw we set the context of the DSO module to be the root module.
   This can cause problem when the root module is not the dso module and
   get destructued early (but we still need the dso module).
   
   This PR makes the following change to fix the problem.
   
   - Merge multiple DSO modules to one during export.
   - Set the context to be the (only one) dso module.
   - Updated testcase to cover the problem.
   
   The enhancement creates some restrictions on the dso import hierachy
   (all dso modules needs to be merged without a cycle). This is the case
   for our current use scenario. The merged logic is also more consistent
   as the library itself is merged.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] areusch merged pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
areusch merged pull request #7918:
URL: https://github.com/apache/tvm/pull/7918


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on a change in pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#discussion_r620538434



##########
File path: src/target/codegen.cc
##########
@@ -110,32 +115,100 @@ class ModuleSerializer {
 
   // invariance: root module is always at location 0.
   // The module order is collected via DFS
+  // This function merges all the DSO exportable module into
+  // a single one as this is also what happens in the final hierachy
   void CreateModuleIndex() {
     std::unordered_set<const runtime::ModuleNode*> visited{mod_.operator->()};
     std::vector<runtime::ModuleNode*> stack{mod_.operator->()};
     uint64_t module_index = 0;
 
-    while (!stack.empty()) {
-      runtime::ModuleNode* n = stack.back();
-      stack.pop_back();
-      mod2index_[n] = module_index++;
-      mod_vec_.emplace_back(n);
-      for (runtime::Module m : n->imports()) {
+    auto fpush_imports_to_stack = [&](runtime::ModuleNode* node) {
+      for (runtime::Module m : node->imports()) {
         runtime::ModuleNode* next = m.operator->();
         if (visited.count(next) == 0) {
           visited.insert(next);
           stack.push_back(next);
         }
       }
+    };
+
+    std::vector<runtime::ModuleNode*> dso_exportable_boundary;
+
+    // Create module index that merges all dso module into a single group.
+    //
+    // Do a two phase visit, to ensure dso module's index
+    // is always bigger than a parent of any dso module
+    // and smaller than children of any dso module.
+    //
+    // Error will be raised in CreateImportTree
+    // if merging dso module causes a cycle in the import tree
+
+    // Phase 0: only expand non-dso-module and record the boundary.
+    while (!stack.empty()) {
+      runtime::ModuleNode* n = stack.back();
+      stack.pop_back();
+      if (DSOExportable(n)) {
+        // do not recursively expand dso modules
+        // we will expand in phase 1
+        dso_exportable_boundary.emplace_back(n);
+      } else {
+        // expand the non-dso modules
+        mod2index_[n] = module_index++;
+        mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>({n}));
+        fpush_imports_to_stack(n);
+      }
+    }
+    if (dso_exportable_boundary.size() == 0) return;
+
+    // create the slot for dso exportable modules
+    uint64_t dso_module_index = module_index++;
+    mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>());
+
+    stack = std::move(dso_exportable_boundary);

Review comment:
       I still think this is better given we are still manipulating the stack(while dso_exportable boundary is not the stack)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen edited a comment on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827051612


   Thanks @areusch, there are indeed no cycles in the import relation before we export, and a cycles will only be created when we try to merge all dso modules into one. The load process always assumes a DAG(the merging happens during exportation). Please take another look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on a change in pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#discussion_r620534314



##########
File path: src/runtime/library_module.cc
##########
@@ -125,10 +125,11 @@ Module LoadModuleFromBinary(const std::string& type_key, dmlc::Stream* stream) {
  * \brief Load and append module blob to module list
  * \param mblob The module blob.
  * \param lib The library.
- *
- * \return Root Module.
+ * \param root_module the output root module
+ * \param dso_ctx_addr the output dso module
  */
-runtime::Module ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib) {
+void ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib, runtime::Module* root_module,
+                       runtime::ModuleNode** dso_ctx_addr = nullptr) {

Review comment:
       good catch




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-828014886


   ping @areusch for another look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen edited a comment on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827051612


   Thanks @areusch, there is indeed no cycles in the import tree before we export, and the cycle will only be created when we try to merge  dso modules. The load process always assumes a DAG(the merging happens during exportation). Please take another look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827141931


   First of all, I do not think such a hierachy is useful. In terms of the naming overlap, while it might be possible to do so in the compiler for artifacts checks to the extent when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] areusch commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
areusch commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827067378


   @tqchen if the import tree has no cycles to begin with, could you give an example where merging could create such a cycle? I think one of the children of a DSO-exportable module would need to import a non-child of that DSO-exportable module?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-826151957


   cc @jwfromm @mbrookhart should fix the loading problem. cc @FrozenGene @junrushao1994 @areusch 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen edited a comment on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827091366


   This is only hypothetical, but we can create a module that looks like this in the import chain:`dso0 <- cuda <-dso1`. Of course right now I cannot think of a practical setting where such import chain is needed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-826151957


   cc @jwfromm @mbrookhart should fix the loading problem. cc @FrozenGene @junrushao1994 @areusch 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on a change in pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#discussion_r620534032



##########
File path: src/runtime/library_module.cc
##########
@@ -143,14 +144,20 @@ runtime::Module ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib) {
   std::vector<Module> modules;
   std::vector<uint64_t> import_tree_row_ptr;
   std::vector<uint64_t> import_tree_child_indices;
+  int num_dso_module = 0;
+
   for (uint64_t i = 0; i < size; ++i) {
     std::string tkey;
     ICHECK(stream->Read(&tkey));
     // Currently, _lib is for DSOModule, but we
-    // don't have loadbinary function for it currently
+    // need to be handled specially
     if (tkey == "_lib") {
       auto dso_module = Module(make_object<LibraryModuleNode>(lib));
+      *dso_ctx_addr = dso_module.operator->();
+      ++num_dso_module;
       modules.emplace_back(dso_module);
+      ICHECK_EQ(num_dso_module, 1U) << "Multiple dso module detected, please upgrade tvm "

Review comment:
       The main problem of warning is that the old behavior have a potential bug of memory leak, thus encourage re-exporting is better. Most of the old modules do not have multi-dso module behavior




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] areusch commented on a change in pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
areusch commented on a change in pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#discussion_r620461701



##########
File path: src/runtime/library_module.cc
##########
@@ -143,14 +144,20 @@ runtime::Module ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib) {
   std::vector<Module> modules;
   std::vector<uint64_t> import_tree_row_ptr;
   std::vector<uint64_t> import_tree_child_indices;
+  int num_dso_module = 0;
+
   for (uint64_t i = 0; i < size; ++i) {
     std::string tkey;
     ICHECK(stream->Read(&tkey));
     // Currently, _lib is for DSOModule, but we
-    // don't have loadbinary function for it currently
+    // need to be handled specially
     if (tkey == "_lib") {
       auto dso_module = Module(make_object<LibraryModuleNode>(lib));
+      *dso_ctx_addr = dso_module.operator->();
+      ++num_dso_module;
       modules.emplace_back(dso_module);
+      ICHECK_EQ(num_dso_module, 1U) << "Multiple dso module detected, please upgrade tvm "

Review comment:
       how come you do signed-unsigned comparison here? also, maybe a slightly better error message: "Multiple DSO-exportable modules detected, please re-export your module with the latest TVM.
   
   does it makes sense to make this a warning rather than a hard fail? it means we will break old modules, and we took some steps to work around that with GraphExecutor change.

##########
File path: src/runtime/library_module.cc
##########
@@ -143,14 +144,20 @@ runtime::Module ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib) {
   std::vector<Module> modules;
   std::vector<uint64_t> import_tree_row_ptr;
   std::vector<uint64_t> import_tree_child_indices;
+  int num_dso_module = 0;
+
   for (uint64_t i = 0; i < size; ++i) {
     std::string tkey;
     ICHECK(stream->Read(&tkey));
     // Currently, _lib is for DSOModule, but we
-    // don't have loadbinary function for it currently
+    // need to be handled specially

Review comment:
       it would be great to explain "specially:" DSOModule's implementation is provided by the wrapping LibraryModule (supplied as `lib` here). "_lib" serves as a placeholder in the module import tree to indicate where to place the DSOModule.

##########
File path: src/target/codegen.cc
##########
@@ -110,32 +115,100 @@ class ModuleSerializer {
 
   // invariance: root module is always at location 0.
   // The module order is collected via DFS
+  // This function merges all the DSO exportable module into
+  // a single one as this is also what happens in the final hierachy
   void CreateModuleIndex() {
     std::unordered_set<const runtime::ModuleNode*> visited{mod_.operator->()};
     std::vector<runtime::ModuleNode*> stack{mod_.operator->()};
     uint64_t module_index = 0;
 
-    while (!stack.empty()) {
-      runtime::ModuleNode* n = stack.back();
-      stack.pop_back();
-      mod2index_[n] = module_index++;
-      mod_vec_.emplace_back(n);
-      for (runtime::Module m : n->imports()) {
+    auto fpush_imports_to_stack = [&](runtime::ModuleNode* node) {
+      for (runtime::Module m : node->imports()) {
         runtime::ModuleNode* next = m.operator->();
         if (visited.count(next) == 0) {
           visited.insert(next);
           stack.push_back(next);
         }
       }
+    };
+
+    std::vector<runtime::ModuleNode*> dso_exportable_boundary;
+
+    // Create module index that merges all dso module into a single group.
+    //
+    // Do a two phase visit, to ensure dso module's index
+    // is always bigger than a parent of any dso module
+    // and smaller than children of any dso module.
+    //
+    // Error will be raised in CreateImportTree
+    // if merging dso module causes a cycle in the import tree
+
+    // Phase 0: only expand non-dso-module and record the boundary.
+    while (!stack.empty()) {
+      runtime::ModuleNode* n = stack.back();
+      stack.pop_back();
+      if (DSOExportable(n)) {
+        // do not recursively expand dso modules
+        // we will expand in phase 1
+        dso_exportable_boundary.emplace_back(n);
+      } else {
+        // expand the non-dso modules
+        mod2index_[n] = module_index++;
+        mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>({n}));
+        fpush_imports_to_stack(n);
+      }
+    }
+    if (dso_exportable_boundary.size() == 0) return;
+
+    // create the slot for dso exportable modules
+    uint64_t dso_module_index = module_index++;

Review comment:
       it would be great to comment why this particular slot is chosen

##########
File path: src/runtime/library_module.cc
##########
@@ -125,10 +125,11 @@ Module LoadModuleFromBinary(const std::string& type_key, dmlc::Stream* stream) {
  * \brief Load and append module blob to module list
  * \param mblob The module blob.
  * \param lib The library.
- *
- * \return Root Module.
+ * \param root_module the output root module
+ * \param dso_ctx_addr the output dso module
  */
-runtime::Module ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib) {
+void ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib, runtime::Module* root_module,
+                       runtime::ModuleNode** dso_ctx_addr = nullptr) {

Review comment:
       why is this nullptr by default? seems like it should never be nullptr..?

##########
File path: src/target/codegen.cc
##########
@@ -110,32 +115,100 @@ class ModuleSerializer {
 
   // invariance: root module is always at location 0.
   // The module order is collected via DFS
+  // This function merges all the DSO exportable module into
+  // a single one as this is also what happens in the final hierachy
   void CreateModuleIndex() {
     std::unordered_set<const runtime::ModuleNode*> visited{mod_.operator->()};
     std::vector<runtime::ModuleNode*> stack{mod_.operator->()};
     uint64_t module_index = 0;
 
-    while (!stack.empty()) {
-      runtime::ModuleNode* n = stack.back();
-      stack.pop_back();
-      mod2index_[n] = module_index++;
-      mod_vec_.emplace_back(n);
-      for (runtime::Module m : n->imports()) {
+    auto fpush_imports_to_stack = [&](runtime::ModuleNode* node) {
+      for (runtime::Module m : node->imports()) {
         runtime::ModuleNode* next = m.operator->();
         if (visited.count(next) == 0) {
           visited.insert(next);
           stack.push_back(next);
         }
       }
+    };
+
+    std::vector<runtime::ModuleNode*> dso_exportable_boundary;
+
+    // Create module index that merges all dso module into a single group.
+    //
+    // Do a two phase visit, to ensure dso module's index
+    // is always bigger than a parent of any dso module
+    // and smaller than children of any dso module.
+    //
+    // Error will be raised in CreateImportTree
+    // if merging dso module causes a cycle in the import tree
+
+    // Phase 0: only expand non-dso-module and record the boundary.
+    while (!stack.empty()) {
+      runtime::ModuleNode* n = stack.back();
+      stack.pop_back();
+      if (DSOExportable(n)) {
+        // do not recursively expand dso modules
+        // we will expand in phase 1
+        dso_exportable_boundary.emplace_back(n);
+      } else {
+        // expand the non-dso modules
+        mod2index_[n] = module_index++;
+        mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>({n}));
+        fpush_imports_to_stack(n);
+      }
+    }
+    if (dso_exportable_boundary.size() == 0) return;
+
+    // create the slot for dso exportable modules
+    uint64_t dso_module_index = module_index++;
+    mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>());
+
+    stack = std::move(dso_exportable_boundary);

Review comment:
       rather than do this, would prefer to avoid the closure and make `fpush_imports_to_stack` take an explicit argument which is the stack to push to. it will make the code more readable.

##########
File path: src/runtime/library_module.cc
##########
@@ -177,11 +186,11 @@ runtime::Module ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib) {
         module_import_addr->emplace_back(modules[child_index]);
       }
     }
+    ICHECK(!modules.empty());

Review comment:
       add an error message

##########
File path: src/target/codegen.cc
##########
@@ -110,32 +115,100 @@ class ModuleSerializer {
 
   // invariance: root module is always at location 0.
   // The module order is collected via DFS
+  // This function merges all the DSO exportable module into
+  // a single one as this is also what happens in the final hierachy
   void CreateModuleIndex() {
     std::unordered_set<const runtime::ModuleNode*> visited{mod_.operator->()};
     std::vector<runtime::ModuleNode*> stack{mod_.operator->()};
     uint64_t module_index = 0;
 
-    while (!stack.empty()) {
-      runtime::ModuleNode* n = stack.back();
-      stack.pop_back();
-      mod2index_[n] = module_index++;
-      mod_vec_.emplace_back(n);
-      for (runtime::Module m : n->imports()) {
+    auto fpush_imports_to_stack = [&](runtime::ModuleNode* node) {
+      for (runtime::Module m : node->imports()) {
         runtime::ModuleNode* next = m.operator->();
         if (visited.count(next) == 0) {
           visited.insert(next);
           stack.push_back(next);
         }
       }
+    };
+
+    std::vector<runtime::ModuleNode*> dso_exportable_boundary;
+
+    // Create module index that merges all dso module into a single group.
+    //
+    // Do a two phase visit, to ensure dso module's index
+    // is always bigger than a parent of any dso module
+    // and smaller than children of any dso module.
+    //
+    // Error will be raised in CreateImportTree
+    // if merging dso module causes a cycle in the import tree
+
+    // Phase 0: only expand non-dso-module and record the boundary.
+    while (!stack.empty()) {
+      runtime::ModuleNode* n = stack.back();
+      stack.pop_back();
+      if (DSOExportable(n)) {
+        // do not recursively expand dso modules
+        // we will expand in phase 1
+        dso_exportable_boundary.emplace_back(n);
+      } else {
+        // expand the non-dso modules
+        mod2index_[n] = module_index++;
+        mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>({n}));
+        fpush_imports_to_stack(n);
+      }
+    }
+    if (dso_exportable_boundary.size() == 0) return;
+
+    // create the slot for dso exportable modules
+    uint64_t dso_module_index = module_index++;
+    mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>());
+
+    stack = std::move(dso_exportable_boundary);
+
+    // Phase 1: expand the children of dso modules.
+    while (!stack.empty()) {
+      runtime::ModuleNode* n = stack.back();
+      stack.pop_back();
+
+      if (DSOExportable(n)) {
+        mod_group_vec_[dso_module_index].emplace_back(n);
+        mod2index_[n] = dso_module_index;
+      } else {
+        mod2index_[n] = module_index++;
+        mod_group_vec_.emplace_back(std::vector<runtime::ModuleNode*>({n}));
+      }
+      fpush_imports_to_stack(n);
     }
   }
 
   void CreateImportTree() {
-    for (auto m : mod_vec_) {
-      for (runtime::Module im : m->imports()) {
-        uint64_t mod_index = mod2index_[im.operator->()];
-        import_tree_child_indices_.push_back(mod_index);
+    std::vector<int64_t> child_indices;
+
+    for (size_t parent_index = 0; parent_index < mod_group_vec_.size(); ++parent_index) {
+      child_indices.clear();
+      for (const auto* m : mod_group_vec_[parent_index]) {
+        for (runtime::Module im : m->imports()) {
+          uint64_t mod_index = mod2index_.at(im.operator->());
+          // skip cycle when dso modules are merged together
+          if (mod_index != parent_index) {
+            child_indices.emplace_back(mod_index);
+          }
+        }
+      }
+      // sort and unique the merged indices
+      std::sort(child_indices.begin(), child_indices.end());
+      auto unique_end = std::unique(child_indices.begin(), child_indices.end());
+
+      // Check cycles due to merging dso exportable modules.
+      if (child_indices.size() != 0) {
+        CHECK_LT(parent_index, child_indices[0])

Review comment:
       it's clear if you read the code carefully, but perhaps to help people, would be great if you could:
   1) add a comment here explaining explicitly why lower indices are "higher" in the tree (just explain the traversal order)
   2) include the offending module instances in the error message, at least for now this will allow their type_key to be shown to the user and it may help them understand the problem in this case or us understand on forum when there are bug reports or questions.

##########
File path: src/target/codegen.cc
##########
@@ -75,21 +75,26 @@ class ModuleSerializer {
     if (has_import_tree) {
       // we will append one key for _import_tree
       // The layout is the same as before: binary_size, key, logic, key, logic...
-      sz = mod_vec_.size() + 1;
+      sz = mod_group_vec_.size() + 1;
     } else {
       // Keep the old behaviour
       sz = mod_->imports().size();
     }
     stream->Write(sz);
 
-    for (auto m : mod_vec_) {
-      std::string mod_type_key = m->type_key();
-      if (!DSOExportable(m)) {
-        stream->Write(mod_type_key);
-        m->SaveToBinary(stream);
-      } else if (has_import_tree) {
-        mod_type_key = "_lib";
+    for (const auto& group : mod_group_vec_) {
+      CHECK_NE(group.size(), 0);

Review comment:
       add an error message, here and below




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827091366


   This is only hypothetical, but we can create a module that looks like this in the import chain:`dso0 <- cuda <-dso1`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827051612


   Thanks @areusch, there is indeed no cycles in the import tree before we export, and the cycle will only be created when we try to merge  dso modules


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] areusch commented on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
areusch commented on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827106711


   @tqchen the practical ramifications of this are that if somehow CUDA module had code to `GetFunction("foo", query_imports=true)` and `dso0` defined that function, the module might behave differently after export -> load, correct? it's not really valid to write code that expects the import tree to be any particular way, since export/load will change it. so, it seems like we would also like some more things (perhaps not in this PR):
   1. assert at compile time we don't have two functions of the same name
   2. ensure we have a single convention by which generated Module use GetFunction or TVMBackendGetFuncFromEnv.
   
   does that seem reasonable to you? Just wondering what use cases might crop up in the future that may trip this, and whether it's worth adding asserts in Import to fix them before export is involved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen edited a comment on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827091366


   This is only hypothetical, but we can create a module that looks like this in the import chain:`dso0 <- cuda <-dso1`. Of course right now I cannot think of a practical setting where such import chain is needed. But the detection code should be created to guard potential future mis-use


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jwfromm commented on a change in pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#discussion_r621684392



##########
File path: src/runtime/library_module.cc
##########
@@ -143,14 +144,20 @@ runtime::Module ProcessModuleBlob(const char* mblob, ObjectPtr<Library> lib) {
   std::vector<Module> modules;
   std::vector<uint64_t> import_tree_row_ptr;
   std::vector<uint64_t> import_tree_child_indices;
+  int num_dso_module = 0;
+
   for (uint64_t i = 0; i < size; ++i) {
     std::string tkey;
     ICHECK(stream->Read(&tkey));
     // Currently, _lib is for DSOModule, but we
-    // don't have loadbinary function for it currently
+    // need to be handled specially
     if (tkey == "_lib") {
       auto dso_module = Module(make_object<LibraryModuleNode>(lib));
+      *dso_ctx_addr = dso_module.operator->();
+      ++num_dso_module;
       modules.emplace_back(dso_module);
+      ICHECK_EQ(num_dso_module, 1U) << "Multiple dso module detected, please upgrade tvm "

Review comment:
       I do think this message could be more clear that a re-export is needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen edited a comment on pull request #7918: [RUNTIME][BUGFIX] Fix DSO module problem when its parent get destructed.

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on pull request #7918:
URL: https://github.com/apache/tvm/pull/7918#issuecomment-827051612


   Thanks @areusch, there is indeed no cycles in the import tree before we export, and the cycle will only be created when we try to merge  dso modules. The load process always assumes a DAG(the merging happens during exportation).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org