You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/08/28 00:44:57 UTC

[GitHub] [tvm] junrushao opened a new pull request, #12628: [MetaSchedule] Introduce `MergedDatabase`

junrushao opened a new pull request, #12628:
URL: https://github.com/apache/tvm/pull/12628

   Following up #12520 and #12626, this PR introduces `MergedDatabase`,
   which allow users to compose multiple databases so that the high-level
   IR could select the best tuning records among them.
   
   The `MergedDatabase` also comes with an extra field `preferred` to allow
   users to override tuning records from other databases. A classic usecase
   of the `preferred` parameter is through handcrafted schedule functions:
   
   ```python
   def schedule_fn(sch: tir.Schedule) -> bool:
     if "nn_conv2d" in sch.mod.attrs["task_name"]:
       handcrafted_scheduling(sch)
       return True
     return False
   
   with ms.database.MergedDatabase(
     databases=[database],
     preferred=ms.database.ScheduleFn(schedule_fn),
   ):
     lib = relay.build(...)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] MasterJH5574 commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
MasterJH5574 commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958501461


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.
+   * \param databases The databases to be merged.
+   * \param fallback The fallback databases. If all the databases didn't answer a query,
+   * the fallback databases will be used.

Review Comment:
   Sorry I guess I now understand the logic - we won’t merge databases in either `preferred` or `fallback`. Thank you! Is it okay to add a short note saying that `preferred` and `fallback` won’t be merged?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
junrushao commented on PR #12628:
URL: https://github.com/apache/tvm/pull/12628#issuecomment-1232537711

   Significantly amended the APIs and docs according to feedbacks from @sunggg @MasterJH5574 @spectrometerHBH @Kathryn-cat @tqchen. Please re-review :-) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao merged pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
junrushao merged PR #12628:
URL: https://github.com/apache/tvm/pull/12628


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] zxybazh commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
zxybazh commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r957951349


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.

Review Comment:
   Nit, maybe better note that `preferred` is an array of preferred databases and the preferred databases are given in specific order so that any database in the are preferred than the databases after it.



##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.
+   * \param databases The databases to be merged.
+   * \param fallback The fallback databases. If all the databases didn't answer a query,

Review Comment:
   Ditto, note this is an array, and there's an order in the array.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958077996


##########
tests/python/unittest/test_link_params.py:
##########
@@ -412,17 +412,12 @@ def schedule_fn(sch):
             return True
         return False
 
-    link_params = True
-

Review Comment:
   It's per @masahi's suggestion from the previous PR: https://github.com/apache/tvm/pull/12626#discussion_r956952473



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] sunggg commented on a diff in pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
sunggg commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r959619205


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,22 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief A database composed of multiple databases, allowing users to guide IR rewriting using
+   * combined knowledge of those databases. To each query, it returns the best record among all the
+   * databases given.
+   * \param databases The list of databases to be combined.
+   * \return The combined database.
+   */
+  TVM_DLL static Database UnionDatabase(Array<Database, void> databases);

Review Comment:
   When I saw `UnionDatabase` and `OrderedUnionDatabase`, I initially thought the opposite: I thought `OrderedUnionDatabase` sorts tuning records and returns the best one lol 
   This might be because I usually assign the name without prefix "Ordered/Sorted" by considering "unsorted" as default. I guess this is a matter of preference, so wonder how other folks think. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958077539


##########
src/meta_schedule/database/merged_database.cc:
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+#include "../utils.h"
+
+namespace tvm {
+namespace meta_schedule {
+
+class MergedDatabaseNode : public DatabaseNode {
+ public:
+  Array<Database> preferred;
+  Array<Database> databases;
+  Array<Database> fallback;
+
+  void VisitAttrs(AttrVisitor* v) {
+    v->Visit("preferred", &preferred);
+    v->Visit("databases", &databases);
+    v->Visit("fallback", &fallback);
+  }
+
+  static constexpr const char* _type_key = "meta_schedule.MergedDatabase";
+  TVM_DECLARE_FINAL_OBJECT_INFO(MergedDatabaseNode, DatabaseNode);
+
+ public:
+  Optional<TuningRecord> QueryTuningRecord(const IRModule& mod, const Target& target,
+                                           const String& task_name) final {
+    for (const Database& db : preferred) {
+      if (Optional<TuningRecord> record = db->QueryTuningRecord(mod, target, task_name)) {
+        return record;
+      }
+    }
+    std::vector<TuningRecord> results;
+    results.reserve(databases.size());
+    for (const Database& db : databases) {
+      if (Optional<TuningRecord> record = db->QueryTuningRecord(mod, target, task_name)) {
+        ICHECK(record.value()->run_secs.defined());
+        results.push_back(record.value());
+      }
+    }
+    std::sort(results.begin(), results.end(), SortTuningRecordByMeanRunSecs());
+    if (!results.empty()) {
+      return results[0];
+    }
+    for (const Database& db : fallback) {
+      if (Optional<TuningRecord> record = db->QueryTuningRecord(mod, target, task_name)) {
+        return record;
+      }
+    }
+    return NullOpt;
+  }
+
+  bool HasWorkload(const IRModule& mod) final {
+    LOG(FATAL) << "NotImplementedError: MergedDatabase.HasWorkload";

Review Comment:
   In fact it's not doable. Consider a case where there could be mocked databases like ScheduleFnDatabase



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r960050399


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,22 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief A database composed of multiple databases, allowing users to guide IR rewriting using
+   * combined knowledge of those databases. To each query, it returns the best record among all the
+   * databases given.
+   * \param databases The list of databases to be combined.
+   * \return The combined database.
+   */
+  TVM_DLL static Database UnionDatabase(Array<Database, void> databases);

Review Comment:
   😂 yeah this is a valid perspective too...Is there any better naming for them?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] sunggg commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
sunggg commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958581651


##########
python/tvm/meta_schedule/database/merged_database.py:
##########
@@ -0,0 +1,94 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""A database consists of multiple databases."""
+from typing import List, Union
+
+from tvm._ffi import register_object
+
+from .. import _ffi_api
+from .database import Database
+
+
+@register_object("meta_schedule.MergedDatabase")
+class MergedDatabase(Database):
+    """A database composed of multiple databases, allowing users to guide IR rewriting using
+    combined knowledge of those databases.
+
+    Besides querying from all databases and picking the best running time, this database also
+    comes with two extra sets of databases:
+    - preferred: If the preferred database responds to a query, all responses of other databases
+    will be overridden and ignored.
+    - fallback: If all databases don't respond to a query, the fallback databases will be used.
+
+    Examples
+    --------
+    An example of using the merged database:
+
+    .. code-block:: python
+    def schedule_conv2d(sch: tir.Schedule) -> bool:
+        if "nn_conv2d" in sch.mod.attrs["task_name"]:
+            handcrafted_scheduling(sch)
+            return True
+        return False
+
+    with ms.database.MergedDatabase(
+        preferred=ScheduleFnDatabase(schedule_conv2d),          # override schedule for conv2d
+        databases=[existing_db0, existing_db1, existing_db2],   # use existing databases
+        fallback=libtorch,                                      # fallback to libtorch

Review Comment:
   Would you elaborate a little more in the comment about database for `libtorch`?
   What kind of data entry does it have? Does it also contain MetaSchedule trace and its performance? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r960926524


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,22 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief A database composed of multiple databases, allowing users to guide IR rewriting using
+   * combined knowledge of those databases. To each query, it returns the best record among all the
+   * databases given.
+   * \param databases The list of databases to be combined.
+   * \return The combined database.
+   */
+  TVM_DLL static Database UnionDatabase(Array<Database, void> databases);

Review Comment:
   Thank you @sunggg!!! It definitely clarifies a lot! adding this to the doc 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on PR #12628:
URL: https://github.com/apache/tvm/pull/12628#issuecomment-1231034010

   This is ready for review! CC: @sunggg @zxybazh @vinx13 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on PR #12628:
URL: https://github.com/apache/tvm/pull/12628#issuecomment-1229348006

   Depends on #12626.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on PR #12628:
URL: https://github.com/apache/tvm/pull/12628#issuecomment-1231874507

   Might need some deliberation on the API design


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958681374


##########
python/tvm/meta_schedule/database/merged_database.py:
##########
@@ -0,0 +1,94 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""A database consists of multiple databases."""
+from typing import List, Union
+
+from tvm._ffi import register_object
+
+from .. import _ffi_api
+from .database import Database
+
+
+@register_object("meta_schedule.MergedDatabase")
+class MergedDatabase(Database):
+    """A database composed of multiple databases, allowing users to guide IR rewriting using
+    combined knowledge of those databases.
+
+    Besides querying from all databases and picking the best running time, this database also
+    comes with two extra sets of databases:
+    - preferred: If the preferred database responds to a query, all responses of other databases
+    will be overridden and ignored.
+    - fallback: If all databases don't respond to a query, the fallback databases will be used.
+
+    Examples
+    --------
+    An example of using the merged database:
+
+    .. code-block:: python
+    def schedule_conv2d(sch: tir.Schedule) -> bool:
+        if "nn_conv2d" in sch.mod.attrs["task_name"]:
+            handcrafted_scheduling(sch)
+            return True
+        return False
+
+    with ms.database.MergedDatabase(
+        preferred=ScheduleFnDatabase(schedule_conv2d),          # override schedule for conv2d
+        databases=[existing_db0, existing_db1, existing_db2],   # use existing databases
+        fallback=libtorch,                                      # fallback to libtorch

Review Comment:
   It doesn't use any data entries, possibly no performance at all, but instead, just like `ScheduleFnDatabase`, it's more like a mocked database - we pretend that it's a database, but the underlying implementation might be the simplest logic to offload PrimFunc to libtorch PackedFuncs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] sunggg commented on a diff in pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
sunggg commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r960742815


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,22 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief A database composed of multiple databases, allowing users to guide IR rewriting using
+   * combined knowledge of those databases. To each query, it returns the best record among all the
+   * databases given.
+   * \param databases The list of databases to be combined.
+   * \return The combined database.
+   */
+  TVM_DLL static Database UnionDatabase(Array<Database, void> databases);

Review Comment:
   Discussed offline with @junrushao and learned about more context behind the naming. 
   I'm okay with the naming, but would like to suggest more documentation with examples for better guidance.
   Some examples we could use:
   ```python
   ### Examples 
   Assumption: 
   * db1, d2 does not have tuning records for the target workload. 
   * each db3, db4, db5 have tuning records r3, r4, r5 for target workload respectively.  
   
   Case 1. `UnionDatabase`:  
   merged_db = ms.database.UnionDatabase(                    
       db1, # no record
       db2, # no record
       db3, # has r3
       db4  # has r4
   )
   # This return best one between r3 and r4
   merged_db.query_tuning_record(..., target_workload) 
   
   Case 2. `OrderedUnionDatabase`
   merged_db = ms.database.OrderedUnionDatabase(                    
       db1, # no record
       db2, # no record
       db3, # has r3
       db4  # has r4
   )
   # This return r3
   merged_db.query_tuning_record(..., target_workload) 
   
   Case 3. Mix-use scenario
   merged_db = ms.database.UnionDatabase(                    
       db1, # no record
       db2, # no record
       db3, # has r3
       ms.database.OrderedUnionDatabase( # returns r4
            db4,  # has r4
            db5,  # has r5
       )
   )
   # This return best one between r3 and r4
   merged_db.query_tuning_record(..., target_workload) 
   
   Case 4. Another mix-use scenario
   merged_db = ms.database.UnionDatabase(                    
       db1, # no record
       db2, # no record
       db3, # has r3
       ms.database.UnionDatabase( # returns best one between r4 and r5
            db4,  # has r4
            db5,  # has r5
       )
   )
   # This return best one among r3, r4 and r5
   merged_db.query_tuning_record(..., target_workload) 
   
   Case 5. Yet another mix-use scenario
   merged_db = ms.database.OrderedUnionDatabase(                    
       db1, # no record
       db2, # no record
       ms.database.UnionDatabase( # returns best one between r3 and r4
            db3, # has r3
            db4,  # has r4
       )
       db5,  # has r5
   )
   # This return best one between r3 and r4
   merged_db.query_tuning_record(..., target_workload) 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] MasterJH5574 commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
MasterJH5574 commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r957985876


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.
+   * \param databases The databases to be merged.
+   * \param fallback The fallback databases. If all the databases didn't answer a query,
+   * the fallback databases will be used.
+   * \return The merged database.
+   */
+  TVM_DLL static Database MergedDatabase(Array<Database, void> preferred,
+                                         Array<Database, void> databases,
+                                         Array<Database, void> fallback);

Review Comment:
   Just curious, what is the second template parameter `void` used for?



##########
src/meta_schedule/database/merged_database.cc:
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+#include "../utils.h"
+
+namespace tvm {
+namespace meta_schedule {
+
+class MergedDatabaseNode : public DatabaseNode {
+ public:
+  Array<Database> preferred;
+  Array<Database> databases;
+  Array<Database> fallback;
+
+  void VisitAttrs(AttrVisitor* v) {
+    v->Visit("preferred", &preferred);
+    v->Visit("databases", &databases);
+    v->Visit("fallback", &fallback);
+  }
+
+  static constexpr const char* _type_key = "meta_schedule.MergedDatabase";
+  TVM_DECLARE_FINAL_OBJECT_INFO(MergedDatabaseNode, DatabaseNode);
+
+ public:
+  Optional<TuningRecord> QueryTuningRecord(const IRModule& mod, const Target& target,
+                                           const String& task_name) final {
+    for (const Database& db : preferred) {
+      if (Optional<TuningRecord> record = db->QueryTuningRecord(mod, target, task_name)) {
+        return record;
+      }
+    }
+    std::vector<TuningRecord> results;
+    results.reserve(databases.size());
+    for (const Database& db : databases) {
+      if (Optional<TuningRecord> record = db->QueryTuningRecord(mod, target, task_name)) {
+        ICHECK(record.value()->run_secs.defined());
+        results.push_back(record.value());
+      }
+    }
+    std::sort(results.begin(), results.end(), SortTuningRecordByMeanRunSecs());
+    if (!results.empty()) {
+      return results[0];
+    }
+    for (const Database& db : fallback) {
+      if (Optional<TuningRecord> record = db->QueryTuningRecord(mod, target, task_name)) {
+        return record;
+      }
+    }
+    return NullOpt;
+  }
+
+  bool HasWorkload(const IRModule& mod) final {
+    LOG(FATAL) << "NotImplementedError: MergedDatabase.HasWorkload";

Review Comment:
   Hmmmm... At least I think this is doable. Just there isn’t such need?



##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.
+   * \param databases The databases to be merged.
+   * \param fallback The fallback databases. If all the databases didn't answer a query,
+   * the fallback databases will be used.

Review Comment:
   Is there any inclusion hierarchy among the three inputs? From the current C++ documents, my understanding is that `databases` must contains all databases in `preferred` and `fallback`. But the documents and example on Python side obviously doesn’t prove my understanding correct. Would it be a good idea to tell a bit about their relation?



##########
tests/python/unittest/test_link_params.py:
##########
@@ -412,17 +412,12 @@ def schedule_fn(sch):
             return True
         return False
 
-    link_params = True
-

Review Comment:
   What’s the impact of the PR on this file :eyes:?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958681790


##########
tests/python/unittest/test_link_params.py:
##########
@@ -412,17 +412,12 @@ def schedule_fn(sch):
             return True
         return False
 
-    link_params = True
-

Review Comment:
   Ooops sure!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958087536


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.
+   * \param databases The databases to be merged.
+   * \param fallback The fallback databases. If all the databases didn't answer a query,
+   * the fallback databases will be used.

Review Comment:
   There is no hierarchy. There is no requirement that `databases` must contain `preferred` or `fallback`. There is no relationship between those three fields. Would be great if you could suggest an enhancement to the doc to avoid confusion :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
junrushao commented on PR #12628:
URL: https://github.com/apache/tvm/pull/12628#issuecomment-1234915543

   Thank you all for super valuable suggestions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] masahi commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
masahi commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958080856


##########
tests/python/unittest/test_link_params.py:
##########
@@ -412,17 +412,12 @@ def schedule_fn(sch):
             return True
         return False
 
-    link_params = True
-

Review Comment:
   Sorry please preserve `Executor("graph", {"link-params": link_params})`, this is a test for link_params = True case. Without it there is no link_params. 
   
   We can remove ` "relay.FuseOps.link_params": link_params,`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] zxybazh commented on a diff in pull request #12628: [MetaSchedule] Introduce `Union` and `OrderedUnion` in Database

Posted by GitBox <gi...@apache.org>.
zxybazh commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r959908816


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,22 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief A database composed of multiple databases, allowing users to guide IR rewriting using
+   * combined knowledge of those databases. To each query, it returns the best record among all the
+   * databases given.
+   * \param databases The list of databases to be combined.
+   * \return The combined database.
+   */
+  TVM_DLL static Database UnionDatabase(Array<Database, void> databases);

Review Comment:
   I think given `Union Database` it's easy to think of a database which returns the best result among all databases given, and this is consistent with the database api of get top k. For `OrderedUnionDatabase` I agree the name is a bit not straghtforward, maybe it could be changed to something like `FirstAvailableUnionDatabase`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r957956549


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.

Review Comment:
   Good catch!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on a diff in pull request #12628: [MetaSchedule] Introduce `MergedDatabase`

Posted by GitBox <gi...@apache.org>.
junrushao commented on code in PR #12628:
URL: https://github.com/apache/tvm/pull/12628#discussion_r958076757


##########
include/tvm/meta_schedule/database.h:
##########
@@ -357,6 +357,18 @@ class Database : public runtime::ObjectRef {
    */
   TVM_DLL static Database JSONDatabase(String path_workload, String path_tuning_record,
                                        bool allow_missing);
+  /*!
+   * \brief Create a database merged from multiple databases.
+   * \param preferred The preferred database. If the preferred database responses to a query,
+   * all other databases will be ignored.
+   * \param databases The databases to be merged.
+   * \param fallback The fallback databases. If all the databases didn't answer a query,
+   * the fallback databases will be used.
+   * \return The merged database.
+   */
+  TVM_DLL static Database MergedDatabase(Array<Database, void> preferred,
+                                         Array<Database, void> databases,
+                                         Array<Database, void> fallback);

Review Comment:
   `Database` at that moment is an incomplete type, which won't pass Array's static type checking which requires the `Database` to be a subclass of `ObjectRef`, and therefore, we use the void-trick here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org