You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by tq...@apache.org on 2022/09/15 23:48:28 UTC

[tvm-site] branch asf-site updated: deploying docs (apache/tvm@1f8b5dec29e6e34b4cf5f092acf5b1d197a59d42)

This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 7903690be9 deploying docs (apache/tvm@1f8b5dec29e6e34b4cf5f092acf5b1d197a59d42)
7903690be9 is described below

commit 7903690be93f2e6c590a150e2175bb1299f6f65c
Author: tvm-bot <95...@users.noreply.github.com>
AuthorDate: Thu Sep 15 23:48:22 2022 +0000

    deploying docs (apache/tvm@1f8b5dec29e6e34b4cf5f092acf5b1d197a59d42)
---
 .../how_to/compile_models/from_darknet.rst.txt     |   2 +-
 .../how_to/compile_models/from_keras.rst.txt       |   2 +-
 .../how_to/compile_models/from_mxnet.rst.txt       |   2 +-
 .../how_to/compile_models/from_oneflow.rst.txt     |   2 +-
 .../how_to/compile_models/from_pytorch.rst.txt     |   2 +-
 .../how_to/compile_models/from_tensorflow.rst.txt  |   2 +-
 .../compile_models/sg_execution_times.rst.txt      |  22 +-
 .../deploy_models/deploy_model_on_android.rst.txt  |   2 +-
 .../deploy_object_detection_pytorch.rst.txt        |   4 +-
 .../deploy_models/deploy_prequantized.rst.txt      |   6 +-
 .../deploy_prequantized_tflite.rst.txt             |   4 +-
 .../how_to/deploy_models/deploy_quantized.rst.txt  |   2 +-
 .../deploy_models/deploy_ssd_gluoncv.rst.txt       |   4 +-
 .../deploy_models/sg_execution_times.rst.txt       |  18 +-
 .../extend_tvm/bring_your_own_datatypes.rst.txt    |   2 +-
 .../how_to/extend_tvm/sg_execution_times.rst.txt   |   8 +-
 .../how_to/extend_tvm/use_pass_instrument.rst.txt  |  16 +-
 .../optimize_operators/opt_conv_cuda.rst.txt       |   2 +-
 .../optimize_operators/opt_conv_tensorcore.rst.txt |   2 +-
 .../how_to/optimize_operators/opt_gemm.rst.txt     |  16 +-
 .../optimize_operators/sg_execution_times.rst.txt  |   8 +-
 .../sg_execution_times.rst.txt                     |  14 +-
 .../tune_conv2d_layer_cuda.rst.txt                 |   4 +-
 .../tune_network_cuda.rst.txt                      |   2 +-
 .../tune_network_x86.rst.txt                       |   4 +-
 .../tune_sparse_x86.rst.txt                        | 425 +++++----------------
 .../tune_with_autotvm/sg_execution_times.rst.txt   |   8 +-
 .../tune_with_autotvm/tune_conv2d_cuda.rst.txt     |  26 +-
 .../work_with_microtvm/micro_autotune.rst.txt      |  16 +-
 .../how_to/work_with_microtvm/micro_train.rst.txt  |  16 +-
 .../work_with_microtvm/sg_execution_times.rst.txt  |  10 +-
 .../work_with_relay/sg_execution_times.rst.txt     |   8 +-
 .../how_to/work_with_schedules/intrin_math.rst.txt |   2 +-
 .../work_with_schedules/sg_execution_times.rst.txt |  16 +-
 .../how_to/work_with_schedules/tensorize.rst.txt   |   2 +-
 .../tutorials/autotvm/sg_execution_times.rst.txt   |   4 +-
 .../frontend/deploy_classification.rst.txt         |   2 +-
 .../tutorials/frontend/deploy_detection.rst.txt    |   2 +-
 .../tutorials/frontend/sg_execution_times.rst.txt  |   6 +-
 .../tutorials/optimize/sg_execution_times.rst.txt  |   6 +-
 .../topic/vta/tutorials/sg_execution_times.rst.txt |   4 +-
 .../tutorial/auto_scheduler_matmul_x86.rst.txt     |   4 +-
 docs/_sources/tutorial/autotvm_matmul_x86.rst.txt  |  20 +-
 docs/_sources/tutorial/autotvm_relay_x86.rst.txt   |  54 +--
 .../tutorial/cross_compilation_and_rpc.rst.txt     |   2 +-
 docs/_sources/tutorial/intro_topi.rst.txt          |   2 +-
 docs/_sources/tutorial/sg_execution_times.rst.txt  |  24 +-
 .../tutorial/tensor_expr_get_started.rst.txt       |  44 +--
 docs/commit_hash                                   |   2 +-
 docs/genindex.html                                 |   2 +
 docs/how_to/compile_models/from_darknet.html       |   2 +-
 docs/how_to/compile_models/from_keras.html         |   2 +-
 docs/how_to/compile_models/from_mxnet.html         |   2 +-
 docs/how_to/compile_models/from_oneflow.html       |  13 +-
 docs/how_to/compile_models/from_pytorch.html       |   7 +-
 docs/how_to/compile_models/from_tensorflow.html    |   2 +-
 docs/how_to/compile_models/sg_execution_times.html |  22 +-
 .../deploy_models/deploy_model_on_android.html     |   2 +-
 .../deploy_object_detection_pytorch.html           |  18 +-
 docs/how_to/deploy_models/deploy_prequantized.html |   6 +-
 .../deploy_models/deploy_prequantized_tflite.html  |   4 +-
 docs/how_to/deploy_models/deploy_quantized.html    |   2 +-
 docs/how_to/deploy_models/deploy_ssd_gluoncv.html  |  40 +-
 docs/how_to/deploy_models/sg_execution_times.html  |  18 +-
 .../extend_tvm/bring_your_own_datatypes.html       |   2 +-
 docs/how_to/extend_tvm/sg_execution_times.html     |   8 +-
 docs/how_to/extend_tvm/use_pass_instrument.html    |  16 +-
 docs/how_to/optimize_operators/opt_conv_cuda.html  |   2 +-
 .../optimize_operators/opt_conv_tensorcore.html    |   2 +-
 docs/how_to/optimize_operators/opt_gemm.html       |  16 +-
 .../optimize_operators/sg_execution_times.html     |   8 +-
 .../sg_execution_times.html                        |  18 +-
 .../tune_conv2d_layer_cuda.html                    |   4 +-
 .../tune_with_autoscheduler/tune_network_cuda.html |   2 +-
 .../tune_with_autoscheduler/tune_network_x86.html  |   4 +-
 .../tune_with_autoscheduler/tune_sparse_x86.html   | 425 +++++----------------
 .../tune_with_autotvm/sg_execution_times.html      |   8 +-
 .../how_to/tune_with_autotvm/tune_conv2d_cuda.html |  26 +-
 docs/how_to/work_with_microtvm/micro_autotune.html |  16 +-
 docs/how_to/work_with_microtvm/micro_train.html    |  16 +-
 .../work_with_microtvm/sg_execution_times.html     |  10 +-
 .../how_to/work_with_relay/sg_execution_times.html |   8 +-
 docs/how_to/work_with_schedules/intrin_math.html   |   2 +-
 .../work_with_schedules/sg_execution_times.html    |  16 +-
 docs/how_to/work_with_schedules/tensorize.html     |   2 +-
 docs/objects.inv                                   | Bin 23439 -> 23449 bytes
 .../classtvm_1_1tir_1_1ScheduleNode-members.html   |  81 ++--
 .../doxygen/classtvm_1_1tir_1_1ScheduleNode.html   |  51 +++
 ...lasstvm_1_1tir_1_1ScheduleNode__coll__graph.svg |   2 +-
 ...stvm_1_1tir_1_1ScheduleNode__inherit__graph.svg |   2 +-
 docs/reference/api/doxygen/database_8h_source.html |   2 +-
 docs/reference/api/doxygen/functions_func_m.html   |   2 +-
 docs/reference/api/doxygen/functions_func_p.html   |   3 +
 docs/reference/api/doxygen/functions_func_u.html   |   2 +-
 docs/reference/api/doxygen/functions_p.html        |   5 +-
 docs/reference/api/doxygen/functions_s.html        |   2 +-
 docs/reference/api/doxygen/functions_t.html        |   6 +-
 docs/reference/api/doxygen/functions_u.html        |   2 +-
 .../api/doxygen/measure__candidate_8h_source.html  |   2 +-
 docs/reference/api/doxygen/postproc_8h_source.html |   2 +-
 .../api/doxygen/schedule__rule_8h_source.html      |   2 +-
 docs/reference/api/doxygen/search/all_11.js        |   3 +-
 docs/reference/api/doxygen/search/all_13.js        |   8 +-
 docs/reference/api/doxygen/search/all_14.js        |   6 +-
 docs/reference/api/doxygen/search/all_15.js        |   2 +-
 docs/reference/api/doxygen/search/all_16.js        |   4 +-
 docs/reference/api/doxygen/search/all_18.js        |   2 +-
 docs/reference/api/doxygen/search/all_e.js         |   6 +-
 docs/reference/api/doxygen/search/functions_10.js  |   3 +-
 docs/reference/api/doxygen/search/functions_12.js  |   2 +-
 docs/reference/api/doxygen/search/functions_15.js  |   2 +-
 docs/reference/api/doxygen/search/functions_d.js   |   6 +-
 .../doxygen/tir_2schedule_2schedule_8h_source.html |   4 +-
 docs/reference/api/doxygen/trace_8h_source.html    |   2 +-
 docs/reference/api/python/auto_scheduler.html      |   4 +-
 docs/reference/api/python/tir.html                 | 112 +++++-
 .../api/typedoc/classes/bytestreamreader.html      |  12 +-
 .../api/typedoc/classes/cachedcallstack.html       |  34 +-
 docs/reference/api/typedoc/classes/dldatatype.html |  12 +-
 docs/reference/api/typedoc/classes/dldevice.html   |  10 +-
 .../reference/api/typedoc/classes/environment.html |  12 +-
 docs/reference/api/typedoc/classes/ffilibrary.html |  20 +-
 .../api/typedoc/classes/graphexecutor.html         |  16 +-
 docs/reference/api/typedoc/classes/instance.html   |  40 +-
 docs/reference/api/typedoc/classes/memory.html     |  34 +-
 docs/reference/api/typedoc/classes/module.html     |  10 +-
 docs/reference/api/typedoc/classes/ndarray.html    |  22 +-
 .../api/typedoc/classes/packedfunccell.html        |   6 +-
 docs/reference/api/typedoc/classes/rpcserver.html  |  14 +-
 docs/reference/api/typedoc/classes/scalar.html     |   6 +-
 .../api/typedoc/classes/webgpucontext.html         |  12 +-
 docs/reference/api/typedoc/enums/argtypecode.html  |  30 +-
 .../api/typedoc/enums/aynccallbackcode.html        |   4 +-
 .../api/typedoc/enums/dldatatypecode.html          |   8 +-
 .../api/typedoc/enums/rpcserverstate.html          |  12 +-
 docs/reference/api/typedoc/enums/sizeof.html       |  18 +-
 docs/reference/api/typedoc/index.html              | 112 +++---
 .../api/typedoc/interfaces/disposable.html         |   2 +-
 .../api/typedoc/interfaces/functioninfo.html       |   6 +-
 .../api/typedoc/interfaces/libraryprovider.html    |   4 +-
 docs/searchindex.js                                |   2 +-
 .../vta/tutorials/autotvm/sg_execution_times.html  |   4 +-
 .../tutorials/frontend/deploy_classification.html  |   2 +-
 .../vta/tutorials/frontend/deploy_detection.html   |   2 +-
 .../vta/tutorials/frontend/sg_execution_times.html |   6 +-
 .../vta/tutorials/optimize/sg_execution_times.html |   6 +-
 docs/topic/vta/tutorials/sg_execution_times.html   |   4 +-
 docs/tutorial/auto_scheduler_matmul_x86.html       |   4 +-
 docs/tutorial/autotvm_matmul_x86.html              |  20 +-
 docs/tutorial/autotvm_relay_x86.html               | 258 ++++++-------
 docs/tutorial/cross_compilation_and_rpc.html       |   2 +-
 docs/tutorial/intro_topi.html                      |   2 +-
 docs/tutorial/sg_execution_times.html              |  24 +-
 docs/tutorial/tensor_expr_get_started.html         |  44 +--
 154 files changed, 1241 insertions(+), 1537 deletions(-)

diff --git a/docs/_sources/how_to/compile_models/from_darknet.rst.txt b/docs/_sources/how_to/compile_models/from_darknet.rst.txt
index 29127929af..ee706e6619 100644
--- a/docs/_sources/how_to/compile_models/from_darknet.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_darknet.rst.txt
@@ -317,7 +317,7 @@ The process is no different from other examples.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  3.838 seconds)
+   **Total running time of the script:** ( 1 minutes  2.475 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_darknet.py:
diff --git a/docs/_sources/how_to/compile_models/from_keras.rst.txt b/docs/_sources/how_to/compile_models/from_keras.rst.txt
index 30629a6d70..98174860da 100644
--- a/docs/_sources/how_to/compile_models/from_keras.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_keras.rst.txt
@@ -228,7 +228,7 @@ Look up prediction top 1 index in 1000 class synset.
  .. code-block:: none
 
     Relay top-1 id: 285, class name: Egyptian cat
-
    1/1 [==============================] - ETA: 0s
    1/1 [==============================] - 1s 1s/step
+
    1/1 [==============================] - ETA: 0s
    1/1 [==============================] - 1s 966ms/step
     Keras top-1 id: 285, class name: Egyptian cat
 
 
diff --git a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
index 4be9a71ecb..9523fcd1be 100644
--- a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
@@ -115,7 +115,7 @@ In this section, we download a pretrained imagenet model and classify an image.
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip38f08d41-9504-4822-aa98-a937255ba425 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip077fb9ce-969e-4a67-ac38-f671c0d05edb from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
     x (1, 3, 224, 224)
 
 
diff --git a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
index 8b0d68fe08..dd37e4fd42 100644
--- a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
@@ -116,7 +116,7 @@ Load a pretrained OneFlow model and save model
  .. code-block:: none
 
     Downloading: "https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip" to /workspace/.oneflow/flowvision_cache/resnet18.zip
-
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
     19%|#9        | 7.99M/41.5M [00:00<00:00, 45.2MB/s]
     35%|###4      | 14.3M/41.5M [00:00<00:00, 42.9MB/s]
     54%|#####3    | 22.3M/41.5M [00:00<00:00, 50.3MB/s]
     67%|######7   | 27.9M/41.5M [00:00<00:00, 52.5MB/s]
     80%|#######9  | 33.0M/41.5M [00:00<00:00, 51.0MB/s]
     97%|#########6| 40.1M/41.5M [00:00<00:00, 57.2MB/s]
    100%|##########| 41.5M/41.5M [00:00<00:00, 54.0MB/s]
+
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
     19%|#9        | 7.99M/41.5M [00:00<00:00, 81.4MB/s]
     38%|###7      | 15.8M/41.5M [00:00<00:00, 78.9MB/s]
     56%|#####6    | 23.3M/41.5M [00:00<00:00, 54.3MB/s]
     72%|#######1  | 29.8M/41.5M [00:00<00:00, 58.5MB/s]
     86%|########6 | 35.9M/41.5M [00:00<00:00, 57.7MB/s]
    100%|##########| 41.5M/41.5M [00:00<00:00, 60.3MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
index 1ae40dd79f..373a26aa59 100644
--- a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
@@ -94,7 +94,7 @@ Load a pretrained PyTorch model
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
-
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
     13%|#2        | 5.73M/44.7M [00:00<00:00, 60.0MB/s]
     26%|##5       | 11.5M/44.7M [00:00<00:00, 58.5MB/s]
     76%|#######5  | 33.9M/44.7M [00:00<00:00, 138MB/s] 
    100%|##########| 44.7M/44.7M [00:00<00:00, 135MB/s]
+
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
     45%|####4     | 19.9M/44.7M [00:00<00:00, 209MB/s]
     89%|########9 | 39.8M/44.7M [00:00<00:00, 158MB/s]
    100%|##########| 44.7M/44.7M [00:00<00:00, 169MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
index d80953b32e..964bb894fa 100644
--- a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
@@ -423,7 +423,7 @@ Run the corresponding model on tensorflow
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  8.238 seconds)
+   **Total running time of the script:** ( 1 minutes  5.535 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_tensorflow.py:
diff --git a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
index 0e07dbb9dd..4de90dce49 100644
--- a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
@@ -5,26 +5,26 @@
 
 Computation times
 =================
-**05:16.532** total execution time for **how_to_compile_models** files:
+**05:05.167** total execution time for **how_to_compile_models** files:
 
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``) | 01:08.238 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``) | 01:05.535 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)       | 01:03.838 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)       | 01:02.475 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)         | 00:40.565 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)         | 00:38.886 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)       | 00:30.384 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)       | 00:27.948 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)           | 00:27.382 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)           | 00:26.324 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)         | 00:25.232 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)         | 00:25.453 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)         | 00:21.564 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)         | 00:21.732 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)       | 00:19.815 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)       | 00:19.222 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)           | 00:17.074 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)           | 00:15.286 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)             | 00:02.439 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)             | 00:02.306 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
index ccf4c4cfe0..ed2f51d866 100644
--- a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
@@ -441,7 +441,7 @@ Execute on TVM
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      15.7978      15.7722      15.9916      15.6921       0.0990   
+      15.7261      15.6897      16.1582      15.5028       0.1797   
                
 
 
diff --git a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
index 20d8e7b83c..9fa2e7ead4 100644
--- a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
@@ -123,7 +123,7 @@ Load pre-trained maskrcnn from torchvision and do tracing
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth" to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
-
      0%|          | 0.00/170M [00:00<?, ?B/s]
     12%|#1        | 19.6M/170M [00:00<00:00, 205MB/s]
     29%|##8       | 48.8M/170M [00:00<00:00, 264MB/s]
     47%|####7     | 80.3M/170M [00:00<00:00, 295MB/s]
     64%|######3   | 108M/170M [00:00<00:00, 293MB/s] 
     80%|########  | 136M/170M [00:00<00:00, 284MB/s]
     97%|#########7| 165M/170M [00:00<00:00, 289MB/s]
    100%|##########| 170M/170M [00:00<00:00, 282MB/s]
+
      0%|          | 0.00/170M [00:00<?, ?B/s]
      5%|4         | 8.27M/170M [00:00<00:01, 86.7MB/s]
     19%|#8        | 32.0M/170M [00:00<00:00, 182MB/s] 
     33%|###2      | 55.5M/170M [00:00<00:00, 211MB/s]
     45%|####4     | 75.7M/170M [00:00<00:00, 175MB/s]
     55%|#####5    | 94.2M/170M [00:00<00:00, 181MB/s]
     69%|######9   | 118M/170M [00:00<00:00, 201MB/s] 
     81%|########  | 137M/170M [00:00<00:00, 188MB/s]
     92%|#########2| 157M/170M [00:00<00:00, 193MB/s]
    100%|##########| 170M/170M [00:00<00:00, 191MB/s]
     /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
       for i in range(dim)
     /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
@@ -295,7 +295,7 @@ Get boxes with score larger than 0.9
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 3 minutes  4.633 seconds)
+   **Total running time of the script:** ( 2 minutes  56.827 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_object_detection_pytorch.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
index 9826b0bb37..5b2428359e 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
@@ -232,7 +232,7 @@ training. Other models require a full post training calibration.
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/mobilenet_v2-b0353104.pth" to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
-
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 175MB/s]
+
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 163MB/s]
 
 
 
@@ -412,7 +412,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      90.3583      90.2349      95.9917      90.1003       0.6798   
+      90.3589      90.1998      96.2542      90.0808       0.7585   
                
 
 
@@ -461,7 +461,7 @@ TODO
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  11.746 seconds)
+   **Total running time of the script:** ( 1 minutes  8.664 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
index b09dd4ee8c..15b16a1945 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
@@ -439,7 +439,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      120.1182     120.0640     122.5953     119.1741      0.4515   
+      120.7587     120.6116     128.2929     119.9034      0.9078   
                
 
 
@@ -476,7 +476,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  3.760 seconds)
+   **Total running time of the script:** ( 1 minutes  58.803 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized_tflite.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
index 9ccdd9ad26..55990a03ed 100644
--- a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
@@ -255,7 +255,7 @@ We create a Relay VM to build and execute the model.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  22.186 seconds)
+   **Total running time of the script:** ( 1 minutes  20.897 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_quantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
index 78a4b401e5..5c654e8fe0 100644
--- a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
@@ -158,7 +158,7 @@ Convert and compile model for CPU.
             data: None
       input_sym_arg_type = in_param.infer_type()[0]
     Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
-
      0%|          | 0/132723 [00:00<?, ?KB/s]
      4%|3         | 4969/132723 [00:00<00:02, 49683.65KB/s]
      9%|8         | 11927/132723 [00:00<00:01, 61381.21KB/s]
     14%|#4        | 18833/132723 [00:00<00:01, 64881.68KB/s]
     19%|#9        | 25633/132723 [00:00<00:01, 66110.05KB/s]
     25%|##4       | 32560/132723 [00:00<00:01, 67245.72KB/s]
     30%|##9       | 39569/132723 [00:00<00:01, 68205.90KB/s]
     35%|###5      | 46558/132723 [00:00<00:01, 68753.42KB/s]
     40%|####      | 53472/132723 [00:00<00:01, 68874.98KB/s]
     46%|####5     | 60558/132723 [00:00<00:01, 69493.78KB/s]
     51%|#####     | 67523/132723 [00:01<00:00, 69539.11KB/s]
     56%|#####6    | 74559/132723 [00:01<00:00, 69785.86KB/s]
     61%|######1   | 81588/132723 [00:01<00:00, 69937.91KB/s]
     67%|######6   | 88627/132723 [00:01<00:00, 70072.09KB/s]
     72%|#######2  | 95635/132723 [00:01<00:00, 69879.11KB/s]
     77%|#######7  | 102691/132723 [00:01<00:00, 70082.50KB/s]
     83%|########2 
 | 109736/132723 [00:01<00:00, 70190.67KB/s]
     88%|########7 | 116756/132723 [00:01<00:00, 69938.09KB/s]
     94%|#########3| 124131/132723 [00:01<00:00, 71077.12KB/s]
     99%|#########8| 131371/132723 [00:01<00:00, 71470.74KB/s]
    100%|##########| 132723/132723 [00:01<00:00, 69077.37KB/s]
+
      0%|          | 0/132723 [00:00<?, ?KB/s]
      4%|4         | 5441/132723 [00:00<00:02, 54406.16KB/s]
     10%|9         | 12645/132723 [00:00<00:01, 64776.29KB/s]
     15%|#5        | 20398/132723 [00:00<00:01, 70596.88KB/s]
     21%|##1       | 28055/132723 [00:00<00:01, 72952.10KB/s]
     27%|##6       | 35707/132723 [00:00<00:01, 74235.86KB/s]
     33%|###2      | 43497/132723 [00:00<00:01, 75476.30KB/s]
     39%|###8      | 51245/132723 [00:00<00:01, 76128.89KB/s]
     44%|####4     | 59026/132723 [00:00<00:00, 76661.67KB/s]
     50%|#####     | 66770/132723 [00:00<00:00, 76903.73KB/s]
     56%|#####6    | 74486/132723 [00:01<00:00, 76980.65KB/s]
     62%|######1   | 82236/132723 [00:01<00:00, 77137.31KB/s]
     68%|######7   | 89976/132723 [00:01<00:00, 77215.39KB/s]
     74%|#######3  | 97739/132723 [00:01<00:00, 77337.60KB/s]
     80%|#######9  | 105567/132723 [00:01<00:00, 77619.90KB/s]
     85%|########5 | 113350/132723 [00:01<00:00, 77678.39KB/s]
     91%|#########
 1| 121181/132723 [00:01<00:00, 77867.51KB/s]
     97%|#########7| 128988/132723 [00:01<00:00, 77926.40KB/s]
    100%|##########| 132723/132723 [00:01<00:00, 75927.83KB/s]
 
 
 
@@ -241,7 +241,7 @@ Display result
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  37.628 seconds)
+   **Total running time of the script:** ( 2 minutes  35.386 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_ssd_gluoncv.py:
diff --git a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
index 6c0a137d7b..d491e09b29 100644
--- a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
@@ -5,24 +5,24 @@
 
 Computation times
 =================
-**11:38.358** total execution time for **how_to_deploy_models** files:
+**11:15.103** total execution time for **how_to_deploy_models** files:
 
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``) | 03:04.633 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``) | 02:56.827 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)                           | 02:37.628 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)                           | 02:35.386 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)           | 02:03.760 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)           | 01:58.803 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)                               | 01:22.186 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)                               | 01:20.897 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)                         | 01:11.746 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)                         | 01:08.664 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)                 | 00:32.246 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)                 | 00:29.259 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_nano.py` (``deploy_model_on_nano.py``)                       | 00:23.644 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_nano.py` (``deploy_model_on_nano.py``)                       | 00:22.892 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)                       | 00:22.508 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)                       | 00:22.367 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_deploy_models_deploy_sparse.py` (``deploy_sparse.py``)                                     | 00:00.007 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
index 01f7c92c10..35fb0cb96e 100644
--- a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
@@ -476,7 +476,7 @@ First let us define two helper functions to get the mobilenet model and a cat im
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zipbcdcf044-386a-4c95-a634-4f83076f4b8e from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip479f8cbc-19b2-4c49-bd9e-ecc16b4f6a39 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 
 
 
diff --git a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
index fb95a7c7c6..ec14b6cbbd 100644
--- a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
@@ -5,14 +5,14 @@
 
 Computation times
 =================
-**00:40.316** total execution time for **how_to_extend_tvm** files:
+**00:41.313** total execution time for **how_to_extend_tvm** files:
 
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``) | 00:37.249 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``) | 00:38.151 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)           | 00:02.141 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)           | 00:02.231 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)                     | 00:00.918 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)                     | 00:00.924 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_extend_tvm_low_level_custom_pass.py` (``low_level_custom_pass.py``)       | 00:00.008 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
index f5bc5b087e..b61844ffaf 100644
--- a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
@@ -216,10 +216,10 @@ profile the execution time of each passes.
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 7733us [7733us] (49.58%; 49.58%)
-    FoldScaleAxis: 7863us [5us] (50.42%; 50.42%)
-            FoldConstant: 7858us [1616us] (50.38%; 99.94%)
-                    InferType: 6242us [6242us] (40.02%; 79.43%)
+    InferType: 6733us [6733us] (46.27%; 46.27%)
+    FoldScaleAxis: 7818us [5us] (53.73%; 53.73%)
+            FoldConstant: 7813us [1630us] (53.69%; 99.93%)
+                    InferType: 6183us [6183us] (42.49%; 79.14%)
 
 
 
@@ -258,10 +258,10 @@ Refer to following sections and :py:func:`tvm.instrument.pass_instrument` for th
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 6329us [6329us] (44.35%; 44.35%)
-    FoldScaleAxis: 7943us [4us] (55.65%; 55.65%)
-            FoldConstant: 7939us [1637us] (55.62%; 99.95%)
-                    InferType: 6302us [6302us] (44.16%; 79.38%)
+    InferType: 6205us [6205us] (44.24%; 44.24%)
+    FoldScaleAxis: 7820us [4us] (55.76%; 55.76%)
+            FoldConstant: 7816us [1685us] (55.73%; 99.94%)
+                    InferType: 6131us [6131us] (43.72%; 78.45%)
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
index 36d16a9ffe..a5a9bedd91 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
@@ -340,7 +340,7 @@ latency of convolution.
 
  .. code-block:: none
 
-    Convolution: 54.151693 ms
+    Convolution: 54.099895 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
index 61a26f6bb9..fb50230444 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
@@ -671,7 +671,7 @@ be able to run on our build server
 
  .. code-block:: none
 
-    conv2d with tensor core: 8.537677 ms
+    conv2d with tensor core: 6.482232 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
index e20d89bb8f..0a261ef902 100644
--- a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
@@ -143,8 +143,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 
  .. code-block:: none
 
-    Numpy running time: 0.018252
-    Baseline: 3.441999
+    Numpy running time: 0.018508
+    Baseline: 3.439445
 
 
 
@@ -239,7 +239,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 
  .. code-block:: none
 
-    Opt1: 0.295153
+    Opt1: 0.298512
 
 
 
@@ -342,7 +342,7 @@ In this tutorial, we chose to vectorize the inner loop row data since it is cach
 
  .. code-block:: none
 
-    Opt2: 0.325529
+    Opt2: 0.332190
 
 
 
@@ -438,7 +438,7 @@ the access pattern for A matrix is more cache friendly.
 
  .. code-block:: none
 
-    Opt3: 0.115407
+    Opt3: 0.115539
 
 
 
@@ -563,7 +563,7 @@ flattening.
 
  .. code-block:: none
 
-    Opt4: 0.110130
+    Opt4: 0.109163
 
 
 
@@ -685,7 +685,7 @@ write to C when all the block results are ready.
 
  .. code-block:: none
 
-    Opt5: 0.110737
+    Opt5: 0.110682
 
 
 
@@ -810,7 +810,7 @@ Furthermore, we can also utilize multi-core processors to do the thread-level pa
 
  .. code-block:: none
 
-    Opt6: 0.147071
+    Opt6: 0.147013
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
index b86751c703..93e5adcd7c 100644
--- a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
@@ -5,12 +5,12 @@
 
 Computation times
 =================
-**00:34.479** total execution time for **how_to_optimize_operators** files:
+**00:34.498** total execution time for **how_to_optimize_operators** files:
 
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)                       | 00:32.111 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)                       | 00:32.243 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``) | 00:01.293 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``) | 00:01.220 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)             | 00:01.074 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)             | 00:01.035 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
index 0af66f6cb5..444f9ef6b1 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
@@ -5,18 +5,18 @@
 
 Computation times
 =================
-**06:42.374** total execution time for **how_to_tune_with_autoscheduler** files:
+**06:24.910** total execution time for **how_to_tune_with_autoscheduler** files:
 
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``) | 03:38.089 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``) | 03:28.377 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)             | 01:24.486 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)             | 01:22.647 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)           | 00:58.199 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)           | 00:56.055 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)               | 00:23.694 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)               | 00:20.373 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)             | 00:09.050 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)           | 00:08.879 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)           | 00:08.856 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)             | 00:08.579 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
index f4d9acf501..37a126c3d3 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
@@ -771,7 +771,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 0.375 ms
+    Execution time of this operator: 0.364 ms
 
 
 
@@ -1378,7 +1378,7 @@ In the example below we resume the status and do more 5 trials.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 3 minutes  38.089 seconds)
+   **Total running time of the script:** ( 3 minutes  28.377 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
index b20735a674..718ff8adf7 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
@@ -647,7 +647,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       8.2668       8.2655       8.2728       8.2623       0.0044   
+       8.1831       8.1809       8.1958       8.1727       0.0096   
                
 
 
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
index 12e1366899..40aa3eb6ac 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
@@ -666,7 +666,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      760.9283     760.8877     761.6646     760.2326      0.5853   
+      753.8122     753.3414     754.9024     753.1927      0.7733   
                
 
 
@@ -694,7 +694,7 @@ Other Tips
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  24.486 seconds)
+   **Total running time of the script:** ( 1 minutes  22.647 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_network_x86.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
index 99215816cc..a08619bdb7 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
@@ -397,339 +397,106 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
                  placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
                  compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
       buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-      preflattened_buffer_map = {placeholder_8: placeholder_15: Buffer(placeholder_13, int32, [33], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_9: placeholder_17: Buffer(placeholder_14, float32, [128, 512], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_7: placeholder_19: Buffer(placeholder_12, int32, [4916], [])} {
-      for (i0.outer.i1.outer.fused: int32, 0, 256) "parallel" {
-        allocate(compute_4: Pointer(global float32), float32, [256]), storage_scope = global {
+      preflattened_buffer_map = {compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_8: placeholder_15: Buffer(placeholder_13, int32, [33], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_7: placeholder_17: Buffer(placeholder_12, int32, [4916], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_9: placeholder_19: Buffer(placeholder_14, float32, [128, 512], [])} {
+      for (i0.outer.i1.outer.fused: int32, 0, 64) "parallel" {
+        allocate(compute_4: Pointer(global float32), float32, [1024]), storage_scope = global {
           for (i.outer.inner: int32, 0, 4) {
-            let cse_var_2: int32 = floormod(i0.outer.i1.outer.fused, 32)
-            let cse_var_1: int32 = (i.outer.inner*64)
-             {
-              compute_5: Buffer(compute_4, float32, [256], [])[cse_var_1] = 0f32
-              compute_5[(cse_var_1 + 1)] = 0f32
-              compute_5[(cse_var_1 + 2)] = 0f32
-              compute_5[(cse_var_1 + 3)] = 0f32
-              compute_5[(cse_var_1 + 4)] = 0f32
-              compute_5[(cse_var_1 + 5)] = 0f32
-              compute_5[(cse_var_1 + 6)] = 0f32
-              compute_5[(cse_var_1 + 7)] = 0f32
-              compute_5[(cse_var_1 + 8)] = 0f32
-              compute_5[(cse_var_1 + 9)] = 0f32
-              compute_5[(cse_var_1 + 10)] = 0f32
-              compute_5[(cse_var_1 + 11)] = 0f32
-              compute_5[(cse_var_1 + 12)] = 0f32
-              compute_5[(cse_var_1 + 13)] = 0f32
-              compute_5[(cse_var_1 + 14)] = 0f32
-              compute_5[(cse_var_1 + 15)] = 0f32
-              compute_5[(cse_var_1 + 16)] = 0f32
-              compute_5[(cse_var_1 + 17)] = 0f32
-              compute_5[(cse_var_1 + 18)] = 0f32
-              compute_5[(cse_var_1 + 19)] = 0f32
-              compute_5[(cse_var_1 + 20)] = 0f32
-              compute_5[(cse_var_1 + 21)] = 0f32
-              compute_5[(cse_var_1 + 22)] = 0f32
-              compute_5[(cse_var_1 + 23)] = 0f32
-              compute_5[(cse_var_1 + 24)] = 0f32
-              compute_5[(cse_var_1 + 25)] = 0f32
-              compute_5[(cse_var_1 + 26)] = 0f32
-              compute_5[(cse_var_1 + 27)] = 0f32
-              compute_5[(cse_var_1 + 28)] = 0f32
-              compute_5[(cse_var_1 + 29)] = 0f32
-              compute_5[(cse_var_1 + 30)] = 0f32
-              compute_5[(cse_var_1 + 31)] = 0f32
-              compute_5[(cse_var_1 + 32)] = 0f32
-              compute_5[(cse_var_1 + 33)] = 0f32
-              compute_5[(cse_var_1 + 34)] = 0f32
-              compute_5[(cse_var_1 + 35)] = 0f32
-              compute_5[(cse_var_1 + 36)] = 0f32
-              compute_5[(cse_var_1 + 37)] = 0f32
-              compute_5[(cse_var_1 + 38)] = 0f32
-              compute_5[(cse_var_1 + 39)] = 0f32
-              compute_5[(cse_var_1 + 40)] = 0f32
-              compute_5[(cse_var_1 + 41)] = 0f32
-              compute_5[(cse_var_1 + 42)] = 0f32
-              compute_5[(cse_var_1 + 43)] = 0f32
-              compute_5[(cse_var_1 + 44)] = 0f32
-              compute_5[(cse_var_1 + 45)] = 0f32
-              compute_5[(cse_var_1 + 46)] = 0f32
-              compute_5[(cse_var_1 + 47)] = 0f32
-              compute_5[(cse_var_1 + 48)] = 0f32
-              compute_5[(cse_var_1 + 49)] = 0f32
-              compute_5[(cse_var_1 + 50)] = 0f32
-              compute_5[(cse_var_1 + 51)] = 0f32
-              compute_5[(cse_var_1 + 52)] = 0f32
-              compute_5[(cse_var_1 + 53)] = 0f32
-              compute_5[(cse_var_1 + 54)] = 0f32
-              compute_5[(cse_var_1 + 55)] = 0f32
-              compute_5[(cse_var_1 + 56)] = 0f32
-              compute_5[(cse_var_1 + 57)] = 0f32
-              compute_5[(cse_var_1 + 58)] = 0f32
-              compute_5[(cse_var_1 + 59)] = 0f32
-              compute_5[(cse_var_1 + 60)] = 0f32
-              compute_5[(cse_var_1 + 61)] = 0f32
-              compute_5[(cse_var_1 + 62)] = 0f32
-              compute_5[(cse_var_1 + 63)] = 0f32
-              for (elem_idx: int32, 0, (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])) {
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  compute_5[cse_var_1] = (compute_5[cse_var_1] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_3: int32 = (cse_var_1 + 1)
-                  compute_5[cse_var_3] = (compute_5[cse_var_3] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_4: int32 = (cse_var_1 + 2)
-                  compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_5: int32 = (cse_var_1 + 3)
-                  compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_6: int32 = (cse_var_1 + 4)
-                  compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_7: int32 = (cse_var_1 + 5)
-                  compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_8: int32 = (cse_var_1 + 6)
-                  compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_9: int32 = (cse_var_1 + 7)
-                  compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_10: int32 = (cse_var_1 + 8)
-                  compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_11: int32 = (cse_var_1 + 9)
-                  compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_12: int32 = (cse_var_1 + 10)
-                  compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_13: int32 = (cse_var_1 + 11)
-                  compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_14: int32 = (cse_var_1 + 12)
-                  compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_15: int32 = (cse_var_1 + 13)
-                  compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_16: int32 = (cse_var_1 + 14)
-                  compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_17: int32 = (cse_var_1 + 15)
-                  compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_18: int32 = (cse_var_1 + 16)
-                  compute_5[cse_var_18] = (compute_5[cse_var_18] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_19: int32 = (cse_var_1 + 17)
-                  compute_5[cse_var_19] = (compute_5[cse_var_19] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_20: int32 = (cse_var_1 + 18)
-                  compute_5[cse_var_20] = (compute_5[cse_var_20] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_21: int32 = (cse_var_1 + 19)
-                  compute_5[cse_var_21] = (compute_5[cse_var_21] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_22: int32 = (cse_var_1 + 20)
-                  compute_5[cse_var_22] = (compute_5[cse_var_22] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_23: int32 = (cse_var_1 + 21)
-                  compute_5[cse_var_23] = (compute_5[cse_var_23] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_24: int32 = (cse_var_1 + 22)
-                  compute_5[cse_var_24] = (compute_5[cse_var_24] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_25: int32 = (cse_var_1 + 23)
-                  compute_5[cse_var_25] = (compute_5[cse_var_25] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_26: int32 = (cse_var_1 + 24)
-                  compute_5[cse_var_26] = (compute_5[cse_var_26] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_27: int32 = (cse_var_1 + 25)
-                  compute_5[cse_var_27] = (compute_5[cse_var_27] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_28: int32 = (cse_var_1 + 26)
-                  compute_5[cse_var_28] = (compute_5[cse_var_28] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_29: int32 = (cse_var_1 + 27)
-                  compute_5[cse_var_29] = (compute_5[cse_var_29] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_30: int32 = (cse_var_1 + 28)
-                  compute_5[cse_var_30] = (compute_5[cse_var_30] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_31: int32 = (cse_var_1 + 29)
-                  compute_5[cse_var_31] = (compute_5[cse_var_31] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_32: int32 = (cse_var_1 + 30)
-                  compute_5[cse_var_32] = (compute_5[cse_var_32] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_33: int32 = (cse_var_1 + 31)
-                  compute_5[cse_var_33] = (compute_5[cse_var_33] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_34: int32 = (cse_var_1 + 32)
-                  compute_5[cse_var_34] = (compute_5[cse_var_34] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_35: int32 = (cse_var_1 + 33)
-                  compute_5[cse_var_35] = (compute_5[cse_var_35] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_36: int32 = (cse_var_1 + 34)
-                  compute_5[cse_var_36] = (compute_5[cse_var_36] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_37: int32 = (cse_var_1 + 35)
-                  compute_5[cse_var_37] = (compute_5[cse_var_37] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_38: int32 = (cse_var_1 + 36)
-                  compute_5[cse_var_38] = (compute_5[cse_var_38] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_39: int32 = (cse_var_1 + 37)
-                  compute_5[cse_var_39] = (compute_5[cse_var_39] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_40: int32 = (cse_var_1 + 38)
-                  compute_5[cse_var_40] = (compute_5[cse_var_40] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_41: int32 = (cse_var_1 + 39)
-                  compute_5[cse_var_41] = (compute_5[cse_var_41] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_42: int32 = (cse_var_1 + 40)
-                  compute_5[cse_var_42] = (compute_5[cse_var_42] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_43: int32 = (cse_var_1 + 41)
-                  compute_5[cse_var_43] = (compute_5[cse_var_43] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_44: int32 = (cse_var_1 + 42)
-                  compute_5[cse_var_44] = (compute_5[cse_var_44] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_45: int32 = (cse_var_1 + 43)
-                  compute_5[cse_var_45] = (compute_5[cse_var_45] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_46: int32 = (cse_var_1 + 44)
-                  compute_5[cse_var_46] = (compute_5[cse_var_46] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_47: int32 = (cse_var_1 + 45)
-                  compute_5[cse_var_47] = (compute_5[cse_var_47] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_48: int32 = (cse_var_1 + 46)
-                  compute_5[cse_var_48] = (compute_5[cse_var_48] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_49: int32 = (cse_var_1 + 47)
-                  compute_5[cse_var_49] = (compute_5[cse_var_49] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_50: int32 = (cse_var_1 + 48)
-                  compute_5[cse_var_50] = (compute_5[cse_var_50] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_51: int32 = (cse_var_1 + 49)
-                  compute_5[cse_var_51] = (compute_5[cse_var_51] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_52: int32 = (cse_var_1 + 50)
-                  compute_5[cse_var_52] = (compute_5[cse_var_52] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_53: int32 = (cse_var_1 + 51)
-                  compute_5[cse_var_53] = (compute_5[cse_var_53] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_54: int32 = (cse_var_1 + 52)
-                  compute_5[cse_var_54] = (compute_5[cse_var_54] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_55: int32 = (cse_var_1 + 53)
-                  compute_5[cse_var_55] = (compute_5[cse_var_55] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_56: int32 = (cse_var_1 + 54)
-                  compute_5[cse_var_56] = (compute_5[cse_var_56] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_57: int32 = (cse_var_1 + 55)
-                  compute_5[cse_var_57] = (compute_5[cse_var_57] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_58: int32 = (cse_var_1 + 56)
-                  compute_5[cse_var_58] = (compute_5[cse_var_58] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_59: int32 = (cse_var_1 + 57)
-                  compute_5[cse_var_59] = (compute_5[cse_var_59] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_60: int32 = (cse_var_1 + 58)
-                  compute_5[cse_var_60] = (compute_5[cse_var_60] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_61: int32 = (cse_var_1 + 59)
-                  compute_5[cse_var_61] = (compute_5[cse_var_61] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_62: int32 = (cse_var_1 + 60)
-                  compute_5[cse_var_62] = (compute_5[cse_var_62] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_63: int32 = (cse_var_1 + 61)
-                  compute_5[cse_var_63] = (compute_5[cse_var_63] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_64: int32 = (cse_var_1 + 62)
-                  compute_5[cse_var_64] = (compute_5[cse_var_64] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-                }
-                if @tir.likely((elem_idx < (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-                  let cse_var_65: int32 = (cse_var_1 + 63)
-                  compute_5[cse_var_65] = (compute_5[cse_var_65] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
+            for (i.inner.init: int32, 0, 16) {
+              let cse_var_1: int32 = ((i.outer.inner*256) + (i.inner.init*16))
+               {
+                compute_5: Buffer(compute_4, float32, [1024], [])[cse_var_1] = 0f32
+                compute_5[(cse_var_1 + 1)] = 0f32
+                compute_5[(cse_var_1 + 2)] = 0f32
+                compute_5[(cse_var_1 + 3)] = 0f32
+                compute_5[(cse_var_1 + 4)] = 0f32
+                compute_5[(cse_var_1 + 5)] = 0f32
+                compute_5[(cse_var_1 + 6)] = 0f32
+                compute_5[(cse_var_1 + 7)] = 0f32
+                compute_5[(cse_var_1 + 8)] = 0f32
+                compute_5[(cse_var_1 + 9)] = 0f32
+                compute_5[(cse_var_1 + 10)] = 0f32
+                compute_5[(cse_var_1 + 11)] = 0f32
+                compute_5[(cse_var_1 + 12)] = 0f32
+                compute_5[(cse_var_1 + 13)] = 0f32
+                compute_5[(cse_var_1 + 14)] = 0f32
+                compute_5[(cse_var_1 + 15)] = 0f32
+              }
+            }
+            for (elem_idx: int32, 0, let cse_var_2: int32 = floormod(i0.outer.i1.outer.fused, 32) in (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])) {
+              for (i.inner: int32, 0, 16) {
+                let cse_var_3: int32 = floormod(i0.outer.i1.outer.fused, 32)
+                 {
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_4: int32 = ((i.outer.inner*256) + (i.inner*16))
+                    compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[((placeholder_3[cse_var_3]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_5: int32 = (((i.outer.inner*256) + (i.inner*16)) + 1)
+                    compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_6: int32 = (((i.outer.inner*256) + (i.inner*16)) + 2)
+                    compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_7: int32 = (((i.outer.inner*256) + (i.inner*16)) + 3)
+                    compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_8: int32 = (((i.outer.inner*256) + (i.inner*16)) + 4)
+                    compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_9: int32 = (((i.outer.inner*256) + (i.inner*16)) + 5)
+                    compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_10: int32 = (((i.outer.inner*256) + (i.inner*16)) + 6)
+                    compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_11: int32 = (((i.outer.inner*256) + (i.inner*16)) + 7)
+                    compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_12: int32 = (((i.outer.inner*256) + (i.inner*16)) + 8)
+                    compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_13: int32 = (((i.outer.inner*256) + (i.inner*16)) + 9)
+                    compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_14: int32 = (((i.outer.inner*256) + (i.inner*16)) + 10)
+                    compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_15: int32 = (((i.outer.inner*256) + (i.inner*16)) + 11)
+                    compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_16: int32 = (((i.outer.inner*256) + (i.inner*16)) + 12)
+                    compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_17: int32 = (((i.outer.inner*256) + (i.inner*16)) + 13)
+                    compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_18: int32 = (((i.outer.inner*256) + (i.inner*16)) + 14)
+                    compute_5[cse_var_18] = (compute_5[cse_var_18] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
+                  if @tir.likely((elem_idx < (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                    let cse_var_19: int32 = (((i.outer.inner*256) + (i.inner*16)) + 15)
+                    compute_5[cse_var_19] = (compute_5[cse_var_19] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+                  }
                 }
               }
             }
           }
-          for (i0.inner: int32, 0, 16) {
-            let cse_var_66: int32 = (((floordiv(i0.outer.i1.outer.fused, 32)*8192) + (i0.inner*512)) + (floormod(i0.outer.i1.outer.fused, 32)*16))
-            compute[ramp(cse_var_66, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_66, 1, 16)]), broadcast(0f32, 16))
+          for (i0.inner: int32, 0, 64) {
+            let cse_var_20: int32 = (((floordiv(i0.outer.i1.outer.fused, 32)*32768) + (i0.inner*512)) + (floormod(i0.outer.i1.outer.fused, 32)*16))
+            compute[ramp(cse_var_20, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_20, 1, 16)]), broadcast(0f32, 16))
           }
         }
       }
@@ -785,7 +552,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 3.106 ms
+    Execution time of this operator: 2.112 ms
 
 
 
diff --git a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
index 85d3b59490..466eb05175 100644
--- a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
@@ -5,16 +5,16 @@
 
 Computation times
 =================
-**00:45.933** total execution time for **how_to_tune_with_autotvm** files:
+**00:46.853** total execution time for **how_to_tune_with_autotvm** files:
 
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)           | 00:45.898 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)           | 00:46.818 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_x86.py` (``tune_relay_x86.py``)               | 00:00.020 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_cuda.py` (``tune_relay_cuda.py``)             | 00:00.005 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_arm.py` (``tune_relay_arm.py``)               | 00:00.005 | 0.0 MB |
-+--------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_mobile_gpu.py` (``tune_relay_mobile_gpu.py``) | 00:00.005 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
+| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_arm.py` (``tune_relay_arm.py``)               | 00:00.005 | 0.0 MB |
++--------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
index 3c736a6923..146c7437ff 100644
--- a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
@@ -1156,8 +1156,8 @@ for this template
     TimeoutError
 
             [('tile_f', [-1, 2, 1, 64]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4909501
-    No: 9   GFLOPS: 115.95/115.95   result: MeasureResult(costs=(0.0019965838571428573,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.0792434215545654, timestamp=1663246830.32257)        [('tile_f', [-1, 1, 4, 8]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 2, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5072689
-    No: 10  GFLOPS: 0.00/115.95     result: Traceback (most recent call last):
+    No: 9   GFLOPS: 177.31/177.31   result: MeasureResult(costs=(0.0013056615444444445,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8902785778045654, timestamp=1663277972.252658)       [('tile_f', [-1, 1, 4, 8]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 2, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5072689
+    No: 10  GFLOPS: 0.00/177.31     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1280,8 +1280,8 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 4, 8]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 64, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5092711
-    No: 11  GFLOPS: 258.69/258.69   result: MeasureResult(costs=(0.0008948907678571429,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.5719780921936035, timestamp=1663246831.0225298)      [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
-    No: 12  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+    No: 11  GFLOPS: 260.55/260.55   result: MeasureResult(costs=(0.0008885024861878452,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6748344898223877, timestamp=1663277973.1763225)      [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
+    No: 12  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1404,7 +1404,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 128, 1, 2]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 256]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,183542
-    No: 13  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+    No: 13  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1527,7 +1527,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 8, 8]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 64]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2482196
-    No: 14  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+    No: 14  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1650,9 +1650,9 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 64, 1, 4]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10306226
-    No: 15  GFLOPS: 5.29/258.69     result: MeasureResult(costs=(0.043792867,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8540875911712646, timestamp=1663246835.5596945)        [('tile_f', [-1, 2, 2, 8]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 8]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,5330964
-    No: 16  GFLOPS: 3.33/258.69     result: MeasureResult(costs=(0.06942636725,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.532775402069092, timestamp=1663246836.8009863)       [('tile_f', [-1, 8, 4, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2140058
-    No: 17  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+    No: 15  GFLOPS: 5.45/260.55     result: MeasureResult(costs=(0.04245233475,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8400776386260986, timestamp=1663277977.7528288)      [('tile_f', [-1, 2, 2, 8]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 8]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,5330964
+    No: 16  GFLOPS: 3.35/260.55     result: MeasureResult(costs=(0.06918441774999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.568036079406738, timestamp=1663277978.9816248) [('tile_f', [-1, 8, 4, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2140058
+    No: 17  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 142, in build
         res = future.result()
       File "/usr/lib/python3.7/concurrent/futures/_base.py", line 435, in result
@@ -1670,8 +1670,8 @@ for this template
     TimeoutError
 
             [('tile_f', [-1, 2, 2, 1]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 16]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10195251
-    No: 18  GFLOPS: 28.15/258.69    result: MeasureResult(costs=(0.008225285214285715,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.3043599128723145, timestamp=1663246847.8164864)       [('tile_f', [-1, 4, 8, 4]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6068603
-    No: 19  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+    No: 18  GFLOPS: 26.56/260.55    result: MeasureResult(costs=(0.008714610333333333,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.167480707168579, timestamp=1663277989.870366) [('tile_f', [-1, 4, 8, 4]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6068603
+    No: 19  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1794,7 +1794,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 16, 4, 8]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 128]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6956993
-    No: 20  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+    No: 20  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1973,7 +1973,7 @@ and measure running time.
     Best config:
     [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
     Finish loading 20 records
-    Time cost of this operator: 0.001297
+    Time cost of this operator: 0.001237
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
index 1973e584e2..1d8e9e3215 100644
--- a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
@@ -329,10 +329,10 @@ Timing the untuned program
     ########## Build without Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)  
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  311.9     98.723   (1, 2, 10, 10, 3)  2       1        [311.9]           
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.069     0.971    (1, 6, 10, 10)     1       1        [3.069]           
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.964     0.305    (1, 1, 10, 10, 3)  1       1        [0.964]           
-    Total_time                                    -                                             315.933   -        -                  -       -        -                 
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  308.3     98.717   (1, 2, 10, 10, 3)  2       1        [308.3]           
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.045     0.975    (1, 6, 10, 10)     1       1        [3.045]           
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.963     0.309    (1, 1, 10, 10, 3)  1       1        [0.963]           
+    Total_time                                    -                                             312.308   -        -                  -       -        -                 
 
 
 
@@ -398,10 +398,10 @@ Timing the tuned program
     ########## Build with Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)  
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  216.9     98.597   (1, 1, 10, 10, 6)  2       1        [216.9]           
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       2.243     1.02     (1, 6, 10, 10)     1       1        [2.243]           
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.843     0.383    (1, 3, 10, 10, 1)  1       1        [0.843]           
-    Total_time                                    -                                             219.985   -        -                  -       -        -                 
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  79.75     96.642   (1, 6, 10, 10, 1)  2       1        [79.75]           
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.795     2.175    (1, 6, 10, 10)     1       1        [1.795]           
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.976     1.182    (1, 1, 10, 10, 3)  1       1        [0.976]           
+    Total_time                                    -                                             82.521    -        -                  -       -        -                 
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt b/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
index 9fc196889a..269d45314a 100644
--- a/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
@@ -225,7 +225,7 @@ take about **2 minutes** to download the Stanford Cars, while COCO 2017 validati
  .. code-block:: none
 
 
-    '/tmp/tmp9cqkckzz/images/random'
+    '/tmp/tmpl09erv86/images/random'
 
 
 
@@ -325,8 +325,8 @@ objects to other stuff? We can display some examples from our datasets using ``m
 
  .. code-block:: none
 
-    /tmp/tmp9cqkckzz/images/target contains 8144 images
-    /tmp/tmp9cqkckzz/images/random contains 5000 images
+    /tmp/tmpl09erv86/images/target contains 8144 images
+    /tmp/tmpl09erv86/images/random contains 5000 images
 
 
 
@@ -501,13 +501,13 @@ the time on our validation set).
  .. code-block:: none
 
     Epoch 1/3
-    328/328 - 47s - loss: 0.2253 - accuracy: 0.9215 - val_loss: 0.1274 - val_accuracy: 0.9603 - 47s/epoch - 143ms/step
+    328/328 - 47s - loss: 0.2132 - accuracy: 0.9262 - val_loss: 0.1393 - val_accuracy: 0.9566 - 47s/epoch - 142ms/step
     Epoch 2/3
-    328/328 - 44s - loss: 0.0992 - accuracy: 0.9623 - val_loss: 0.1189 - val_accuracy: 0.9641 - 44s/epoch - 133ms/step
+    328/328 - 43s - loss: 0.0981 - accuracy: 0.9622 - val_loss: 0.1133 - val_accuracy: 0.9622 - 43s/epoch - 132ms/step
     Epoch 3/3
-    328/328 - 45s - loss: 0.0663 - accuracy: 0.9753 - val_loss: 0.1578 - val_accuracy: 0.9569 - 45s/epoch - 136ms/step
+    328/328 - 43s - loss: 0.0669 - accuracy: 0.9742 - val_loss: 0.1277 - val_accuracy: 0.9637 - 43s/epoch - 131ms/step
 
-    <keras.callbacks.History object at 0x7f98da090e90>
+    <keras.callbacks.History object at 0x7f6aad2bd110>
 
 
 
@@ -871,7 +871,7 @@ Arduino tutorial for how to do that `on GitHub <https://github.com/guberti/tvm-a
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 4 minutes  37.645 seconds)
+   **Total running time of the script:** ( 4 minutes  24.933 seconds)
 
 
 .. _sphx_glr_download_how_to_work_with_microtvm_micro_train.py:
diff --git a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
index d37cbac961..b105b9b91e 100644
--- a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
@@ -5,16 +5,16 @@
 
 Computation times
 =================
-**05:32.482** total execution time for **how_to_work_with_microtvm** files:
+**05:17.796** total execution time for **how_to_work_with_microtvm** files:
 
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_train.py` (``micro_train.py``)               | 04:37.645 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_train.py` (``micro_train.py``)               | 04:24.933 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)         | 00:43.295 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)         | 00:42.117 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_aot.py` (``micro_aot.py``)                   | 00:08.173 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_aot.py` (``micro_aot.py``)                   | 00:07.451 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)             | 00:03.367 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)             | 00:03.294 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_microtvm_micro_ethosu.py` (``micro_ethosu.py``)             | 00:00.001 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
index 4ddf5fa3c7..50f28f50ab 100644
--- a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
@@ -5,14 +5,14 @@
 
 Computation times
 =================
-**00:43.529** total execution time for **how_to_work_with_relay** files:
+**00:43.333** total execution time for **how_to_work_with_relay** files:
 
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_using_pipeline_executor.py` (``using_pipeline_executor.py``) | 00:32.266 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_using_pipeline_executor.py` (``using_pipeline_executor.py``) | 00:31.687 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)           | 00:09.803 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)           | 00:10.182 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)                             | 00:01.453 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)                             | 00:01.457 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_relay_using_relay_viz.py` (``using_relay_viz.py``)                 | 00:00.007 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt b/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
index f51b1abfad..c0b0820980 100644
--- a/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
@@ -261,7 +261,7 @@ The following example customizes CUDA lowering rule for :code:`exp`.
  .. code-block:: none
 
 
-    <function my_cuda_math_rule at 0x7f987865a3b0>
+    <function my_cuda_math_rule at 0x7f6a2b623dd0>
 
 
 
diff --git a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
index 7290c12631..09af04e794 100644
--- a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
@@ -5,22 +5,22 @@
 
 Computation times
 =================
-**00:06.376** total execution time for **how_to_work_with_schedules** files:
+**00:06.851** total execution time for **how_to_work_with_schedules** files:
 
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)                 | 00:04.104 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)                 | 00:04.579 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)                     | 00:00.966 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)                     | 00:01.010 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)                     | 00:00.575 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)                     | 00:00.551 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)                               | 00:00.546 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)                               | 00:00.530 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_schedules_extern_op.py` (``extern_op.py``)                     | 00:00.098 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_schedule_primitives.py` (``schedule_primitives.py``) | 00:00.046 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_schedule_primitives.py` (``schedule_primitives.py``) | 00:00.042 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)                               | 00:00.027 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)                               | 00:00.026 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_tuple_inputs.py` (``tuple_inputs.py``)               | 00:00.014 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_tuple_inputs.py` (``tuple_inputs.py``)               | 00:00.015 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
index fc8d8d6646..97177fab11 100644
--- a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
@@ -347,7 +347,7 @@ The importing needs to happen before the tensorized GEMV being executed.
                  C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
       buffer_map = {A_1: A, B_1: B, C_1: C}
       preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmp340mp_wk/input0.cc'\nsource_filename = \"/tmp/tmp340mp_wk/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
+      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmpx5xvzfc9/input0.cc'\nsource_filename = \"/tmp/tmpx5xvzfc9/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
       for (i, 0, 1024) {
         for (j.outer: int32, 0, 32) {
           @tir.call_extern("gemv_update", @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
index 2234fd0cd6..58fc768ec8 100644
--- a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:22.027** total execution time for **topic_vta_tutorials_autotvm** files:
+**00:21.220** total execution time for **topic_vta_tutorials_autotvm** files:
 
 +---------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``) | 00:22.021 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``) | 00:21.213 | 0.0 MB |
 +---------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_alu_vta.py` (``tune_alu_vta.py``)     | 00:00.006 | 0.0 MB |
 +---------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
index bc712df813..2961c89edf 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
@@ -291,7 +291,7 @@ The compilation steps are:
       DeprecationWarning,
     /workspace/vta/tutorials/frontend/deploy_classification.py:213: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
       relay_prog, target=tvm.target.Target(target, host=env.target_host), params=params
-    resnet18_v1 inference graph built in 22.82s!
+    resnet18_v1 inference graph built in 22.74s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
index 793e57aeeb..13b8a26907 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
@@ -335,7 +335,7 @@ The compilation steps are:
       "target_host parameter is going to be deprecated. "
     /workspace/python/tvm/relay/build_module.py:348: DeprecationWarning: Please use input parameter mod (tvm.IRModule) instead of deprecated parameter mod (tvm.relay.function.Function)
       DeprecationWarning,
-    yolov3-tiny inference graph built in 17.48s!
+    yolov3-tiny inference graph built in 16.09s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
index 3eb2d650ec..cf9150c280 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**01:34.267** total execution time for **topic_vta_tutorials_frontend** files:
+**01:31.739** total execution time for **topic_vta_tutorials_frontend** files:
 
 +------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)           | 00:50.368 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)           | 00:48.537 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``) | 00:43.899 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``) | 00:43.203 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
index 8985f1e288..2074ad86ce 100644
--- a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:03.331** total execution time for **topic_vta_tutorials_optimize** files:
+**00:03.009** total execution time for **topic_vta_tutorials_optimize** files:
 
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)         | 00:02.902 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)         | 00:02.598 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``) | 00:00.428 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``) | 00:00.411 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
index 71717b7af9..793fc2bbbc 100644
--- a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:00.757** total execution time for **topic_vta_tutorials** files:
+**00:00.785** total execution time for **topic_vta_tutorials** files:
 
 +---------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``) | 00:00.393 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``) | 00:00.421 | 0.0 MB |
 +---------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_topic_vta_tutorials_vta_get_started.py` (``vta_get_started.py``) | 00:00.364 | 0.0 MB |
 +---------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
index 8d418ec581..5c1f2bb64d 100644
--- a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
+++ b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
@@ -326,7 +326,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 93.914 ms
+    Execution time of this operator: 93.582 ms
 
 
 
@@ -444,7 +444,7 @@ operations.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  19.446 seconds)
+   **Total running time of the script:** ( 1 minutes  18.746 seconds)
 
 
 .. _sphx_glr_download_tutorial_auto_scheduler_matmul_x86.py:
diff --git a/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt b/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
index fb49478d79..f8f715cf30 100644
--- a/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
+++ b/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
@@ -462,16 +462,16 @@ reduce variance, we take 5 measurements and average them.
     waiting for device...
     device available
     Get devices for measurement successfully!
-    No: 1   GFLOPS: 9.90/9.90       result: MeasureResult(costs=(0.027117801400000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.56846022605896, timestamp=1663245559.7403083) [('tile_y', [-1, 1]), ('tile_x', [-1, 256])],None,80
-    No: 2   GFLOPS: 2.39/9.90       result: MeasureResult(costs=(0.11231959799999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.945366621017456, timestamp=1663245561.6962612) [('tile_y', [-1, 4]), ('tile_x', [-1, 8])],None,32
-    No: 3   GFLOPS: 11.81/11.81     result: MeasureResult(costs=(0.0227336622,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5783097743988037, timestamp=1663245562.757294)        [('tile_y', [-1, 64]), ('tile_x', [-1, 32])],None,56
-    No: 4   GFLOPS: 1.85/11.81      result: MeasureResult(costs=(0.1450331794,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.4399099349975586, timestamp=1663245565.77341) [('tile_y', [-1, 1]), ('tile_x', [-1, 4])],None,20
-    No: 5   GFLOPS: 3.68/11.81      result: MeasureResult(costs=(0.07295330219999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.314638614654541, timestamp=1663245567.7472565) [('tile_y', [-1, 256]), ('tile_x', [-1, 16])],None,48
-    No: 6   GFLOPS: 1.85/11.81      result: MeasureResult(costs=(0.1449238076,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.4757537841796875, timestamp=1663245570.2668052)       [('tile_y', [-1, 512]), ('tile_x', [-1, 4])],None,29
-    No: 7   GFLOPS: 0.87/11.81      result: MeasureResult(costs=(0.30979644,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.078179597854614, timestamp=1663245575.387729)   [('tile_y', [-1, 512]), ('tile_x', [-1, 2])],None,19
-    No: 8   GFLOPS: 10.59/11.81     result: MeasureResult(costs=(0.025346019200000004,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.559157133102417, timestamp=1663245575.956249) [('tile_y', [-1, 4]), ('tile_x', [-1, 64])],None,62
-    No: 9   GFLOPS: 1.89/11.81      result: MeasureResult(costs=(0.1419734074,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.370332956314087, timestamp=1663245578.4465785)        [('tile_y', [-1, 2]), ('tile_x', [-1, 2])],None,11
-    No: 10  GFLOPS: 2.75/11.81      result: MeasureResult(costs=(0.0975077014,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6733081340789795, timestamp=1663245580.1720784)       [('tile_y', [-1, 4]), ('tile_x', [-1, 4])],None,22
+    No: 1   GFLOPS: 9.84/9.84       result: MeasureResult(costs=(0.0272695492,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5697076320648193, timestamp=1663276737.7821794)       [('tile_y', [-1, 1]), ('tile_x', [-1, 256])],None,80
+    No: 2   GFLOPS: 2.44/9.84       result: MeasureResult(costs=(0.1101957606,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.9129202365875244, timestamp=1663276739.7083054)       [('tile_y', [-1, 4]), ('tile_x', [-1, 8])],None,32
+    No: 3   GFLOPS: 11.86/11.86     result: MeasureResult(costs=(0.022632376,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5607941150665283, timestamp=1663276740.7634134)        [('tile_y', [-1, 64]), ('tile_x', [-1, 32])],None,56
+    No: 4   GFLOPS: 1.55/11.86      result: MeasureResult(costs=(0.1728821068,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.8814404010772705, timestamp=1663276744.224555)        [('tile_y', [-1, 1]), ('tile_x', [-1, 4])],None,20
+    No: 5   GFLOPS: 3.71/11.86      result: MeasureResult(costs=(0.0723810036,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2968084812164307, timestamp=1663276745.6454818)       [('tile_y', [-1, 256]), ('tile_x', [-1, 16])],None,48
+    No: 6   GFLOPS: 1.84/11.86      result: MeasureResult(costs=(0.14628462999999997,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.5020077228546143, timestamp=1663276748.1910057)        [('tile_y', [-1, 512]), ('tile_x', [-1, 4])],None,29
+    No: 7   GFLOPS: 0.87/11.86      result: MeasureResult(costs=(0.3069285432,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.0363616943359375, timestamp=1663276753.7956307)       [('tile_y', [-1, 512]), ('tile_x', [-1, 2])],None,19
+    No: 8   GFLOPS: 10.59/11.86     result: MeasureResult(costs=(0.025344378,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5466985702514648, timestamp=1663276754.3625996)        [('tile_y', [-1, 4]), ('tile_x', [-1, 64])],None,62
+    No: 9   GFLOPS: 1.86/11.86      result: MeasureResult(costs=(0.14411985700000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.400439739227295, timestamp=1663276756.8816943) [('tile_y', [-1, 2]), ('tile_x', [-1, 2])],None,11
+    No: 10  GFLOPS: 2.67/11.86      result: MeasureResult(costs=(0.10050512380000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7148866653442383, timestamp=1663276758.6540103)        [('tile_y', [-1, 4]), ('tile_x', [-1, 4])],None,22
 
 
 
diff --git a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
index abae0a88bd..c404a47622 100644
--- a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
+++ b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
@@ -327,7 +327,7 @@ standard deviation.
 
  .. code-block:: none
 
-    {'mean': 513.3250088999557, 'median': 513.3628811498056, 'std': 1.4233655232003068}
+    {'mean': 512.3894513099992, 'median': 512.8161163000016, 'std': 2.366645412308052}
 
 
 
@@ -563,30 +563,30 @@ the tuning data to.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  1/25]  Current/Best:   17.54/  17.54 GFLOPS | Progress: (4/20) | 6.53 s
    [Task  1/25]  Current/Best:    6.10/  17.54 GFLOPS | Progress: (8/20) | 9.61 s
    [Task  1/25]  Current/Best:   11.20/  22.19 GFLOPS | Progress: (12/20) | 12.13 s
    [Task  1/25]  Current/Best:   16.48/  22.20 GFLOPS | Progress: (16/20) | 13.85 s
    [Task  1/25]  Current/Best:   11.32/  23.24 GFLOPS | Progress: (20/20) | 15.67 s Done.
-
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  2/25]  Current/Best:   12.19/  12.48 GFLOPS | Progress: (4/20) | 3.98 s
    [Task  2/25]  Current/Best:   12.43/  18.14 GFLOPS | Progress: (8/20) | 5.30 s
    [Task  2/25]  Current/Best:   21.14/  21.14 GFLOPS | Progress: (12/20) | 6.67 s
    [Task  2/25]  Current/Best:   10.67/  21.14 GFLOPS | Progress: (16/20) | 7.97 s
    [Task  2/25]  Current/Best:   18.04/  21.14 GFLOPS | Progress: (20/20) | 9.61 s Done.
-
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  3/25]  Current/Best:    1.63/  10.17 GFLOPS | Progress: (4/20) | 5.92 s
    [Task  3/25]  Current/Best:   15.35/  16.86 GFLOPS | Progress: (8/20) | 7.91 s
    [Task  3/25]  Current/Best:   15.00/  16.86 GFLOPS | Progress: (12/20) | 9.66 s
    [Task  3/25]  Current/Best:    6.79/  23.31 GFLOPS | Progress: (16/20) | 11.68 s
    [Task  3/25]  Current/Best:   11.11/  23.31 GFLOPS | Progress: (20/20) | 16.34 s Done.
-
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  4/25]  Current/Best:    8.91/  18.37 GFLOPS | Progress: (4/20) | 2.43 s
    [Task  4/25]  Current/Best:    6.58/  18.37 GFLOPS | Progress: (8/20) | 7.21 s
    [Task  4/25]  Current/Best:   19.19/  19.19 GFLOPS | Progress: (12/20) | 12.26 s
    [Task  4/25]  Current/Best:   16.35/  19.29 GFLOPS | Progress: (16/20) | 14.66 s
    [Task  4/25]  Current/Best:   12.94/  19.29 GFLOPS | Progress: (20/20) | 16.67 s Done.
-
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  5/25]  Current/Best:    8.88/   9.80 GFLOPS | Progress: (4/20) | 2.70 s
    [Task  5/25]  Current/Best:   11.58/  11.58 GFLOPS | Progress: (8/20) | 4.82 s
    [Task  5/25]  Current/Best:   11.68/  17.75 GFLOPS | Progress: (12/20) | 8.06 s
    [Task  5/25]  Current/Best:   11.54/  21.99 GFLOPS | Progress: (16/20) | 9.49 s
    [Task  5/25]  Current/Best:   12.05/  21.99 GFLOPS | Progress: (20/20) | 11.45 s Done.
-
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  6/25]  Current/Best:   12.02/  19.79 GFLOPS | Progress: (4/20) | 4.17 s
    [Task  6/25]  Current/Best:   18.86/  19.79 GFLOPS | Progress: (8/20) | 5.95 s
    [Task  6/25]  Current/Best:   13.18/  19.79 GFLOPS | Progress: (12/20) | 7.98 s
    [Task  6/25]  Current/Best:   18.91/  19.79 GFLOPS | Progress: (16/20) | 10.25 s
    [Task  6/25]  Current/Best:    3.71/  19.79 GFLOPS | Progress: (20/20) | 12.85 s Done.
-
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  7/25]  Current/Best:    9.71/  12.11 GFLOPS | Progress: (4/20) | 3.74 s
    [Task  7/25]  Current/Best:   19.62/  19.62 GFLOPS | Progress: (8/20) | 5.30 s
    [Task  7/25]  Current/Best:   15.96/  19.62 GFLOPS | Progress: (12/20) | 7.28 s
    [Task  7/25]  Current/Best:   12.11/  20.20 GFLOPS | Progress: (16/20) | 9.39 s
    [Task  7/25]  Current/Best:    6.17/  20.40 GFLOPS | Progress: (20/20) | 11.90 s Done.
-
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  8/25]  Current/Best:    9.68/  13.45 GFLOPS | Progress: (4/20) | 3.00 s
    [Task  8/25]  Current/Best:    9.55/  13.45 GFLOPS | Progress: (8/20) | 8.18 s
    [Task  8/25]  Current/Best:   12.69/  13.45 GFLOPS | Progress: (12/20) | 14.80 s
    [Task  8/25]  Current/Best:   18.40/  18.40 GFLOPS | Progress: (16/20) | 16.94 s
    [Task  8/25]  Current/Best:   19.31/  19.31 GFLOPS | Progress: (20/20) | 24.12 s Done.
-
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  9/25]  Current/Best:   14.31/  14.31 GFLOPS | Progress: (4/20) | 12.00 s
    [Task  9/25]  Current/Best:   21.88/  21.88 GFLOPS | Progress: (8/20) | 13.81 s
    [Task  9/25]  Current/Best:    8.01/  21.88 GFLOPS | Progress: (12/20) | 16.40 s
    [Task  9/25]  Current/Best:   17.87/  21.88 GFLOPS | Progress: (16/20) | 19.28 s
    [Task  9/25]  Current/Best:    9.09/  21.88 GFLOPS | Progress: (20/20) | 27.89 s
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 10/25]  Current/Best:   17.54/  17.54 GFLOPS | Progress: (4/20) | 2.66 s
    [Task 10/25]  Current/Best:   15.63/  17.54 GFLOPS | Progress: (8/20) | 4.36 s
    [Task 10/25]  Current/Best:   11.17/  18.52 GFLOPS | Progress: (12/20) | 5.95 s
    [Task 10/25]  Current/Best:   19.08/  20.07 GFLOPS | Progress: (16/20) | 7.08 s
    [Task 10/25]  Current/Best:    8.52/  20.07 GFLOPS | Progress: (20/20
 ) | 8.65 s Done.
-
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 11/25]  Current/Best:   10.89/  18.18 GFLOPS | Progress: (4/20) | 3.42 s
    [Task 11/25]  Current/Best:   14.83/  18.18 GFLOPS | Progress: (8/20) | 6.30 s
    [Task 11/25]  Current/Best:   15.88/  18.18 GFLOPS | Progress: (12/20) | 8.46 s
    [Task 11/25]  Current/Best:   11.88/  20.62 GFLOPS | Progress: (16/20) | 11.36 s
    [Task 11/25]  Current/Best:   18.56/  20.62 GFLOPS | Progress: (20/20) | 13.50 s Done.
-
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 12/25]  Current/Best:    7.79/  17.83 GFLOPS | Progress: (4/20) | 5.82 s
    [Task 12/25]  Current/Best:    4.95/  17.83 GFLOPS | Progress: (8/20) | 9.77 s
    [Task 12/25]  Current/Best:   18.86/  18.86 GFLOPS | Progress: (12/20) | 11.82 s
    [Task 12/25]  Current/Best:   15.07/  18.86 GFLOPS | Progress: (16/20) | 14.84 s
    [Task 12/25]  Current/Best:   15.08/  18.86 GFLOPS | Progress: (20/20) | 16.82 s Done.
-
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 13/25]  Current/Best:    8.65/  17.25 GFLOPS | Progress: (4/20) | 3.86 s
    [Task 13/25]  Current/Best:   15.56/  20.57 GFLOPS | Progress: (8/20) | 6.45 s
    [Task 13/25]  Current/Best:   18.68/  21.69 GFLOPS | Progress: (12/20) | 9.67 s
    [Task 13/25]  Current/Best:   12.26/  21.69 GFLOPS | Progress: (16/20) | 13.15 s
    [Task 13/25]  Current/Best:   17.52/  21.69 GFLOPS | Progress: (20/20) | 15.56 s Done.
-
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 14/25]  Current/Best:   12.15/  13.38 GFLOPS | Progress: (4/20) | 3.53 s
    [Task 14/25]  Current/Best:    6.09/  13.38 GFLOPS | Progress: (8/20) | 5.75 s
    [Task 14/25]  Current/Best:   19.05/  19.18 GFLOPS | Progress: (12/20) | 8.48 s
    [Task 14/25]  Current/Best:   16.45/  19.18 GFLOPS | Progress: (16/20) | 10.14 s Done.
-
    [Task 14/25]  Current/Best:   17.04/  19.18 GFLOPS | Progress: (20/20) | 11.93 s
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 15/25]  Current/Best:   14.90/  16.48 GFLOPS | Progress: (4/20) | 2.77 s
    [Task 15/25]  Current/Best:   12.73/  17.99 GFLOPS | Progress: (8/20) | 4.16 s
    [Task 15/25]  Current/Best:    9.87/  20.56 GFLOPS | Progress: (12/20) | 6.46 s
    [Task 15/25]  Current/Best:   20.27/  20.56 GFLOPS | Progress: (16/20) | 9.80 s
    [Task 15/25]  Current/Best:    9.52/  20.56 GFLOPS | Progress: (20/20) | 10.83 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 16/25]  Current/Best:   17.41/  17.41 GFLOPS | Progress: (4/20) | 3.12 s
    [Task 16/25]  Current/Best:    3.00/  17.41 GFLOPS | Progress: (8/20) | 4.74 s
    [Task 16/25]  Current/Best:   18.19/  19.30 GFLOPS | Progress: (12/20) | 5.98 s
    [Task 16/25]  Current/Best:   17.95/  19.30 GFLOPS | Progress: (16/20) |
  7.38 s
    [Task 16/25]  Current/Best:    9.88/  21.29 GFLOPS | Progress: (20/20) | 9.54 s Done.
-
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 17/25]  Current/Best:   11.96/  16.05 GFLOPS | Progress: (4/20) | 4.91 s
    [Task 17/25]  Current/Best:   13.15/  22.87 GFLOPS | Progress: (8/20) | 7.75 s
    [Task 17/25]  Current/Best:   16.50/  22.87 GFLOPS | Progress: (12/20) | 9.89 s
    [Task 17/25]  Current/Best:   16.44/  22.87 GFLOPS | Progress: (16/20) | 12.13 s
    [Task 17/25]  Current/Best:    9.99/  22.87 GFLOPS | Progress: (20/20) | 14.30 s Done.
-
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 18/25]  Current/Best:   10.50/  16.63 GFLOPS | Progress: (4/20) | 3.89 s
    [Task 18/25]  Current/Best:   10.52/  19.07 GFLOPS | Progress: (8/20) | 7.66 s
    [Task 18/25]  Current/Best:   18.48/  19.07 GFLOPS | Progress: (12/20) | 9.63 s
    [Task 18/25]  Current/Best:   10.38/  19.07 GFLOPS | Progress: (16/20) | 13.58 s
    [Task 18/25]  Current/Best:   20.58/  20.58 GFLOPS | Progress: (20/20) | 15.13 s Done.
-
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 19/25]  Current/Best:    7.27/  19.59 GFLOPS | Progress: (4/20) | 6.16 s
    [Task 19/25]  Current/Best:    2.69/  19.59 GFLOPS | Progress: (8/20) | 9.54 s
    [Task 19/25]  Current/Best:   18.53/  20.77 GFLOPS | Progress: (12/20) | 12.54 s
    [Task 19/25]  Current/Best:   12.76/  20.77 GFLOPS | Progress: (16/20) | 15.56 s
    [Task 19/25]  Current/Best:    2.69/  22.41 GFLOPS | Progress: (20/20) | 18.39 s Done.
-
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 20/25]  Current/Best:    8.33/  14.90 GFLOPS | Progress: (4/20) | 3.40 s Done.
+
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  1/25]  Current/Best:   17.29/  17.29 GFLOPS | Progress: (4/20) | 5.87 s
    [Task  1/25]  Current/Best:    6.11/  17.29 GFLOPS | Progress: (8/20) | 9.44 s
    [Task  1/25]  Current/Best:   11.21/  22.34 GFLOPS | Progress: (12/20) | 11.94 s
    [Task  1/25]  Current/Best:   16.54/  22.34 GFLOPS | Progress: (16/20) | 13.63 s
    [Task  1/25]  Current/Best:   11.26/  23.50 GFLOPS | Progress: (20/20) | 15.40 s Done.
+
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  2/25]  Current/Best:   12.23/  12.25 GFLOPS | Progress: (4/20) | 3.86 s
    [Task  2/25]  Current/Best:   12.50/  17.93 GFLOPS | Progress: (8/20) | 5.16 s
    [Task  2/25]  Current/Best:   20.97/  20.97 GFLOPS | Progress: (12/20) | 6.47 s
    [Task  2/25]  Current/Best:   10.77/  20.97 GFLOPS | Progress: (16/20) | 7.74 s
    [Task  2/25]  Current/Best:   17.54/  20.97 GFLOPS | Progress: (20/20) | 9.33 s Done.
+
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  3/25]  Current/Best:    1.63/  10.17 GFLOPS | Progress: (4/20) | 5.88 s
    [Task  3/25]  Current/Best:   15.38/  16.87 GFLOPS | Progress: (8/20) | 7.82 s
    [Task  3/25]  Current/Best:   15.07/  16.87 GFLOPS | Progress: (12/20) | 9.54 s
    [Task  3/25]  Current/Best:    6.78/  23.37 GFLOPS | Progress: (16/20) | 11.54 s
    [Task  3/25]  Current/Best:   11.09/  23.37 GFLOPS | Progress: (20/20) | 16.20 s Done.
+
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  4/25]  Current/Best:    9.02/  18.47 GFLOPS | Progress: (4/20) | 2.42 s
    [Task  4/25]  Current/Best:    6.50/  18.47 GFLOPS | Progress: (8/20) | 7.16 s
    [Task  4/25]  Current/Best:   20.57/  20.57 GFLOPS | Progress: (12/20) | 12.13 s
    [Task  4/25]  Current/Best:   15.86/  20.57 GFLOPS | Progress: (16/20) | 14.49 s
    [Task  4/25]  Current/Best:   12.95/  20.57 GFLOPS | Progress: (20/20) | 16.46 s Done.
+
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  5/25]  Current/Best:    9.05/   9.70 GFLOPS | Progress: (4/20) | 2.63 s
    [Task  5/25]  Current/Best:   11.57/  11.57 GFLOPS | Progress: (8/20) | 4.76 s
    [Task  5/25]  Current/Best:   11.66/  18.06 GFLOPS | Progress: (12/20) | 7.82 s
    [Task  5/25]  Current/Best:   11.52/  20.53 GFLOPS | Progress: (16/20) | 9.25 s
    [Task  5/25]  Current/Best:   12.08/  21.03 GFLOPS | Progress: (20/20) | 11.19 s Done.
+
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  6/25]  Current/Best:   12.13/  19.89 GFLOPS | Progress: (4/20) | 4.15 s
    [Task  6/25]  Current/Best:   18.97/  19.89 GFLOPS | Progress: (8/20) | 5.93 s
    [Task  6/25]  Current/Best:   13.18/  19.89 GFLOPS | Progress: (12/20) | 7.96 s
    [Task  6/25]  Current/Best:   19.30/  19.89 GFLOPS | Progress: (16/20) | 10.25 s
    [Task  6/25]  Current/Best:    3.72/  19.89 GFLOPS | Progress: (20/20) | 12.87 s Done.
+
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  7/25]  Current/Best:    9.79/  12.12 GFLOPS | Progress: (4/20) | 3.70 s
    [Task  7/25]  Current/Best:   19.48/  20.06 GFLOPS | Progress: (8/20) | 5.25 s
    [Task  7/25]  Current/Best:   14.74/  20.06 GFLOPS | Progress: (12/20) | 7.20 s
    [Task  7/25]  Current/Best:   12.15/  20.12 GFLOPS | Progress: (16/20) | 9.29 s
    [Task  7/25]  Current/Best:    6.00/  20.57 GFLOPS | Progress: (20/20) | 11.81 s Done.
+
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  8/25]  Current/Best:    9.70/  13.70 GFLOPS | Progress: (4/20) | 2.98 s
    [Task  8/25]  Current/Best:    9.30/  13.70 GFLOPS | Progress: (8/20) | 8.08 s
    [Task  8/25]  Current/Best:   12.73/  13.70 GFLOPS | Progress: (12/20) | 14.51 s
    [Task  8/25]  Current/Best:   18.86/  18.86 GFLOPS | Progress: (16/20) | 16.66 s
    [Task  8/25]  Current/Best:   18.52/  18.86 GFLOPS | Progress: (20/20) | 23.76 s Done.
+
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  9/25]  Current/Best:   14.35/  14.35 GFLOPS | Progress: (4/20) | 11.98 s
    [Task  9/25]  Current/Best:   21.67/  21.67 GFLOPS | Progress: (8/20) | 13.73 s
    [Task  9/25]  Current/Best:    8.03/  21.67 GFLOPS | Progress: (12/20) | 16.28 s
    [Task  9/25]  Current/Best:   17.87/  21.67 GFLOPS | Progress: (16/20) | 19.16 s
    [Task  9/25]  Current/Best:    9.09/  21.67 GFLOPS | Progress: (20/20) | 27.75 s
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 10/25]  Current/Best:   18.18/  18.18 GFLOPS | Progress: (4/20) | 2.58 s
    [Task 10/25]  Current/Best:   15.58/  18.18 GFLOPS | Progress: (8/20) | 4.21 s
    [Task 10/25]  Current/Best:   11.38/  18.18 GFLOPS | Progress: (12/20) | 5.78 s
    [Task 10/25]  Current/Best:   19.06/  20.15 GFLOPS | Progress: (16/20) | 6.90 s
    [Task 10/25]  Current/Best:    8.39/  20.15 GFLOPS | Progress: (20/20
 ) | 8.45 s Done.
+
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 11/25]  Current/Best:   10.87/  18.10 GFLOPS | Progress: (4/20) | 3.46 s
    [Task 11/25]  Current/Best:   14.90/  18.10 GFLOPS | Progress: (8/20) | 6.29 s
    [Task 11/25]  Current/Best:   15.93/  18.10 GFLOPS | Progress: (12/20) | 8.43 s
    [Task 11/25]  Current/Best:   11.90/  20.66 GFLOPS | Progress: (16/20) | 11.31 s
    [Task 11/25]  Current/Best:   18.64/  20.66 GFLOPS | Progress: (20/20) | 13.45 s Done.
+
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 12/25]  Current/Best:    7.80/  17.74 GFLOPS | Progress: (4/20) | 5.72 s
    [Task 12/25]  Current/Best:    4.88/  17.74 GFLOPS | Progress: (8/20) | 9.67 s
    [Task 12/25]  Current/Best:   18.86/  18.86 GFLOPS | Progress: (12/20) | 11.69 s
    [Task 12/25]  Current/Best:   15.28/  18.86 GFLOPS | Progress: (16/20) | 14.66 s
    [Task 12/25]  Current/Best:   14.87/  18.86 GFLOPS | Progress: (20/20) | 16.66 s Done.
+
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 13/25]  Current/Best:    8.56/  17.29 GFLOPS | Progress: (4/20) | 3.82 s
    [Task 13/25]  Current/Best:   15.28/  20.85 GFLOPS | Progress: (8/20) | 6.41 s
    [Task 13/25]  Current/Best:   18.28/  21.18 GFLOPS | Progress: (12/20) | 9.52 s
    [Task 13/25]  Current/Best:   12.23/  21.18 GFLOPS | Progress: (16/20) | 12.93 s
    [Task 13/25]  Current/Best:   17.40/  21.18 GFLOPS | Progress: (20/20) | 15.35 s Done.
+
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 14/25]  Current/Best:   12.16/  13.34 GFLOPS | Progress: (4/20) | 3.50 s
    [Task 14/25]  Current/Best:    6.10/  13.34 GFLOPS | Progress: (8/20) | 5.70 s
    [Task 14/25]  Current/Best:   18.72/  18.88 GFLOPS | Progress: (12/20) | 8.39 s
    [Task 14/25]  Current/Best:   16.09/  18.88 GFLOPS | Progress: (16/20) | 10.03 s Done.
+
    [Task 14/25]  Current/Best:   17.13/  18.88 GFLOPS | Progress: (20/20) | 11.80 s
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 15/25]  Current/Best:   15.22/  17.31 GFLOPS | Progress: (4/20) | 2.77 s
    [Task 15/25]  Current/Best:   12.46/  17.78 GFLOPS | Progress: (8/20) | 4.13 s
    [Task 15/25]  Current/Best:    9.71/  21.16 GFLOPS | Progress: (12/20) | 6.41 s
    [Task 15/25]  Current/Best:   20.35/  21.16 GFLOPS | Progress: (16/20) | 9.57 s
    [Task 15/25]  Current/Best:    9.53/  21.16 GFLOPS | Progress: (20/20) | 10.59 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 16/25]  Current/Best:   18.12/  18.12 GFLOPS | Progress: (4/20) | 3.01 s
    [Task 16/25]  Current/Best:    3.03/  18.12 GFLOPS | Progress: (8/20) | 4.63 s
    [Task 16/25]  Current/Best:   17.07/  19.58 GFLOPS | Progress: (12/20) | 5.89 s
    [Task 16/25]  Current/Best:   17.21/  19.58 GFLOPS | Progress: (16/20) |
  7.25 s
    [Task 16/25]  Current/Best:    9.80/  21.24 GFLOPS | Progress: (20/20) | 9.42 s Done.
+
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 17/25]  Current/Best:   12.70/  16.16 GFLOPS | Progress: (4/20) | 4.84 s
    [Task 17/25]  Current/Best:   13.10/  22.99 GFLOPS | Progress: (8/20) | 7.64 s
    [Task 17/25]  Current/Best:   16.48/  22.99 GFLOPS | Progress: (12/20) | 9.75 s
    [Task 17/25]  Current/Best:   16.44/  22.99 GFLOPS | Progress: (16/20) | 12.00 s
    [Task 17/25]  Current/Best:   10.01/  22.99 GFLOPS | Progress: (20/20) | 14.17 s Done.
+
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 18/25]  Current/Best:   10.95/  16.81 GFLOPS | Progress: (4/20) | 3.85 s
    [Task 18/25]  Current/Best:   10.52/  18.29 GFLOPS | Progress: (8/20) | 7.54 s
    [Task 18/25]  Current/Best:   18.20/  18.29 GFLOPS | Progress: (12/20) | 9.51 s
    [Task 18/25]  Current/Best:   10.21/  18.29 GFLOPS | Progress: (16/20) | 13.41 s
    [Task 18/25]  Current/Best:   20.54/  20.54 GFLOPS | Progress: (20/20) | 14.96 s Done.
+
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 19/25]  Current/Best:    7.25/  19.87 GFLOPS | Progress: (4/20) | 6.15 s
    [Task 19/25]  Current/Best:    2.68/  19.87 GFLOPS | Progress: (8/20) | 9.54 s
    [Task 19/25]  Current/Best:   19.12/  20.97 GFLOPS | Progress: (12/20) | 12.52 s
    [Task 19/25]  Current/Best:   12.76/  20.97 GFLOPS | Progress: (16/20) | 15.50 s
    [Task 19/25]  Current/Best:    2.69/  22.48 GFLOPS | Progress: (20/20) | 18.34 s Done.
+
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 20/25]  Current/Best:    9.58/  15.24 GFLOPS | Progress: (4/20) | 3.34 s Done.
      Done.
-
    [Task 20/25]  Current/Best:    9.76/  14.90 GFLOPS | Progress: (8/20) | 6.98 s
    [Task 20/25]  Current/Best:    2.30/  14.90 GFLOPS | Progress: (12/20) | 10.89 s
    [Task 20/25]  Current/Best:   11.42/  14.90 GFLOPS | Progress: (16/20) | 14.71 s
    [Task 20/25]  Current/Best:   11.19/  21.95 GFLOPS | Progress: (20/20) | 16.84 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 21/25]  Current/Best:    6.37/  17.71 GFLOPS | Progress: (4/20) | 3.33 s
    [Task 21/25]  Current/Best:   14.54/  17.71 GFLOPS | Progress: (8/20) | 4.97 s
    [Task 21/25]  Current/Best:    1.61/  17.71 GFLOPS | Progress: (12/20) | 7.15 s
    [Task 21/25]  Current/Best:   15.88/  17.71 GFLOPS | Progress: (16/20) | 10.73 s
    [Task 21/25]  Current/Best:    4.46/  17.71 GFLOPS | Progress: (20/20) | 18.07 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 22/25]  Current/Best:    2.70/  16.77 GFLOPS | Progress: (4/20
 ) | 2.73 s
    [Task 22/25]  Current/Best:    8.70/  20.10 GFLOPS | Progress: (8/20) | 4.84 s
    [Task 22/25]  Current/Best:   19.86/  20.10 GFLOPS | Progress: (12/20) | 7.22 s
    [Task 22/25]  Current/Best:   15.22/  20.10 GFLOPS | Progress: (16/20) | 9.37 s
    [Task 22/25]  Current/Best:   12.27/  20.10 GFLOPS | Progress: (20/20) | 11.14 s Done.
-
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 23/25]  Current/Best:   16.60/  19.90 GFLOPS | Progress: (4/20) | 3.31 s
    [Task 23/25]  Current/Best:   13.11/  19.90 GFLOPS | Progress: (8/20) | 6.84 s
    [Task 23/25]  Current/Best:   20.60/  21.61 GFLOPS | Progress: (12/20) | 8.69 s
    [Task 23/25]  Current/Best:    6.54/  21.61 GFLOPS | Progress: (16/20) | 15.65 s
    [Task 23/25]  Current/Best:    7.83/  21.61 GFLOPS | Progress: (20/20) | 19.89 s Done.
-
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 24/25]  Current/Best:    8.27/   8.27 GFLOPS | Progress: (4/20) | 11.81 s
    [Task 24/25]  Current/Best:    2.02/   8.27 GFLOPS | Progress: (8/20) | 22.85 s
    [Task 24/25]  Current/Best:    3.61/   8.27 GFLOPS | Progress: (12/20) | 34.43 s Done.
-
    [Task 24/25]  Current/Best:    5.62/   8.48 GFLOPS | Progress: (16/20) | 40.07 s
    [Task 24/25]  Current/Best:    2.99/   8.48 GFLOPS | Progress: (20/20) | 46.27 s Done.
-
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 25/25]  Current/Best:    1.55/   2.73 GFLOPS | Progress: (4/20) | 11.64 s
    [Task 25/25]  Current/Best:    5.90/   7.80 GFLOPS | Progress: (8/20) | 22.96 s
    [Task 25/25]  Current/Best:    5.88/   7.80 GFLOPS | Progress: (12/20) | 34.47 s
    [Task 25/25]  Current/Best:    5.80/   7.92 GFLOPS | Progress: (16/20) | 36.35 s
    [Task 25/25]  Current/Best:    2.81/   8.61 GFLOPS | Progress: (20/20) | 47.03 s
+
    [Task 20/25]  Current/Best:    9.96/  15.24 GFLOPS | Progress: (8/20) | 6.93 s
    [Task 20/25]  Current/Best:    2.32/  15.24 GFLOPS | Progress: (12/20) | 10.83 s
    [Task 20/25]  Current/Best:   11.19/  15.24 GFLOPS | Progress: (16/20) | 14.71 s
    [Task 20/25]  Current/Best:   11.41/  21.59 GFLOPS | Progress: (20/20) | 16.85 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 21/25]  Current/Best:    6.34/  17.73 GFLOPS | Progress: (4/20) | 3.29 s
    [Task 21/25]  Current/Best:   14.53/  17.73 GFLOPS | Progress: (8/20) | 4.91 s
    [Task 21/25]  Current/Best:    1.61/  17.73 GFLOPS | Progress: (12/20) | 7.05 s
    [Task 21/25]  Current/Best:   15.93/  17.73 GFLOPS | Progress: (16/20) | 10.58 s
    [Task 21/25]  Current/Best:    4.46/  17.73 GFLOPS | Progress: (20/20) | 17.89 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 22/25]  Current/Best:    2.70/  16.20 GFLOPS | Progress: (4/20
 ) | 2.71 s
    [Task 22/25]  Current/Best:    8.74/  20.49 GFLOPS | Progress: (8/20) | 4.70 s
    [Task 22/25]  Current/Best:   19.72/  20.49 GFLOPS | Progress: (12/20) | 7.08 s
    [Task 22/25]  Current/Best:   15.29/  20.49 GFLOPS | Progress: (16/20) | 9.21 s
    [Task 22/25]  Current/Best:   12.26/  20.49 GFLOPS | Progress: (20/20) | 10.94 s Done.
+
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 23/25]  Current/Best:   16.64/  19.99 GFLOPS | Progress: (4/20) | 3.30 s
    [Task 23/25]  Current/Best:   13.61/  19.99 GFLOPS | Progress: (8/20) | 6.78 s
    [Task 23/25]  Current/Best:   20.63/  21.74 GFLOPS | Progress: (12/20) | 8.64 s
    [Task 23/25]  Current/Best:    6.56/  21.74 GFLOPS | Progress: (16/20) | 15.76 s
    [Task 23/25]  Current/Best:    7.65/  21.74 GFLOPS | Progress: (20/20) | 20.01 s Done.
+
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 24/25]  Current/Best:    8.47/   8.47 GFLOPS | Progress: (4/20) | 11.80 s
    [Task 24/25]  Current/Best:    2.01/   8.47 GFLOPS | Progress: (8/20) | 22.88 s
    [Task 24/25]  Current/Best:    3.88/   8.47 GFLOPS | Progress: (12/20) | 34.43 s Done.
+
    [Task 24/25]  Current/Best:    6.20/   8.89 GFLOPS | Progress: (16/20) | 40.04 s
    [Task 24/25]  Current/Best:    2.96/   8.89 GFLOPS | Progress: (20/20) | 46.06 s Done.
+
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 25/25]  Current/Best:    1.55/   2.77 GFLOPS | Progress: (4/20) | 11.60 s
    [Task 25/25]  Current/Best:    6.02/   8.29 GFLOPS | Progress: (8/20) | 22.89 s
    [Task 25/25]  Current/Best:    5.95/   8.29 GFLOPS | Progress: (12/20) | 34.38 s
    [Task 25/25]  Current/Best:    5.88/   8.45 GFLOPS | Progress: (16/20) | 36.18 s
    [Task 25/25]  Current/Best:    2.82/   9.18 GFLOPS | Progress: (20/20) | 46.85 s
 
 
 
@@ -748,8 +748,8 @@ improvement in comparing the optimized model to the unoptimized model.
 
  .. code-block:: none
 
-    optimized: {'mean': 411.55453831997875, 'median': 411.3484295499802, 'std': 0.87108492230167}
-    unoptimized: {'mean': 513.3250088999557, 'median': 513.3628811498056, 'std': 1.4233655232003068}
+    optimized: {'mean': 414.2226953699969, 'median': 414.42295164999905, 'std': 1.1449126569482697}
+    unoptimized: {'mean': 512.3894513099992, 'median': 512.8161163000016, 'std': 2.366645412308052}
 
 
 
@@ -772,7 +772,7 @@ profiling/benchmarking.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 10 minutes  34.577 seconds)
+   **Total running time of the script:** ( 10 minutes  24.029 seconds)
 
 
 .. _sphx_glr_download_tutorial_autotvm_relay_x86.py:
diff --git a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
index cad5e8e698..8bd9dc32c0 100644
--- a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
@@ -282,7 +282,7 @@ device and returns the measured cost. Network overhead is excluded.
 
  .. code-block:: none
 
-    1.441e-07 secs/op
+    1.263e-07 secs/op
 
 
 
diff --git a/docs/_sources/tutorial/intro_topi.rst.txt b/docs/_sources/tutorial/intro_topi.rst.txt
index ba19b264d1..9db517eb26 100644
--- a/docs/_sources/tutorial/intro_topi.rst.txt
+++ b/docs/_sources/tutorial/intro_topi.rst.txt
@@ -263,7 +263,7 @@ As you can see, scheduled stages of computation have been accumulated and we can
 
  .. code-block:: none
 
-    [stage(a, placeholder(a, 0x1ff323b0)), stage(b, placeholder(b, 0x22b91370)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(mi [...]
+    [stage(a, placeholder(a, 0xcc33980)), stage(b, placeholder(b, 0x22681240)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min [...]
 
 
 
diff --git a/docs/_sources/tutorial/sg_execution_times.rst.txt b/docs/_sources/tutorial/sg_execution_times.rst.txt
index f229ada444..a7cd01358a 100644
--- a/docs/_sources/tutorial/sg_execution_times.rst.txt
+++ b/docs/_sources/tutorial/sg_execution_times.rst.txt
@@ -5,32 +5,32 @@
 
 Computation times
 =================
-**13:54.846** total execution time for **tutorial** files:
+**13:41.503** total execution time for **tutorial** files:
 
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)                 | 10:34.577 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)                 | 10:24.029 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``) | 01:19.446 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``) | 01:18.746 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)     | 01:02.720 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)     | 01:01.192 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)                 | 00:32.459 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)                 | 00:31.110 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)               | 00:23.992 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)               | 00:24.383 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)       | 00:00.753 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)       | 00:01.187 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)                               | 00:00.741 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)                               | 00:00.699 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``) | 00:00.149 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``) | 00:00.148 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_introduction.py` (``introduction.py``)                           | 00:00.005 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_uma.py` (``uma.py``)                                             | 00:00.001 | 0.0 MB |
-+------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tvmc_python.py` (``tvmc_python.py``)                             | 00:00.001 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_uma.py` (``uma.py``)                                             | 00:00.002 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_tvmc_command_line_driver.py` (``tvmc_command_line_driver.py``)   | 00:00.001 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
+| :ref:`sphx_glr_tutorial_tvmc_python.py` (``tvmc_python.py``)                             | 00:00.001 | 0.0 MB |
++------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_install.py` (``install.py``)                                     | 00:00.001 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
index 39b4fe9d32..5619fb2061 100644
--- a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
+++ b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
@@ -301,7 +301,7 @@ helper function to run a profile of the TVM generated code.
 
  .. code-block:: none
 
-    Numpy running time: 0.000008
+    Numpy running time: 0.000007
     naive: 0.000007
 
 
@@ -403,7 +403,7 @@ compile and run this new schedule with the parallel operation applied:
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    parallel: 0.000007
+    parallel: 0.000008
 
 
 
@@ -512,10 +512,10 @@ We can now compare the different schedules
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                   numpy    7.968120007717516e-06                    1.0
-                   naive    6.6864999999999994e-06    0.8391565379943821
-                parallel              6.9711e-06      0.8748738715340817
-                  vector             2.45637e-05      3.0827472447966207
+                   numpy    7.033520000732097e-06                    1.0
+                   naive              6.6893e-06      0.9510600665532666
+                parallel              8.0827e-06       1.149168552752917
+                  vector    2.5464300000000003e-05    3.6204205003112957
 
 
 
@@ -936,7 +936,7 @@ matrix multiplication.
 
  .. code-block:: none
 
-    Numpy running time: 0.018261
+    Numpy running time: 0.018294
 
 
 
@@ -996,7 +996,7 @@ optimizations.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    none: 3.547660
+    none: 3.442716
 
 
 
@@ -1101,7 +1101,7 @@ schedule.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    blocking: 0.299680
+    blocking: 0.299036
 
 
 
@@ -1199,7 +1199,7 @@ already cache friendly from our previous optimizations.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    vectorization: 0.340141
+    vectorization: 0.335199
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1275,7 +1275,7 @@ more cache friendly.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    loop permutation: 0.118493
+    loop permutation: 0.115906
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1376,7 +1376,7 @@ optimized schedule.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    array packing: 0.109422
+    array packing: 0.108423
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1471,7 +1471,7 @@ to `C` when all the block results are ready.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    block caching: 0.111004
+    block caching: 0.110404
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1559,7 +1559,7 @@ of thread-level parallelization.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    parallelization: 0.145424
+    parallelization: 0.145882
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1640,13 +1640,13 @@ working, we can compare the results.
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                    none      3.5476597945000004                     1.0
-                blocking            0.2996795641     0.08447246395062981
-           vectorization     0.34014130740000004     0.09587765656879702
-        loop permutation            0.1184933813     0.03340043526262084
-           array packing            0.1094217999    0.030843374573187247
-           block caching     0.11100439209999999     0.03128946926424345
-         parallelization     0.14542441309999998    0.040991645626633655
+                    none      3.4427163323000003                     1.0
+                blocking     0.29903623960000003     0.08686055159247487
+           vectorization     0.33519857109999995     0.09736456296300817
+        loop permutation     0.11590583890000002     0.03366697331771333
+           array packing     0.10842323429999998     0.03149351379396539
+           block caching     0.11040388409999999     0.03206882979703462
+         parallelization     0.14588205399999998     0.04237411390282612
 
 
 
@@ -1688,7 +1688,7 @@ the computation for specific platforms.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  2.720 seconds)
+   **Total running time of the script:** ( 1 minutes  1.192 seconds)
 
 
 .. _sphx_glr_download_tutorial_tensor_expr_get_started.py:
diff --git a/docs/commit_hash b/docs/commit_hash
index c6a8d3eb3d..34307ffa6c 100644
--- a/docs/commit_hash
+++ b/docs/commit_hash
@@ -1 +1 @@
-397cf8781eba7a2bcc35e832130801c1d1419c43
+1f8b5dec29e6e34b4cf5f092acf5b1d197a59d42
diff --git a/docs/genindex.html b/docs/genindex.html
index 58ad5845f8..06e9f8f42e 100644
--- a/docs/genindex.html
+++ b/docs/genindex.html
@@ -3020,6 +3020,8 @@
         <li><a href="reference/api/python/topi.html#tvm.topi.nn.pad">(in module tvm.topi.nn)</a>
 </li>
       </ul></li>
+      <li><a href="reference/api/python/tir.html#tvm.tir.Schedule.pad_einsum">pad_einsum() (tvm.tir.Schedule method)</a>
+</li>
       <li><a href="reference/api/python/topi.html#tvm.topi.nn.Workload.padb">padb (tvm.topi.nn.Workload property)</a>
 </li>
       <li><a href="reference/api/python/topi.html#tvm.topi.nn.Workload.padl">padl (tvm.topi.nn.Workload property)</a>
diff --git a/docs/how_to/compile_models/from_darknet.html b/docs/how_to/compile_models/from_darknet.html
index d9c82f551c..75accde847 100644
--- a/docs/how_to/compile_models/from_darknet.html
+++ b/docs/how_to/compile_models/from_darknet.html
@@ -574,7 +574,7 @@ class:[&#39;truck 0.9266&#39;] left:471 top:83 right:689 bottom:169
 class:[&#39;bicycle 0.9984&#39;] left:111 top:113 right:577 bottom:447
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  3.838 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  2.475 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-darknet-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7716f96385bd5abb6e822041e285be54/from_darknet.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_darknet.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/from_keras.html b/docs/how_to/compile_models/from_keras.html
index 7cba98c7da..a16f1e557d 100644
--- a/docs/how_to/compile_models/from_keras.html
+++ b/docs/how_to/compile_models/from_keras.html
@@ -493,7 +493,7 @@ pip install -U tensorflow --user
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Relay top-1 id: 285, class name: Egyptian cat
 
 1/1 [==============================] - ETA: 0s
-1/1 [==============================] - 1s 1s/step
+1/1 [==============================] - 1s 966ms/step
 Keras top-1 id: 285, class name: Egyptian cat
 </pre></div>
 </div>
diff --git a/docs/how_to/compile_models/from_mxnet.html b/docs/how_to/compile_models/from_mxnet.html
index ffc1a4e9a9..ddffb7424d 100644
--- a/docs/how_to/compile_models/from_mxnet.html
+++ b/docs/how_to/compile_models/from_mxnet.html
@@ -427,7 +427,7 @@ to download the full example code</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;x&quot;</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#tuple" title="builtins.tuple" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">x</span><span class="o">.</span><span class="n">shape</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_from_mxnet_001.png" srcset="../../_images/sphx_glr_from_mxnet_001.png" alt="from mxnet" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip38f08d41-9504-4822-aa98-a937255ba425 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+<img src="../../_images/sphx_glr_from_mxnet_001.png" srcset="../../_images/sphx_glr_from_mxnet_001.png" alt="from mxnet" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip077fb9ce-969e-4a67-ac38-f671c0d05edb from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
 x (1, 3, 224, 224)
 </pre></div>
 </div>
diff --git a/docs/how_to/compile_models/from_oneflow.html b/docs/how_to/compile_models/from_oneflow.html
index fbcc36c574..1a2c182d65 100644
--- a/docs/how_to/compile_models/from_oneflow.html
+++ b/docs/how_to/compile_models/from_oneflow.html
@@ -435,13 +435,12 @@ Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdo
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip&quot; to /workspace/.oneflow/flowvision_cache/resnet18.zip
 
   0%|          | 0.00/41.5M [00:00&lt;?, ?B/s]
- 19%|#9        | 7.99M/41.5M [00:00&lt;00:00, 45.2MB/s]
- 35%|###4      | 14.3M/41.5M [00:00&lt;00:00, 42.9MB/s]
- 54%|#####3    | 22.3M/41.5M [00:00&lt;00:00, 50.3MB/s]
- 67%|######7   | 27.9M/41.5M [00:00&lt;00:00, 52.5MB/s]
- 80%|#######9  | 33.0M/41.5M [00:00&lt;00:00, 51.0MB/s]
- 97%|#########6| 40.1M/41.5M [00:00&lt;00:00, 57.2MB/s]
-100%|##########| 41.5M/41.5M [00:00&lt;00:00, 54.0MB/s]
+ 19%|#9        | 7.99M/41.5M [00:00&lt;00:00, 81.4MB/s]
+ 38%|###7      | 15.8M/41.5M [00:00&lt;00:00, 78.9MB/s]
+ 56%|#####6    | 23.3M/41.5M [00:00&lt;00:00, 54.3MB/s]
+ 72%|#######1  | 29.8M/41.5M [00:00&lt;00:00, 58.5MB/s]
+ 86%|########6 | 35.9M/41.5M [00:00&lt;00:00, 57.7MB/s]
+100%|##########| 41.5M/41.5M [00:00&lt;00:00, 60.3MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_pytorch.html b/docs/how_to/compile_models/from_pytorch.html
index fdef4a49d6..9d0c043d9f 100644
--- a/docs/how_to/compile_models/from_pytorch.html
+++ b/docs/how_to/compile_models/from_pytorch.html
@@ -414,10 +414,9 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/resnet18-f37072fd.pth&quot; to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
 
   0%|          | 0.00/44.7M [00:00&lt;?, ?B/s]
- 13%|#2        | 5.73M/44.7M [00:00&lt;00:00, 60.0MB/s]
- 26%|##5       | 11.5M/44.7M [00:00&lt;00:00, 58.5MB/s]
- 76%|#######5  | 33.9M/44.7M [00:00&lt;00:00, 138MB/s]
-100%|##########| 44.7M/44.7M [00:00&lt;00:00, 135MB/s]
+ 45%|####4     | 19.9M/44.7M [00:00&lt;00:00, 209MB/s]
+ 89%|########9 | 39.8M/44.7M [00:00&lt;00:00, 158MB/s]
+100%|##########| 44.7M/44.7M [00:00&lt;00:00, 169MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_tensorflow.html b/docs/how_to/compile_models/from_tensorflow.html
index d863494feb..a8d5c47472 100644
--- a/docs/how_to/compile_models/from_tensorflow.html
+++ b/docs/how_to/compile_models/from_tensorflow.html
@@ -636,7 +636,7 @@ banana (score = 0.00022)
 desk (score = 0.00019)
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  8.238 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  5.535 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-tensorflow-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7f1d3d1b878694c201c614c807cdebc8/from_tensorflow.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_tensorflow.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/sg_execution_times.html b/docs/how_to/compile_models/sg_execution_times.html
index a6919c510a..fc210494f8 100644
--- a/docs/how_to/compile_models/sg_execution_times.html
+++ b/docs/how_to/compile_models/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-compile-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>05:16.532</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
+<p><strong>05:05.167</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 81%" />
@@ -336,43 +336,43 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_tensorflow.html#sphx-glr-how-to-compile-models-from-tensorflow-py"><span class="std std-ref">Compile Tensorflow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tensorflow.py</span></code>)</p></td>
-<td><p>01:08.238</p></td>
+<td><p>01:05.535</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_darknet.html#sphx-glr-how-to-compile-models-from-darknet-py"><span class="std std-ref">Compile YOLO-V2 and YOLO-V3 in DarkNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_darknet.py</span></code>)</p></td>
-<td><p>01:03.838</p></td>
+<td><p>01:02.475</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_paddle.html#sphx-glr-how-to-compile-models-from-paddle-py"><span class="std std-ref">Compile PaddlePaddle Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_paddle.py</span></code>)</p></td>
-<td><p>00:40.565</p></td>
+<td><p>00:38.886</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_oneflow.html#sphx-glr-how-to-compile-models-from-oneflow-py"><span class="std std-ref">Compile OneFlow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_oneflow.py</span></code>)</p></td>
-<td><p>00:30.384</p></td>
+<td><p>00:27.948</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_mxnet.html#sphx-glr-how-to-compile-models-from-mxnet-py"><span class="std std-ref">Compile MXNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_mxnet.py</span></code>)</p></td>
-<td><p>00:27.382</p></td>
+<td><p>00:26.324</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_tflite.html#sphx-glr-how-to-compile-models-from-tflite-py"><span class="std std-ref">Compile TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tflite.py</span></code>)</p></td>
-<td><p>00:25.232</p></td>
+<td><p>00:25.453</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_coreml.html#sphx-glr-how-to-compile-models-from-coreml-py"><span class="std std-ref">Compile CoreML Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_coreml.py</span></code>)</p></td>
-<td><p>00:21.564</p></td>
+<td><p>00:21.732</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_pytorch.html#sphx-glr-how-to-compile-models-from-pytorch-py"><span class="std std-ref">Compile PyTorch Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_pytorch.py</span></code>)</p></td>
-<td><p>00:19.815</p></td>
+<td><p>00:19.222</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_keras.html#sphx-glr-how-to-compile-models-from-keras-py"><span class="std std-ref">Compile Keras Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_keras.py</span></code>)</p></td>
-<td><p>00:17.074</p></td>
+<td><p>00:15.286</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_onnx.html#sphx-glr-how-to-compile-models-from-onnx-py"><span class="std std-ref">Compile ONNX Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_onnx.py</span></code>)</p></td>
-<td><p>00:02.439</p></td>
+<td><p>00:02.306</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/deploy_models/deploy_model_on_android.html b/docs/how_to/deploy_models/deploy_model_on_android.html
index 51beba0b14..976fd14d60 100644
--- a/docs/how_to/deploy_models/deploy_model_on_android.html
+++ b/docs/how_to/deploy_models/deploy_model_on_android.html
@@ -653,7 +653,7 @@ to the remote android device.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  15.7978      15.7722      15.9916      15.6921       0.0990
+  15.7261      15.6897      16.1582      15.5028       0.1797
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
index 498f0e1ae1..6bde5a08ab 100644
--- a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
+++ b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
@@ -436,13 +436,15 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth&quot; to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
 
   0%|          | 0.00/170M [00:00&lt;?, ?B/s]
- 12%|#1        | 19.6M/170M [00:00&lt;00:00, 205MB/s]
- 29%|##8       | 48.8M/170M [00:00&lt;00:00, 264MB/s]
- 47%|####7     | 80.3M/170M [00:00&lt;00:00, 295MB/s]
- 64%|######3   | 108M/170M [00:00&lt;00:00, 293MB/s]
- 80%|########  | 136M/170M [00:00&lt;00:00, 284MB/s]
- 97%|#########7| 165M/170M [00:00&lt;00:00, 289MB/s]
-100%|##########| 170M/170M [00:00&lt;00:00, 282MB/s]
+  5%|4         | 8.27M/170M [00:00&lt;00:01, 86.7MB/s]
+ 19%|#8        | 32.0M/170M [00:00&lt;00:00, 182MB/s]
+ 33%|###2      | 55.5M/170M [00:00&lt;00:00, 211MB/s]
+ 45%|####4     | 75.7M/170M [00:00&lt;00:00, 175MB/s]
+ 55%|#####5    | 94.2M/170M [00:00&lt;00:00, 181MB/s]
+ 69%|######9   | 118M/170M [00:00&lt;00:00, 201MB/s]
+ 81%|########  | 137M/170M [00:00&lt;00:00, 188MB/s]
+ 92%|#########2| 157M/170M [00:00&lt;00:00, 193MB/s]
+100%|##########| 170M/170M [00:00&lt;00:00, 191MB/s]
 /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
   for i in range(dim)
 /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the &#39;trunc&#39; function NOT &#39;floor&#39;). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=&#39;trunc&#39;), or for actual floor division, use torch.div(a, b, rounding_mode=&#39;floor&#39;).
@@ -540,7 +542,7 @@ torchvision rcnn models.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Get 9 valid boxes
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  4.633 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  56.827 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-object-detection-pytorch-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7795da4b258c8feff986668b95ef57ad/deploy_object_detection_pytorch.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_object_detection_pytorch.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized.html b/docs/how_to/deploy_models/deploy_prequantized.html
index 461da45992..28cb9750c3 100644
--- a/docs/how_to/deploy_models/deploy_prequantized.html
+++ b/docs/how_to/deploy_models/deploy_prequantized.html
@@ -480,7 +480,7 @@ training. Other models require a full post training calibration.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/mobilenet_v2-b0353104.pth&quot; to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
 
   0%|          | 0.00/13.6M [00:00&lt;?, ?B/s]
-100%|##########| 13.6M/13.6M [00:00&lt;00:00, 175MB/s]
+100%|##########| 13.6M/13.6M [00:00&lt;00:00, 163MB/s]
 </pre></div>
 </div>
 </div>
@@ -569,7 +569,7 @@ output values are identical out of 1000 outputs from mobilenet v2.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  90.3583      90.2349      95.9917      90.1003       0.6798
+  90.3589      90.1998      96.2542      90.0808       0.7585
 </pre></div>
 </div>
 <div class="admonition note">
@@ -608,7 +608,7 @@ This includes support for the VNNI 8 bit dot product instruction (CascadeLake or
 <div class="section" id="deploy-a-quantized-tflite-model">
 <h2>Deploy a quantized TFLite Model<a class="headerlink" href="#deploy-a-quantized-tflite-model" title="Permalink to this headline">¶</a></h2>
 <p>TODO</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  11.746 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  8.664 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/fb8217c13f4351224c6cf3aacf1a87fc/deploy_prequantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized_tflite.html b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
index d4783f2faf..9c74d768dd 100644
--- a/docs/how_to/deploy_models/deploy_prequantized_tflite.html
+++ b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
@@ -573,7 +573,7 @@ TFLite Top-5 labels: [387 102 386 341 349]
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  120.1182     120.0640     122.5953     119.1741      0.4515
+  120.7587     120.6116     128.2929     119.9034      0.9078
 </pre></div>
 </div>
 <div class="admonition note">
@@ -601,7 +601,7 @@ network for ARM CPU</span></a>.</p></li>
 </ul>
 </div></blockquote>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  3.760 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  58.803 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-tflite-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/56691c7a27d45da61d112276334640d3/deploy_prequantized_tflite.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized_tflite.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_quantized.html b/docs/how_to/deploy_models/deploy_quantized.html
index 2160683b68..d7af3974b7 100644
--- a/docs/how_to/deploy_models/deploy_quantized.html
+++ b/docs/how_to/deploy_models/deploy_quantized.html
@@ -509,7 +509,7 @@ for calibration. But the accuracy might be impacted.</p>
   DeprecationWarning,
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  22.186 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  20.897 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-quantized-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7810ecf51bfc05f7d5e8a400ac3e815d/deploy_quantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_quantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
index 833ed64a85..d16425f0c3 100644
--- a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
+++ b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
@@ -441,26 +441,24 @@ to your device.</p>
 Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
 
   0%|          | 0/132723 [00:00&lt;?, ?KB/s]
-  4%|3         | 4969/132723 [00:00&lt;00:02, 49683.65KB/s]
-  9%|8         | 11927/132723 [00:00&lt;00:01, 61381.21KB/s]
- 14%|#4        | 18833/132723 [00:00&lt;00:01, 64881.68KB/s]
- 19%|#9        | 25633/132723 [00:00&lt;00:01, 66110.05KB/s]
- 25%|##4       | 32560/132723 [00:00&lt;00:01, 67245.72KB/s]
- 30%|##9       | 39569/132723 [00:00&lt;00:01, 68205.90KB/s]
- 35%|###5      | 46558/132723 [00:00&lt;00:01, 68753.42KB/s]
- 40%|####      | 53472/132723 [00:00&lt;00:01, 68874.98KB/s]
- 46%|####5     | 60558/132723 [00:00&lt;00:01, 69493.78KB/s]
- 51%|#####     | 67523/132723 [00:01&lt;00:00, 69539.11KB/s]
- 56%|#####6    | 74559/132723 [00:01&lt;00:00, 69785.86KB/s]
- 61%|######1   | 81588/132723 [00:01&lt;00:00, 69937.91KB/s]
- 67%|######6   | 88627/132723 [00:01&lt;00:00, 70072.09KB/s]
- 72%|#######2  | 95635/132723 [00:01&lt;00:00, 69879.11KB/s]
- 77%|#######7  | 102691/132723 [00:01&lt;00:00, 70082.50KB/s]
- 83%|########2 | 109736/132723 [00:01&lt;00:00, 70190.67KB/s]
- 88%|########7 | 116756/132723 [00:01&lt;00:00, 69938.09KB/s]
- 94%|#########3| 124131/132723 [00:01&lt;00:00, 71077.12KB/s]
- 99%|#########8| 131371/132723 [00:01&lt;00:00, 71470.74KB/s]
-100%|##########| 132723/132723 [00:01&lt;00:00, 69077.37KB/s]
+  4%|4         | 5441/132723 [00:00&lt;00:02, 54406.16KB/s]
+ 10%|9         | 12645/132723 [00:00&lt;00:01, 64776.29KB/s]
+ 15%|#5        | 20398/132723 [00:00&lt;00:01, 70596.88KB/s]
+ 21%|##1       | 28055/132723 [00:00&lt;00:01, 72952.10KB/s]
+ 27%|##6       | 35707/132723 [00:00&lt;00:01, 74235.86KB/s]
+ 33%|###2      | 43497/132723 [00:00&lt;00:01, 75476.30KB/s]
+ 39%|###8      | 51245/132723 [00:00&lt;00:01, 76128.89KB/s]
+ 44%|####4     | 59026/132723 [00:00&lt;00:00, 76661.67KB/s]
+ 50%|#####     | 66770/132723 [00:00&lt;00:00, 76903.73KB/s]
+ 56%|#####6    | 74486/132723 [00:01&lt;00:00, 76980.65KB/s]
+ 62%|######1   | 82236/132723 [00:01&lt;00:00, 77137.31KB/s]
+ 68%|######7   | 89976/132723 [00:01&lt;00:00, 77215.39KB/s]
+ 74%|#######3  | 97739/132723 [00:01&lt;00:00, 77337.60KB/s]
+ 80%|#######9  | 105567/132723 [00:01&lt;00:00, 77619.90KB/s]
+ 85%|########5 | 113350/132723 [00:01&lt;00:00, 77678.39KB/s]
+ 91%|#########1| 121181/132723 [00:01&lt;00:00, 77867.51KB/s]
+ 97%|#########7| 128988/132723 [00:01&lt;00:00, 77926.40KB/s]
+100%|##########| 132723/132723 [00:01&lt;00:00, 75927.83KB/s]
 </pre></div>
 </div>
 <p>Create TVM runtime and do inference
@@ -503,7 +501,7 @@ Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from h
 <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" srcset="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" alt="deploy ssd gluoncv" class = "sphx-glr-single-img"/><p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  37.628 seconds)</p>
+<img src="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" srcset="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" alt="deploy ssd gluoncv" class = "sphx-glr-single-img"/><p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  35.386 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-ssd-gluoncv-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/cccb17d28e5e8b2e94ea8cd5ec59f6ed/deploy_ssd_gluoncv.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_ssd_gluoncv.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/sg_execution_times.html b/docs/how_to/deploy_models/sg_execution_times.html
index 54e3802d1e..2e7b2538e3 100644
--- a/docs/how_to/deploy_models/sg_execution_times.html
+++ b/docs/how_to/deploy_models/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-deploy-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>11:38.358</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
+<p><strong>11:15.103</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 86%" />
@@ -336,35 +336,35 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_object_detection_pytorch.html#sphx-glr-how-to-deploy-models-deploy-object-detection-pytorch-py"><span class="std std-ref">Compile PyTorch Object Detection Models</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_object_detection_pytorch.py</span></code>)</p></td>
-<td><p>03:04.633</p></td>
+<td><p>02:56.827</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_ssd_gluoncv.html#sphx-glr-how-to-deploy-models-deploy-ssd-gluoncv-py"><span class="std std-ref">Deploy Single Shot Multibox Detector(SSD) model</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_ssd_gluoncv.py</span></code>)</p></td>
-<td><p>02:37.628</p></td>
+<td><p>02:35.386</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_prequantized_tflite.html#sphx-glr-how-to-deploy-models-deploy-prequantized-tflite-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM - Part 3 (TFLite)</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized_tflite.py</span></code>)</p></td>
-<td><p>02:03.760</p></td>
+<td><p>01:58.803</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_quantized.html#sphx-glr-how-to-deploy-models-deploy-quantized-py"><span class="std std-ref">Deploy a Quantized Model on Cuda</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_quantized.py</span></code>)</p></td>
-<td><p>01:22.186</p></td>
+<td><p>01:20.897</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_prequantized.html#sphx-glr-how-to-deploy-models-deploy-prequantized-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized.py</span></code>)</p></td>
-<td><p>01:11.746</p></td>
+<td><p>01:08.664</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_model_on_android.html#sphx-glr-how-to-deploy-models-deploy-model-on-android-py"><span class="std std-ref">Deploy the Pretrained Model on Android</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_android.py</span></code>)</p></td>
-<td><p>00:32.246</p></td>
+<td><p>00:29.259</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_model_on_nano.html#sphx-glr-how-to-deploy-models-deploy-model-on-nano-py"><span class="std std-ref">Deploy the Pretrained Model on Jetson Nano</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_nano.py</span></code>)</p></td>
-<td><p>00:23.644</p></td>
+<td><p>00:22.892</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_model_on_rasp.html#sphx-glr-how-to-deploy-models-deploy-model-on-rasp-py"><span class="std std-ref">Deploy the Pretrained Model on Raspberry Pi</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_rasp.py</span></code>)</p></td>
-<td><p>00:22.508</p></td>
+<td><p>00:22.367</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_sparse.html#sphx-glr-how-to-deploy-models-deploy-sparse-py"><span class="std std-ref">Deploy a Hugging Face Pruned Model on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_sparse.py</span></code>)</p></td>
diff --git a/docs/how_to/extend_tvm/bring_your_own_datatypes.html b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
index 9053410a35..2ad2f9fcf6 100644
--- a/docs/how_to/extend_tvm/bring_your_own_datatypes.html
+++ b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
@@ -612,7 +612,7 @@ In this alpha state of the Bring Your Own Datatypes framework, we have not imple
 <span class="n">module</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">params</span></a> <span class="o">=</span> <span class="n">get_mobilenet</span><span class="p">()</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zipbcdcf044-386a-4c95-a634-4f83076f4b8e from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip479f8cbc-19b2-4c49-bd9e-ecc16b4f6a39 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 </pre></div>
 </div>
 <p>It’s easy to execute MobileNet with native TVM:</p>
diff --git a/docs/how_to/extend_tvm/sg_execution_times.html b/docs/how_to/extend_tvm/sg_execution_times.html
index aea5dc5fb8..f6e1476eb5 100644
--- a/docs/how_to/extend_tvm/sg_execution_times.html
+++ b/docs/how_to/extend_tvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-extend-tvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:40.316</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
+<p><strong>00:41.313</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,15 +336,15 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="bring_your_own_datatypes.html#sphx-glr-how-to-extend-tvm-bring-your-own-datatypes-py"><span class="std std-ref">Bring Your Own Datatypes to TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">bring_your_own_datatypes.py</span></code>)</p></td>
-<td><p>00:37.249</p></td>
+<td><p>00:38.151</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="use_pass_instrument.html#sphx-glr-how-to-extend-tvm-use-pass-instrument-py"><span class="std std-ref">How to Use TVM Pass Instrument</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_instrument.py</span></code>)</p></td>
-<td><p>00:02.141</p></td>
+<td><p>00:02.231</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="use_pass_infra.html#sphx-glr-how-to-extend-tvm-use-pass-infra-py"><span class="std std-ref">How to Use TVM Pass Infra</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_infra.py</span></code>)</p></td>
-<td><p>00:00.918</p></td>
+<td><p>00:00.924</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="low_level_custom_pass.html#sphx-glr-how-to-extend-tvm-low-level-custom-pass-py"><span class="std std-ref">Writing a Customized Pass</span></a> (<code class="docutils literal notranslate"><span class="pre">low_level_custom_pass.py</span></code>)</p></td>
diff --git a/docs/how_to/extend_tvm/use_pass_instrument.html b/docs/how_to/extend_tvm/use_pass_instrument.html
index 1ee1b3f0cd..bdb45423b0 100644
--- a/docs/how_to/extend_tvm/use_pass_instrument.html
+++ b/docs/how_to/extend_tvm/use_pass_instrument.html
@@ -512,10 +512,10 @@ profile the execution time of each passes.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 7733us [7733us] (49.58%; 49.58%)
-FoldScaleAxis: 7863us [5us] (50.42%; 50.42%)
-        FoldConstant: 7858us [1616us] (50.38%; 99.94%)
-                InferType: 6242us [6242us] (40.02%; 79.43%)
+InferType: 6733us [6733us] (46.27%; 46.27%)
+FoldScaleAxis: 7818us [5us] (53.73%; 53.73%)
+        FoldConstant: 7813us [1630us] (53.69%; 99.93%)
+                InferType: 6183us [6183us] (42.49%; 79.14%)
 </pre></div>
 </div>
 </div>
@@ -537,10 +537,10 @@ Refer to following sections and <a class="reference internal" href="../../refere
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 6329us [6329us] (44.35%; 44.35%)
-FoldScaleAxis: 7943us [4us] (55.65%; 55.65%)
-        FoldConstant: 7939us [1637us] (55.62%; 99.95%)
-                InferType: 6302us [6302us] (44.16%; 79.38%)
+InferType: 6205us [6205us] (44.24%; 44.24%)
+FoldScaleAxis: 7820us [4us] (55.76%; 55.76%)
+        FoldConstant: 7816us [1685us] (55.73%; 99.94%)
+                InferType: 6131us [6131us] (43.72%; 78.45%)
 </pre></div>
 </div>
 <p>Register empty list to clear existing instruments.</p>
diff --git a/docs/how_to/optimize_operators/opt_conv_cuda.html b/docs/how_to/optimize_operators/opt_conv_cuda.html
index 5a3cb0a9ba..5a35289198 100644
--- a/docs/how_to/optimize_operators/opt_conv_cuda.html
+++ b/docs/how_to/optimize_operators/opt_conv_cuda.html
@@ -564,7 +564,7 @@ latency of convolution.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Convolution: </span><span class="si">%f</span><span class="s2"> ms&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span> <span class="o">*</span> <span cl [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 54.151693 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 54.099895 ms
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-optimize-operators-opt-conv-cuda-py">
diff --git a/docs/how_to/optimize_operators/opt_conv_tensorcore.html b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
index 1b9320185c..eb832b8ce6 100644
--- a/docs/how_to/optimize_operators/opt_conv_tensorcore.html
+++ b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
@@ -906,7 +906,7 @@ be able to run on our build server</p>
     <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;conv2d with tensor core: </span><span class="si">%f</span><span class="s2"> ms&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span> <span class="o">* [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 8.537677 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 6.482232 ms
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/optimize_operators/opt_gemm.html b/docs/how_to/optimize_operators/opt_gemm.html
index 5a8f3e4f6b..2766fd345e 100644
--- a/docs/how_to/optimize_operators/opt_gemm.html
+++ b/docs/how_to/optimize_operators/opt_gemm.html
@@ -461,8 +461,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Baseline: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018252
-Baseline: 3.441999
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018508
+Baseline: 3.439445
 </pre></div>
 </div>
 <p>In TVM, we can always inspect lower level IR to debug or optimize our schedule.
@@ -522,7 +522,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt1: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.295153
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.298512
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -589,7 +589,7 @@ vastly.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt2: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.325529
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.332190
 </pre></div>
 </div>
 <p>Here is the generated IR after vectorization.</p>
@@ -650,7 +650,7 @@ the access pattern for A matrix is more cache friendly.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt3: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.115407
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.115539
 </pre></div>
 </div>
 <p>Here is the generated IR after loop permutation.</p>
@@ -733,7 +733,7 @@ flattening.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt4: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.110130
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.109163
 </pre></div>
 </div>
 <p>Here is the generated IR after array packing.</p>
@@ -819,7 +819,7 @@ write to C when all the block results are ready.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt5: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.110737
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.110682
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -909,7 +909,7 @@ write to C when all the block results are ready.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt6: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">opt6_time</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.147071
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.147013
 </pre></div>
 </div>
 <p>Here is the generated IR after parallelization.</p>
diff --git a/docs/how_to/optimize_operators/sg_execution_times.html b/docs/how_to/optimize_operators/sg_execution_times.html
index 2f3157ac11..03e4c88785 100644
--- a/docs/how_to/optimize_operators/sg_execution_times.html
+++ b/docs/how_to/optimize_operators/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-optimize-operators-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:34.479</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
+<p><strong>00:34.498</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,15 +336,15 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="opt_gemm.html#sphx-glr-how-to-optimize-operators-opt-gemm-py"><span class="std std-ref">How to optimize GEMM on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_gemm.py</span></code>)</p></td>
-<td><p>00:32.111</p></td>
+<td><p>00:32.243</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="opt_conv_tensorcore.html#sphx-glr-how-to-optimize-operators-opt-conv-tensorcore-py"><span class="std std-ref">How to optimize convolution using TensorCores</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_tensorcore.py</span></code>)</p></td>
-<td><p>00:01.293</p></td>
+<td><p>00:01.220</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="opt_conv_cuda.html#sphx-glr-how-to-optimize-operators-opt-conv-cuda-py"><span class="std std-ref">How to optimize convolution on GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_cuda.py</span></code>)</p></td>
-<td><p>00:01.074</p></td>
+<td><p>00:01.035</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
index 3b535dbd95..29441d9b95 100644
--- a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
+++ b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autoscheduler-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>06:42.374</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
+<p><strong>06:24.910</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 85%" />
@@ -336,27 +336,27 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_conv2d_layer_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py"><span class="std std-ref">Auto-scheduling a Convolution Layer for GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_layer_cuda.py</span></code>)</p></td>
-<td><p>03:38.089</p></td>
+<td><p>03:28.377</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_network_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-x86-py"><span class="std std-ref">Auto-scheduling a Neural Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_x86.py</span></code>)</p></td>
-<td><p>01:24.486</p></td>
+<td><p>01:22.647</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_network_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-cuda-py"><span class="std std-ref">Auto-scheduling a Neural Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_cuda.py</span></code>)</p></td>
-<td><p>00:58.199</p></td>
+<td><p>00:56.055</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_sparse_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-sparse-x86-py"><span class="std std-ref">Auto-scheduling Sparse Matrix Multiplication on CPU with Custom Sketch Rule</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_sparse_x86.py</span></code>)</p></td>
-<td><p>00:23.694</p></td>
+<td><p>00:20.373</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="tune_network_arm.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-arm-py"><span class="std std-ref">Auto-scheduling a Neural Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_arm.py</span></code>)</p></td>
-<td><p>00:09.050</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="tune_network_mali.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-mali-py"><span class="std std-ref">Auto-scheduling a Neural Network for mali GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_mali.py</span></code>)</p></td>
+<td><p>00:08.879</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="tune_network_mali.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-mali-py"><span class="std std-ref">Auto-scheduling a Neural Network for mali GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_mali.py</span></code>)</p></td>
-<td><p>00:08.856</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="tune_network_arm.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-arm-py"><span class="std std-ref">Auto-scheduling a Neural Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_arm.py</span></code>)</p></td>
+<td><p>00:08.579</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
index 824f1834b6..1e22f595eb 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
@@ -1004,7 +1004,7 @@ cooperative fetching, unrolling and operator fusion.</p>
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.375 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.364 ms
 </pre></div>
 </div>
 </div>
@@ -1567,7 +1567,7 @@ In the example below we resume the status and do more 5 trials.</p>
 Get devices for measurement successfully!
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  38.089 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  28.377 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e3e540f3b477c0c52d8eb73e674e8ffd/tune_conv2d_layer_cuda.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_conv2d_layer_cuda.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
index 270a676883..4a136f12e0 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
@@ -906,7 +906,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   8.2668       8.2655       8.2728       8.2623       0.0044
+   8.1831       8.1809       8.1958       8.1727       0.0096
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
index 149eb8f157..e0aedd54ed 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
@@ -925,7 +925,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  760.9283     760.8877     761.6646     760.2326      0.5853
+  753.8122     753.3414     754.9024     753.1927      0.7733
 </pre></div>
 </div>
 </div>
@@ -947,7 +947,7 @@ to learn how to use the RPC Tracker and RPC Server.
 To use the RPC Tracker in auto-scheduler, replace the runner in <code class="code docutils literal notranslate"><span class="pre">TuningOptions</span></code>
 with <a class="reference internal" href="../../reference/api/python/auto_scheduler.html#tvm.auto_scheduler.RPCRunner" title="tvm.auto_scheduler.RPCRunner"><code class="xref any py py-class docutils literal notranslate"><span class="pre">auto_scheduler.RPCRunner</span></code></a>.</p></li>
 </ol>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  24.486 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  22.647 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-network-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e416b94ca1090b0897c0f6e0df95b911/tune_network_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_network_x86.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
index 47b8c512e5..e2332b4f86 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
@@ -625,339 +625,106 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
              placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
              compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
   buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-  preflattened_buffer_map = {placeholder_8: placeholder_15: Buffer(placeholder_13, int32, [33], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_9: placeholder_17: Buffer(placeholder_14, float32, [128, 512], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_7: placeholder_19: Buffer(placeholder_12, int32, [4916], [])} {
-  for (i0.outer.i1.outer.fused: int32, 0, 256) &quot;parallel&quot; {
-    allocate(compute_4: Pointer(global float32), float32, [256]), storage_scope = global {
+  preflattened_buffer_map = {compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_8: placeholder_15: Buffer(placeholder_13, int32, [33], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_7: placeholder_17: Buffer(placeholder_12, int32, [4916], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_9: placeholder_19: Buffer(placeholder_14, float32, [128, 512], [])} {
+  for (i0.outer.i1.outer.fused: int32, 0, 64) &quot;parallel&quot; {
+    allocate(compute_4: Pointer(global float32), float32, [1024]), storage_scope = global {
       for (i.outer.inner: int32, 0, 4) {
-        let cse_var_2: int32 = floormod(i0.outer.i1.outer.fused, 32)
-        let cse_var_1: int32 = (i.outer.inner*64)
-         {
-          compute_5: Buffer(compute_4, float32, [256], [])[cse_var_1] = 0f32
-          compute_5[(cse_var_1 + 1)] = 0f32
-          compute_5[(cse_var_1 + 2)] = 0f32
-          compute_5[(cse_var_1 + 3)] = 0f32
-          compute_5[(cse_var_1 + 4)] = 0f32
-          compute_5[(cse_var_1 + 5)] = 0f32
-          compute_5[(cse_var_1 + 6)] = 0f32
-          compute_5[(cse_var_1 + 7)] = 0f32
-          compute_5[(cse_var_1 + 8)] = 0f32
-          compute_5[(cse_var_1 + 9)] = 0f32
-          compute_5[(cse_var_1 + 10)] = 0f32
-          compute_5[(cse_var_1 + 11)] = 0f32
-          compute_5[(cse_var_1 + 12)] = 0f32
-          compute_5[(cse_var_1 + 13)] = 0f32
-          compute_5[(cse_var_1 + 14)] = 0f32
-          compute_5[(cse_var_1 + 15)] = 0f32
-          compute_5[(cse_var_1 + 16)] = 0f32
-          compute_5[(cse_var_1 + 17)] = 0f32
-          compute_5[(cse_var_1 + 18)] = 0f32
-          compute_5[(cse_var_1 + 19)] = 0f32
-          compute_5[(cse_var_1 + 20)] = 0f32
-          compute_5[(cse_var_1 + 21)] = 0f32
-          compute_5[(cse_var_1 + 22)] = 0f32
-          compute_5[(cse_var_1 + 23)] = 0f32
-          compute_5[(cse_var_1 + 24)] = 0f32
-          compute_5[(cse_var_1 + 25)] = 0f32
-          compute_5[(cse_var_1 + 26)] = 0f32
-          compute_5[(cse_var_1 + 27)] = 0f32
-          compute_5[(cse_var_1 + 28)] = 0f32
-          compute_5[(cse_var_1 + 29)] = 0f32
-          compute_5[(cse_var_1 + 30)] = 0f32
-          compute_5[(cse_var_1 + 31)] = 0f32
-          compute_5[(cse_var_1 + 32)] = 0f32
-          compute_5[(cse_var_1 + 33)] = 0f32
-          compute_5[(cse_var_1 + 34)] = 0f32
-          compute_5[(cse_var_1 + 35)] = 0f32
-          compute_5[(cse_var_1 + 36)] = 0f32
-          compute_5[(cse_var_1 + 37)] = 0f32
-          compute_5[(cse_var_1 + 38)] = 0f32
-          compute_5[(cse_var_1 + 39)] = 0f32
-          compute_5[(cse_var_1 + 40)] = 0f32
-          compute_5[(cse_var_1 + 41)] = 0f32
-          compute_5[(cse_var_1 + 42)] = 0f32
-          compute_5[(cse_var_1 + 43)] = 0f32
-          compute_5[(cse_var_1 + 44)] = 0f32
-          compute_5[(cse_var_1 + 45)] = 0f32
-          compute_5[(cse_var_1 + 46)] = 0f32
-          compute_5[(cse_var_1 + 47)] = 0f32
-          compute_5[(cse_var_1 + 48)] = 0f32
-          compute_5[(cse_var_1 + 49)] = 0f32
-          compute_5[(cse_var_1 + 50)] = 0f32
-          compute_5[(cse_var_1 + 51)] = 0f32
-          compute_5[(cse_var_1 + 52)] = 0f32
-          compute_5[(cse_var_1 + 53)] = 0f32
-          compute_5[(cse_var_1 + 54)] = 0f32
-          compute_5[(cse_var_1 + 55)] = 0f32
-          compute_5[(cse_var_1 + 56)] = 0f32
-          compute_5[(cse_var_1 + 57)] = 0f32
-          compute_5[(cse_var_1 + 58)] = 0f32
-          compute_5[(cse_var_1 + 59)] = 0f32
-          compute_5[(cse_var_1 + 60)] = 0f32
-          compute_5[(cse_var_1 + 61)] = 0f32
-          compute_5[(cse_var_1 + 62)] = 0f32
-          compute_5[(cse_var_1 + 63)] = 0f32
-          for (elem_idx: int32, 0, (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])) {
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              compute_5[cse_var_1] = (compute_5[cse_var_1] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_3: int32 = (cse_var_1 + 1)
-              compute_5[cse_var_3] = (compute_5[cse_var_3] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_4: int32 = (cse_var_1 + 2)
-              compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_5: int32 = (cse_var_1 + 3)
-              compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_6: int32 = (cse_var_1 + 4)
-              compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_7: int32 = (cse_var_1 + 5)
-              compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_8: int32 = (cse_var_1 + 6)
-              compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_9: int32 = (cse_var_1 + 7)
-              compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_10: int32 = (cse_var_1 + 8)
-              compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_11: int32 = (cse_var_1 + 9)
-              compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_12: int32 = (cse_var_1 + 10)
-              compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_13: int32 = (cse_var_1 + 11)
-              compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_14: int32 = (cse_var_1 + 12)
-              compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_15: int32 = (cse_var_1 + 13)
-              compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_16: int32 = (cse_var_1 + 14)
-              compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_17: int32 = (cse_var_1 + 15)
-              compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[(((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)])], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_18: int32 = (cse_var_1 + 16)
-              compute_5[cse_var_18] = (compute_5[cse_var_18] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_19: int32 = (cse_var_1 + 17)
-              compute_5[cse_var_19] = (compute_5[cse_var_19] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_20: int32 = (cse_var_1 + 18)
-              compute_5[cse_var_20] = (compute_5[cse_var_20] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_21: int32 = (cse_var_1 + 19)
-              compute_5[cse_var_21] = (compute_5[cse_var_21] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_22: int32 = (cse_var_1 + 20)
-              compute_5[cse_var_22] = (compute_5[cse_var_22] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_23: int32 = (cse_var_1 + 21)
-              compute_5[cse_var_23] = (compute_5[cse_var_23] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_24: int32 = (cse_var_1 + 22)
-              compute_5[cse_var_24] = (compute_5[cse_var_24] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_25: int32 = (cse_var_1 + 23)
-              compute_5[cse_var_25] = (compute_5[cse_var_25] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_26: int32 = (cse_var_1 + 24)
-              compute_5[cse_var_26] = (compute_5[cse_var_26] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_27: int32 = (cse_var_1 + 25)
-              compute_5[cse_var_27] = (compute_5[cse_var_27] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_28: int32 = (cse_var_1 + 26)
-              compute_5[cse_var_28] = (compute_5[cse_var_28] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_29: int32 = (cse_var_1 + 27)
-              compute_5[cse_var_29] = (compute_5[cse_var_29] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_30: int32 = (cse_var_1 + 28)
-              compute_5[cse_var_30] = (compute_5[cse_var_30] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_31: int32 = (cse_var_1 + 29)
-              compute_5[cse_var_31] = (compute_5[cse_var_31] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_32: int32 = (cse_var_1 + 30)
-              compute_5[cse_var_32] = (compute_5[cse_var_32] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_33: int32 = (cse_var_1 + 31)
-              compute_5[cse_var_33] = (compute_5[cse_var_33] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 256)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_34: int32 = (cse_var_1 + 32)
-              compute_5[cse_var_34] = (compute_5[cse_var_34] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_35: int32 = (cse_var_1 + 33)
-              compute_5[cse_var_35] = (compute_5[cse_var_35] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_36: int32 = (cse_var_1 + 34)
-              compute_5[cse_var_36] = (compute_5[cse_var_36] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_37: int32 = (cse_var_1 + 35)
-              compute_5[cse_var_37] = (compute_5[cse_var_37] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_38: int32 = (cse_var_1 + 36)
-              compute_5[cse_var_38] = (compute_5[cse_var_38] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_39: int32 = (cse_var_1 + 37)
-              compute_5[cse_var_39] = (compute_5[cse_var_39] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_40: int32 = (cse_var_1 + 38)
-              compute_5[cse_var_40] = (compute_5[cse_var_40] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_41: int32 = (cse_var_1 + 39)
-              compute_5[cse_var_41] = (compute_5[cse_var_41] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_42: int32 = (cse_var_1 + 40)
-              compute_5[cse_var_42] = (compute_5[cse_var_42] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_43: int32 = (cse_var_1 + 41)
-              compute_5[cse_var_43] = (compute_5[cse_var_43] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_44: int32 = (cse_var_1 + 42)
-              compute_5[cse_var_44] = (compute_5[cse_var_44] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_45: int32 = (cse_var_1 + 43)
-              compute_5[cse_var_45] = (compute_5[cse_var_45] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_46: int32 = (cse_var_1 + 44)
-              compute_5[cse_var_46] = (compute_5[cse_var_46] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_47: int32 = (cse_var_1 + 45)
-              compute_5[cse_var_47] = (compute_5[cse_var_47] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_48: int32 = (cse_var_1 + 46)
-              compute_5[cse_var_48] = (compute_5[cse_var_48] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_49: int32 = (cse_var_1 + 47)
-              compute_5[cse_var_49] = (compute_5[cse_var_49] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 512)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_50: int32 = (cse_var_1 + 48)
-              compute_5[cse_var_50] = (compute_5[cse_var_50] + (placeholder_1[((placeholder_3[cse_var_2]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_51: int32 = (cse_var_1 + 49)
-              compute_5[cse_var_51] = (compute_5[cse_var_51] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_52: int32 = (cse_var_1 + 50)
-              compute_5[cse_var_52] = (compute_5[cse_var_52] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_53: int32 = (cse_var_1 + 51)
-              compute_5[cse_var_53] = (compute_5[cse_var_53] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_54: int32 = (cse_var_1 + 52)
-              compute_5[cse_var_54] = (compute_5[cse_var_54] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_55: int32 = (cse_var_1 + 53)
-              compute_5[cse_var_55] = (compute_5[cse_var_55] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_56: int32 = (cse_var_1 + 54)
-              compute_5[cse_var_56] = (compute_5[cse_var_56] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_57: int32 = (cse_var_1 + 55)
-              compute_5[cse_var_57] = (compute_5[cse_var_57] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_58: int32 = (cse_var_1 + 56)
-              compute_5[cse_var_58] = (compute_5[cse_var_58] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_59: int32 = (cse_var_1 + 57)
-              compute_5[cse_var_59] = (compute_5[cse_var_59] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_60: int32 = (cse_var_1 + 58)
-              compute_5[cse_var_60] = (compute_5[cse_var_60] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_61: int32 = (cse_var_1 + 59)
-              compute_5[cse_var_61] = (compute_5[cse_var_61] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_62: int32 = (cse_var_1 + 60)
-              compute_5[cse_var_62] = (compute_5[cse_var_62] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_63: int32 = (cse_var_1 + 61)
-              compute_5[cse_var_63] = (compute_5[cse_var_63] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_64: int32 = (cse_var_1 + 62)
-              compute_5[cse_var_64] = (compute_5[cse_var_64] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
-            }
-            if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])), dtype=bool) {
-              let cse_var_65: int32 = (cse_var_1 + 63)
-              compute_5[cse_var_65] = (compute_5[cse_var_65] + (placeholder_1[(((placeholder_3[cse_var_2]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*4096) + (i.outer.inner*1024)) + placeholder_2[(placeholder_3[cse_var_2] + elem_idx)]) + 768)], 0f32)))
+        for (i.inner.init: int32, 0, 16) {
+          let cse_var_1: int32 = ((i.outer.inner*256) + (i.inner.init*16))
+           {
+            compute_5: Buffer(compute_4, float32, [1024], [])[cse_var_1] = 0f32
+            compute_5[(cse_var_1 + 1)] = 0f32
+            compute_5[(cse_var_1 + 2)] = 0f32
+            compute_5[(cse_var_1 + 3)] = 0f32
+            compute_5[(cse_var_1 + 4)] = 0f32
+            compute_5[(cse_var_1 + 5)] = 0f32
+            compute_5[(cse_var_1 + 6)] = 0f32
+            compute_5[(cse_var_1 + 7)] = 0f32
+            compute_5[(cse_var_1 + 8)] = 0f32
+            compute_5[(cse_var_1 + 9)] = 0f32
+            compute_5[(cse_var_1 + 10)] = 0f32
+            compute_5[(cse_var_1 + 11)] = 0f32
+            compute_5[(cse_var_1 + 12)] = 0f32
+            compute_5[(cse_var_1 + 13)] = 0f32
+            compute_5[(cse_var_1 + 14)] = 0f32
+            compute_5[(cse_var_1 + 15)] = 0f32
+          }
+        }
+        for (elem_idx: int32, 0, let cse_var_2: int32 = floormod(i0.outer.i1.outer.fused, 32) in (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])) {
+          for (i.inner: int32, 0, 16) {
+            let cse_var_3: int32 = floormod(i0.outer.i1.outer.fused, 32)
+             {
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_4: int32 = ((i.outer.inner*256) + (i.inner*16))
+                compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[((placeholder_3[cse_var_3]*16) + (elem_idx*16))]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_5: int32 = (((i.outer.inner*256) + (i.inner*16)) + 1)
+                compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 1)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_6: int32 = (((i.outer.inner*256) + (i.inner*16)) + 2)
+                compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 2)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_7: int32 = (((i.outer.inner*256) + (i.inner*16)) + 3)
+                compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 3)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_8: int32 = (((i.outer.inner*256) + (i.inner*16)) + 4)
+                compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 4)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_9: int32 = (((i.outer.inner*256) + (i.inner*16)) + 5)
+                compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 5)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_10: int32 = (((i.outer.inner*256) + (i.inner*16)) + 6)
+                compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 6)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_11: int32 = (((i.outer.inner*256) + (i.inner*16)) + 7)
+                compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 7)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_12: int32 = (((i.outer.inner*256) + (i.inner*16)) + 8)
+                compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 8)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_13: int32 = (((i.outer.inner*256) + (i.inner*16)) + 9)
+                compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 9)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_14: int32 = (((i.outer.inner*256) + (i.inner*16)) + 10)
+                compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 10)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_15: int32 = (((i.outer.inner*256) + (i.inner*16)) + 11)
+                compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 11)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_16: int32 = (((i.outer.inner*256) + (i.inner*16)) + 12)
+                compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 12)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_17: int32 = (((i.outer.inner*256) + (i.inner*16)) + 13)
+                compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 13)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_18: int32 = (((i.outer.inner*256) + (i.inner*16)) + 14)
+                compute_5[cse_var_18] = (compute_5[cse_var_18] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 14)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
+              if @tir.likely((elem_idx &lt; (placeholder_3[(cse_var_3 + 1)] - placeholder_3[cse_var_3])), dtype=bool) {
+                let cse_var_19: int32 = (((i.outer.inner*256) + (i.inner*16)) + 15)
+                compute_5[cse_var_19] = (compute_5[cse_var_19] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + 15)]*max(placeholder[((((floordiv(i0.outer.i1.outer.fused, 32)*16384) + (i.outer.inner*4096)) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+              }
             }
           }
         }
       }
-      for (i0.inner: int32, 0, 16) {
-        let cse_var_66: int32 = (((floordiv(i0.outer.i1.outer.fused, 32)*8192) + (i0.inner*512)) + (floormod(i0.outer.i1.outer.fused, 32)*16))
-        compute[ramp(cse_var_66, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_66, 1, 16)]), broadcast(0f32, 16))
+      for (i0.inner: int32, 0, 64) {
+        let cse_var_20: int32 = (((floordiv(i0.outer.i1.outer.fused, 32)*32768) + (i0.inner*512)) + (floormod(i0.outer.i1.outer.fused, 32)*16))
+        compute[ramp(cse_var_20, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_20, 1, 16)]), broadcast(0f32, 16))
       }
     }
   }
@@ -995,7 +762,7 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 3.106 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 2.112 ms
 </pre></div>
 </div>
 <div class="admonition note">
diff --git a/docs/how_to/tune_with_autotvm/sg_execution_times.html b/docs/how_to/tune_with_autotvm/sg_execution_times.html
index c438274bc6..9788b85280 100644
--- a/docs/how_to/tune_with_autotvm/sg_execution_times.html
+++ b/docs/how_to/tune_with_autotvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autotvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:45.933</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
+<p><strong>00:46.853</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,7 +336,7 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_conv2d_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-conv2d-cuda-py"><span class="std std-ref">Tuning High Performance Convolution on NVIDIA GPUs</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_cuda.py</span></code>)</p></td>
-<td><p>00:45.898</p></td>
+<td><p>00:46.818</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_relay_x86.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-x86-py"><span class="std std-ref">Auto-tuning a Convolutional Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_x86.py</span></code>)</p></td>
@@ -347,11 +347,11 @@
 <td><p>00:00.005</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="tune_relay_arm.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-arm-py"><span class="std std-ref">Auto-tuning a Convolutional Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_arm.py</span></code>)</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="tune_relay_mobile_gpu.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-mobile-gpu-py"><span class="std std-ref">Auto-tuning a Convolutional Network for Mobile GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_mobile_gpu.py</span></code>)</p></td>
 <td><p>00:00.005</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="tune_relay_mobile_gpu.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-mobile-gpu-py"><span class="std std-ref">Auto-tuning a Convolutional Network for Mobile GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_mobile_gpu.py</span></code>)</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="tune_relay_arm.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-arm-py"><span class="std std-ref">Auto-tuning a Convolutional Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_arm.py</span></code>)</p></td>
 <td><p>00:00.005</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
diff --git a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
index 8122a944bd..2bacf3fdac 100644
--- a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
+++ b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
@@ -1436,8 +1436,8 @@ No: 8   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
 TimeoutError
 
         [(&#39;tile_f&#39;, [-1, 2, 1, 64]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4909501
-No: 9   GFLOPS: 115.95/115.95   result: MeasureResult(costs=(0.0019965838571428573,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.0792434215545654, timestamp=1663246830.32257)        [(&#39;tile_f&#39;, [-1, 1, 4, 8]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 2, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5072689
-No: 10  GFLOPS: 0.00/115.95     result: Traceback (most recent call last):
+No: 9   GFLOPS: 177.31/177.31   result: MeasureResult(costs=(0.0013056615444444445,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8902785778045654, timestamp=1663277972.252658)       [(&#39;tile_f&#39;, [-1, 1, 4, 8]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 2, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5072689
+No: 10  GFLOPS: 0.00/177.31     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1560,8 +1560,8 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 4, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 64, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5092711
-No: 11  GFLOPS: 258.69/258.69   result: MeasureResult(costs=(0.0008948907678571429,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.5719780921936035, timestamp=1663246831.0225298)      [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
-No: 12  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+No: 11  GFLOPS: 260.55/260.55   result: MeasureResult(costs=(0.0008885024861878452,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6748344898223877, timestamp=1663277973.1763225)      [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
+No: 12  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1684,7 +1684,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 128, 1, 2]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 256]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 0)],None,183542
-No: 13  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+No: 13  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1807,7 +1807,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 8, 8]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 64]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2482196
-No: 14  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+No: 14  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1930,9 +1930,9 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 64, 1, 4]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,10306226
-No: 15  GFLOPS: 5.29/258.69     result: MeasureResult(costs=(0.043792867,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8540875911712646, timestamp=1663246835.5596945)        [(&#39;tile_f&#39;, [-1, 2, 2, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 8]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,5330964
-No: 16  GFLOPS: 3.33/258.69     result: MeasureResult(costs=(0.06942636725,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.532775402069092, timestamp=1663246836.8009863)       [(&#39;tile_f&#39;, [-1, 8, 4, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2140058
-No: 17  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+No: 15  GFLOPS: 5.45/260.55     result: MeasureResult(costs=(0.04245233475,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8400776386260986, timestamp=1663277977.7528288)      [(&#39;tile_f&#39;, [-1, 2, 2, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 8]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,5330964
+No: 16  GFLOPS: 3.35/260.55     result: MeasureResult(costs=(0.06918441774999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.568036079406738, timestamp=1663277978.9816248) [(&#39;tile_f&#39;, [-1, 8, 4, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2140058
+No: 17  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 142, in build
     res = future.result()
   File &quot;/usr/lib/python3.7/concurrent/futures/_base.py&quot;, line 435, in result
@@ -1950,8 +1950,8 @@ No: 17  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
 TimeoutError
 
         [(&#39;tile_f&#39;, [-1, 2, 2, 1]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 16]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,10195251
-No: 18  GFLOPS: 28.15/258.69    result: MeasureResult(costs=(0.008225285214285715,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.3043599128723145, timestamp=1663246847.8164864)       [(&#39;tile_f&#39;, [-1, 4, 8, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6068603
-No: 19  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+No: 18  GFLOPS: 26.56/260.55    result: MeasureResult(costs=(0.008714610333333333,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.167480707168579, timestamp=1663277989.870366) [(&#39;tile_f&#39;, [-1, 4, 8, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6068603
+No: 19  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -2074,7 +2074,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 16, 4, 8]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 128]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6956993
-No: 20  GFLOPS: 0.00/258.69     result: Traceback (most recent call last):
+No: 20  GFLOPS: 0.00/260.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -2237,7 +2237,7 @@ and measure running time.</p>
 Best config:
 [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
 Finish loading 20 records
-Time cost of this operator: 0.001297
+Time cost of this operator: 0.001237
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autotvm-tune-conv2d-cuda-py">
diff --git a/docs/how_to/work_with_microtvm/micro_autotune.html b/docs/how_to/work_with_microtvm/micro_autotune.html
index e34b4c0927..1005c60637 100644
--- a/docs/how_to/work_with_microtvm/micro_autotune.html
+++ b/docs/how_to/work_with_microtvm/micro_autotune.html
@@ -584,10 +584,10 @@ the tuned operator.</p>
 ########## Build without Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
 ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  311.9     98.723   (1, 2, 10, 10, 3)  2       1        [311.9]
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.069     0.971    (1, 6, 10, 10)     1       1        [3.069]
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.964     0.305    (1, 1, 10, 10, 3)  1       1        [0.964]
-Total_time                                    -                                             315.933   -        -                  -       -        -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  308.3     98.717   (1, 2, 10, 10, 3)  2       1        [308.3]
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.045     0.975    (1, 6, 10, 10)     1       1        [3.045]
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.963     0.309    (1, 1, 10, 10, 3)  1       1        [0.963]
+Total_time                                    -                                             312.308   -        -                  -       -        -
 </pre></div>
 </div>
 </div>
@@ -640,10 +640,10 @@ Total_time                                    -
 ########## Build with Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
 ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  216.9     98.597   (1, 1, 10, 10, 6)  2       1        [216.9]
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       2.243     1.02     (1, 6, 10, 10)     1       1        [2.243]
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.843     0.383    (1, 3, 10, 10, 1)  1       1        [0.843]
-Total_time                                    -                                             219.985   -        -                  -       -        -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  79.75     96.642   (1, 6, 10, 10, 1)  2       1        [79.75]
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.795     2.175    (1, 6, 10, 10)     1       1        [1.795]
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.976     1.182    (1, 1, 10, 10, 3)  1       1        [0.976]
+Total_time                                    -                                             82.521    -        -                  -       -        -
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-work-with-microtvm-micro-autotune-py">
diff --git a/docs/how_to/work_with_microtvm/micro_train.html b/docs/how_to/work_with_microtvm/micro_train.html
index 820db7b58b..aa9ae3565c 100644
--- a/docs/how_to/work_with_microtvm/micro_train.html
+++ b/docs/how_to/work_with_microtvm/micro_train.html
@@ -516,7 +516,7 @@ take about <strong>2 minutes</strong> to download the Stanford Cars, while COCO
 <a href="https://docs.python.org/3/library/shutil.html#shutil.move" title="shutil.move" class="sphx-glr-backref-module-shutil sphx-glr-backref-type-py-function"><span class="n">shutil</span><span class="o">.</span><span class="n">move</span></a><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><a href="https://docs.python.org/3/library/stdtypes.html#str" title="builtins.str" class="sphx-glr-backref-module-builtins sphx-glr-backref-typ [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&#39;/tmp/tmp9cqkckzz/images/random&#39;
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&#39;/tmp/tmpl09erv86/images/random&#39;
 </pre></div>
 </div>
 </div>
@@ -576,8 +576,8 @@ objects to other stuff? We can display some examples from our datasets using <co
     <span class="n">plt</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&quot;off&quot;</span><span class="p">)</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_micro_train_001.png" srcset="../../_images/sphx_glr_micro_train_001.png" alt="[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0], [1.0, 0.0]" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/tmp/tmp9cqkckzz/images/target contains 8144 images
-/tmp/tmp9cqkckzz/images/random contains 5000 images
+<img src="../../_images/sphx_glr_micro_train_001.png" srcset="../../_images/sphx_glr_micro_train_001.png" alt="[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0], [1.0, 0.0]" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/tmp/tmpl09erv86/images/target contains 8144 images
+/tmp/tmpl09erv86/images/random contains 5000 images
 </pre></div>
 </div>
 </div>
@@ -689,13 +689,13 @@ the time on our validation set).</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Epoch 1/3
-328/328 - 47s - loss: 0.2253 - accuracy: 0.9215 - val_loss: 0.1274 - val_accuracy: 0.9603 - 47s/epoch - 143ms/step
+328/328 - 47s - loss: 0.2132 - accuracy: 0.9262 - val_loss: 0.1393 - val_accuracy: 0.9566 - 47s/epoch - 142ms/step
 Epoch 2/3
-328/328 - 44s - loss: 0.0992 - accuracy: 0.9623 - val_loss: 0.1189 - val_accuracy: 0.9641 - 44s/epoch - 133ms/step
+328/328 - 43s - loss: 0.0981 - accuracy: 0.9622 - val_loss: 0.1133 - val_accuracy: 0.9622 - 43s/epoch - 132ms/step
 Epoch 3/3
-328/328 - 45s - loss: 0.0663 - accuracy: 0.9753 - val_loss: 0.1578 - val_accuracy: 0.9569 - 45s/epoch - 136ms/step
+328/328 - 43s - loss: 0.0669 - accuracy: 0.9742 - val_loss: 0.1277 - val_accuracy: 0.9637 - 43s/epoch - 131ms/step
 
-&lt;keras.callbacks.History object at 0x7f98da090e90&gt;
+&lt;keras.callbacks.History object at 0x7f6aad2bd110&gt;
 </pre></div>
 </div>
 </div>
@@ -961,7 +961,7 @@ as intended.</p>
 <p>From here, we could modify the model to read live images from the camera - we have another
 Arduino tutorial for how to do that <a class="reference external" href="https://github.com/guberti/tvm-arduino-demos/tree/master/examples/person_detection">on GitHub</a>. Alternatively, we could also
 <a class="reference external" href="https://tvm.apache.org/docs/how_to/work_with_microtvm/micro_autotune.html">use TVM’s autotuning capabilities</a> to dramatically improve the model’s performance.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 4 minutes  37.645 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 4 minutes  24.933 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-work-with-microtvm-micro-train-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/b52cec46baf4f78d6bcd94cbe269c8a6/micro_train.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">micro_train.py</span></code></a></p>
diff --git a/docs/how_to/work_with_microtvm/sg_execution_times.html b/docs/how_to/work_with_microtvm/sg_execution_times.html
index 075378b354..a7d359ffe9 100644
--- a/docs/how_to/work_with_microtvm/sg_execution_times.html
+++ b/docs/how_to/work_with_microtvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-microtvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>05:32.482</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
+<p><strong>05:17.796</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,19 +336,19 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_train.html#sphx-glr-how-to-work-with-microtvm-micro-train-py"><span class="std std-ref">Training Vision Models for microTVM on Arduino</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_train.py</span></code>)</p></td>
-<td><p>04:37.645</p></td>
+<td><p>04:24.933</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="micro_autotune.html#sphx-glr-how-to-work-with-microtvm-micro-autotune-py"><span class="std std-ref">Autotuning with microTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_autotune.py</span></code>)</p></td>
-<td><p>00:43.295</p></td>
+<td><p>00:42.117</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_aot.html#sphx-glr-how-to-work-with-microtvm-micro-aot-py"><span class="std std-ref">microTVM Host-Driven AoT</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_aot.py</span></code>)</p></td>
-<td><p>00:08.173</p></td>
+<td><p>00:07.451</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="micro_tflite.html#sphx-glr-how-to-work-with-microtvm-micro-tflite-py"><span class="std std-ref">microTVM with TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_tflite.py</span></code>)</p></td>
-<td><p>00:03.367</p></td>
+<td><p>00:03.294</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_ethosu.html#sphx-glr-how-to-work-with-microtvm-micro-ethosu-py"><span class="std std-ref">Running TVM on bare metal Arm(R) Cortex(R)-M55 CPU and Ethos(TM)-U55 NPU with CMSIS-NN</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_ethosu.py</span></code>)</p></td>
diff --git a/docs/how_to/work_with_relay/sg_execution_times.html b/docs/how_to/work_with_relay/sg_execution_times.html
index 8bc8118cd7..8bf10d34e6 100644
--- a/docs/how_to/work_with_relay/sg_execution_times.html
+++ b/docs/how_to/work_with_relay/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-relay-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:43.529</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
+<p><strong>00:43.333</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,15 +336,15 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="using_pipeline_executor.html#sphx-glr-how-to-work-with-relay-using-pipeline-executor-py"><span class="std std-ref">Using Pipeline Executor in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_pipeline_executor.py</span></code>)</p></td>
-<td><p>00:32.266</p></td>
+<td><p>00:31.687</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="using_external_lib.html#sphx-glr-how-to-work-with-relay-using-external-lib-py"><span class="std std-ref">Using External Libraries in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_external_lib.py</span></code>)</p></td>
-<td><p>00:09.803</p></td>
+<td><p>00:10.182</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="build_gcn.html#sphx-glr-how-to-work-with-relay-build-gcn-py"><span class="std std-ref">Building a Graph Convolutional Network</span></a> (<code class="docutils literal notranslate"><span class="pre">build_gcn.py</span></code>)</p></td>
-<td><p>00:01.453</p></td>
+<td><p>00:01.457</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="using_relay_viz.html#sphx-glr-how-to-work-with-relay-using-relay-viz-py"><span class="std std-ref">Use Relay Visualizer to Visualize Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_relay_viz.py</span></code>)</p></td>
diff --git a/docs/how_to/work_with_schedules/intrin_math.html b/docs/how_to/work_with_schedules/intrin_math.html
index 0cc0cb1178..cb4d093509 100644
--- a/docs/how_to/work_with_schedules/intrin_math.html
+++ b/docs/how_to/work_with_schedules/intrin_math.html
@@ -522,7 +522,7 @@ The following example customizes CUDA lowering rule for <code class="code docuti
 <a href="../../reference/api/python/ir.html#tvm.ir.register_intrin_lowering" title="tvm.ir.register_intrin_lowering" class="sphx-glr-backref-module-tvm-ir sphx-glr-backref-type-py-function"><span class="n">register_intrin_lowering</span></a><span class="p">(</span><span class="s2">&quot;tir.exp&quot;</span><span class="p">,</span> <span class="n">target</span><span class="o">=</span><span class="s2">&quot;cuda&quot;</span><span class="p">,</span> <span class="n">f</span><span class="o">= [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&lt;function my_cuda_math_rule at 0x7f987865a3b0&gt;
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&lt;function my_cuda_math_rule at 0x7f6a2b623dd0&gt;
 </pre></div>
 </div>
 <p>Register the rule to TVM with override option to override existing rule.
diff --git a/docs/how_to/work_with_schedules/sg_execution_times.html b/docs/how_to/work_with_schedules/sg_execution_times.html
index 87bd6614d7..7f4f29bfac 100644
--- a/docs/how_to/work_with_schedules/sg_execution_times.html
+++ b/docs/how_to/work_with_schedules/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-schedules-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:06.376</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
+<p><strong>00:06.851</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,19 +336,19 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="intrin_math.html#sphx-glr-how-to-work-with-schedules-intrin-math-py"><span class="std std-ref">Intrinsics and Math Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">intrin_math.py</span></code>)</p></td>
-<td><p>00:04.104</p></td>
+<td><p>00:04.579</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tensorize.html#sphx-glr-how-to-work-with-schedules-tensorize-py"><span class="std std-ref">Use Tensorize to Leverage Hardware Intrinsics</span></a> (<code class="docutils literal notranslate"><span class="pre">tensorize.py</span></code>)</p></td>
-<td><p>00:00.966</p></td>
+<td><p>00:01.010</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="reduction.html#sphx-glr-how-to-work-with-schedules-reduction-py"><span class="std std-ref">Reduction</span></a> (<code class="docutils literal notranslate"><span class="pre">reduction.py</span></code>)</p></td>
-<td><p>00:00.575</p></td>
+<td><p>00:00.551</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="scan.html#sphx-glr-how-to-work-with-schedules-scan-py"><span class="std std-ref">Scan and Recurrent Kernel</span></a> (<code class="docutils literal notranslate"><span class="pre">scan.py</span></code>)</p></td>
-<td><p>00:00.546</p></td>
+<td><p>00:00.530</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="extern_op.html#sphx-glr-how-to-work-with-schedules-extern-op-py"><span class="std std-ref">External Tensor Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">extern_op.py</span></code>)</p></td>
@@ -356,15 +356,15 @@
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="schedule_primitives.html#sphx-glr-how-to-work-with-schedules-schedule-primitives-py"><span class="std std-ref">Schedule Primitives in TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">schedule_primitives.py</span></code>)</p></td>
-<td><p>00:00.046</p></td>
+<td><p>00:00.042</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tedd.html#sphx-glr-how-to-work-with-schedules-tedd-py"><span class="std std-ref">Use Tensor Expression Debug Display (TEDD) for Visualization</span></a> (<code class="docutils literal notranslate"><span class="pre">tedd.py</span></code>)</p></td>
-<td><p>00:00.027</p></td>
+<td><p>00:00.026</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tuple_inputs.html#sphx-glr-how-to-work-with-schedules-tuple-inputs-py"><span class="std std-ref">Compute and Reduce with Tuple Inputs</span></a> (<code class="docutils literal notranslate"><span class="pre">tuple_inputs.py</span></code>)</p></td>
-<td><p>00:00.014</p></td>
+<td><p>00:00.015</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/work_with_schedules/tensorize.html b/docs/how_to/work_with_schedules/tensorize.html
index 929a501f30..6f6bbb49cb 100644
--- a/docs/how_to/work_with_schedules/tensorize.html
+++ b/docs/how_to/work_with_schedules/tensorize.html
@@ -577,7 +577,7 @@ The importing needs to happen before the tensorized GEMV being executed.</p>
              C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
   buffer_map = {A_1: A, B_1: B, C_1: C}
   preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmp340mp_wk/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmp340mp_wk/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
+  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmpx5xvzfc9/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmpx5xvzfc9/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
   for (i, 0, 1024) {
     for (j.outer: int32, 0, 32) {
       @tir.call_extern(&quot;gemv_update&quot;, @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/objects.inv b/docs/objects.inv
index 3056b6109f..99f743b49e 100644
Binary files a/docs/objects.inv and b/docs/objects.inv differ
diff --git a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html
index 583fa15a45..cf654aee52 100644
--- a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html
+++ b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html
@@ -122,46 +122,47 @@ $(function() {
   <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#aa1612f69ea5b4225d4cda759cd517323">Object</a>(Object &amp;&amp;other)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
   <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a69c32fbd96181f5c21d2c878ab285e4f">operator=</a>(const Object &amp;other)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
   <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ae341e561272ff43cdcbc927bc29ac50d">operator=</a>(Object &amp;&amp;other)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a553dc17c0b49b175cd16881c81b6c789">Parallel</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a0d492efee331e2239a093f4b2017c10f">ref_counter_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a55549a6c23987890246248682560a03d">RefCounterType</a> typedef</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a9e36a8a0e37a76e55068dd534e28c8c5">ReIndex</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a70d353bb52f6fa29fedeb90a6ff872d5">RemoveRV</a>(const BlockRV &amp;block_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a7c44d4f4ea662291ccb9d79383b6fefe">RemoveRV</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a00fcf343d2bc8f36f170c04e5e29d2dc">RemoveRV</a>(const ExprRV &amp;expr_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a059229fe0e254961da406807a97f7a3d">Reorder</a>(const Array&lt; LoopRV &gt; &amp;ordered_loop_rvs)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ad75e0424902b06dca23d46807a9a47d5">ReverseComputeAt</a>(const BlockRV &amp;block_rv, const LoopRV &amp;loop_rv, bool preserve_unit_loops, int index=-1)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a99c902d903680da14339842dd2fd29c7">ReverseComputeInline</a>(const BlockRV &amp;block)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ab185c8eac1065290d84d58e7f4617232">RFactor</a>(const LoopRV &amp;loop_rv, int factor_axis)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ad94d79729ac85aa7c976e23d39066383">RuntimeTypeIndex</a>()</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ab9d2b3a98335b88f168b69deed49eb19">SampleCategorical</a>(const Array&lt; Integer &gt; &amp;candidates, const Array&lt; FloatImm &gt; &amp;probs, Optional&lt; Integer &gt; decision=NullOpt)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#abf9fbec94271b7512c24b6eced230c39">SampleComputeLocation</a>(const BlockRV &amp;block_rv, Optional&lt; Integer &gt; decision=NullOpt)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a2c62b3f9486dd35714df50bc424d6698">SamplePerfectTile</a>(const LoopRV &amp;loop_rv, int n, int max_innermost_factor, Optional&lt; Array&lt; Integer &gt;&gt; decision=NullOpt)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#aae5808dc2e987bf17ef42196457a654d">Schedule</a> class</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">friend</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a3cb60d6112fe5a443ef39bc005c9fbf1">Seed</a>(support::LinearCongruentialEngine::TRandState seed)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a025b5eef0c2516fc1f72eed9ced88807">SetAxisSeparator</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type, const Array&lt; IntImm &gt; &amp;axis_separators)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#aa4760135d373af488a08aaeba7114c48">SetScope</a>(const BlockRV &amp;block_rv, int buffer_index, const String &amp;storage_scope)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ac190a0ab76d8754a35209479bcc6dfa2">Split</a>(const LoopRV &amp;loop_rv, const Array&lt; Optional&lt; ExprRV &gt;&gt; &amp;factors, bool preserve_unit_iters=true)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#abb3612c2598fa2d3ee0e6e3fc3de8a26">state</a>() const =0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a93d1d23f24d903db844f75f51fe09a36">StorageAlign</a>(const BlockRV &amp;block_rv, int buffer_index, int axis, int factor, int offset)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ae3794a03b566e5b1721b44c564992975">Tensorize</a>(const LoopRV &amp;loop_rv, const String &amp;intrin)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#aaca1621ab9c3db0ddd04ac57de79d37f">Tensorize</a>(const BlockRV &amp;block_rv, const String &amp;intrin)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a953bca4123b5a758adfdcd65634a5f3b">trace</a>() const =0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a998b22e37ef63a697a984c8ebcc39ca2">TransformBlockLayout</a>(const BlockRV &amp;block_rv, const IndexMap &amp;index_map)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c">TransformLayout</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type, const IndexMap &amp;index_map)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#abc4294398b140f3ff13a33f94a2f9e5f">TVM_DECLARE_FINAL_OBJECT_INFO</a>(ScheduleNode, runtime::Object)</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a481f01923b14e1851ebd38506e9c66ea">type_index</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a4bfc2586cb55f2af47728187b3256255">type_index_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">TypeIndex2Key</a>(uint32_t tindex)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6ee32a02dd44257da105fbbe5d9c8622">TypeIndex2KeyHash</a>(uint32_t tindex)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6841f97e06e6614dd7e82c6dd41b818a">TypeKey2Index</a>(const std::string &amp;key)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a45cd553c09ec836dfcbff81379647f07">Unannotate</a>(const LoopRV &amp;loop_rv, const String &amp;ann_key)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a7c310bca5d1583e61a3f27052a1dd5d0">Unannotate</a>(const BlockRV &amp;block_rv, const String &amp;ann_key)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#afd548730a6139d19fe24473ad66026d7">unique</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a84ec742f6295f59390592a6d0d90a552">Unroll</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ab4a8cd91959ceab22855ec338978bcee">Vectorize</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#acb747d074e1f99477f7132e4614221a3">WorkOn</a>(const String &amp;func_name)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ae637f126412479ed9bec05fd55376f7f">~ScheduleNode</a>()=default</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a1ac39c82aee1f8de30d5871d5923fc24">PadEinsum</a>(const BlockRV &amp;block_rv, const Array&lt; Integer &gt; &amp;padding)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a553dc17c0b49b175cd16881c81b6c789">Parallel</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a0d492efee331e2239a093f4b2017c10f">ref_counter_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a55549a6c23987890246248682560a03d">RefCounterType</a> typedef</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a9e36a8a0e37a76e55068dd534e28c8c5">ReIndex</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a70d353bb52f6fa29fedeb90a6ff872d5">RemoveRV</a>(const BlockRV &amp;block_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a7c44d4f4ea662291ccb9d79383b6fefe">RemoveRV</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a00fcf343d2bc8f36f170c04e5e29d2dc">RemoveRV</a>(const ExprRV &amp;expr_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a059229fe0e254961da406807a97f7a3d">Reorder</a>(const Array&lt; LoopRV &gt; &amp;ordered_loop_rvs)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ad75e0424902b06dca23d46807a9a47d5">ReverseComputeAt</a>(const BlockRV &amp;block_rv, const LoopRV &amp;loop_rv, bool preserve_unit_loops, int index=-1)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a99c902d903680da14339842dd2fd29c7">ReverseComputeInline</a>(const BlockRV &amp;block)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ab185c8eac1065290d84d58e7f4617232">RFactor</a>(const LoopRV &amp;loop_rv, int factor_axis)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ad94d79729ac85aa7c976e23d39066383">RuntimeTypeIndex</a>()</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">static</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ab9d2b3a98335b88f168b69deed49eb19">SampleCategorical</a>(const Array&lt; Integer &gt; &amp;candidates, const Array&lt; FloatImm &gt; &amp;probs, Optional&lt; Integer &gt; decision=NullOpt)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#abf9fbec94271b7512c24b6eced230c39">SampleComputeLocation</a>(const BlockRV &amp;block_rv, Optional&lt; Integer &gt; decision=NullOpt)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a2c62b3f9486dd35714df50bc424d6698">SamplePerfectTile</a>(const LoopRV &amp;loop_rv, int n, int max_innermost_factor, Optional&lt; Array&lt; Integer &gt;&gt; decision=NullOpt)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#aae5808dc2e987bf17ef42196457a654d">Schedule</a> class</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">friend</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a3cb60d6112fe5a443ef39bc005c9fbf1">Seed</a>(support::LinearCongruentialEngine::TRandState seed)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a025b5eef0c2516fc1f72eed9ced88807">SetAxisSeparator</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type, const Array&lt; IntImm &gt; &amp;axis_separators)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#aa4760135d373af488a08aaeba7114c48">SetScope</a>(const BlockRV &amp;block_rv, int buffer_index, const String &amp;storage_scope)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ac190a0ab76d8754a35209479bcc6dfa2">Split</a>(const LoopRV &amp;loop_rv, const Array&lt; Optional&lt; ExprRV &gt;&gt; &amp;factors, bool preserve_unit_iters=true)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#abb3612c2598fa2d3ee0e6e3fc3de8a26">state</a>() const =0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a93d1d23f24d903db844f75f51fe09a36">StorageAlign</a>(const BlockRV &amp;block_rv, int buffer_index, int axis, int factor, int offset)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ae3794a03b566e5b1721b44c564992975">Tensorize</a>(const LoopRV &amp;loop_rv, const String &amp;intrin)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#aaca1621ab9c3db0ddd04ac57de79d37f">Tensorize</a>(const BlockRV &amp;block_rv, const String &amp;intrin)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a953bca4123b5a758adfdcd65634a5f3b">trace</a>() const =0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a998b22e37ef63a697a984c8ebcc39ca2">TransformBlockLayout</a>(const BlockRV &amp;block_rv, const IndexMap &amp;index_map)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c">TransformLayout</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type, const IndexMap &amp;index_map)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#abc4294398b140f3ff13a33f94a2f9e5f">TVM_DECLARE_FINAL_OBJECT_INFO</a>(ScheduleNode, runtime::Object)</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a481f01923b14e1851ebd38506e9c66ea">type_index</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a4bfc2586cb55f2af47728187b3256255">type_index_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">TypeIndex2Key</a>(uint32_t tindex)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6ee32a02dd44257da105fbbe5d9c8622">TypeIndex2KeyHash</a>(uint32_t tindex)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6841f97e06e6614dd7e82c6dd41b818a">TypeKey2Index</a>(const std::string &amp;key)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a45cd553c09ec836dfcbff81379647f07">Unannotate</a>(const LoopRV &amp;loop_rv, const String &amp;ann_key)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a7c310bca5d1583e61a3f27052a1dd5d0">Unannotate</a>(const BlockRV &amp;block_rv, const String &amp;ann_key)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#afd548730a6139d19fe24473ad66026d7">unique</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a84ec742f6295f59390592a6d0d90a552">Unroll</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ab4a8cd91959ceab22855ec338978bcee">Vectorize</a>(const LoopRV &amp;loop_rv)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#acb747d074e1f99477f7132e4614221a3">WorkOn</a>(const String &amp;func_name)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#ae637f126412479ed9bec05fd55376f7f">~ScheduleNode</a>()=default</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">virtual</span></td></tr>
 </table></div><!-- contents -->
 <!-- start footer part -->
 <hr class="footer"/><address class="footer"><small>
diff --git a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html
index 40a2ae5954..e160ed8813 100644
--- a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html
+++ b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html
@@ -267,6 +267,9 @@ Public Member Functions</h2></td></tr>
 <tr class="memitem:af7ef928082afe7f45b417f3e130792e8"><td class="memItemLeft" align="right" valign="top">virtual <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#af7ef928082afe7f45b417f3e130792e8">DecomposePadding</a> (const <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> &amp;block_rv, const <a class="el" href="classtvm_1_1tir_1_1LoopRV.html">LoopR [...]
 <tr class="memdesc:af7ef928082afe7f45b417f3e130792e8"><td class="mdescLeft">&#160;</td><td class="mdescRight">Decompose a padding block into a block filling const pad values and a block writing in-bound values.  <a href="#af7ef928082afe7f45b417f3e130792e8">More...</a><br /></td></tr>
 <tr class="separator:af7ef928082afe7f45b417f3e130792e8"><td class="memSeparator" colspan="2">&#160;</td></tr>
+<tr class="memitem:a1ac39c82aee1f8de30d5871d5923fc24"><td class="memItemLeft" align="right" valign="top">virtual void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a1ac39c82aee1f8de30d5871d5923fc24">PadEinsum</a> (const <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> &amp;block_rv, const <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1Integer.html">Integer</ [...]
+<tr class="memdesc:a1ac39c82aee1f8de30d5871d5923fc24"><td class="mdescLeft">&#160;</td><td class="mdescRight">Pad the computation of Einsum.  <a href="#a1ac39c82aee1f8de30d5871d5923fc24">More...</a><br /></td></tr>
+<tr class="separator:a1ac39c82aee1f8de30d5871d5923fc24"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a2428fbd498ba2710a22d9ca4bc455957"><td class="memItemLeft" align="right" valign="top">virtual void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a2428fbd498ba2710a22d9ca4bc455957">EnterPostproc</a> ()=0</td></tr>
 <tr class="memdesc:a2428fbd498ba2710a22d9ca4bc455957"><td class="mdescLeft">&#160;</td><td class="mdescRight">A no-op that marks the start of postprocessing phase of scheduling.  <a href="#a2428fbd498ba2710a22d9ca4bc455957">More...</a><br /></td></tr>
 <tr class="separator:a2428fbd498ba2710a22d9ca4bc455957"><td class="memSeparator" colspan="2">&#160;</td></tr>
@@ -1607,6 +1610,54 @@ Additional Inherited Members</h2></td></tr>
 
 <p>Get the <a class="el" href="classtvm_1_1IRModule.html" title="Managed reference class to IRModuleNode. ">IRModule</a> associated with this schedule. </p>
 
+</div>
+</div>
+<a id="a1ac39c82aee1f8de30d5871d5923fc24"></a>
+<h2 class="memtitle"><span class="permalink"><a href="#a1ac39c82aee1f8de30d5871d5923fc24">&#9670;&nbsp;</a></span>PadEinsum()</h2>
+
+<div class="memitem">
+<div class="memproto">
+<table class="mlabels">
+  <tr>
+  <td class="mlabels-left">
+      <table class="memname">
+        <tr>
+          <td class="memname">virtual void tvm::tir::ScheduleNode::PadEinsum </td>
+          <td>(</td>
+          <td class="paramtype">const <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> &amp;&#160;</td>
+          <td class="paramname"><em>block_rv</em>, </td>
+        </tr>
+        <tr>
+          <td class="paramkey"></td>
+          <td></td>
+          <td class="paramtype">const <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1Integer.html">Integer</a> &gt; &amp;&#160;</td>
+          <td class="paramname"><em>padding</em>&#160;</td>
+        </tr>
+        <tr>
+          <td></td>
+          <td>)</td>
+          <td></td><td></td>
+        </tr>
+      </table>
+  </td>
+  <td class="mlabels-right">
+<span class="mlabels"><span class="mlabel">pure virtual</span></span>  </td>
+  </tr>
+</table>
+</div><div class="memdoc">
+
+<p>Pad the computation of Einsum. </p>
+<dl class="params"><dt>Parameters</dt><dd>
+  <table class="params">
+    <tr><td class="paramname">block_rv</td><td>The block that matches the Einsum pattern. </td></tr>
+    <tr><td class="paramname">padding</td><td>The padding for each block iter.</td></tr>
+  </table>
+  </dd>
+</dl>
+<p>This schedule primitives identifies the Einsum pattern in the block body, and find its producer blocks. It then pads the computation of the Einsum pattern and its producer blocks. The output buffer and the producer buffer is resized according to the padding size. It requires the output buffer and the producer buffer to be allocated inside the <a class="el" href="classtvm_1_1tir_1_1PrimFunc.html" title="Managed reference to PrimFuncNode. ">PrimFunc</a>.</p>
+<p>The padding is a list of non-negative integers, each element corresponds to the padding for each block iter in the order of block iters. The block and its producer blocks should have trivial bindings, i.e. each block iter is bound to a single loop variable. After padding, the block iter extent and the corresponding outer loop is extended by the padding size.</p>
+<p>The size of the producer buffers are infered from the padding size of the Einsum computation. The producer buffers are padded by the initial value of the corresponding reduction. </p>
+
 </div>
 </div>
 <a id="a553dc17c0b49b175cd16881c81b6c789"></a>
diff --git a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__coll__graph.svg b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__coll__graph.svg
index ea9ac70d4b..c02237bbeb 100644
--- a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__coll__graph.svg
+++ b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__coll__graph.svg
@@ -27,7 +27,7 @@
 <text text-anchor="start" x="8" y="-40.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Seed()</text>
 <text text-anchor="start" x="8" y="-29.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ForkSeed()</text>
 <text text-anchor="start" x="8" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Get()</text>
-<text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">and 51 more...</text>
+<text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">and 52 more...</text>
 </g>
 <!-- Node3 -->
 <g id="node2" class="node">
diff --git a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__inherit__graph.svg b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__inherit__graph.svg
index 8a48a0c4cb..09a02730da 100644
--- a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__inherit__graph.svg
+++ b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode__inherit__graph.svg
@@ -27,7 +27,7 @@
 <text text-anchor="start" x="8" y="-40.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Seed()</text>
 <text text-anchor="start" x="8" y="-29.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ForkSeed()</text>
 <text text-anchor="start" x="8" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Get()</text>
-<text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">and 51 more...</text>
+<text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">and 52 more...</text>
 </g>
 <!-- Node1 -->
 <g id="node2" class="node">
diff --git a/docs/reference/api/doxygen/database_8h_source.html b/docs/reference/api/doxygen/database_8h_source.html
index 8bbf0704e8..f78c67d542 100644
--- a/docs/reference/api/doxygen/database_8h_source.html
+++ b/docs/reference/api/doxygen/database_8h_source.html
@@ -83,7 +83,7 @@ $(function() {
 <div class="ttc" id="structtvm_1_1meta__schedule_1_1WorkloadHash_html_a7cb09ddc6c76d9d00ddbeab8502d97cb"><div class="ttname"><a href="structtvm_1_1meta__schedule_1_1WorkloadHash.html#a7cb09ddc6c76d9d00ddbeab8502d97cb">tvm::meta_schedule::WorkloadHash::operator()</a></div><div class="ttdeci">size_t operator()(const Workload &amp;a) const</div><div class="ttdef"><b>Definition:</b> database.h:91</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyDatabaseNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyDatabaseNode.html">tvm::meta_schedule::PyDatabaseNode</a></div><div class="ttdoc">The database with customized methods on the python-side. </div><div class="ttdef"><b>Definition:</b> database.h:239</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:659</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
 <div class="ttc" id="classtvm_1_1StructuralEqual_html"><div class="ttname"><a href="classtvm_1_1StructuralEqual.html">tvm::StructuralEqual</a></div><div class="ttdoc">Content-aware structural equality comparator for objects. </div><div class="ttdef"><b>Definition:</b> structural_equal.h:103</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1TuningRecordNode_html_a8cc2d64f796593a1a774eef259f17b29"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1TuningRecordNode.html#a8cc2d64f796593a1a774eef259f17b29">tvm::meta_schedule::TuningRecordNode::trace</a></div><div class="ttdeci">tir::Trace trace</div><div class="ttdoc">The trace tuned. </div><div class="ttdef"><b>Definition:</b> database.h:108</div></div>
 <div class="ttc" id="arg__info_8h_html"><div class="ttname"><a href="arg__info_8h.html">arg_info.h</a></div></div>
diff --git a/docs/reference/api/doxygen/functions_func_m.html b/docs/reference/api/doxygen/functions_func_m.html
index 9fb918fa70..3e78707725 100644
--- a/docs/reference/api/doxygen/functions_func_m.html
+++ b/docs/reference/api/doxygen/functions_func_m.html
@@ -188,7 +188,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1arith_1_1ModularSet.html#a9f54896d98169246c6a24cc338fde500">tvm::arith::ModularSet</a>
 </li>
 <li>Module()
-: <a class="el" href="classtvm_1_1runtime_1_1Module.html#abd1380b3f813c2b6acefca3aaef425f4">tvm::runtime::Module</a>
+: <a class="el" href="classtvm_1_1runtime_1_1Module.html#abfbc619b3b3166d63ec52e399c24bed9">tvm::runtime::Module</a>
 </li>
 <li>Move()
 : <a class="el" href="structtvm_1_1runtime_1_1vm_1_1Instruction.html#a162dc8d73dc2306f066c3ee013ff096f">tvm::runtime::vm::Instruction</a>
diff --git a/docs/reference/api/doxygen/functions_func_p.html b/docs/reference/api/doxygen/functions_func_p.html
index ec62fe0871..d76dea1992 100644
--- a/docs/reference/api/doxygen/functions_func_p.html
+++ b/docs/reference/api/doxygen/functions_func_p.html
@@ -76,6 +76,9 @@ $(function() {
 <li>PacketDone()
 : <a class="el" href="classtvm_1_1runtime_1_1micro__rpc_1_1WriteStream.html#a1745b7d9d5a0e094e129eb7a4c363ac9">tvm::runtime::micro_rpc::WriteStream</a>
 </li>
+<li>PadEinsum()
+: <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a1ac39c82aee1f8de30d5871d5923fc24">tvm::tir::ScheduleNode</a>
+</li>
 <li>parallel()
 : <a class="el" href="classtvm_1_1auto__scheduler_1_1State.html#a2376f0180bc5b5dd4b456f2a75d4a366">tvm::auto_scheduler::State</a>
 , <a class="el" href="classtvm_1_1te_1_1Stage.html#a60a6be10a1a96cb594c1399efabafef3">tvm::te::Stage</a>
diff --git a/docs/reference/api/doxygen/functions_func_u.html b/docs/reference/api/doxygen/functions_func_u.html
index 4b4e0f203d..611cae9ff1 100644
--- a/docs/reference/api/doxygen/functions_func_u.html
+++ b/docs/reference/api/doxygen/functions_func_u.html
@@ -106,7 +106,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1CostModelNode.html#ae35b2b678760b8da57a43d3ae9c24da5">tvm::auto_scheduler::CostModelNode</a>
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1PythonBasedModelNode.html#a2d7849df6c7dbe93bf363c1d9f860a26">tvm::auto_scheduler::PythonBasedModelNode</a>
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1RandomModelNode.html#a7febac6c05d8e2d407f466467769ee32">tvm::auto_scheduler::RandomModelNode</a>
-, <a class="el" href="classtvm_1_1IRModuleNode.html#a94a93385e64ce844299729af6a573015">tvm::IRModuleNode</a>
+, <a class="el" href="classtvm_1_1IRModuleNode.html#abdd8936c6fca33ef9b7c086f8fd58f84">tvm::IRModuleNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1CostModelNode.html#a1bba32eba84db583fe90d1a5bce085f1">tvm::meta_schedule::CostModelNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1PyCostModelNode.html#a970b00b0eb1bf6b88eea2711b58c4d1d">tvm::meta_schedule::PyCostModelNode</a>
 </li>
diff --git a/docs/reference/api/doxygen/functions_p.html b/docs/reference/api/doxygen/functions_p.html
index 8a3df637fb..f59bfa6484 100644
--- a/docs/reference/api/doxygen/functions_p.html
+++ b/docs/reference/api/doxygen/functions_p.html
@@ -134,6 +134,9 @@ $(function() {
 <li>paddings
 : <a class="el" href="structtvm_1_1relay_1_1SpaceToBatchNDAttrs.html#aabc579d65229d49279a1c3a903a99095">tvm::relay::SpaceToBatchNDAttrs</a>
 </li>
+<li>PadEinsum()
+: <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a1ac39c82aee1f8de30d5871d5923fc24">tvm::tir::ScheduleNode</a>
+</li>
 <li>parallel()
 : <a class="el" href="classtvm_1_1auto__scheduler_1_1State.html#a2376f0180bc5b5dd4b456f2a75d4a366">tvm::auto_scheduler::State</a>
 , <a class="el" href="classtvm_1_1te_1_1Stage.html#a60a6be10a1a96cb594c1399efabafef3">tvm::te::Stage</a>
@@ -407,7 +410,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1meta__schedule_1_1SearchStrategyNode.html#ad88e1545e88dc1934c25f4b417077aff">tvm::meta_schedule::SearchStrategyNode</a>
 </li>
 <li>PrimExpr()
-: <a class="el" href="classtvm_1_1PrimExpr.html#a756d3f8b17b019560946524951ae6118">tvm::PrimExpr</a>
+: <a class="el" href="classtvm_1_1PrimExpr.html#a7f0ca30e951608a0b36a77a66d4d19e0">tvm::PrimExpr</a>
 </li>
 <li>PrimFunc()
 : <a class="el" href="classtvm_1_1tir_1_1PrimFunc.html#ab01a529fafaf9fabdfca170605f7b0f8">tvm::tir::PrimFunc</a>
diff --git a/docs/reference/api/doxygen/functions_s.html b/docs/reference/api/doxygen/functions_s.html
index c73f166528..97d8526b7e 100644
--- a/docs/reference/api/doxygen/functions_s.html
+++ b/docs/reference/api/doxygen/functions_s.html
@@ -1054,7 +1054,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a93d1d23f24d903db844f75f51fe09a36">tvm::tir::ScheduleNode</a>
 </li>
 <li>StorageAlignStep()
-: <a class="el" href="classtvm_1_1auto__scheduler_1_1StorageAlignStep.html#a99dbb8c55d9e7d78268b6d43fd348bc7">tvm::auto_scheduler::StorageAlignStep</a>
+: <a class="el" href="classtvm_1_1auto__scheduler_1_1StorageAlignStep.html#af50b7c2f020f8e0a80f5bcc8e559b394">tvm::auto_scheduler::StorageAlignStep</a>
 </li>
 <li>StorageType
 : <a class="el" href="classtvm_1_1runtime_1_1SimpleObjAllocator_1_1ArrayHandler.html#a67e86db3290b1d3bd4aca7e7a2faf187">tvm::runtime::SimpleObjAllocator::ArrayHandler&lt; ArrayType, ElemType &gt;</a>
diff --git a/docs/reference/api/doxygen/functions_t.html b/docs/reference/api/doxygen/functions_t.html
index ae021b82bc..7b708e552c 100644
--- a/docs/reference/api/doxygen/functions_t.html
+++ b/docs/reference/api/doxygen/functions_t.html
@@ -81,7 +81,7 @@ $(function() {
 , <a class="el" href="structtvm_1_1runtime_1_1vm_1_1Instruction.html#a46879dbe84105fb621a6167f8d73b223">tvm::runtime::vm::Instruction</a>
 </li>
 <li>Target()
-: <a class="el" href="classtvm_1_1Target.html#a58a5a1e042e265fe5a6973045226fe1a">tvm::Target</a>
+: <a class="el" href="classtvm_1_1Target.html#a77f3d7cc97d8cfd7172af58b4e784d89">tvm::Target</a>
 </li>
 <li>target
 : <a class="el" href="classtvm_1_1VirtualDeviceNode.html#a8b2d427d9e21886ccaeaae5e9cc55aaf">tvm::VirtualDeviceNode</a>
@@ -1444,7 +1444,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1TypedEnvFunc_3_01R_07Args_8_8_8_08_4.html#a41a6b9014d0feeb628ca7edfd0d26f0b">tvm::TypedEnvFunc&lt; R(Args...)&gt;</a>
 </li>
 <li>TypedPackedFunc()
-: <a class="el" href="classtvm_1_1runtime_1_1TypedPackedFunc_3_01R_07Args_8_8_8_08_4.html#a8941c80982a1b2a289440f3c79bb0ac8">tvm::runtime::TypedPackedFunc&lt; R(Args...)&gt;</a>
+: <a class="el" href="classtvm_1_1runtime_1_1TypedPackedFunc_3_01R_07Args_8_8_8_08_4.html#a36ca0d1876544463ee848766e70e5e96">tvm::runtime::TypedPackedFunc&lt; R(Args...)&gt;</a>
 </li>
 <li>TypeIndex2Key()
 : <a class="el" href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">tvm::runtime::Object</a>
@@ -1467,7 +1467,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1TypeRelation.html#ac26b1897eab8197ed26606ab81b7403b">tvm::TypeRelation</a>
 </li>
 <li>TypeReporter()
-: <a class="el" href="classtvm_1_1TypeReporter.html#aa3dc38a3c84d324d0b3a9f358460a091">tvm::TypeReporter</a>
+: <a class="el" href="classtvm_1_1TypeReporter.html#a8e7e05a07f9f7ad9bea91f27afac9051">tvm::TypeReporter</a>
 </li>
 <li>types
 : <a class="el" href="classtvm_1_1TupleAffineTypeNode.html#a30c834b7e1cb64467e6587ac16ebb187">tvm::TupleAffineTypeNode</a>
diff --git a/docs/reference/api/doxygen/functions_u.html b/docs/reference/api/doxygen/functions_u.html
index 9051d7808e..aee008c4c1 100644
--- a/docs/reference/api/doxygen/functions_u.html
+++ b/docs/reference/api/doxygen/functions_u.html
@@ -122,7 +122,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1CostModelNode.html#ae35b2b678760b8da57a43d3ae9c24da5">tvm::auto_scheduler::CostModelNode</a>
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1PythonBasedModelNode.html#a2d7849df6c7dbe93bf363c1d9f860a26">tvm::auto_scheduler::PythonBasedModelNode</a>
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1RandomModelNode.html#a7febac6c05d8e2d407f466467769ee32">tvm::auto_scheduler::RandomModelNode</a>
-, <a class="el" href="classtvm_1_1IRModuleNode.html#a94a93385e64ce844299729af6a573015">tvm::IRModuleNode</a>
+, <a class="el" href="classtvm_1_1IRModuleNode.html#abdd8936c6fca33ef9b7c086f8fd58f84">tvm::IRModuleNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1CostModelNode.html#a1bba32eba84db583fe90d1a5bce085f1">tvm::meta_schedule::CostModelNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1PyCostModelNode.html#a970b00b0eb1bf6b88eea2711b58c4d1d">tvm::meta_schedule::PyCostModelNode</a>
 </li>
diff --git a/docs/reference/api/doxygen/measure__candidate_8h_source.html b/docs/reference/api/doxygen/measure__candidate_8h_source.html
index 357ecc46c9..f810a31c46 100644
--- a/docs/reference/api/doxygen/measure__candidate_8h_source.html
+++ b/docs/reference/api/doxygen/measure__candidate_8h_source.html
@@ -71,7 +71,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1MeasureCandidateNode_html_a99858dbe74082cc52938ac942523d792"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html#a99858dbe74082cc52938ac942523d792">tvm::meta_schedule::MeasureCandidateNode::VisitAttrs</a></div><div class="ttdeci">void VisitAttrs(tvm::AttrVisitor *v)</div><div class="ttdef"><b>Definition:</b> measure_candidate.h:40</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1MeasureCandidateNode_html_a6891e92cac8712bb690401ed121ae7e8"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html#a6891e92cac8712bb690401ed121ae7e8">tvm::meta_schedule::MeasureCandidateNode::args_info</a></div><div class="ttdeci">Array&lt; ArgInfo &gt; args_info</div><div class="ttdoc">The argument information, e.g., (shape, dtype) for tensors. </div><div class="ttdef"><b>Definition:</b> measure_candidate. [...]
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:659</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
 <div class="ttc" id="arg__info_8h_html"><div class="ttname"><a href="arg__info_8h.html">arg_info.h</a></div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1MeasureCandidateNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html">tvm::meta_schedule::MeasureCandidateNode</a></div><div class="ttdoc">The schedule (with input shapes) to be measured. </div><div class="ttdef"><b>Definition:</b> measure_candidate.h:33</div></div>
 <div class="ttc" id="array_8h_html"><div class="ttname"><a href="array_8h.html">array.h</a></div><div class="ttdoc">Runtime Array container types. </div></div>
diff --git a/docs/reference/api/doxygen/postproc_8h_source.html b/docs/reference/api/doxygen/postproc_8h_source.html
index 8bab5738b2..51438415a5 100644
--- a/docs/reference/api/doxygen/postproc_8h_source.html
+++ b/docs/reference/api/doxygen/postproc_8h_source.html
@@ -73,7 +73,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyPostprocNode_html_a3771e585727ef6dfecc502ffe57fd2a2"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyPostprocNode.html#a3771e585727ef6dfecc502ffe57fd2a2">tvm::meta_schedule::PyPostprocNode::f_apply</a></div><div class="ttdeci">FApply f_apply</div><div class="ttdoc">The packed function to the Apply function. </div><div class="ttdef"><b>Definition:</b> postproc.h:84</div></div>
 <div class="ttc" id="object_8h_html_aaaa3dc5b6dc33f84b2d28f9a81267212"><div class="ttname"><a href="object_8h.html#aaaa3dc5b6dc33f84b2d28f9a81267212">TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS</a></div><div class="ttdeci">#define TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(TypeName, ParentType, ObjectName)</div><div class="ttdef"><b>Definition:</b> object.h:744</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:659</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1TuneContext_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1TuneContext.html">tvm::meta_schedule::TuneContext</a></div><div class="ttdoc">Managed reference to TuneContextNode. </div><div class="ttdef"><b>Definition:</b> tune_context.h:129</div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PostprocNode_html_af7bfe77672b2305982132990781515b4"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PostprocNode.html#af7bfe77672b2305982132990781515b4">tvm::meta_schedule::PostprocNode::_type_key</a></div><div class="ttdeci">static constexpr const char * _type_key</div><div class="ttdef"><b>Definition:</b> postproc.h:57</div></div>
diff --git a/docs/reference/api/doxygen/schedule__rule_8h_source.html b/docs/reference/api/doxygen/schedule__rule_8h_source.html
index aab2c88707..fe2334423b 100644
--- a/docs/reference/api/doxygen/schedule__rule_8h_source.html
+++ b/docs/reference/api/doxygen/schedule__rule_8h_source.html
@@ -78,7 +78,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="object_8h_html_aaaa3dc5b6dc33f84b2d28f9a81267212"><div class="ttname"><a href="object_8h.html#aaaa3dc5b6dc33f84b2d28f9a81267212">TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS</a></div><div class="ttdeci">#define TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(TypeName, ParentType, ObjectName)</div><div class="ttdef"><b>Definition:</b> object.h:744</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyScheduleRuleNode_html_a752192bcb5385b1ba72b7c1856c6f360"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyScheduleRuleNode.html#a752192bcb5385b1ba72b7c1856c6f360">tvm::meta_schedule::PyScheduleRuleNode::f_apply</a></div><div class="ttdeci">FApply f_apply</div><div class="ttdoc">The packed function to the Apply function. </div><div class="ttdef"><b>Definition:</b> schedule_rule.h:91</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:659</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1TuneContext_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1TuneContext.html">tvm::meta_schedule::TuneContext</a></div><div class="ttdoc">Managed reference to TuneContextNode. </div><div class="ttdef"><b>Definition:</b> tune_context.h:129</div></div>
 <div class="ttc" id="array_8h_html"><div class="ttname"><a href="array_8h.html">array.h</a></div><div class="ttdoc">Runtime Array container types. </div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
diff --git a/docs/reference/api/doxygen/search/all_11.js b/docs/reference/api/doxygen/search/all_11.js
index cff51a4e2c..e51a8cd242 100644
--- a/docs/reference/api/doxygen/search/all_11.js
+++ b/docs/reference/api/doxygen/search/all_11.js
@@ -31,10 +31,11 @@ var searchData=
   ['padding_5fmode',['padding_mode',['../structtvm_1_1relay_1_1GridSampleAttrs.html#aab46e9c8c1a6caa0e05605e930570682',1,'tvm::relay::GridSampleAttrs']]],
   ['padding_5fpredicate',['padding_predicate',['../classtvm_1_1arith_1_1IterMapResultNode.html#af982bb1cf020e53e2f7521ee1cf97c2a',1,'tvm::arith::IterMapResultNode']]],
   ['paddings',['paddings',['../structtvm_1_1relay_1_1SpaceToBatchNDAttrs.html#aabc579d65229d49279a1c3a903a99095',1,'tvm::relay::SpaceToBatchNDAttrs']]],
+  ['padeinsum',['PadEinsum',['../classtvm_1_1tir_1_1ScheduleNode.html#a1ac39c82aee1f8de30d5871d5923fc24',1,'tvm::tir::ScheduleNode']]],
   ['page_5fallocator_2eh',['page_allocator.h',['../page__allocator_8h.html',1,'']]],
   ['pagememorymanagercreate',['PageMemoryManagerCreate',['../page__allocator_8h.html#a720dbc7474ac13b93fafb974cfc20bc7',1,'page_allocator.h']]],
   ['papi_2eh',['papi.h',['../papi_8h.html',1,'']]],
-  ['parallel',['Parallel',['../classtvm_1_1tir_1_1ScheduleNode.html#a553dc17c0b49b175cd16881c81b6c789',1,'tvm::tir::ScheduleNode::Parallel()'],['../classtvm_1_1auto__scheduler_1_1State.html#a2376f0180bc5b5dd4b456f2a75d4a366',1,'tvm::auto_scheduler::State::parallel()'],['../classtvm_1_1te_1_1Stage.html#a60a6be10a1a96cb594c1399efabafef3',1,'tvm::te::Stage::parallel()']]],
+  ['parallel',['parallel',['../classtvm_1_1auto__scheduler_1_1State.html#a2376f0180bc5b5dd4b456f2a75d4a366',1,'tvm::auto_scheduler::State::parallel()'],['../classtvm_1_1te_1_1Stage.html#a60a6be10a1a96cb594c1399efabafef3',1,'tvm::te::Stage::parallel()'],['../classtvm_1_1tir_1_1ScheduleNode.html#a553dc17c0b49b175cd16881c81b6c789',1,'tvm::tir::ScheduleNode::Parallel()']]],
   ['parallel_5ffor',['parallel_for',['../namespacetvm_1_1support.html#a8bf1225e8bb1db575578ca2d645fb23c',1,'tvm::support']]],
   ['parallel_5ffor_2eh',['parallel_for.h',['../parallel__for_8h.html',1,'']]],
   ['parallel_5ffor_5fdynamic',['parallel_for_dynamic',['../namespacetvm_1_1support.html#afe4271363c794f1644ce7af5c2266530',1,'tvm::support']]],
diff --git a/docs/reference/api/doxygen/search/all_13.js b/docs/reference/api/doxygen/search/all_13.js
index c9d096f145..12e1d46adf 100644
--- a/docs/reference/api/doxygen/search/all_13.js
+++ b/docs/reference/api/doxygen/search/all_13.js
@@ -12,7 +12,7 @@ var searchData=
   ['randomcomputelocation',['RandomComputeLocation',['../classtvm_1_1meta__schedule_1_1ScheduleRule.html#a1bf485537817533eaf711226f687778c',1,'tvm::meta_schedule::ScheduleRule']]],
   ['randommodel',['RandomModel',['../classtvm_1_1auto__scheduler_1_1RandomModel.html',1,'tvm::auto_scheduler::RandomModel'],['../classtvm_1_1auto__scheduler_1_1RandomModel.html#aa456abf1dc91cbf76935189424d8954f',1,'tvm::auto_scheduler::RandomModel::RandomModel()'],['../classtvm_1_1auto__scheduler_1_1RandomModel.html#ac2b355e61135f2ff57d4f96fe2fba845',1,'tvm::auto_scheduler::RandomModel::RandomModel(::tvm::runtime::ObjectPtr&lt;::tvm::runtime::Object &gt; n)']]],
   ['randommodelnode',['RandomModelNode',['../classtvm_1_1auto__scheduler_1_1RandomModelNode.html',1,'tvm::auto_scheduler']]],
-  ['range',['Range',['../classtvm_1_1Range.html',1,'tvm::Range'],['../classtvm_1_1Range.html#a9d58cccc53897fee0c80ab1437da1f0f',1,'tvm::Range::Range()'],['../classtvm_1_1auto__scheduler_1_1IteratorNode.html#a2751c3164971b3154ffc506e3aebaf91',1,'tvm::auto_scheduler::IteratorNode::range()']]],
+  ['range',['Range',['../classtvm_1_1Range.html',1,'tvm::Range'],['../classtvm_1_1auto__scheduler_1_1IteratorNode.html#a2751c3164971b3154ffc506e3aebaf91',1,'tvm::auto_scheduler::IteratorNode::range()'],['../classtvm_1_1Range.html#a9d58cccc53897fee0c80ab1437da1f0f',1,'tvm::Range::Range()']]],
   ['rangenode',['RangeNode',['../classtvm_1_1RangeNode.html',1,'tvm::RangeNode'],['../classtvm_1_1RangeNode.html#ab845f7ed4ed85e360b730df3450d1aab',1,'tvm::RangeNode::RangeNode()'],['../classtvm_1_1RangeNode.html#a4bbc33969cb484c20306da1d2b9fa1fd',1,'tvm::RangeNode::RangeNode(PrimExpr min, PrimExpr extent, Span span=Span())']]],
   ['ranges',['ranges',['../classtvm_1_1arith_1_1IntConstraintsNode.html#ab23d4d806766c88b0df69dbfb5ebd63c',1,'tvm::arith::IntConstraintsNode']]],
   ['rate',['rate',['../structtvm_1_1relay_1_1DropoutAttrs.html#a0b5a52c24a1be53dbb122a1df9fe22af',1,'tvm::relay::DropoutAttrs']]],
@@ -82,7 +82,7 @@ var searchData=
   ['registerconfigoption',['RegisterConfigOption',['../classtvm_1_1transform_1_1PassContext.html#a6f1d1040cc97320414b4690203f87919',1,'tvm::transform::PassContext']]],
   ['registergenericfunc',['RegisterGenericFunc',['../classtvm_1_1GenericFunc.html#a909acecbf2f34f847a34e587a4570dce',1,'tvm::GenericFunc']]],
   ['registerorget',['RegisterOrGet',['../classtvm_1_1OpRegEntry.html#a39a4d3e7f905eb4e29ca464bcedb05bd',1,'tvm::OpRegEntry::RegisterOrGet()'],['../classtvm_1_1relay_1_1ExecutorRegEntry.html#a03347a2b68269b853a7c0399994951ef',1,'tvm::relay::ExecutorRegEntry::RegisterOrGet()'],['../classtvm_1_1relay_1_1RuntimeRegEntry.html#ae8b479159ccd8b35b75950fcda58dd9d',1,'tvm::relay::RuntimeRegEntry::RegisterOrGet()'],['../classtvm_1_1TargetTagRegEntry.html#a07e0631600484dc0985ca62b1620461c',1,'tvm::T [...]
-  ['registry',['Registry',['../classtvm_1_1ReflectionVTable_1_1Registry.html',1,'tvm::ReflectionVTable::Registry'],['../classtvm_1_1runtime_1_1Registry.html',1,'tvm::runtime::Registry'],['../structTVMMutableFuncRegistry.html#acc1fcd6554c627c1bf3b3c00e1120e9b',1,'TVMMutableFuncRegistry::registry()'],['../structTVMModule.html#a6db21005b9e983207b341e65af4c4ab7',1,'TVMModule::registry()'],['../classtvm_1_1ReflectionVTable_1_1Registry.html#ac8f4637640aa9dffed745303a4cfa827',1,'tvm::Reflection [...]
+  ['registry',['Registry',['../classtvm_1_1ReflectionVTable_1_1Registry.html',1,'tvm::ReflectionVTable::Registry'],['../classtvm_1_1runtime_1_1Registry.html',1,'tvm::runtime::Registry'],['../classtvm_1_1ReflectionVTable_1_1Registry.html#ac8f4637640aa9dffed745303a4cfa827',1,'tvm::ReflectionVTable::Registry::Registry()'],['../structTVMMutableFuncRegistry.html#acc1fcd6554c627c1bf3b3c00e1120e9b',1,'TVMMutableFuncRegistry::registry()'],['../structTVMModule.html#a6db21005b9e983207b341e65af4c4a [...]
   ['registry_2eh',['registry.h',['../registry_8h.html',1,'']]],
   ['regname',['RegName',['../namespacetvm_1_1runtime_1_1vm.html#a3bbbf700719e9dc3dda2bc25210c18ae',1,'tvm::runtime::vm']]],
   ['reindex',['ReIndex',['../classtvm_1_1tir_1_1ScheduleNode.html#a9e36a8a0e37a76e55068dd534e28c8c5',1,'tvm::tir::ScheduleNode']]],
@@ -160,7 +160,7 @@ var searchData=
   ['resize2dattrs',['Resize2DAttrs',['../structtvm_1_1relay_1_1Resize2DAttrs.html',1,'tvm::relay']]],
   ['resize3dattrs',['Resize3DAttrs',['../structtvm_1_1relay_1_1Resize3DAttrs.html',1,'tvm::relay']]],
   ['resolvedependency',['ResolveDependency',['../classtvm_1_1transform_1_1SequentialNode.html#a5549edf77e0a64bd6fcb692603967b8e',1,'tvm::transform::SequentialNode']]],
-  ['result',['Result',['../classtvm_1_1meta__schedule_1_1RunnerFutureNode.html#a1b5438c21c436ce7a864487583fd32b2',1,'tvm::meta_schedule::RunnerFutureNode::Result()'],['../structtvm_1_1runtime_1_1vm_1_1Instruction.html#ae0d33229af059c727db2abd3616660e0',1,'tvm::runtime::vm::Instruction::result()'],['../classtvm_1_1script_1_1ir__builder_1_1IRBuilderNode.html#ae9bab07b47a5fd7f27576cbcfddab953',1,'tvm::script::ir_builder::IRBuilderNode::result()'],['../classtvm_1_1tir_1_1CommReducerNode.html [...]
+  ['result',['result',['../structtvm_1_1runtime_1_1vm_1_1Instruction.html#ae0d33229af059c727db2abd3616660e0',1,'tvm::runtime::vm::Instruction::result()'],['../classtvm_1_1script_1_1ir__builder_1_1IRBuilderNode.html#ae9bab07b47a5fd7f27576cbcfddab953',1,'tvm::script::ir_builder::IRBuilderNode::result()'],['../classtvm_1_1tir_1_1CommReducerNode.html#a7030917568a088215da423fc56882814',1,'tvm::tir::CommReducerNode::result()'],['../classtvm_1_1meta__schedule_1_1RunnerFutureNode.html#a1b5438c21 [...]
   ['result_5f',['result_',['../classtvm_1_1detail_1_1AttrsSEqualVisitor.html#aeda3a91f0b2d1a7a9a075828954ff77f',1,'tvm::detail::AttrsSEqualVisitor']]],
   ['result_5ftype',['result_type',['../classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html#a24d4a3522ee6c4cdeed80dcdcc1424ad',1,'tvm::TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;::result_type()'],['../classtvm_1_1NodeFunctor_3_01R_07const_01ObjectRef_01_6n_00_01Args_8_8_8_08_4.html#ac7f687cb7dda02407b578a6683fa708a',1,'tvm::NodeFunctor&lt; R(const ObjectRef &amp;n, Args...)&gt;::result_type()'],['../classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n [...]
   ['resulttype',['ResultType',['../structtvm_1_1runtime_1_1Array_1_1ValueConverter.html#a0db77cfd8032391d76dffc88eae8e09b',1,'tvm::runtime::Array::ValueConverter']]],
@@ -195,7 +195,7 @@ var searchData=
   ['rewritetensorize',['RewriteTensorize',['../classtvm_1_1meta__schedule_1_1Postproc.html#a95db036cfced4c2575367a26a41498ff',1,'tvm::meta_schedule::Postproc']]],
   ['rewriteunboundblock',['RewriteUnboundBlock',['../classtvm_1_1meta__schedule_1_1Postproc.html#a1836b2278bc24fdc227c490896d92980',1,'tvm::meta_schedule::Postproc']]],
   ['rewriteunsafeselect',['RewriteUnsafeSelect',['../namespacetvm_1_1tir_1_1transform.html#a4fe43327c4454dd05b6e925577443f49',1,'tvm::tir::transform']]],
-  ['rfactor',['rfactor',['../classtvm_1_1auto__scheduler_1_1State.html#a21c27b06d439267f8b981fa05c5f48a0',1,'tvm::auto_scheduler::State::rfactor()'],['../classtvm_1_1te_1_1Schedule.html#a34ae85add41bbed0140726d024d08862',1,'tvm::te::Schedule::rfactor()'],['../classtvm_1_1tir_1_1ScheduleNode.html#ab185c8eac1065290d84d58e7f4617232',1,'tvm::tir::ScheduleNode::RFactor()']]],
+  ['rfactor',['RFactor',['../classtvm_1_1tir_1_1ScheduleNode.html#ab185c8eac1065290d84d58e7f4617232',1,'tvm::tir::ScheduleNode::RFactor()'],['../classtvm_1_1auto__scheduler_1_1State.html#a21c27b06d439267f8b981fa05c5f48a0',1,'tvm::auto_scheduler::State::rfactor()'],['../classtvm_1_1te_1_1Schedule.html#a34ae85add41bbed0140726d024d08862',1,'tvm::te::Schedule::rfactor()']]],
   ['rfactorstep',['RfactorStep',['../classtvm_1_1auto__scheduler_1_1RfactorStep.html',1,'tvm::auto_scheduler::RfactorStep'],['../classtvm_1_1auto__scheduler_1_1RfactorStep.html#a26e6f85b55307f18fab4469e3bd4be0c',1,'tvm::auto_scheduler::RfactorStep::RfactorStep(int stage_id, int iter_id, int factor_iter_id)'],['../classtvm_1_1auto__scheduler_1_1RfactorStep.html#a95575c21441177634178245ab562cb4f',1,'tvm::auto_scheduler::RfactorStep::RfactorStep(dmlc::JSONReader *reader)']]],
   ['rfactorstepnode',['RfactorStepNode',['../classtvm_1_1auto__scheduler_1_1RfactorStepNode.html',1,'tvm::auto_scheduler']]],
   ['rhs',['rhs',['../classtvm_1_1relay_1_1ClauseNode.html#a93217eeea15c1f7c1a659da3da86d3bd',1,'tvm::relay::ClauseNode::rhs()'],['../classtvm_1_1script_1_1printer_1_1AssignDocNode.html#a436fcace00d445213fc367ece59c4067',1,'tvm::script::printer::AssignDocNode::rhs()'],['../classtvm_1_1script_1_1printer_1_1ForDocNode.html#aa72614136675287310ea08520f596642',1,'tvm::script::printer::ForDocNode::rhs()'],['../classtvm_1_1script_1_1printer_1_1ScopeDocNode.html#abf3636ac2820118a3d48f2fea32b2b0b' [...]
diff --git a/docs/reference/api/doxygen/search/all_14.js b/docs/reference/api/doxygen/search/all_14.js
index 98ceaf3ef9..8a2bdfdb19 100644
--- a/docs/reference/api/doxygen/search/all_14.js
+++ b/docs/reference/api/doxygen/search/all_14.js
@@ -250,7 +250,7 @@ var searchData=
   ['solvelinearequations',['SolveLinearEquations',['../namespacetvm_1_1arith.html#ae0290f04432523ab8e5f76edde80071a',1,'tvm::arith']]],
   ['solvelinearinequalities',['SolveLinearInequalities',['../namespacetvm_1_1arith.html#ac59d63560e04431f108e81457b212fdc',1,'tvm::arith']]],
   ['sorted',['sorted',['../structtvm_1_1relay_1_1UniqueAttrs.html#aef434799646533ec9d796393ba01db44',1,'tvm::relay::UniqueAttrs']]],
-  ['source',['Source',['../classtvm_1_1parser_1_1Source.html',1,'tvm::parser::Source'],['../classtvm_1_1parser_1_1Source.html#a0ef9f726abcc6c4c9e81b3a257055df8',1,'tvm::parser::Source::Source()'],['../classtvm_1_1arith_1_1IterMarkNode.html#a8b885a675c88e5a5d142fa68bcba048a',1,'tvm::arith::IterMarkNode::source()'],['../classtvm_1_1arith_1_1IterSplitExprNode.html#a7a129dc9b432359a07c1a1e286c3c66f',1,'tvm::arith::IterSplitExprNode::source()'],['../classtvm_1_1parser_1_1SourceNode.html#a51cc [...]
+  ['source',['Source',['../classtvm_1_1parser_1_1Source.html',1,'tvm::parser::Source'],['../classtvm_1_1arith_1_1IterMarkNode.html#a8b885a675c88e5a5d142fa68bcba048a',1,'tvm::arith::IterMarkNode::source()'],['../classtvm_1_1arith_1_1IterSplitExprNode.html#a7a129dc9b432359a07c1a1e286c3c66f',1,'tvm::arith::IterSplitExprNode::source()'],['../classtvm_1_1parser_1_1SourceNode.html#a51cc3c98e4cdacf0ffdc643c848e09af',1,'tvm::parser::SourceNode::source()'],['../classtvm_1_1tir_1_1ReduceNode.html# [...]
   ['source_5fmap',['source_map',['../classtvm_1_1IRModuleNode.html#a49470c0bfb4b85d9eda7576a837b7031',1,'tvm::IRModuleNode::source_map()'],['../classtvm_1_1parser_1_1SourceMapNode.html#ae22bc1181b066f17f8938868ef22610a',1,'tvm::parser::SourceMapNode::source_map()']]],
   ['source_5fmap_2eh',['source_map.h',['../source__map_8h.html',1,'']]],
   ['source_5fname',['source_name',['../classtvm_1_1DiagnosticBuilder.html#a92d320e1ede24fe5ff47862365002691',1,'tvm::DiagnosticBuilder::source_name()'],['../classtvm_1_1SpanNode.html#ad573167f93facbfbee19983b08bbba3d',1,'tvm::SpanNode::source_name()'],['../classtvm_1_1parser_1_1SourceNode.html#a8d4c50a18eb3e99b14d73d7db2a52af3',1,'tvm::parser::SourceNode::source_name()']]],
@@ -365,9 +365,9 @@ var searchData=
   ['stmtsref',['StmtSRef',['../classtvm_1_1tir_1_1StmtSRef.html',1,'tvm::tir::StmtSRef'],['../classtvm_1_1tir_1_1StmtSRef.html#a31687ace5dc4fe487ffb87d658d86412',1,'tvm::tir::StmtSRef::StmtSRef()']]],
   ['stmtsrefnode',['StmtSRefNode',['../classtvm_1_1tir_1_1StmtSRefNode.html',1,'tvm::tir']]],
   ['stmtvisitor',['StmtVisitor',['../classtvm_1_1tir_1_1StmtVisitor.html',1,'tvm::tir']]],
-  ['stop',['Stop',['../classtvm_1_1runtime_1_1TimerNode.html#a67eb764f2c9e3fb7c2708f01c0c35683',1,'tvm::runtime::TimerNode::Stop()'],['../classtvm_1_1runtime_1_1profiling_1_1MetricCollectorNode.html#aca9679dd49dfbc886b9dc99539cbf0e6',1,'tvm::runtime::profiling::MetricCollectorNode::Stop()'],['../classtvm_1_1runtime_1_1profiling_1_1Profiler.html#aa2000d8cd1970b5d29139ab1831394f0',1,'tvm::runtime::profiling::Profiler::Stop()'],['../structtvm_1_1relay_1_1ArangeAttrs.html#a1eadf1f3964ca83dad [...]
+  ['stop',['stop',['../structtvm_1_1relay_1_1ArangeAttrs.html#a1eadf1f3964ca83dade8edeae7d6d7cf',1,'tvm::relay::ArangeAttrs::stop()'],['../classtvm_1_1script_1_1printer_1_1SliceDocNode.html#aaeb98937e7617cb76fb9662616b89e81',1,'tvm::script::printer::SliceDocNode::stop()'],['../classtvm_1_1runtime_1_1TimerNode.html#a67eb764f2c9e3fb7c2708f01c0c35683',1,'tvm::runtime::TimerNode::Stop()'],['../classtvm_1_1runtime_1_1profiling_1_1MetricCollectorNode.html#aca9679dd49dfbc886b9dc99539cbf0e6',1,' [...]
   ['stopcall',['StopCall',['../classtvm_1_1runtime_1_1profiling_1_1Profiler.html#ad5e6a8e8c9d915c80f494138eedfec3f',1,'tvm::runtime::profiling::Profiler']]],
-  ['storage',['Storage',['../classtvm_1_1runtime_1_1vm_1_1Storage.html',1,'tvm::runtime::vm::Storage'],['../structtvm_1_1runtime_1_1vm_1_1Instruction.html#a3412cabd3b4f42f106f56fc22257f6ca',1,'tvm::runtime::vm::Instruction::storage()'],['../classtvm_1_1runtime_1_1vm_1_1Storage.html#aff0c1264864e6205cfa468f069f62f55',1,'tvm::runtime::vm::Storage::Storage()']]],
+  ['storage',['Storage',['../classtvm_1_1runtime_1_1vm_1_1Storage.html',1,'tvm::runtime::vm::Storage'],['../classtvm_1_1runtime_1_1vm_1_1Storage.html#aff0c1264864e6205cfa468f069f62f55',1,'tvm::runtime::vm::Storage::Storage()'],['../structtvm_1_1runtime_1_1vm_1_1Instruction.html#a3412cabd3b4f42f106f56fc22257f6ca',1,'tvm::runtime::vm::Instruction::storage()']]],
   ['storage_5falign',['storage_align',['../classtvm_1_1auto__scheduler_1_1State.html#ab006690418e43cc9b7ad021c02657ed6',1,'tvm::auto_scheduler::State::storage_align()'],['../classtvm_1_1te_1_1Stage.html#aa73e3a269d84c3b4f0a1994371d67bab',1,'tvm::te::Stage::storage_align()']]],
   ['storage_5falignment',['storage_alignment',['../namespacetvm_1_1tir_1_1attr.html#af27d464f2065dc5f77408df7b94d4bb6',1,'tvm::tir::attr']]],
   ['storage_5fid',['storage_id',['../structTVMGraphExecutorGraphAttr.html#a8a0d6d05adcffbf499aafb6a6700c400',1,'TVMGraphExecutorGraphAttr']]],
diff --git a/docs/reference/api/doxygen/search/all_15.js b/docs/reference/api/doxygen/search/all_15.js
index 970b257e9a..91c8158cd3 100644
--- a/docs/reference/api/doxygen/search/all_15.js
+++ b/docs/reference/api/doxygen/search/all_15.js
@@ -39,7 +39,7 @@ var searchData=
   ['takeattrs',['TakeAttrs',['../structtvm_1_1relay_1_1TakeAttrs.html',1,'tvm::relay']]],
   ['tan',['tan',['../namespacetvm.html#af99838098788d40c80b402f29b3c2e8c',1,'tvm::tan()'],['../namespacetvm_1_1topi.html#a13b757fe52775f43a58d91c0a1330f97',1,'tvm::topi::tan()']]],
   ['tanh',['tanh',['../namespacetvm.html#a12c5457301d8a2c03a2ba1163edd7cee',1,'tvm::tanh()'],['../namespacetvm_1_1topi.html#aec153e599d33c78a7592007cde1c02cb',1,'tvm::topi::tanh()']]],
-  ['target',['Target',['../classtvm_1_1Target.html',1,'tvm::Target'],['../classtvm_1_1Target.html#a58a5a1e042e265fe5a6973045226fe1a',1,'tvm::Target::Target(std::nullptr_t)'],['../classtvm_1_1Target.html#a77f3d7cc97d8cfd7172af58b4e784d89',1,'tvm::Target::Target(const String &amp;tag_or_config_or_target_str)'],['../classtvm_1_1Target.html#ab825b350cf478bf948d807b6fdf636a0',1,'tvm::Target::Target(const Map&lt; String, ObjectRef &gt; &amp;config)'],['../classtvm_1_1Target.html#a1abb29217d8e3 [...]
+  ['target',['Target',['../classtvm_1_1Target.html',1,'tvm::Target'],['../classtvm_1_1auto__scheduler_1_1SearchTaskNode.html#acf4407e0c8dced81b05b34ec0426c933',1,'tvm::auto_scheduler::SearchTaskNode::target()'],['../classtvm_1_1meta__schedule_1_1BuilderInputNode.html#afc001f3e427cfc8c05236b615cfd2868',1,'tvm::meta_schedule::BuilderInputNode::target()'],['../classtvm_1_1meta__schedule_1_1TuningRecordNode.html#a45a380cfa2edfd63056fb1a00f9aac35',1,'tvm::meta_schedule::TuningRecordNode::targ [...]
   ['target_2eh',['target.h',['../target_8h.html',1,'']]],
   ['target_5fburst_5fbytes',['target_burst_bytes',['../structtvm_1_1PoolInfoNode.html#a747c03e3eafc83b053637b735244c6d7',1,'tvm::PoolInfoNode::target_burst_bytes()'],['../structtvm_1_1PoolInfoPropertiesNode.html#aa1efe29e920f5b003894a2ae3304da17',1,'tvm::PoolInfoPropertiesNode::target_burst_bytes()']]],
   ['target_5fhost',['target_host',['../classtvm_1_1auto__scheduler_1_1SearchTaskNode.html#afe27bf8cb82dc8a1b6fffb9e5a3e6c20',1,'tvm::auto_scheduler::SearchTaskNode']]],
diff --git a/docs/reference/api/doxygen/search/all_16.js b/docs/reference/api/doxygen/search/all_16.js
index 5c19111669..a9f4b23567 100644
--- a/docs/reference/api/doxygen/search/all_16.js
+++ b/docs/reference/api/doxygen/search/all_16.js
@@ -29,9 +29,9 @@ var searchData=
   ['unknownattributeaccesspathnode',['UnknownAttributeAccessPathNode',['../classtvm_1_1UnknownAttributeAccessPathNode.html',1,'tvm::UnknownAttributeAccessPathNode'],['../classtvm_1_1UnknownAttributeAccessPathNode.html#a1882e9e591466a2785acc761dc63d56e',1,'tvm::UnknownAttributeAccessPathNode::UnknownAttributeAccessPathNode()']]],
   ['unmatchedcases',['UnmatchedCases',['../namespacetvm_1_1relay.html#aa3a8cace40f8056fd6412f39c3eaa605',1,'tvm::relay']]],
   ['unravel_5findex',['unravel_index',['../namespacetvm_1_1topi.html#a8811a02532bbe3047986bf1a8449ac0e',1,'tvm::topi']]],
-  ['unroll',['unroll',['../classtvm_1_1auto__scheduler_1_1State.html#aa68a9d2e226bae38a36e4be4af1d1ae4',1,'tvm::auto_scheduler::State::unroll()'],['../classtvm_1_1te_1_1Stage.html#af83ad8672660403504f472228b044b33',1,'tvm::te::Stage::unroll()'],['../classtvm_1_1tir_1_1ScheduleNode.html#a84ec742f6295f59390592a6d0d90a552',1,'tvm::tir::ScheduleNode::Unroll()']]],
+  ['unroll',['Unroll',['../classtvm_1_1tir_1_1ScheduleNode.html#a84ec742f6295f59390592a6d0d90a552',1,'tvm::tir::ScheduleNode::Unroll()'],['../classtvm_1_1auto__scheduler_1_1State.html#aa68a9d2e226bae38a36e4be4af1d1ae4',1,'tvm::auto_scheduler::State::unroll()'],['../classtvm_1_1te_1_1Stage.html#af83ad8672660403504f472228b044b33',1,'tvm::te::Stage::unroll()']]],
   ['unrollloop',['UnrollLoop',['../namespacetvm_1_1tir_1_1transform.html#ab2f279e91071fa96a1edb24fa004ea6a',1,'tvm::tir::transform']]],
-  ['update',['update',['../classtvm_1_1te_1_1ScanOpNode.html#ace2bf7e43cd4197324ec6363626fc60a',1,'tvm::te::ScanOpNode::update()'],['../classtvm_1_1arith_1_1ConstIntBoundAnalyzer.html#a5ae0699196c4bbc754bbdd4c3a6c7ca7',1,'tvm::arith::ConstIntBoundAnalyzer::Update()'],['../classtvm_1_1arith_1_1ModularSetAnalyzer.html#a04156fac580981f3005af3b8e676720d',1,'tvm::arith::ModularSetAnalyzer::Update()'],['../classtvm_1_1arith_1_1RewriteSimplifier.html#a5e6752c0702dc2d3e4235797d9d3ac7b',1,'tvm::a [...]
+  ['update',['Update',['../classtvm_1_1arith_1_1ConstIntBoundAnalyzer.html#a5ae0699196c4bbc754bbdd4c3a6c7ca7',1,'tvm::arith::ConstIntBoundAnalyzer::Update()'],['../classtvm_1_1arith_1_1ModularSetAnalyzer.html#a04156fac580981f3005af3b8e676720d',1,'tvm::arith::ModularSetAnalyzer::Update()'],['../classtvm_1_1arith_1_1RewriteSimplifier.html#a5e6752c0702dc2d3e4235797d9d3ac7b',1,'tvm::arith::RewriteSimplifier::Update()'],['../classtvm_1_1arith_1_1CanonicalSimplifier.html#a790c032e12c7d93e9e940 [...]
   ['update_5ffunc',['update_func',['../classtvm_1_1auto__scheduler_1_1PythonBasedModelNode.html#ade9364c152a36501d4f24fa4f0111519',1,'tvm::auto_scheduler::PythonBasedModelNode']]],
   ['updatecostmodel',['UpdateCostModel',['../classtvm_1_1meta__schedule_1_1MeasureCallback.html#afdf5503c6e6f53767de132d91a7b53f9',1,'tvm::meta_schedule::MeasureCallback']]],
   ['updateiters',['UpdateIters',['../classtvm_1_1auto__scheduler_1_1AttachMap.html#ab45b991ef2bcfb1bc191601aac42e778',1,'tvm::auto_scheduler::AttachMap']]],
diff --git a/docs/reference/api/doxygen/search/all_18.js b/docs/reference/api/doxygen/search/all_18.js
index 8331374b77..8551e528a6 100644
--- a/docs/reference/api/doxygen/search/all_18.js
+++ b/docs/reference/api/doxygen/search/all_18.js
@@ -33,7 +33,7 @@ var searchData=
   ['withframe',['WithFrame',['../classtvm_1_1script_1_1printer_1_1IRDocsifierNode.html#aeb321e859e30f7a3917a4ca8db71d472',1,'tvm::script::printer::IRDocsifierNode']]],
   ['withhost',['WithHost',['../classtvm_1_1Target.html#a509ce63995f082c80742ea5ca6ac112f',1,'tvm::Target']]],
   ['withoutattr',['WithoutAttr',['../namespacetvm.html#a7e2bc626db8be997b1562c79df3d9e11',1,'tvm']]],
-  ['workload',['Workload',['../classtvm_1_1meta__schedule_1_1Workload.html',1,'tvm::meta_schedule::Workload'],['../classtvm_1_1meta__schedule_1_1Workload.html#a21ccf9c956b82d50a2579f1c0f592fd0',1,'tvm::meta_schedule::Workload::Workload(IRModule mod)'],['../classtvm_1_1meta__schedule_1_1Workload.html#a8880877517679c82ae63520e28d5e1d8',1,'tvm::meta_schedule::Workload::Workload(IRModule mod, THashCode shash)'],['../classtvm_1_1meta__schedule_1_1TuningRecordNode.html#a42c87f1ec62dae6806c3fe9 [...]
+  ['workload',['Workload',['../classtvm_1_1meta__schedule_1_1Workload.html',1,'tvm::meta_schedule::Workload'],['../classtvm_1_1meta__schedule_1_1TuningRecordNode.html#a42c87f1ec62dae6806c3fe9629c5e7f0',1,'tvm::meta_schedule::TuningRecordNode::workload()'],['../classtvm_1_1meta__schedule_1_1Workload.html#a21ccf9c956b82d50a2579f1c0f592fd0',1,'tvm::meta_schedule::Workload::Workload(IRModule mod)'],['../classtvm_1_1meta__schedule_1_1Workload.html#a8880877517679c82ae63520e28d5e1d8',1,'tvm::me [...]
   ['workload_5fkey',['workload_key',['../classtvm_1_1auto__scheduler_1_1SearchTaskNode.html#a20045d677ba2bc5c5ce461e78543b3e2',1,'tvm::auto_scheduler::SearchTaskNode']]],
   ['workloadequal',['WorkloadEqual',['../structtvm_1_1meta__schedule_1_1WorkloadEqual.html',1,'tvm::meta_schedule']]],
   ['workloadhash',['WorkloadHash',['../structtvm_1_1meta__schedule_1_1WorkloadHash.html',1,'tvm::meta_schedule']]],
diff --git a/docs/reference/api/doxygen/search/all_e.js b/docs/reference/api/doxygen/search/all_e.js
index 8aaebcb9da..bafd7fcb43 100644
--- a/docs/reference/api/doxygen/search/all_e.js
+++ b/docs/reference/api/doxygen/search/all_e.js
@@ -72,7 +72,7 @@ var searchData=
   ['matmulattrs',['MatmulAttrs',['../structtvm_1_1relay_1_1MatmulAttrs.html',1,'tvm::relay']]],
   ['matrix_5fset_5fdiag',['matrix_set_diag',['../namespacetvm_1_1topi.html#aead477c6c9d4f4589d22b8acff82040c',1,'tvm::topi']]],
   ['matrixsetdiagattrs',['MatrixSetDiagAttrs',['../structtvm_1_1relay_1_1MatrixSetDiagAttrs.html',1,'tvm::relay']]],
-  ['max',['Max',['../classtvm_1_1tir_1_1Max.html',1,'tvm::tir::Max'],['../classtvm_1_1tir_1_1Max.html#a7dff11b4dea01bfc7a03eacd077f0729',1,'tvm::tir::Max::Max()'],['../classtvm_1_1arith_1_1IntSet.html#ac215840d3e9fb2817f1e5648e31317c5',1,'tvm::arith::IntSet::max()'],['../classtvm_1_1support_1_1LinearCongruentialEngine.html#a2c5ea87b1155aa7810e0beb3b69b955b',1,'tvm::support::LinearCongruentialEngine::max()'],['../namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb',1,'tvm::max(PrimExpr a, [...]
+  ['max',['Max',['../classtvm_1_1tir_1_1Max.html',1,'tvm::tir::Max'],['../classtvm_1_1arith_1_1IntSet.html#ac215840d3e9fb2817f1e5648e31317c5',1,'tvm::arith::IntSet::max()'],['../classtvm_1_1support_1_1LinearCongruentialEngine.html#a2c5ea87b1155aa7810e0beb3b69b955b',1,'tvm::support::LinearCongruentialEngine::max()'],['../classtvm_1_1tir_1_1Max.html#a7dff11b4dea01bfc7a03eacd077f0729',1,'tvm::tir::Max::Max()'],['../namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb',1,'tvm::max(PrimExpr a, [...]
   ['max_5fcontinuous_5ferror',['max_continuous_error',['../classtvm_1_1auto__scheduler_1_1ProgramMeasurerNode.html#abdc38da91bcdf77be765c1e3d5af3648',1,'tvm::auto_scheduler::ProgramMeasurerNode']]],
   ['max_5fdisplacement',['max_displacement',['../structtvm_1_1relay_1_1CorrelationAttrs.html#ad1d16e2ba537736c8baee2553e1e32bf',1,'tvm::relay::CorrelationAttrs']]],
   ['max_5ffunctions',['max_functions',['../structTVMMutableFuncRegistry.html#a41745f8e0f73f8e4fb2074f5b154b49c',1,'TVMMutableFuncRegistry']]],
@@ -177,7 +177,7 @@ var searchData=
   ['microtvmruntimegetoutput',['MicroTVMRuntimeGetOutput',['../microtvm__runtime_8h.html#a76129be7b6de972791a3f9a1b312acfa',1,'microtvm_runtime.h']]],
   ['microtvmruntimerun',['MicroTVMRuntimeRun',['../microtvm__runtime_8h.html#ac43a544f675dd716e8c279c3e41f6e45',1,'microtvm_runtime.h']]],
   ['microtvmruntimesetinput',['MicroTVMRuntimeSetInput',['../microtvm__runtime_8h.html#aa593edc600f4356f2b560702aa01b113',1,'microtvm_runtime.h']]],
-  ['min',['Min',['../classtvm_1_1tir_1_1Min.html',1,'tvm::tir::Min'],['../classtvm_1_1RangeNode.html#a43d2fb12bb61cf05936a1972d0158b49',1,'tvm::RangeNode::min()'],['../classtvm_1_1tir_1_1ForNode.html#a1d1aa2006328bd84e4911f6d43ceca5c',1,'tvm::tir::ForNode::min()'],['../classtvm_1_1arith_1_1IntSet.html#ae5517de2862e93a801224eed98a57001',1,'tvm::arith::IntSet::min()'],['../classtvm_1_1support_1_1LinearCongruentialEngine.html#aec5f11b588fa3a12294a46c945c34411',1,'tvm::support::LinearCongrue [...]
+  ['min',['Min',['../classtvm_1_1tir_1_1Min.html',1,'tvm::tir::Min'],['../classtvm_1_1tir_1_1Min.html#a3a4403aec40029a5206e22cd334e356b',1,'tvm::tir::Min::Min()'],['../classtvm_1_1RangeNode.html#a43d2fb12bb61cf05936a1972d0158b49',1,'tvm::RangeNode::min()'],['../classtvm_1_1tir_1_1ForNode.html#a1d1aa2006328bd84e4911f6d43ceca5c',1,'tvm::tir::ForNode::min()'],['../classtvm_1_1arith_1_1IntSet.html#ae5517de2862e93a801224eed98a57001',1,'tvm::arith::IntSet::min()'],['../classtvm_1_1support_1_1L [...]
   ['min_5frepeat_5fms',['min_repeat_ms',['../classtvm_1_1auto__scheduler_1_1ProgramRunnerNode.html#a39a865216db9ed6f57dfb22160cae1ff',1,'tvm::auto_scheduler::ProgramRunnerNode']]],
   ['min_5fvalue',['min_value',['../classtvm_1_1arith_1_1ConstIntBoundNode.html#a0761897bf16ab73b848bf360e9b195a3',1,'tvm::arith::ConstIntBoundNode::min_value()'],['../namespacetvm.html#a3b37fa55ea93d6868751a2441996b072',1,'tvm::min_value()']]],
   ['minimum',['minimum',['../namespacetvm_1_1topi.html#a7ac1dc0d99ce93090a4cdf90ab19d4b8',1,'tvm::topi::minimum(const tvm::PrimExpr &amp;a, const tvm::PrimExpr &amp;b)'],['../namespacetvm_1_1topi.html#a0e19dc06a2b1ecbb83b0942fdf836169',1,'tvm::topi::minimum(const tvm::te::Tensor &amp;A, const tvm::te::Tensor &amp;B, std::string name=&quot;T_&quot; &quot;minimum&quot;, std::string tag=kBroadcast)'],['../namespacetvm_1_1topi.html#a28d4ef4b3426bff237215ce356dd5681',1,'tvm::topi::minimum(con [...]
@@ -195,7 +195,7 @@ var searchData=
   ['mixedmodulepassmanager',['MixedModulePassManager',['../namespacetvm.html#abc01352eff102d4902632d097adc0e08',1,'tvm']]],
   ['mma_5ffill',['mma_fill',['../namespacetvm_1_1tir_1_1builtin.html#a307667c449c54cef747d781771f79bab',1,'tvm::tir::builtin']]],
   ['mma_5fstore',['mma_store',['../namespacetvm_1_1tir_1_1builtin.html#a772fb68f083e71e635c50bb503903f22',1,'tvm::tir::builtin']]],
-  ['mod',['Mod',['../classtvm_1_1tir_1_1Mod.html',1,'tvm::tir::Mod'],['../classtvm_1_1tir_1_1Mod.html#a8bb56b57ed569d8f357c4439fd8a2f13',1,'tvm::tir::Mod::Mod()'],['../classtvm_1_1meta__schedule_1_1BuilderInputNode.html#ab2fb058ca54af03b5bc47bf4fac23cf7',1,'tvm::meta_schedule::BuilderInputNode::mod()'],['../classtvm_1_1meta__schedule_1_1WorkloadNode.html#a3929f2761c168c25de6be2247b913911',1,'tvm::meta_schedule::WorkloadNode::mod()'],['../classtvm_1_1meta__schedule_1_1ExtractedTaskNode.ht [...]
+  ['mod',['Mod',['../classtvm_1_1tir_1_1Mod.html',1,'tvm::tir::Mod'],['../classtvm_1_1meta__schedule_1_1BuilderInputNode.html#ab2fb058ca54af03b5bc47bf4fac23cf7',1,'tvm::meta_schedule::BuilderInputNode::mod()'],['../classtvm_1_1meta__schedule_1_1WorkloadNode.html#a3929f2761c168c25de6be2247b913911',1,'tvm::meta_schedule::WorkloadNode::mod()'],['../classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html#a50c40aa8beb57d0f31c36ef360042be6',1,'tvm::meta_schedule::ExtractedTaskNode::mod()'],['../c [...]
   ['mod_5fname',['mod_name',['../structTVMMetadata.html#a32e45fcae0f9328e944a35a885d94276',1,'TVMMetadata::mod_name()'],['../classtvm_1_1runtime_1_1metadata_1_1MetadataNode.html#a1c05bb5eb88b5d55b3abeeb2de263191',1,'tvm::runtime::metadata::MetadataNode::mod_name()']]],
   ['mode',['mode',['../structtvm_1_1relay_1_1MirrorPadAttrs.html#af5381d72f1d9c9abcb9d2e522966ad86',1,'tvm::relay::MirrorPadAttrs::mode()'],['../structtvm_1_1relay_1_1SubPixelAttrs.html#a6f0822aa1ad7672a18ab73c64e83fa99',1,'tvm::relay::SubPixelAttrs::mode()'],['../structtvm_1_1relay_1_1ScatterNDAttrs.html#ab13eeaa700fe7e41666ac04179e0fd62',1,'tvm::relay::ScatterNDAttrs::mode()'],['../structtvm_1_1relay_1_1TakeAttrs.html#a0bf9d25ced9bfc91e766494e5f641e70',1,'tvm::relay::TakeAttrs::mode()' [...]
   ['modnode',['ModNode',['../classtvm_1_1tir_1_1ModNode.html',1,'tvm::tir']]],
diff --git a/docs/reference/api/doxygen/search/functions_10.js b/docs/reference/api/doxygen/search/functions_10.js
index 973b41e883..56bd5761db 100644
--- a/docs/reference/api/doxygen/search/functions_10.js
+++ b/docs/reference/api/doxygen/search/functions_10.js
@@ -8,8 +8,9 @@ var searchData=
   ['packimportstoc',['PackImportsToC',['../namespacetvm_1_1codegen.html#abf02059ebadcdb8bbbe5c840b646d67b',1,'tvm::codegen']]],
   ['packimportstollvm',['PackImportsToLLVM',['../namespacetvm_1_1codegen.html#ab2cd2a65bac4b26427a8ca0abe4e0bd6',1,'tvm::codegen']]],
   ['pad',['pad',['../namespacetvm_1_1topi.html#a3305d377f96cd20c23032eeada2756d5',1,'tvm::topi::pad(const tvm::te::Tensor &amp;t, const tvm::Array&lt; tvm::PrimExpr &gt; &amp;pad_before, tvm::Array&lt; tvm::PrimExpr &gt; pad_after=tvm::Array&lt; tvm::PrimExpr &gt;(), PrimExpr pad_value=PrimExpr(), std::string name=&quot;T_pad&quot;, std::string tag=kElementWise, std::string pad_mode=&quot;constant&quot;, const Array&lt; PrimExpr &gt; *dyn_output_shape=nullptr)'],['../namespacetvm_1_1topi [...]
+  ['padeinsum',['PadEinsum',['../classtvm_1_1tir_1_1ScheduleNode.html#a1ac39c82aee1f8de30d5871d5923fc24',1,'tvm::tir::ScheduleNode']]],
   ['pagememorymanagercreate',['PageMemoryManagerCreate',['../page__allocator_8h.html#a720dbc7474ac13b93fafb974cfc20bc7',1,'page_allocator.h']]],
-  ['parallel',['Parallel',['../classtvm_1_1tir_1_1ScheduleNode.html#a553dc17c0b49b175cd16881c81b6c789',1,'tvm::tir::ScheduleNode::Parallel()'],['../classtvm_1_1auto__scheduler_1_1State.html#a2376f0180bc5b5dd4b456f2a75d4a366',1,'tvm::auto_scheduler::State::parallel()'],['../classtvm_1_1te_1_1Stage.html#a60a6be10a1a96cb594c1399efabafef3',1,'tvm::te::Stage::parallel()']]],
+  ['parallel',['parallel',['../classtvm_1_1auto__scheduler_1_1State.html#a2376f0180bc5b5dd4b456f2a75d4a366',1,'tvm::auto_scheduler::State::parallel()'],['../classtvm_1_1te_1_1Stage.html#a60a6be10a1a96cb594c1399efabafef3',1,'tvm::te::Stage::parallel()'],['../classtvm_1_1tir_1_1ScheduleNode.html#a553dc17c0b49b175cd16881c81b6c789',1,'tvm::tir::ScheduleNode::Parallel()']]],
   ['parallel_5ffor',['parallel_for',['../namespacetvm_1_1support.html#a8bf1225e8bb1db575578ca2d645fb23c',1,'tvm::support']]],
   ['parallel_5ffor_5fdynamic',['parallel_for_dynamic',['../namespacetvm_1_1support.html#afe4271363c794f1644ce7af5c2266530',1,'tvm::support']]],
   ['parallelizevectorizeunroll',['ParallelizeVectorizeUnroll',['../classtvm_1_1meta__schedule_1_1ScheduleRule.html#a0ef9b604081db7a8bf960f3fbfd3a804',1,'tvm::meta_schedule::ScheduleRule']]],
diff --git a/docs/reference/api/doxygen/search/functions_12.js b/docs/reference/api/doxygen/search/functions_12.js
index c9ee61bdac..daf7b2563f 100644
--- a/docs/reference/api/doxygen/search/functions_12.js
+++ b/docs/reference/api/doxygen/search/functions_12.js
@@ -97,7 +97,7 @@ var searchData=
   ['rewritetensorize',['RewriteTensorize',['../classtvm_1_1meta__schedule_1_1Postproc.html#a95db036cfced4c2575367a26a41498ff',1,'tvm::meta_schedule::Postproc']]],
   ['rewriteunboundblock',['RewriteUnboundBlock',['../classtvm_1_1meta__schedule_1_1Postproc.html#a1836b2278bc24fdc227c490896d92980',1,'tvm::meta_schedule::Postproc']]],
   ['rewriteunsafeselect',['RewriteUnsafeSelect',['../namespacetvm_1_1tir_1_1transform.html#a4fe43327c4454dd05b6e925577443f49',1,'tvm::tir::transform']]],
-  ['rfactor',['rfactor',['../classtvm_1_1auto__scheduler_1_1State.html#a21c27b06d439267f8b981fa05c5f48a0',1,'tvm::auto_scheduler::State::rfactor()'],['../classtvm_1_1te_1_1Schedule.html#a34ae85add41bbed0140726d024d08862',1,'tvm::te::Schedule::rfactor()'],['../classtvm_1_1tir_1_1ScheduleNode.html#ab185c8eac1065290d84d58e7f4617232',1,'tvm::tir::ScheduleNode::RFactor()']]],
+  ['rfactor',['RFactor',['../classtvm_1_1tir_1_1ScheduleNode.html#ab185c8eac1065290d84d58e7f4617232',1,'tvm::tir::ScheduleNode::RFactor()'],['../classtvm_1_1auto__scheduler_1_1State.html#a21c27b06d439267f8b981fa05c5f48a0',1,'tvm::auto_scheduler::State::rfactor()'],['../classtvm_1_1te_1_1Schedule.html#a34ae85add41bbed0140726d024d08862',1,'tvm::te::Schedule::rfactor()']]],
   ['rfactorstep',['RfactorStep',['../classtvm_1_1auto__scheduler_1_1RfactorStep.html#a26e6f85b55307f18fab4469e3bd4be0c',1,'tvm::auto_scheduler::RfactorStep::RfactorStep(int stage_id, int iter_id, int factor_iter_id)'],['../classtvm_1_1auto__scheduler_1_1RfactorStep.html#a95575c21441177634178245ab562cb4f',1,'tvm::auto_scheduler::RfactorStep::RfactorStep(dmlc::JSONReader *reader)']]],
   ['right_5fshift',['right_shift',['../namespacetvm.html#ae8ecc0382685a855187bede0c97d93e6',1,'tvm::right_shift(PrimExpr a, PrimExpr b, Span span=Span())'],['../namespacetvm.html#af49dde9dfdeea62e8ad3a6d8db53de0b',1,'tvm::right_shift(const PrimExpr &amp;a, int b, Span span=Span())'],['../namespacetvm.html#a98ff4361d0a24570f8dc32d03cde972a',1,'tvm::right_shift(int a, const PrimExpr &amp;b, Span span=Span())'],['../namespacetvm_1_1topi.html#a9673b9caffb46404b566c3f04a492dfe',1,'tvm::topi:: [...]
   ['rocblas_5fbatch_5fmatmul',['rocblas_batch_matmul',['../namespacetvm_1_1topi_1_1contrib.html#abf1113dd429e1285752b48f62fe12848',1,'tvm::topi::contrib']]],
diff --git a/docs/reference/api/doxygen/search/functions_15.js b/docs/reference/api/doxygen/search/functions_15.js
index f211aa2409..7a55842ac1 100644
--- a/docs/reference/api/doxygen/search/functions_15.js
+++ b/docs/reference/api/doxygen/search/functions_15.js
@@ -22,7 +22,7 @@ var searchData=
   ['unknownattributeaccesspathnode',['UnknownAttributeAccessPathNode',['../classtvm_1_1UnknownAttributeAccessPathNode.html#a1882e9e591466a2785acc761dc63d56e',1,'tvm::UnknownAttributeAccessPathNode']]],
   ['unmatchedcases',['UnmatchedCases',['../namespacetvm_1_1relay.html#aa3a8cace40f8056fd6412f39c3eaa605',1,'tvm::relay']]],
   ['unravel_5findex',['unravel_index',['../namespacetvm_1_1topi.html#a8811a02532bbe3047986bf1a8449ac0e',1,'tvm::topi']]],
-  ['unroll',['unroll',['../classtvm_1_1auto__scheduler_1_1State.html#aa68a9d2e226bae38a36e4be4af1d1ae4',1,'tvm::auto_scheduler::State::unroll()'],['../classtvm_1_1te_1_1Stage.html#af83ad8672660403504f472228b044b33',1,'tvm::te::Stage::unroll()'],['../classtvm_1_1tir_1_1ScheduleNode.html#a84ec742f6295f59390592a6d0d90a552',1,'tvm::tir::ScheduleNode::Unroll()']]],
+  ['unroll',['Unroll',['../classtvm_1_1tir_1_1ScheduleNode.html#a84ec742f6295f59390592a6d0d90a552',1,'tvm::tir::ScheduleNode::Unroll()'],['../classtvm_1_1auto__scheduler_1_1State.html#aa68a9d2e226bae38a36e4be4af1d1ae4',1,'tvm::auto_scheduler::State::unroll()'],['../classtvm_1_1te_1_1Stage.html#af83ad8672660403504f472228b044b33',1,'tvm::te::Stage::unroll()']]],
   ['unrollloop',['UnrollLoop',['../namespacetvm_1_1tir_1_1transform.html#ab2f279e91071fa96a1edb24fa004ea6a',1,'tvm::tir::transform']]],
   ['update',['Update',['../classtvm_1_1arith_1_1ConstIntBoundAnalyzer.html#a5ae0699196c4bbc754bbdd4c3a6c7ca7',1,'tvm::arith::ConstIntBoundAnalyzer::Update()'],['../classtvm_1_1arith_1_1ModularSetAnalyzer.html#a04156fac580981f3005af3b8e676720d',1,'tvm::arith::ModularSetAnalyzer::Update()'],['../classtvm_1_1arith_1_1RewriteSimplifier.html#a5e6752c0702dc2d3e4235797d9d3ac7b',1,'tvm::arith::RewriteSimplifier::Update()'],['../classtvm_1_1arith_1_1CanonicalSimplifier.html#a790c032e12c7d93e9e940 [...]
   ['updatecostmodel',['UpdateCostModel',['../classtvm_1_1meta__schedule_1_1MeasureCallback.html#afdf5503c6e6f53767de132d91a7b53f9',1,'tvm::meta_schedule::MeasureCallback']]],
diff --git a/docs/reference/api/doxygen/search/functions_d.js b/docs/reference/api/doxygen/search/functions_d.js
index ba876e7044..df48d208b7 100644
--- a/docs/reference/api/doxygen/search/functions_d.js
+++ b/docs/reference/api/doxygen/search/functions_d.js
@@ -36,7 +36,7 @@ var searchData=
   ['matchrange',['MatchRange',['../classtvm_1_1arith_1_1IntSet.html#a2f2999336fbba4f436b66bdddce5c57a',1,'tvm::arith::IntSet']]],
   ['matmul',['matmul',['../namespacetvm_1_1topi.html#adae7dcb7e951109ba72192202d182994',1,'tvm::topi']]],
   ['matrix_5fset_5fdiag',['matrix_set_diag',['../namespacetvm_1_1topi.html#aead477c6c9d4f4589d22b8acff82040c',1,'tvm::topi']]],
-  ['max',['Max',['../classtvm_1_1tir_1_1Max.html#a7dff11b4dea01bfc7a03eacd077f0729',1,'tvm::tir::Max::Max()'],['../classtvm_1_1arith_1_1IntSet.html#ac215840d3e9fb2817f1e5648e31317c5',1,'tvm::arith::IntSet::max()'],['../classtvm_1_1support_1_1LinearCongruentialEngine.html#a2c5ea87b1155aa7810e0beb3b69b955b',1,'tvm::support::LinearCongruentialEngine::max()'],['../namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb',1,'tvm::max(PrimExpr a, PrimExpr b, Span span=Span())'],['../namespacetvm.ht [...]
+  ['max',['max',['../classtvm_1_1arith_1_1IntSet.html#ac215840d3e9fb2817f1e5648e31317c5',1,'tvm::arith::IntSet::max()'],['../classtvm_1_1support_1_1LinearCongruentialEngine.html#a2c5ea87b1155aa7810e0beb3b69b955b',1,'tvm::support::LinearCongruentialEngine::max()'],['../classtvm_1_1tir_1_1Max.html#a7dff11b4dea01bfc7a03eacd077f0729',1,'tvm::tir::Max::Max()'],['../namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb',1,'tvm::max(PrimExpr a, PrimExpr b, Span span=Span())'],['../namespacetvm.ht [...]
   ['max_5fvalue',['max_value',['../namespacetvm.html#a4f1398024c0af23699447ef910b654b8',1,'tvm']]],
   ['maxconcurrency',['MaxConcurrency',['../namespacetvm_1_1runtime_1_1threading.html#af8c1c389a74e67bcc3680555288219f8',1,'tvm::runtime::threading']]],
   ['maximum',['maximum',['../namespacetvm_1_1topi.html#afd64bc3e27dfc97002d3add5d7ce4174',1,'tvm::topi::maximum(const tvm::PrimExpr &amp;a, const tvm::PrimExpr &amp;b)'],['../namespacetvm_1_1topi.html#a5338e9297463bc745027fca67daa2ebb',1,'tvm::topi::maximum(const tvm::te::Tensor &amp;A, const tvm::te::Tensor &amp;B, std::string name=&quot;T_&quot; &quot;maximum&quot;, std::string tag=kBroadcast)'],['../namespacetvm_1_1topi.html#a4076a8d6a2b243c548d741e9f6bcfe69',1,'tvm::topi::maximum(con [...]
@@ -66,7 +66,7 @@ var searchData=
   ['microtvmruntimegetoutput',['MicroTVMRuntimeGetOutput',['../microtvm__runtime_8h.html#a76129be7b6de972791a3f9a1b312acfa',1,'microtvm_runtime.h']]],
   ['microtvmruntimerun',['MicroTVMRuntimeRun',['../microtvm__runtime_8h.html#ac43a544f675dd716e8c279c3e41f6e45',1,'microtvm_runtime.h']]],
   ['microtvmruntimesetinput',['MicroTVMRuntimeSetInput',['../microtvm__runtime_8h.html#aa593edc600f4356f2b560702aa01b113',1,'microtvm_runtime.h']]],
-  ['min',['min',['../classtvm_1_1arith_1_1IntSet.html#ae5517de2862e93a801224eed98a57001',1,'tvm::arith::IntSet::min()'],['../classtvm_1_1support_1_1LinearCongruentialEngine.html#aec5f11b588fa3a12294a46c945c34411',1,'tvm::support::LinearCongruentialEngine::min()'],['../classtvm_1_1tir_1_1Min.html#a3a4403aec40029a5206e22cd334e356b',1,'tvm::tir::Min::Min()'],['../namespacetvm.html#aac2abc149c1a47944c37b560181b15c0',1,'tvm::min(PrimExpr a, PrimExpr b, Span span=Span())'],['../namespacetvm.ht [...]
+  ['min',['Min',['../classtvm_1_1tir_1_1Min.html#a3a4403aec40029a5206e22cd334e356b',1,'tvm::tir::Min::Min()'],['../classtvm_1_1arith_1_1IntSet.html#ae5517de2862e93a801224eed98a57001',1,'tvm::arith::IntSet::min()'],['../classtvm_1_1support_1_1LinearCongruentialEngine.html#aec5f11b588fa3a12294a46c945c34411',1,'tvm::support::LinearCongruentialEngine::min()'],['../namespacetvm.html#aac2abc149c1a47944c37b560181b15c0',1,'tvm::min(PrimExpr a, PrimExpr b, Span span=Span())'],['../namespacetvm.ht [...]
   ['min_5fvalue',['min_value',['../namespacetvm.html#a3b37fa55ea93d6868751a2441996b072',1,'tvm']]],
   ['minimum',['minimum',['../namespacetvm_1_1topi.html#a7ac1dc0d99ce93090a4cdf90ab19d4b8',1,'tvm::topi::minimum(const tvm::PrimExpr &amp;a, const tvm::PrimExpr &amp;b)'],['../namespacetvm_1_1topi.html#a0e19dc06a2b1ecbb83b0942fdf836169',1,'tvm::topi::minimum(const tvm::te::Tensor &amp;A, const tvm::te::Tensor &amp;B, std::string name=&quot;T_&quot; &quot;minimum&quot;, std::string tag=kBroadcast)'],['../namespacetvm_1_1topi.html#a28d4ef4b3426bff237215ce356dd5681',1,'tvm::topi::minimum(con [...]
   ['minop',['MinOp',['../namespacetvm_1_1topi.html#aea9a989b0aaa2aef03fe8ee237d8257e',1,'tvm::topi']]],
@@ -79,7 +79,7 @@ var searchData=
   ['mixedmodulepassmanager',['MixedModulePassManager',['../namespacetvm.html#abc01352eff102d4902632d097adc0e08',1,'tvm']]],
   ['mma_5ffill',['mma_fill',['../namespacetvm_1_1tir_1_1builtin.html#a307667c449c54cef747d781771f79bab',1,'tvm::tir::builtin']]],
   ['mma_5fstore',['mma_store',['../namespacetvm_1_1tir_1_1builtin.html#a772fb68f083e71e635c50bb503903f22',1,'tvm::tir::builtin']]],
-  ['mod',['Mod',['../classtvm_1_1tir_1_1Mod.html#a8bb56b57ed569d8f357c4439fd8a2f13',1,'tvm::tir::Mod::Mod()'],['../classtvm_1_1tir_1_1ScheduleNode.html#a6dd7ec20629e09cd0be1aa49e5f57c12',1,'tvm::tir::ScheduleNode::mod()'],['../namespacetvm_1_1topi.html#aaa95d3ad68932ab206efbe0a326db6a2',1,'tvm::topi::mod(const tvm::PrimExpr &amp;a, const tvm::PrimExpr &amp;b)'],['../namespacetvm_1_1topi.html#a4eb4b5a58cf4c5dbbdd4413cfd166882',1,'tvm::topi::mod(const tvm::te::Tensor &amp;A, const tvm::te: [...]
+  ['mod',['mod',['../classtvm_1_1tir_1_1ScheduleNode.html#a6dd7ec20629e09cd0be1aa49e5f57c12',1,'tvm::tir::ScheduleNode::mod()'],['../classtvm_1_1tir_1_1Mod.html#a8bb56b57ed569d8f357c4439fd8a2f13',1,'tvm::tir::Mod::Mod()'],['../namespacetvm_1_1topi.html#aaa95d3ad68932ab206efbe0a326db6a2',1,'tvm::topi::mod(const tvm::PrimExpr &amp;a, const tvm::PrimExpr &amp;b)'],['../namespacetvm_1_1topi.html#a4eb4b5a58cf4c5dbbdd4413cfd166882',1,'tvm::topi::mod(const tvm::te::Tensor &amp;A, const tvm::te: [...]
   ['mod_5fname',['mod_name',['../classtvm_1_1runtime_1_1metadata_1_1MetadataNode.html#a1c05bb5eb88b5d55b3abeeb2de263191',1,'tvm::runtime::metadata::MetadataNode']]],
   ['modularset',['ModularSet',['../classtvm_1_1arith_1_1ModularSet.html#a9f54896d98169246c6a24cc338fde500',1,'tvm::arith::ModularSet']]],
   ['module',['Module',['../classtvm_1_1runtime_1_1Module.html#abfbc619b3b3166d63ec52e399c24bed9',1,'tvm::runtime::Module::Module()'],['../classtvm_1_1runtime_1_1Module.html#abd1380b3f813c2b6acefca3aaef425f4',1,'tvm::runtime::Module::Module(ObjectPtr&lt; Object &gt; n)']]],
diff --git a/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html b/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html
index 8b038aabf4..a4d1ee184e 100644
--- a/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html
+++ b/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html
@@ -66,7 +66,7 @@ $(function() {
 <div class="title">schedule.h</div>  </div>
 </div><!--header-->
 <div class="contents">
-<a href="tir_2schedule_2schedule_8h.html">Go to the documentation of this file.</a><div class="fragment"><div class="line"><a name="l00001"></a><span class="lineno">    1</span>&#160;<span class="comment">/*</span></div><div class="line"><a name="l00002"></a><span class="lineno">    2</span>&#160;<span class="comment"> * Licensed to the Apache Software Foundation (ASF) under one</span></div><div class="line"><a name="l00003"></a><span class="lineno">    3</span>&#160;<span class="comment [...]
+<a href="tir_2schedule_2schedule_8h.html">Go to the documentation of this file.</a><div class="fragment"><div class="line"><a name="l00001"></a><span class="lineno">    1</span>&#160;<span class="comment">/*</span></div><div class="line"><a name="l00002"></a><span class="lineno">    2</span>&#160;<span class="comment"> * Licensed to the Apache Software Foundation (ASF) under one</span></div><div class="line"><a name="l00003"></a><span class="lineno">    3</span>&#160;<span class="comment [...]
 <div class="ttc" id="classtvm_1_1tir_1_1StmtNode_html"><div class="ttname"><a href="classtvm_1_1tir_1_1StmtNode.html">tvm::tir::StmtNode</a></div><div class="ttdoc">Base node of all statements. </div><div class="ttdef"><b>Definition:</b> stmt.h:38</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1BlockRVNode_html_af90b398c502892d19ff3bdf6463d32ab"><div class="ttname"><a href="classtvm_1_1tir_1_1BlockRVNode.html#af90b398c502892d19ff3bdf6463d32ab">tvm::tir::BlockRVNode::VisitAttrs</a></div><div class="ttdeci">void VisitAttrs(tvm::AttrVisitor *v)</div><div class="ttdef"><b>Definition:</b> schedule.h:53</div></div>
 <div class="ttc" id="trace_8h_html"><div class="ttname"><a href="trace_8h.html">trace.h</a></div></div>
@@ -83,7 +83,7 @@ $(function() {
 <div class="ttc" id="object_8h_html_aaaa3dc5b6dc33f84b2d28f9a81267212"><div class="ttname"><a href="object_8h.html#aaaa3dc5b6dc33f84b2d28f9a81267212">TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS</a></div><div class="ttdeci">#define TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(TypeName, ParentType, ObjectName)</div><div class="ttdef"><b>Definition:</b> object.h:744</div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_a9ae244600a5e56c4adc9faf6d88f931e"><div class="ttname"><a href="namespacetvm_1_1tir.html#a9ae244600a5e56c4adc9faf6d88f931e">tvm::tir::ScheduleErrorRenderLevel</a></div><div class="ttdeci">ScheduleErrorRenderLevel</div><div class="ttdoc">The level of detailed error message rendering. </div><div class="ttdef"><b>Definition:</b> schedule.h:31</div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_a9ae244600a5e56c4adc9faf6d88f931ead6733547bb237ce06cddf96357f1b66b"><div class="ttname"><a href="namespacetvm_1_1tir.html#a9ae244600a5e56c4adc9faf6d88f931ead6733547bb237ce06cddf96357f1b66b">tvm::tir::ScheduleErrorRenderLevel::kDetail</a></div><div class="ttdoc">Render a detailed error message. </div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:659</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
 <div class="ttc" id="index__map_8h_html"><div class="ttname"><a href="index__map_8h.html">index_map.h</a></div><div class="ttdoc">Defines a remapping of buffer indices. </div></div>
 <div class="ttc" id="classtvm_1_1support_1_1LinearCongruentialEngine_html_a4d3a3a94a3f3d2dfab4b5ccb1a7e97de"><div class="ttname"><a href="classtvm_1_1support_1_1LinearCongruentialEngine.html#a4d3a3a94a3f3d2dfab4b5ccb1a7e97de">tvm::support::LinearCongruentialEngine::TRandState</a></div><div class="ttdeci">int64_t TRandState</div><div class="ttdef"><b>Definition:</b> random_engine.h:54</div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
diff --git a/docs/reference/api/doxygen/trace_8h_source.html b/docs/reference/api/doxygen/trace_8h_source.html
index 4a0c4df909..8ef9a8c605 100644
--- a/docs/reference/api/doxygen/trace_8h_source.html
+++ b/docs/reference/api/doxygen/trace_8h_source.html
@@ -76,7 +76,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1tir_html_a75918aeef1136f9d6308556902d5bcae"><div class="ttname"><a href="namespacetvm_1_1tir.html#a75918aeef1136f9d6308556902d5bcae">tvm::tir::FTraceDecisionProvider</a></div><div class="ttdeci">runtime::TypedPackedFunc&lt; ObjectRef(const Instruction &amp;inst, const Array&lt; ObjectRef &gt; &amp;inputs, const Array&lt; ObjectRef &gt; &amp;attrs, const Optional&lt; ObjectRef &gt; &amp;decision)&gt; FTraceDecisionProvider</div><div class="ttdoc">A cal [...]
 <div class="ttc" id="instruction_8h_html"><div class="ttname"><a href="instruction_8h.html">instruction.h</a></div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:659</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1TraceNode_html_ad6c859ed32b1e2ae076355eda37df0a2"><div class="ttname"><a href="classtvm_1_1tir_1_1TraceNode.html#ad6c859ed32b1e2ae076355eda37df0a2">tvm::tir::TraceNode::insts</a></div><div class="ttdeci">Array&lt; Instruction &gt; insts</div><div class="ttdoc">The instructions invoked so far in the program execution. </div><div class="ttdef"><b>Definition:</b> trace.h:61</div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1TraceNode_html_a764346045e536fa26b56c9e140de8e7b"><div class="ttname"><a href="classtvm_1_1tir_1_1TraceNode.html#a764346045e536fa26b56c9e140de8e7b">tvm::tir::TraceNode::ApplyToSchedule</a></div><div class="ttdeci">void ApplyToSchedule(Schedule sch, bool remove_postproc, FTraceDecisionProvider decision_provider=nullptr) const</div><div class="ttdoc">Apply the trace to a TensorIR schedule. </div></div>
diff --git a/docs/reference/api/python/auto_scheduler.html b/docs/reference/api/python/auto_scheduler.html
index c1cd9627bd..1c4a70a147 100644
--- a/docs/reference/api/python/auto_scheduler.html
+++ b/docs/reference/api/python/auto_scheduler.html
@@ -1602,7 +1602,7 @@ history states as starting point to perform Evolutionary Search).</p></li>
 
 <dl class="py class">
 <dt class="sig sig-object py" id="tvm.auto_scheduler.SketchPolicy">
-<em class="property"><span class="pre">class</span> </em><span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">SketchPolicy</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">program_cost_model</span></span><span class="o"><span class="pre">=</span></span><span class="defau [...]
+<em class="property"><span class="pre">class</span> </em><span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">SketchPolicy</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">program_cost_model</span></span><span class="o"><span class="pre">=</span></span><span class="defau [...]
 <dd><p>The search policy that searches in a hierarchical search space defined by sketches.
 The policy randomly samples programs from the space defined by sketches and use evolutionary
 search to fine-tune them.</p>
@@ -1886,7 +1886,7 @@ Candidates:
 
 <dl class="py function">
 <dt class="sig sig-object py" id="tvm.auto_scheduler.auto_schedule">
-<span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">auto_schedule</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">search_policy</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em clas [...]
+<span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">auto_schedule</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">search_policy</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em clas [...]
 <dd><p>THIS API IS DEPRECATED.</p>
 <p>Run auto scheduling search for a task.</p>
 <dl class="field-list simple">
diff --git a/docs/reference/api/python/tir.html b/docs/reference/api/python/tir.html
index 9c0f932ca9..0dd1cc6d30 100644
--- a/docs/reference/api/python/tir.html
+++ b/docs/reference/api/python/tir.html
@@ -5562,7 +5562,10 @@ preserve the semantics of computation. Some example of schedules:
 <tr class="row-even"><td><p><a class="reference internal" href="#tvm.tir.Schedule.can_decompose_padding" title="tvm.tir.Schedule.can_decompose_padding"><code class="xref py py-obj docutils literal notranslate"><span class="pre">can_decompose_padding</span></code></a>(block, loop)</p></td>
 <td><p>Check whether the block match padding pattern and can be decomposed.</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="#tvm.tir.Schedule.enter_postproc" title="tvm.tir.Schedule.enter_postproc"><code class="xref py py-obj docutils literal notranslate"><span class="pre">enter_postproc</span></code></a>()</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="#tvm.tir.Schedule.pad_einsum" title="tvm.tir.Schedule.pad_einsum"><code class="xref py py-obj docutils literal notranslate"><span class="pre">pad_einsum</span></code></a>(block, padding)</p></td>
+<td><p>Pad the computation of Einsum.</p></td>
+</tr>
+<tr class="row-even"><td><p><a class="reference internal" href="#tvm.tir.Schedule.enter_postproc" title="tvm.tir.Schedule.enter_postproc"><code class="xref py py-obj docutils literal notranslate"><span class="pre">enter_postproc</span></code></a>()</p></td>
 <td><p>A no-op that marks the start of postprocessing phase of scheduling</p></td>
 </tr>
 </tbody>
@@ -7598,6 +7601,113 @@ reads/writes of the block.</p>
 <dd><p>Check whether the block match padding pattern and can be decomposed.</p>
 </dd></dl>
 
+<dl class="py method">
+<dt class="sig sig-object py" id="tvm.tir.Schedule.pad_einsum">
+<span class="sig-name descname"><span class="pre">pad_einsum</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">block</span></span><span class="p"><span class="pre">:</span></span> <span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">tvm.tir.schedule.schedule.BlockRV</span><span class="p"><span class="pre">,</span> </span><a class="reference external" href="https://docs.python.or [...]
+<dd><p>Pad the computation of Einsum.</p>
+<p>This schedule primitives identifies the Einsum pattern in the block body, and find its
+producer blocks. It then pads the computation of the Einsum pattern and its producer blocks.
+The output buffer and the producer buffer is resized according to the padding size. It
+requires the output buffer and the producer buffer to be allocated inside the PrimFunc.</p>
+<p>The padding is a list of non-negative integers, each element corresponds to the padding for
+each block iter in the order of block iters. The block and it’s producer blocks should have
+trivial bindings, i.e. each block iter is bound to a single loop variable. After padding,
+thblock iter extent and the corresponding outer loop is extended by the padding size.</p>
+<p>The size of the producer buffers are infered from the padding size of the Einsum
+computation. The producer buffers are padded by the initial value of the corresponding
+reduction.</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters</dt>
+<dd class="field-odd"><ul class="simple">
+<li><p><strong>block</strong> (<em>Union</em><em>[</em><em>BlockRV</em><em>, </em><a class="reference external" href="https://docs.python.org/3/library/stdtypes.html#str" title="(in Python v3.10)"><em>str</em></a><em>]</em>) – The block that matches the Einsum pattern.</p></li>
+<li><p><strong>padding</strong> (<em>List</em><em>[</em><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.10)"><em>int</em></a><em>]</em>) – The padding for each block iter.</p></li>
+</ul>
+</dd>
+</dl>
+<p class="rubric">Examples</p>
+<p>Before applying pad-einsum, in TensorIR, the IR is:</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="nd">@T</span><span class="o">.</span><span class="n">prim_func</span>
+<span class="k">def</span> <span class="nf">before_pad_einsum</span><span class="p">(</span>
+    <span class="n">A</span><span class="p">:</span> <span class="n">T</span><span class="o">.</span><span class="n">Buffer</span><span class="p">[(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">],</span>
+    <span class="n">B</span><span class="p">:</span> <span class="n">T</span><span class="o">.</span><span class="n">Buffer</span><span class="p">[(</span><span class="mi">127</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">],</span>
+    <span class="n">C</span><span class="p">:</span> <span class="n">T</span><span class="o">.</span><span class="n">Buffer</span><span class="p">[(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">],</span>
+<span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
+    <span class="n">A_shared</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">alloc_buffer</span><span class="p">((</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">,</span> <span class="n">scope</span><span class="o">=</span><span class="s2">&quot;shared&quot;</span><span class="p">)</span>
+    <span class="n">B_shared</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">alloc_buffer</span><span class="p">((</span><span class="mi">127</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">,</span> <span class="n">scope</span><span class="o">=</span><span class="s2">&quot;shared&quot;</span><span class="p">)</span>
+    <span class="n">C_shared</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">alloc_buffer</span><span class="p">((</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">,</span> <span class="n">scope</span><span class="o">=</span><span class="s2">&quot;shared&quot;</span><span class="p">)</span>
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;A&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SS&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">])</span>
+            <span class="n">A_shared</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span>
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">127</span><span class="p">,</span> <span class="mi">127</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;B&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SS&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">])</span>
+            <span class="n">B_shared</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span>
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">,</span> <span class="n">i2</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">,</span> <span class="mi">127</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;C_shared&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SSR&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">,< [...]
+            <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">init</span><span class="p">():</span>
+                <span class="n">C_shared</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
+            <span class="n">C_shared</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">C_shared</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">+</span> <span class="n">A_shared</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">k</s [...]
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;C&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SS&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">])</span>
+            <span class="n">C</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">C_shared</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span>
+</pre></div>
+</div>
+<p>Create the schedule and do pad-einsum with specified block:</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">sch</span> <span class="o">=</span> <span class="n">tir</span><span class="o">.</span><span class="n">Schedule</span><span class="p">(</span><span class="n">before_pad_einsum</span><span class="p">,</span> <span class="n">debug_mask</span><span class="o">=</span><span class="s2">&quot;all&quot;</span><span class="p">)</span>
+<span class="n">block</span> <span class="o">=</span> <span class="n">sch</span><span class="o">.</span><span class="n">get_block</span><span class="p">(</span><span class="s2">&quot;C_shared&quot;</span><span class="p">)</span>
+<span class="n">sch</span><span class="o">.</span><span class="n">pad_einsum</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
+<span class="nb">print</span><span class="p">(</span><span class="n">sch</span><span class="o">.</span><span class="n">mod</span><span class="p">[</span><span class="s2">&quot;main&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">script</span><span class="p">())</span>
+</pre></div>
+</div>
+<p>After applying decompose-padding, the IR becomes:</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="nd">@T</span><span class="o">.</span><span class="n">prim_func</span>
+<span class="k">def</span> <span class="nf">after_pad_einsum</span><span class="p">(</span>
+    <span class="n">A</span><span class="p">:</span> <span class="n">T</span><span class="o">.</span><span class="n">Buffer</span><span class="p">[(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">],</span>
+    <span class="n">B</span><span class="p">:</span> <span class="n">T</span><span class="o">.</span><span class="n">Buffer</span><span class="p">[(</span><span class="mi">127</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">],</span>
+    <span class="n">C</span><span class="p">:</span> <span class="n">T</span><span class="o">.</span><span class="n">Buffer</span><span class="p">[(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">),</span> <span class="s2">&quot;float32&quot;</span><span class="p">],</span>
+<span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
+    <span class="n">A_shared_padded</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">alloc_buffer</span><span class="p">([</span><span class="mi">128</span><span class="p">,</span> <span class="mi">128</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="s2">&quot;float32&quot;</span><span class="p">,</span> <span class="n">scope</span><span class="o">=</span><span class="s2">&quot;shared&quot;</sp [...]
+    <span class="n">B_shared_padded</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">alloc_buffer</span><span class="p">([</span><span class="mi">128</span><span class="p">,</span> <span class="mi">128</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="s2">&quot;float32&quot;</span><span class="p">,</span> <span class="n">scope</span><span class="o">=</span><span class="s2">&quot;shared&quot;</sp [...]
+    <span class="n">C_shared_padded</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">alloc_buffer</span><span class="p">([</span><span class="mi">128</span><span class="p">,</span> <span class="mi">128</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="s2">&quot;float32&quot;</span><span class="p">,</span> <span class="n">scope</span><span class="o">=</span><span class="s2">&quot;shared&quot;</sp [...]
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">128</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;A&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SS&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">])</span>
+            <span class="n">T</span><span class="o">.</span><span class="n">reads</span><span class="p">(</span><span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="n">T</span><span class="o">.</span><span class="n">writes</span><span class="p">(</span><span class="n">A_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="n">A_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">if_then_else</span><span class="p">(</span>
+                <span class="n">j</span> <span class="o">&lt;</span> <span class="mi">127</span><span class="p">,</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">],</span> <span class="n">T</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="s2">&q [...]
+            <span class="p">)</span>
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">128</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;B&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SS&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">])</span>
+            <span class="n">T</span><span class="o">.</span><span class="n">reads</span><span class="p">(</span><span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="n">T</span><span class="o">.</span><span class="n">writes</span><span class="p">(</span><span class="n">B_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="n">B_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">if_then_else</span><span class="p">(</span>
+                <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">127</span> <span class="ow">and</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="mi">127</span><span class="p">,</span> <span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">],</span> <span class="n">T</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class=" [...]
+            <span class="p">)</span>
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">,</span> <span class="n">i2</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">128</span><span class="p">,</span> <span class="mi">128</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;C_shared&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SSR&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">,< [...]
+            <span class="n">T</span><span class="o">.</span><span class="n">reads</span><span class="p">(</span><span class="n">A_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">k</span><span class="p">],</span> <span class="n">B_shared_padded</span><span class="p">[</span><span class="n">k</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="n">T</span><span class="o">.</span><span class="n">writes</span><span class="p">(</span><span class="n">C_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">init</span><span class="p">():</span>
+                <span class="n">C_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
+            <span class="n">C_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span>
+                <span class="n">C_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">+</span> <span class="n">A_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">k</span><span class="p">]</span> <span class="o">*</span> <span class="n">B_shared_padded</span><span class="p">[</span><span class="n">k</span><span class="p">,</s [...]
+            <span class="p">)</span>
+    <span class="k">for</span> <span class="n">i0</span><span class="p">,</span> <span class="n">i1</span> <span class="ow">in</span> <span class="n">T</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">127</span><span class="p">):</span>
+        <span class="k">with</span> <span class="n">T</span><span class="o">.</span><span class="n">block</span><span class="p">(</span><span class="s2">&quot;C&quot;</span><span class="p">):</span>
+            <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="n">T</span><span class="o">.</span><span class="n">axis</span><span class="o">.</span><span class="n">remap</span><span class="p">(</span><span class="s2">&quot;SS&quot;</span><span class="p">,</span> <span class="p">[</span><span class="n">i0</span><span class="p">,</span> <span class="n">i1</span><span class="p">])</span>
+            <span class="n">T</span><span class="o">.</span><span class="n">reads</span><span class="p">(</span><span class="n">C_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="n">T</span><span class="o">.</span><span class="n">writes</span><span class="p">(</span><span class="n">C</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">])</span>
+            <span class="n">C</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">C_shared_padded</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span>
+</pre></div>
+</div>
+</dd></dl>
+
 <dl class="py method">
 <dt class="sig sig-object py" id="tvm.tir.Schedule.enter_postproc">
 <span class="sig-name descname"><span class="pre">enter_postproc</span></span><span class="sig-paren">(</span><span class="sig-paren">)</span> <span class="sig-return"><span class="sig-return-icon">&#x2192;</span> <span class="sig-return-typehint"><a class="reference external" href="https://docs.python.org/3/library/constants.html#None" title="(in Python v3.10)"><span class="pre">None</span></a></span></span><a class="headerlink" href="#tvm.tir.Schedule.enter_postproc" title="Permalink t [...]
diff --git a/docs/reference/api/typedoc/classes/bytestreamreader.html b/docs/reference/api/typedoc/classes/bytestreamreader.html
index 01c59912ac..cda84b9736 100644
--- a/docs/reference/api/typedoc/classes/bytestreamreader.html
+++ b/docs/reference/api/typedoc/classes/bytestreamreader.html
@@ -119,7 +119,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -141,7 +141,7 @@
 					<div class="tsd-signature tsd-kind-icon">bytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Uint8Array</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -151,7 +151,7 @@
 					<div class="tsd-signature tsd-kind-icon">offset<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L42">rpc_server.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L42">rpc_server.ts:42</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -168,7 +168,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L63">rpc_server.ts:63</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L63">rpc_server.ts:63</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">Uint8Array</span></h4>
@@ -185,7 +185,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L49">rpc_server.ts:49</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L49">rpc_server.ts:49</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -202,7 +202,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L57">rpc_server.ts:57</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L57">rpc_server.ts:57</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
diff --git a/docs/reference/api/typedoc/classes/cachedcallstack.html b/docs/reference/api/typedoc/classes/cachedcallstack.html
index 5f0a6a16b0..37507ddf45 100644
--- a/docs/reference/api/typedoc/classes/cachedcallstack.html
+++ b/docs/reference/api/typedoc/classes/cachedcallstack.html
@@ -144,7 +144,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L223">memory.ts:223</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L223">memory.ts:223</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -172,7 +172,7 @@
 					<div class="tsd-signature tsd-kind-icon">temp<wbr>Args<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><a href="../interfaces/disposable.html" class="tsd-signature-type">Disposable</a><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = []</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L208">memory.ts:208</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L208">memory.ts:208</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -194,7 +194,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L312">memory.ts:312</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L312">memory.ts:312</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -226,7 +226,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L284">memory.ts:284</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L284">memory.ts:284</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -262,7 +262,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L388">memory.ts:388</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L388">memory.ts:388</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -300,7 +300,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L376">memory.ts:376</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L376">memory.ts:376</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -340,7 +340,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L267">memory.ts:267</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L267">memory.ts:267</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -373,7 +373,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L243">memory.ts:243</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L243">memory.ts:243</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -390,7 +390,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L321">memory.ts:321</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L321">memory.ts:321</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -422,7 +422,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L252">memory.ts:252</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L252">memory.ts:252</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -444,7 +444,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L359">memory.ts:359</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L359">memory.ts:359</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -470,7 +470,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L342">memory.ts:342</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L342">memory.ts:342</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -496,7 +496,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L350">memory.ts:350</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L350">memory.ts:350</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -522,7 +522,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L326">memory.ts:326</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L326">memory.ts:326</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -548,7 +548,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L363">memory.ts:363</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L363">memory.ts:363</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -574,7 +574,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L346">memory.ts:346</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L346">memory.ts:346</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -600,7 +600,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L334">memory.ts:334</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L334">memory.ts:334</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
diff --git a/docs/reference/api/typedoc/classes/dldatatype.html b/docs/reference/api/typedoc/classes/dldatatype.html
index 904bef64fa..921e497006 100644
--- a/docs/reference/api/typedoc/classes/dldatatype.html
+++ b/docs/reference/api/typedoc/classes/dldatatype.html
@@ -119,7 +119,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L262">runtime.ts:262</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L262">runtime.ts:262</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -147,7 +147,7 @@
 					<div class="tsd-signature tsd-kind-icon">bits<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L260">runtime.ts:260</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L260">runtime.ts:260</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">code<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L258">runtime.ts:258</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L258">runtime.ts:258</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -177,7 +177,7 @@
 					<div class="tsd-signature tsd-kind-icon">lanes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L262">runtime.ts:262</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L262">runtime.ts:262</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -199,7 +199,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L279">runtime.ts:279</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L279">runtime.ts:279</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -216,7 +216,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L270">runtime.ts:270</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L270">runtime.ts:270</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">string</span></h4>
diff --git a/docs/reference/api/typedoc/classes/dldevice.html b/docs/reference/api/typedoc/classes/dldevice.html
index 466b260b24..469e054c1c 100644
--- a/docs/reference/api/typedoc/classes/dldevice.html
+++ b/docs/reference/api/typedoc/classes/dldevice.html
@@ -118,7 +118,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L202">runtime.ts:202</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L202">runtime.ts:202</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -146,7 +146,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<wbr>Id<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L200">runtime.ts:200</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L200">runtime.ts:200</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -161,7 +161,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L198">runtime.ts:198</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L198">runtime.ts:198</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -183,7 +183,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L223">runtime.ts:223</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L223">runtime.ts:223</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -205,7 +205,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L230">runtime.ts:230</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L230">runtime.ts:230</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">string</span></h4>
diff --git a/docs/reference/api/typedoc/classes/environment.html b/docs/reference/api/typedoc/classes/environment.html
index 76bc1d4653..c59134486f 100644
--- a/docs/reference/api/typedoc/classes/environment.html
+++ b/docs/reference/api/typedoc/classes/environment.html
@@ -125,7 +125,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/environment.ts#L86">environment.ts:86</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/environment.ts#L86">environment.ts:86</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -169,7 +169,7 @@
 					<aside class="tsd-sources">
 						<p>Implementation of <a href="../interfaces/libraryprovider.html">LibraryProvider</a>.<a href="../interfaces/libraryprovider.html#imports">imports</a></p>
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/environment.ts#L70">environment.ts:70</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/environment.ts#L70">environment.ts:70</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 					<div class="tsd-signature tsd-kind-icon">logger<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>msg<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/environment.ts#L69">environment.ts:69</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/environment.ts#L69">environment.ts:69</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -210,7 +210,7 @@
 					<div class="tsd-signature tsd-kind-icon">packedCFunc<wbr>Table<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">ctypes.FTVMWasmPackedCFunc</span><span class="tsd-signature-symbol"> | </span><span class="tsd-signature-type">undefined</span><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = [undefined,]</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/environment.ts#L78">environment.ts:78</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/environment.ts#L78">environment.ts:78</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -228,7 +228,7 @@
 					<div class="tsd-signature tsd-kind-icon">packedCFunc<wbr>Table<wbr>Free<wbr>Id<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = []</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/environment.ts#L84">environment.ts:84</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/environment.ts#L84">environment.ts:84</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -250,7 +250,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/environment.ts#L105">environment.ts:105</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/environment.ts#L105">environment.ts:105</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/ffilibrary.html b/docs/reference/api/typedoc/classes/ffilibrary.html
index 20f5a56446..04040d77e9 100644
--- a/docs/reference/api/typedoc/classes/ffilibrary.html
+++ b/docs/reference/api/typedoc/classes/ffilibrary.html
@@ -131,7 +131,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L49">runtime.ts:49</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L49">runtime.ts:49</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -156,7 +156,7 @@
 					<div class="tsd-signature tsd-kind-icon">exports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">Function</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L46">runtime.ts:46</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L46">runtime.ts:46</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -166,7 +166,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L45">runtime.ts:45</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L45">runtime.ts:45</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">wasm32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">boolean</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L44">runtime.ts:44</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L44">runtime.ts:44</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -186,7 +186,7 @@
 					<div class="tsd-signature tsd-kind-icon">webGPUContext<span class="tsd-signature-symbol">:</span> <a href="webgpucontext.html" class="tsd-signature-type">WebGPUContext</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L47">runtime.ts:47</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L47">runtime.ts:47</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -203,7 +203,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L76">runtime.ts:76</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L76">runtime.ts:76</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -226,7 +226,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L66">runtime.ts:66</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L66">runtime.ts:66</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -243,7 +243,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L84">runtime.ts:84</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L84">runtime.ts:84</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <a href="cachedcallstack.html" class="tsd-signature-type">CachedCallStack</a></h4>
@@ -260,7 +260,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L95">runtime.ts:95</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L95">runtime.ts:95</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -283,7 +283,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L72">runtime.ts:72</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L72">runtime.ts:72</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
diff --git a/docs/reference/api/typedoc/classes/graphexecutor.html b/docs/reference/api/typedoc/classes/graphexecutor.html
index c433aa060a..e3a386eef5 100644
--- a/docs/reference/api/typedoc/classes/graphexecutor.html
+++ b/docs/reference/api/typedoc/classes/graphexecutor.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L583">runtime.ts:583</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L583">runtime.ts:583</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">module<span class="tsd-signature-symbol">:</span> <a href="module.html" class="tsd-signature-type">Module</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L579">runtime.ts:579</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L579">runtime.ts:579</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L654">runtime.ts:654</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L654">runtime.ts:654</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -224,7 +224,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L597">runtime.ts:597</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L597">runtime.ts:597</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -241,7 +241,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L631">runtime.ts:631</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L631">runtime.ts:631</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -279,7 +279,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L644">runtime.ts:644</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L644">runtime.ts:644</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -310,7 +310,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L621">runtime.ts:621</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L621">runtime.ts:621</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -332,7 +332,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L609">runtime.ts:609</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L609">runtime.ts:609</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/instance.html b/docs/reference/api/typedoc/classes/instance.html
index 1084200d92..7c1223ce95 100644
--- a/docs/reference/api/typedoc/classes/instance.html
+++ b/docs/reference/api/typedoc/classes/instance.html
@@ -139,7 +139,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L692">runtime.ts:692</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L692">runtime.ts:692</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -202,7 +202,7 @@
 					<div class="tsd-signature tsd-kind-icon">exports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">Function</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L684">runtime.ts:684</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L684">runtime.ts:684</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -212,7 +212,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L683">runtime.ts:683</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L683">runtime.ts:683</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -229,7 +229,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L932">runtime.ts:932</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L932">runtime.ts:932</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -260,7 +260,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L994">runtime.ts:994</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L994">runtime.ts:994</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -303,7 +303,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L924">runtime.ts:924</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L924">runtime.ts:924</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -341,7 +341,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L732">runtime.ts:732</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L732">runtime.ts:732</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -358,7 +358,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L952">runtime.ts:952</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L952">runtime.ts:952</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -402,7 +402,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L816">runtime.ts:816</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L816">runtime.ts:816</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -434,7 +434,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L1033">runtime.ts:1033</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L1033">runtime.ts:1033</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -465,7 +465,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L846">runtime.ts:846</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L846">runtime.ts:846</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -497,7 +497,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L750">runtime.ts:750</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L750">runtime.ts:750</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -520,7 +520,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L1013">runtime.ts:1013</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L1013">runtime.ts:1013</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -568,7 +568,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L789">runtime.ts:789</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L789">runtime.ts:789</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -608,7 +608,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L914">runtime.ts:914</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L914">runtime.ts:914</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -646,7 +646,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L1145">runtime.ts:1145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L1145">runtime.ts:1145</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -698,7 +698,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L740">runtime.ts:740</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L740">runtime.ts:740</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -722,7 +722,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L868">runtime.ts:868</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L868">runtime.ts:868</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -754,7 +754,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L857">runtime.ts:857</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L857">runtime.ts:857</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -786,7 +786,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L940">runtime.ts:940</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L940">runtime.ts:940</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/memory.html b/docs/reference/api/typedoc/classes/memory.html
index f3842d5c4e..f2334c68f5 100644
--- a/docs/reference/api/typedoc/classes/memory.html
+++ b/docs/reference/api/typedoc/classes/memory.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L40">memory.ts:40</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L40">memory.ts:40</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -152,7 +152,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Memory</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L32">memory.ts:32</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L32">memory.ts:32</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">wasm32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">boolean</span><span class="tsd-signature-symbol"> = true</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L33">memory.ts:33</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L33">memory.ts:33</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L154">memory.ts:154</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L154">memory.ts:154</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -210,7 +210,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L90">memory.ts:90</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L90">memory.ts:90</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -233,7 +233,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L97">memory.ts:97</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L97">memory.ts:97</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -256,7 +256,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L74">memory.ts:74</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L74">memory.ts:74</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -279,7 +279,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L81">memory.ts:81</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L81">memory.ts:81</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -302,7 +302,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L104">memory.ts:104</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L104">memory.ts:104</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -325,7 +325,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L132">memory.ts:132</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L132">memory.ts:132</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -362,7 +362,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L145">memory.ts:145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L145">memory.ts:145</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -393,7 +393,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L60">memory.ts:60</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L60">memory.ts:60</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -416,7 +416,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L67">memory.ts:67</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L67">memory.ts:67</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -439,7 +439,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L53">memory.ts:53</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L53">memory.ts:53</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -462,7 +462,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L114">memory.ts:114</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L114">memory.ts:114</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -485,7 +485,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L124">memory.ts:124</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L124">memory.ts:124</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -502,7 +502,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/memory.ts#L175">memory.ts:175</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/memory.ts#L175">memory.ts:175</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/module.html b/docs/reference/api/typedoc/classes/module.html
index 16b299a4ab..5d1813f2f2 100644
--- a/docs/reference/api/typedoc/classes/module.html
+++ b/docs/reference/api/typedoc/classes/module.html
@@ -124,7 +124,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L504">runtime.ts:504</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L504">runtime.ts:504</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -170,7 +170,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L502">runtime.ts:502</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L502">runtime.ts:502</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -187,7 +187,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L516">runtime.ts:516</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L516">runtime.ts:516</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -204,7 +204,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L530">runtime.ts:530</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L530">runtime.ts:530</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -236,7 +236,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L561">runtime.ts:561</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L561">runtime.ts:561</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/ndarray.html b/docs/reference/api/typedoc/classes/ndarray.html
index 17382f6ed6..30c5e613ac 100644
--- a/docs/reference/api/typedoc/classes/ndarray.html
+++ b/docs/reference/api/typedoc/classes/ndarray.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L304">runtime.ts:304</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L304">runtime.ts:304</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -158,7 +158,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<span class="tsd-signature-symbol">:</span> <a href="dldevice.html" class="tsd-signature-type">DLDevice</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L297">runtime.ts:297</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L297">runtime.ts:297</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -173,7 +173,7 @@
 					<div class="tsd-signature tsd-kind-icon">dtype<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L293">runtime.ts:293</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L293">runtime.ts:293</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -188,7 +188,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L289">runtime.ts:289</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L289">runtime.ts:289</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -203,7 +203,7 @@
 					<div class="tsd-signature tsd-kind-icon">ndim<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L291">runtime.ts:291</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L291">runtime.ts:291</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -218,7 +218,7 @@
 					<div class="tsd-signature tsd-kind-icon">shape<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L295">runtime.ts:295</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L295">runtime.ts:295</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -240,7 +240,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L370">runtime.ts:370</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L370">runtime.ts:370</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -273,7 +273,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L414">runtime.ts:414</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L414">runtime.ts:414</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -305,7 +305,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L355">runtime.ts:355</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L355">runtime.ts:355</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -322,7 +322,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L474">runtime.ts:474</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L474">runtime.ts:474</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -346,7 +346,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L443">runtime.ts:443</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L443">runtime.ts:443</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/packedfunccell.html b/docs/reference/api/typedoc/classes/packedfunccell.html
index e8b11c288e..e0ce3a32e0 100644
--- a/docs/reference/api/typedoc/classes/packedfunccell.html
+++ b/docs/reference/api/typedoc/classes/packedfunccell.html
@@ -122,7 +122,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L158">runtime.ts:158</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L158">runtime.ts:158</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -147,7 +147,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L157">runtime.ts:157</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L157">runtime.ts:157</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -164,7 +164,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L165">runtime.ts:165</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L165">runtime.ts:165</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
diff --git a/docs/reference/api/typedoc/classes/rpcserver.html b/docs/reference/api/typedoc/classes/rpcserver.html
index 964ccadbe4..1585ef748f 100644
--- a/docs/reference/api/typedoc/classes/rpcserver.html
+++ b/docs/reference/api/typedoc/classes/rpcserver.html
@@ -115,7 +115,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L92">rpc_server.ts:92</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L92">rpc_server.ts:92</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">get<wbr>Imports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">unknown</span><span class="tsd-signat [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L82">rpc_server.ts:82</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L82">rpc_server.ts:82</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -201,7 +201,7 @@
 					<div class="tsd-signature tsd-kind-icon">key<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L78">rpc_server.ts:78</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L78">rpc_server.ts:78</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -211,7 +211,7 @@
 					<div class="tsd-signature tsd-kind-icon">logger<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>msg<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L81">rpc_server.ts:81</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L81">rpc_server.ts:81</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -242,7 +242,7 @@
 					<div class="tsd-signature tsd-kind-icon">socket<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">WebSocket</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L79">rpc_server.ts:79</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L79">rpc_server.ts:79</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -252,7 +252,7 @@
 					<div class="tsd-signature tsd-kind-icon">state<span class="tsd-signature-symbol">:</span> <a href="../enums/rpcserverstate.html" class="tsd-signature-type">RPCServerState</a><span class="tsd-signature-symbol"> = RPCServerState.InitHeader</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L80">rpc_server.ts:80</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L80">rpc_server.ts:80</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -262,7 +262,7 @@
 					<div class="tsd-signature tsd-kind-icon">url<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L77">rpc_server.ts:77</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L77">rpc_server.ts:77</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/classes/scalar.html b/docs/reference/api/typedoc/classes/scalar.html
index 1d460893f0..59c0dc5fac 100644
--- a/docs/reference/api/typedoc/classes/scalar.html
+++ b/docs/reference/api/typedoc/classes/scalar.html
@@ -112,7 +112,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L145">runtime.ts:145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L145">runtime.ts:145</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -137,7 +137,7 @@
 					<div class="tsd-signature tsd-kind-icon">dtype<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L145">runtime.ts:145</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L145">runtime.ts:145</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -152,7 +152,7 @@
 					<div class="tsd-signature tsd-kind-icon">value<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L143">runtime.ts:143</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L143">runtime.ts:143</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/webgpucontext.html b/docs/reference/api/typedoc/classes/webgpucontext.html
index 13b0aa3996..68ea253327 100644
--- a/docs/reference/api/typedoc/classes/webgpucontext.html
+++ b/docs/reference/api/typedoc/classes/webgpucontext.html
@@ -120,7 +120,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L57">webgpu.ts:57</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L57">webgpu.ts:57</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -145,7 +145,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">GPUDevice</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L50">webgpu.ts:50</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L50">webgpu.ts:50</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -155,7 +155,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L51">webgpu.ts:51</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L51">webgpu.ts:51</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -172,7 +172,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L84">webgpu.ts:84</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L84">webgpu.ts:84</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -209,7 +209,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L170">webgpu.ts:170</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L170">webgpu.ts:170</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -238,7 +238,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L67">webgpu.ts:67</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L67">webgpu.ts:67</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/enums/argtypecode.html b/docs/reference/api/typedoc/enums/argtypecode.html
index 82c674f468..0ccee92caa 100644
--- a/docs/reference/api/typedoc/enums/argtypecode.html
+++ b/docs/reference/api/typedoc/enums/argtypecode.html
@@ -106,7 +106,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLDevice<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 6</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L220">ctypes.ts:220</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L220">ctypes.ts:220</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -116,7 +116,7 @@
 					<div class="tsd-signature tsd-kind-icon">Float<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L216">ctypes.ts:216</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L216">ctypes.ts:216</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -126,7 +126,7 @@
 					<div class="tsd-signature tsd-kind-icon">Int<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L214">ctypes.ts:214</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L214">ctypes.ts:214</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -136,7 +136,7 @@
 					<div class="tsd-signature tsd-kind-icon">Null<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L218">ctypes.ts:218</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L218">ctypes.ts:218</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -146,7 +146,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMBytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 12</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L226">ctypes.ts:226</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L226">ctypes.ts:226</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -156,7 +156,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMDLTensor<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 7</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L221">ctypes.ts:221</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L221">ctypes.ts:221</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -166,7 +166,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMData<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 5</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L219">ctypes.ts:219</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L219">ctypes.ts:219</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMModule<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 9</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L223">ctypes.ts:223</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L223">ctypes.ts:223</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -186,7 +186,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMNDArray<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 13</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L227">ctypes.ts:227</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L227">ctypes.ts:227</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -196,7 +196,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMObject<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L222">ctypes.ts:222</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L222">ctypes.ts:222</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -206,7 +206,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMObjectRValue<wbr>Ref<wbr>Arg<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 14</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L228">ctypes.ts:228</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L228">ctypes.ts:228</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -216,7 +216,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMOpaque<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 3</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L217">ctypes.ts:217</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L217">ctypes.ts:217</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -226,7 +226,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMPacked<wbr>Func<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 10</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L224">ctypes.ts:224</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L224">ctypes.ts:224</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -236,7 +236,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 11</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L225">ctypes.ts:225</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L225">ctypes.ts:225</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -246,7 +246,7 @@
 					<div class="tsd-signature tsd-kind-icon">UInt<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L215">ctypes.ts:215</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L215">ctypes.ts:215</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/aynccallbackcode.html b/docs/reference/api/typedoc/enums/aynccallbackcode.html
index 5594f46dcf..1470468998 100644
--- a/docs/reference/api/typedoc/enums/aynccallbackcode.html
+++ b/docs/reference/api/typedoc/enums/aynccallbackcode.html
@@ -93,7 +93,7 @@
 					<div class="tsd-signature tsd-kind-icon">k<wbr>Exception<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 5</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L676">runtime.ts:676</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L676">runtime.ts:676</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -103,7 +103,7 @@
 					<div class="tsd-signature tsd-kind-icon">k<wbr>Return<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L675">runtime.ts:675</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L675">runtime.ts:675</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/dldatatypecode.html b/docs/reference/api/typedoc/enums/dldatatypecode.html
index 7685030780..5757501825 100644
--- a/docs/reference/api/typedoc/enums/dldatatypecode.html
+++ b/docs/reference/api/typedoc/enums/dldatatypecode.html
@@ -95,7 +95,7 @@
 					<div class="tsd-signature tsd-kind-icon">Float<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L242">runtime.ts:242</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L242">runtime.ts:242</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -105,7 +105,7 @@
 					<div class="tsd-signature tsd-kind-icon">Int<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L240">runtime.ts:240</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L240">runtime.ts:240</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -115,7 +115,7 @@
 					<div class="tsd-signature tsd-kind-icon">Opaque<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 3</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L243">runtime.ts:243</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L243">runtime.ts:243</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -125,7 +125,7 @@
 					<div class="tsd-signature tsd-kind-icon">UInt<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L241">runtime.ts:241</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L241">runtime.ts:241</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/rpcserverstate.html b/docs/reference/api/typedoc/enums/rpcserverstate.html
index 81c298794e..5af5d1c14b 100644
--- a/docs/reference/api/typedoc/enums/rpcserverstate.html
+++ b/docs/reference/api/typedoc/enums/rpcserverstate.html
@@ -90,7 +90,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Header<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L27">rpc_server.ts:27</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L27">rpc_server.ts:27</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -100,7 +100,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Header<wbr>Key<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L28">rpc_server.ts:28</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L28">rpc_server.ts:28</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -110,7 +110,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Server<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L29">rpc_server.ts:29</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L29">rpc_server.ts:29</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -120,7 +120,7 @@
 					<div class="tsd-signature tsd-kind-icon">Receive<wbr>Packet<wbr>Body<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L32">rpc_server.ts:32</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L32">rpc_server.ts:32</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -130,7 +130,7 @@
 					<div class="tsd-signature tsd-kind-icon">Receive<wbr>Packet<wbr>Header<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L31">rpc_server.ts:31</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L31">rpc_server.ts:31</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -140,7 +140,7 @@
 					<div class="tsd-signature tsd-kind-icon">Wait<wbr>For<wbr>Callback<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L30">rpc_server.ts:30</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L30">rpc_server.ts:30</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/sizeof.html b/docs/reference/api/typedoc/enums/sizeof.html
index 10efd670be..3412c8f1c2 100644
--- a/docs/reference/api/typedoc/enums/sizeof.html
+++ b/docs/reference/api/typedoc/enums/sizeof.html
@@ -100,7 +100,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLData<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = I32</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L206">ctypes.ts:206</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L206">ctypes.ts:206</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -110,7 +110,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLDevice<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = I32 + I32</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L207">ctypes.ts:207</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L207">ctypes.ts:207</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -120,7 +120,7 @@
 					<div class="tsd-signature tsd-kind-icon">F32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L203">ctypes.ts:203</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L203">ctypes.ts:203</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -130,7 +130,7 @@
 					<div class="tsd-signature tsd-kind-icon">F64<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L204">ctypes.ts:204</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L204">ctypes.ts:204</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -140,7 +140,7 @@
 					<div class="tsd-signature tsd-kind-icon">I32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L201">ctypes.ts:201</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L201">ctypes.ts:201</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -150,7 +150,7 @@
 					<div class="tsd-signature tsd-kind-icon">I64<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L202">ctypes.ts:202</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L202">ctypes.ts:202</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -160,7 +160,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMValue<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L205">ctypes.ts:205</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L205">ctypes.ts:205</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -170,7 +170,7 @@
 					<div class="tsd-signature tsd-kind-icon">U16<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L200">ctypes.ts:200</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L200">ctypes.ts:200</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -180,7 +180,7 @@
 					<div class="tsd-signature tsd-kind-icon">U8<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L199">ctypes.ts:199</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L199">ctypes.ts:199</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/index.html b/docs/reference/api/typedoc/index.html
index f482d46d2b..7a90bd39ff 100644
--- a/docs/reference/api/typedoc/index.html
+++ b/docs/reference/api/typedoc/index.html
@@ -174,7 +174,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Alloc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>shape<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, ndim<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, dtypeCode<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, dtypeBits<span class="tsd [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L112">ctypes.ts:112</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L112">ctypes.ts:112</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -238,7 +238,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>From<wbr>Bytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, data<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nbytes<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">num [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L128">ctypes.ts:128</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L128">ctypes.ts:128</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -282,7 +282,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>From<wbr>To<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>from<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, to<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, stream<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-sig [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L144">ctypes.ts:144</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L144">ctypes.ts:144</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -326,7 +326,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>ToBytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, data<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nbytes<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</sp [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L136">ctypes.ts:136</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L136">ctypes.ts:136</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -370,7 +370,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L121">ctypes.ts:121</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L121">ctypes.ts:121</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -406,7 +406,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMBackend<wbr>PackedCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>argValues<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, argCodes<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nargs<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number< [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L160">ctypes.ts:160</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L160">ctypes.ts:160</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -458,7 +458,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMCFunc<wbr>Set<wbr>Return<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>ret<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, value<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCode<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signa [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L77">ctypes.ts:77</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L77">ctypes.ts:77</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -506,7 +506,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMCb<wbr>Arg<wbr>ToReturn<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>value<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, code<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span c [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L83">ctypes.ts:83</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L83">ctypes.ts:83</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -545,7 +545,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Call<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>func<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, argValues<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCode<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-t [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L67">ctypes.ts:67</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L67">ctypes.ts:67</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -601,7 +601,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>func<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L57">ctypes.ts:57</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L57">ctypes.ts:57</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -637,7 +637,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Get<wbr>Global<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>name<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, out<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span cla [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L100">ctypes.ts:100</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L100">ctypes.ts:100</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -676,7 +676,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>List<wbr>Global<wbr>Names<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>outSize<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, outArray<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&g [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L88">ctypes.ts:88</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L88">ctypes.ts:88</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -715,7 +715,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Register<wbr>Global<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>name<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, f<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, override<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</spa [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L94">ctypes.ts:94</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L94">ctypes.ts:94</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -758,7 +758,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMGet<wbr>Last<wbr>Error<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L34">ctypes.ts:34</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L34">ctypes.ts:34</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -788,7 +788,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L52">ctypes.ts:52</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L52">ctypes.ts:52</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -824,7 +824,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Get<wbr>Function<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, funcName<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, queryImports<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">numbe [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L42">ctypes.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L42">ctypes.ts:42</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -872,7 +872,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Import<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, dep<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-si [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L48">ctypes.ts:48</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L48">ctypes.ts:48</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -912,7 +912,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMSynchronize<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>deviceType<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, deviceId<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, stream<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signatur [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L150">ctypes.ts:150</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L150">ctypes.ts:150</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -954,7 +954,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Alloc<wbr>Space<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>size<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L167">ctypes.ts:167</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L167">ctypes.ts:167</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -990,7 +990,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Free<wbr>Space<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>ptr<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L170">ctypes.ts:170</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L170">ctypes.ts:170</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1026,7 +1026,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Func<wbr>Create<wbr>FromCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>resource<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, out<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&g [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L187">ctypes.ts:187</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L187">ctypes.ts:187</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1066,7 +1066,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>PackedCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>args<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCodes<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nargs<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L179">ctypes.ts:179</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L179">ctypes.ts:179</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1118,7 +1118,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>PackedCFunc<wbr>Finalizer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>resourceHandle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L193">ctypes.ts:193</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L193">ctypes.ts:193</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1154,7 +1154,7 @@
 					<div class="tsd-signature tsd-kind-icon">GPUPointer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L25">webgpu.ts:25</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L25">webgpu.ts:25</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1169,7 +1169,7 @@
 					<div class="tsd-signature tsd-kind-icon">Packed<wbr>Func<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">...</span>args<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol"> &amp; </span><a href="interfaces/disp [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L36">runtime.ts:36</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L36">runtime.ts:36</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1184,7 +1184,7 @@
 					<div class="tsd-signature tsd-kind-icon">Pointer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L25">ctypes.ts:25</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L25">ctypes.ts:25</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1199,7 +1199,7 @@
 					<div class="tsd-signature tsd-kind-icon">Ptr<wbr>Offset<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/ctypes.ts#L28">ctypes.ts:28</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/ctypes.ts#L28">ctypes.ts:28</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1217,7 +1217,7 @@
 					<div class="tsd-signature tsd-kind-icon">RPC_<wbr>MAGIC<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">1045105</span><span class="tsd-signature-symbol"> = 1045105</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/rpc_server.ts#L36">rpc_server.ts:36</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/rpc_server.ts#L36">rpc_server.ts:36</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1239,7 +1239,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/support.ts#L25">support.ts:25</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/support.ts#L25">support.ts:25</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1271,7 +1271,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/support.ts#L39">support.ts:39</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/support.ts#L39">support.ts:39</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1300,7 +1300,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/support.ts#L52">support.ts:52</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/support.ts#L52">support.ts:52</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1337,7 +1337,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/compact.ts#L38">compact.ts:38</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/compact.ts#L38">compact.ts:38</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1368,7 +1368,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L30">webgpu.ts:30</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L30">webgpu.ts:30</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1390,7 +1390,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/environment.ts#L32">environment.ts:32</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/environment.ts#L32">environment.ts:32</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1421,7 +1421,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/compact.ts#L24">compact.ts:24</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/compact.ts#L24">compact.ts:24</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1443,7 +1443,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L1367">runtime.ts:1367</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L1367">runtime.ts:1367</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1508,7 +1508,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/support.ts#L62">support.ts:62</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/support.ts#L62">support.ts:62</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1530,7 +1530,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLData<wbr>Type<wbr>Code<wbr>ToStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L246">runtime.ts:246</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L246">runtime.ts:246</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1539,7 +1539,7 @@
 						<div class="tsd-signature tsd-kind-icon">0<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;int&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L247">runtime.ts:247</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L247">runtime.ts:247</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1549,7 +1549,7 @@
 						<div class="tsd-signature tsd-kind-icon">1<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;uint&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L248">runtime.ts:248</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L248">runtime.ts:248</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1559,7 +1559,7 @@
 						<div class="tsd-signature tsd-kind-icon">2<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;float&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L249">runtime.ts:249</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L249">runtime.ts:249</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1569,7 +1569,7 @@
 						<div class="tsd-signature tsd-kind-icon">3<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;handle&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L250">runtime.ts:250</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L250">runtime.ts:250</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1580,7 +1580,7 @@
 					<div class="tsd-signature tsd-kind-icon">Device<wbr>Enum<wbr>ToStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L175">runtime.ts:175</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L175">runtime.ts:175</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1589,7 +1589,7 @@
 						<div class="tsd-signature tsd-kind-icon">1<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;cpu&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L176">runtime.ts:176</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L176">runtime.ts:176</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1599,7 +1599,7 @@
 						<div class="tsd-signature tsd-kind-icon">15<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;webgpu&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L180">runtime.ts:180</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L180">runtime.ts:180</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1609,7 +1609,7 @@
 						<div class="tsd-signature tsd-kind-icon">2<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;cuda&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L177">runtime.ts:177</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L177">runtime.ts:177</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1619,7 +1619,7 @@
 						<div class="tsd-signature tsd-kind-icon">4<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;opencl&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L178">runtime.ts:178</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L178">runtime.ts:178</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1629,7 +1629,7 @@
 						<div class="tsd-signature tsd-kind-icon">8<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;metal&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L179">runtime.ts:179</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L179">runtime.ts:179</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1640,7 +1640,7 @@
 					<div class="tsd-signature tsd-kind-icon">Device<wbr>Str<wbr>ToEnum<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L183">runtime.ts:183</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L183">runtime.ts:183</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1649,7 +1649,7 @@
 						<div class="tsd-signature tsd-kind-icon">cl<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 4</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L186">runtime.ts:186</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L186">runtime.ts:186</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1659,7 +1659,7 @@
 						<div class="tsd-signature tsd-kind-icon">cpu<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 1</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L184">runtime.ts:184</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L184">runtime.ts:184</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1669,7 +1669,7 @@
 						<div class="tsd-signature tsd-kind-icon">cuda<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 2</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L185">runtime.ts:185</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L185">runtime.ts:185</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1679,7 +1679,7 @@
 						<div class="tsd-signature tsd-kind-icon">metal<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 8</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L189">runtime.ts:189</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L189">runtime.ts:189</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1689,7 +1689,7 @@
 						<div class="tsd-signature tsd-kind-icon">opencl<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 4</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L187">runtime.ts:187</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L187">runtime.ts:187</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1699,7 +1699,7 @@
 						<div class="tsd-signature tsd-kind-icon">vulkan<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 7</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L188">runtime.ts:188</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L188">runtime.ts:188</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1709,7 +1709,7 @@
 						<div class="tsd-signature tsd-kind-icon">webgpu<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 15</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/runtime.ts#L190">runtime.ts:190</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/runtime.ts#L190">runtime.ts:190</a></li>
 							</ul>
 						</aside>
 					</section>
diff --git a/docs/reference/api/typedoc/interfaces/disposable.html b/docs/reference/api/typedoc/interfaces/disposable.html
index 81e0e1d6b2..0343822af8 100644
--- a/docs/reference/api/typedoc/interfaces/disposable.html
+++ b/docs/reference/api/typedoc/interfaces/disposable.html
@@ -113,7 +113,7 @@
 					<div class="tsd-signature tsd-kind-icon">dispose<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/types.ts#L52">types.ts:52</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/types.ts#L52">types.ts:52</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/interfaces/functioninfo.html b/docs/reference/api/typedoc/interfaces/functioninfo.html
index b96b3de5ca..f225322c43 100644
--- a/docs/reference/api/typedoc/interfaces/functioninfo.html
+++ b/docs/reference/api/typedoc/interfaces/functioninfo.html
@@ -95,7 +95,7 @@
 					<div class="tsd-signature tsd-kind-icon">arg_<wbr>types<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L41">webgpu.ts:41</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L41">webgpu.ts:41</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -105,7 +105,7 @@
 					<div class="tsd-signature tsd-kind-icon">launch_<wbr>param_<wbr>tags<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L42">webgpu.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L42">webgpu.ts:42</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -115,7 +115,7 @@
 					<div class="tsd-signature tsd-kind-icon">name<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/webgpu.ts#L40">webgpu.ts:40</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/webgpu.ts#L40">webgpu.ts:40</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/interfaces/libraryprovider.html b/docs/reference/api/typedoc/interfaces/libraryprovider.html
index ad8ad6b57c..cc564af3d2 100644
--- a/docs/reference/api/typedoc/interfaces/libraryprovider.html
+++ b/docs/reference/api/typedoc/interfaces/libraryprovider.html
@@ -112,7 +112,7 @@
 					<div class="tsd-signature tsd-kind-icon">imports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/types.ts#L34">types.ts:34</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/types.ts#L34">types.ts:34</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -127,7 +127,7 @@
 					<div class="tsd-signature tsd-kind-icon">start<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>inst<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">Instance</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/397cf8781/web/src/types.ts#L39">types.ts:39</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/1f8b5dec2/web/src/types.ts#L39">types.ts:39</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/searchindex.js b/docs/searchindex.js
index e7ffc4e771..6b43dcc4c5 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({docnames:["arch/benchmark","arch/convert_layout","arch/debugger","arch/device_target_interactions","arch/frontend/tensorflow","arch/hybrid_script","arch/index","arch/inferbound","arch/introduction_to_module_serialization","arch/microtvm_design","arch/microtvm_project_api","arch/model_library_format","arch/pass_infra","arch/relay_intro","arch/relay_op_strategy","arch/runtime","arch/runtimes/vulkan","arch/security","arch/virtual_machine","contribute/ci","contribute/code_gu [...]
\ No newline at end of file
+Search.setIndex({docnames:["arch/benchmark","arch/convert_layout","arch/debugger","arch/device_target_interactions","arch/frontend/tensorflow","arch/hybrid_script","arch/index","arch/inferbound","arch/introduction_to_module_serialization","arch/microtvm_design","arch/microtvm_project_api","arch/model_library_format","arch/pass_infra","arch/relay_intro","arch/relay_op_strategy","arch/runtime","arch/runtimes/vulkan","arch/security","arch/virtual_machine","contribute/ci","contribute/code_gu [...]
\ No newline at end of file
diff --git a/docs/topic/vta/tutorials/autotvm/sg_execution_times.html b/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
index 1d7269ba04..688d8604fb 100644
--- a/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-autotvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:22.027</strong> total execution time for <strong>topic_vta_tutorials_autotvm</strong> files:</p>
+<p><strong>00:21.220</strong> total execution time for <strong>topic_vta_tutorials_autotvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 82%" />
@@ -336,7 +336,7 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_relay_vta.html#sphx-glr-topic-vta-tutorials-autotvm-tune-relay-vta-py"><span class="std std-ref">Auto-tuning a convolutional network on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_vta.py</span></code>)</p></td>
-<td><p>00:22.021</p></td>
+<td><p>00:21.213</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_alu_vta.html#sphx-glr-topic-vta-tutorials-autotvm-tune-alu-vta-py"><span class="std std-ref">Auto-tuning a ALU fused op on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_alu_vta.py</span></code>)</p></td>
diff --git a/docs/topic/vta/tutorials/frontend/deploy_classification.html b/docs/topic/vta/tutorials/frontend/deploy_classification.html
index 373f3fcb9e..cda37bc68a 100644
--- a/docs/topic/vta/tutorials/frontend/deploy_classification.html
+++ b/docs/topic/vta/tutorials/frontend/deploy_classification.html
@@ -571,7 +571,7 @@ and dense layer which will both be executed in fp32 on the CPU.</p></li>
   DeprecationWarning,
 /workspace/vta/tutorials/frontend/deploy_classification.py:213: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
   relay_prog, target=tvm.target.Target(target, host=env.target_host), params=params
-resnet18_v1 inference graph built in 22.82s!
+resnet18_v1 inference graph built in 22.74s!
 </pre></div>
 </div>
 </div>
diff --git a/docs/topic/vta/tutorials/frontend/deploy_detection.html b/docs/topic/vta/tutorials/frontend/deploy_detection.html
index 8b3d38bd99..74780374ae 100644
--- a/docs/topic/vta/tutorials/frontend/deploy_detection.html
+++ b/docs/topic/vta/tutorials/frontend/deploy_detection.html
@@ -589,7 +589,7 @@ and dense layer which will both be executed in fp32 on the CPU.</p></li>
   &quot;target_host parameter is going to be deprecated. &quot;
 /workspace/python/tvm/relay/build_module.py:348: DeprecationWarning: Please use input parameter mod (tvm.IRModule) instead of deprecated parameter mod (tvm.relay.function.Function)
   DeprecationWarning,
-yolov3-tiny inference graph built in 17.48s!
+yolov3-tiny inference graph built in 16.09s!
 </pre></div>
 </div>
 </div>
diff --git a/docs/topic/vta/tutorials/frontend/sg_execution_times.html b/docs/topic/vta/tutorials/frontend/sg_execution_times.html
index 634efce424..51141474bf 100644
--- a/docs/topic/vta/tutorials/frontend/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/frontend/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-frontend-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>01:34.267</strong> total execution time for <strong>topic_vta_tutorials_frontend</strong> files:</p>
+<p><strong>01:31.739</strong> total execution time for <strong>topic_vta_tutorials_frontend</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_detection.html#sphx-glr-topic-vta-tutorials-frontend-deploy-detection-py"><span class="std std-ref">Deploy Pretrained Vision Detection Model from Darknet on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_detection.py</span></code>)</p></td>
-<td><p>00:50.368</p></td>
+<td><p>00:48.537</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_classification.html#sphx-glr-topic-vta-tutorials-frontend-deploy-classification-py"><span class="std std-ref">Deploy Pretrained Vision Model from MxNet on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_classification.py</span></code>)</p></td>
-<td><p>00:43.899</p></td>
+<td><p>00:43.203</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/topic/vta/tutorials/optimize/sg_execution_times.html b/docs/topic/vta/tutorials/optimize/sg_execution_times.html
index 175a1d69e9..198fc6c10a 100644
--- a/docs/topic/vta/tutorials/optimize/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/optimize/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-optimize-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:03.331</strong> total execution time for <strong>topic_vta_tutorials_optimize</strong> files:</p>
+<p><strong>00:03.009</strong> total execution time for <strong>topic_vta_tutorials_optimize</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="convolution_opt.html#sphx-glr-topic-vta-tutorials-optimize-convolution-opt-py"><span class="std std-ref">2D Convolution Optimization</span></a> (<code class="docutils literal notranslate"><span class="pre">convolution_opt.py</span></code>)</p></td>
-<td><p>00:02.902</p></td>
+<td><p>00:02.598</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="matrix_multiply_opt.html#sphx-glr-topic-vta-tutorials-optimize-matrix-multiply-opt-py"><span class="std std-ref">Matrix Multiply Blocking</span></a> (<code class="docutils literal notranslate"><span class="pre">matrix_multiply_opt.py</span></code>)</p></td>
-<td><p>00:00.428</p></td>
+<td><p>00:00.411</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/topic/vta/tutorials/sg_execution_times.html b/docs/topic/vta/tutorials/sg_execution_times.html
index fe271ad7dd..a56c16f395 100644
--- a/docs/topic/vta/tutorials/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:00.757</strong> total execution time for <strong>topic_vta_tutorials</strong> files:</p>
+<p><strong>00:00.785</strong> total execution time for <strong>topic_vta_tutorials</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 81%" />
@@ -336,7 +336,7 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="matrix_multiply.html#sphx-glr-topic-vta-tutorials-matrix-multiply-py"><span class="std std-ref">Simple Matrix Multiply</span></a> (<code class="docutils literal notranslate"><span class="pre">matrix_multiply.py</span></code>)</p></td>
-<td><p>00:00.393</p></td>
+<td><p>00:00.421</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="vta_get_started.html#sphx-glr-topic-vta-tutorials-vta-get-started-py"><span class="std std-ref">Get Started with VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">vta_get_started.py</span></code>)</p></td>
diff --git a/docs/tutorial/auto_scheduler_matmul_x86.html b/docs/tutorial/auto_scheduler_matmul_x86.html
index b7e79d6618..d6432feffb 100644
--- a/docs/tutorial/auto_scheduler_matmul_x86.html
+++ b/docs/tutorial/auto_scheduler_matmul_x86.html
@@ -565,7 +565,7 @@ operator fusion.</p>
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 93.914 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 93.582 ms
 </pre></div>
 </div>
 </div>
@@ -640,7 +640,7 @@ automatically optimize a matrix multiplication, without the need to specify a
 search template.  It ends a series of examples that starts from the Tensor
 Expression (TE) language that demonstrates how TVM can optimize computational
 operations.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  19.446 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  18.746 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-auto-scheduler-matmul-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/eac4389b114db015e95cb3cdf8b86b83/auto_scheduler_matmul_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">auto_scheduler_matmul_x86.py</span></code></a></p>
diff --git a/docs/tutorial/autotvm_matmul_x86.html b/docs/tutorial/autotvm_matmul_x86.html
index 4797acc731..4f8768032b 100644
--- a/docs/tutorial/autotvm_matmul_x86.html
+++ b/docs/tutorial/autotvm_matmul_x86.html
@@ -669,16 +669,16 @@ reduce variance, we take 5 measurements and average them.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>waiting for device...
 device available
 Get devices for measurement successfully!
-No: 1   GFLOPS: 9.90/9.90       result: MeasureResult(costs=(0.027117801400000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.56846022605896, timestamp=1663245559.7403083) [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 256])],None,80
-No: 2   GFLOPS: 2.39/9.90       result: MeasureResult(costs=(0.11231959799999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.945366621017456, timestamp=1663245561.6962612) [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 8])],None,32
-No: 3   GFLOPS: 11.81/11.81     result: MeasureResult(costs=(0.0227336622,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5783097743988037, timestamp=1663245562.757294)        [(&#39;tile_y&#39;, [-1, 64]), (&#39;tile_x&#39;, [-1, 32])],None,56
-No: 4   GFLOPS: 1.85/11.81      result: MeasureResult(costs=(0.1450331794,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.4399099349975586, timestamp=1663245565.77341) [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 4])],None,20
-No: 5   GFLOPS: 3.68/11.81      result: MeasureResult(costs=(0.07295330219999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.314638614654541, timestamp=1663245567.7472565) [(&#39;tile_y&#39;, [-1, 256]), (&#39;tile_x&#39;, [-1, 16])],None,48
-No: 6   GFLOPS: 1.85/11.81      result: MeasureResult(costs=(0.1449238076,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.4757537841796875, timestamp=1663245570.2668052)       [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 4])],None,29
-No: 7   GFLOPS: 0.87/11.81      result: MeasureResult(costs=(0.30979644,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.078179597854614, timestamp=1663245575.387729)   [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 2])],None,19
-No: 8   GFLOPS: 10.59/11.81     result: MeasureResult(costs=(0.025346019200000004,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.559157133102417, timestamp=1663245575.956249) [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 64])],None,62
-No: 9   GFLOPS: 1.89/11.81      result: MeasureResult(costs=(0.1419734074,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.370332956314087, timestamp=1663245578.4465785)        [(&#39;tile_y&#39;, [-1, 2]), (&#39;tile_x&#39;, [-1, 2])],None,11
-No: 10  GFLOPS: 2.75/11.81      result: MeasureResult(costs=(0.0975077014,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6733081340789795, timestamp=1663245580.1720784)       [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 4])],None,22
+No: 1   GFLOPS: 9.84/9.84       result: MeasureResult(costs=(0.0272695492,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5697076320648193, timestamp=1663276737.7821794)       [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 256])],None,80
+No: 2   GFLOPS: 2.44/9.84       result: MeasureResult(costs=(0.1101957606,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.9129202365875244, timestamp=1663276739.7083054)       [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 8])],None,32
+No: 3   GFLOPS: 11.86/11.86     result: MeasureResult(costs=(0.022632376,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5607941150665283, timestamp=1663276740.7634134)        [(&#39;tile_y&#39;, [-1, 64]), (&#39;tile_x&#39;, [-1, 32])],None,56
+No: 4   GFLOPS: 1.55/11.86      result: MeasureResult(costs=(0.1728821068,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.8814404010772705, timestamp=1663276744.224555)        [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 4])],None,20
+No: 5   GFLOPS: 3.71/11.86      result: MeasureResult(costs=(0.0723810036,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2968084812164307, timestamp=1663276745.6454818)       [(&#39;tile_y&#39;, [-1, 256]), (&#39;tile_x&#39;, [-1, 16])],None,48
+No: 6   GFLOPS: 1.84/11.86      result: MeasureResult(costs=(0.14628462999999997,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.5020077228546143, timestamp=1663276748.1910057)        [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 4])],None,29
+No: 7   GFLOPS: 0.87/11.86      result: MeasureResult(costs=(0.3069285432,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.0363616943359375, timestamp=1663276753.7956307)       [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 2])],None,19
+No: 8   GFLOPS: 10.59/11.86     result: MeasureResult(costs=(0.025344378,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5466985702514648, timestamp=1663276754.3625996)        [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 64])],None,62
+No: 9   GFLOPS: 1.86/11.86      result: MeasureResult(costs=(0.14411985700000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.400439739227295, timestamp=1663276756.8816943) [(&#39;tile_y&#39;, [-1, 2]), (&#39;tile_x&#39;, [-1, 2])],None,11
+No: 10  GFLOPS: 2.67/11.86      result: MeasureResult(costs=(0.10050512380000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7148866653442383, timestamp=1663276758.6540103)        [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 4])],None,22
 </pre></div>
 </div>
 <p>With tuning completed, we can choose the configuration from the log file that
diff --git a/docs/tutorial/autotvm_relay_x86.html b/docs/tutorial/autotvm_relay_x86.html
index 628e8a6564..36d3e1d215 100644
--- a/docs/tutorial/autotvm_relay_x86.html
+++ b/docs/tutorial/autotvm_relay_x86.html
@@ -551,7 +551,7 @@ standard deviation.</p>
 <span class="nb">print</span><span class="p">(</span><a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">unoptimized</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>{&#39;mean&#39;: 513.3250088999557, &#39;median&#39;: 513.3628811498056, &#39;std&#39;: 1.4233655232003068}
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>{&#39;mean&#39;: 512.3894513099992, &#39;median&#39;: 512.8161163000016, &#39;std&#39;: 2.366645412308052}
 </pre></div>
 </div>
 </div>
@@ -706,178 +706,178 @@ depending on the specifics of the model and the target platform.</p>
   &quot;target_host parameter is going to be deprecated. &quot;
 
 [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  1/25]  Current/Best:   17.54/  17.54 GFLOPS | Progress: (4/20) | 6.53 s
-[Task  1/25]  Current/Best:    6.10/  17.54 GFLOPS | Progress: (8/20) | 9.61 s
-[Task  1/25]  Current/Best:   11.20/  22.19 GFLOPS | Progress: (12/20) | 12.13 s
-[Task  1/25]  Current/Best:   16.48/  22.20 GFLOPS | Progress: (16/20) | 13.85 s
-[Task  1/25]  Current/Best:   11.32/  23.24 GFLOPS | Progress: (20/20) | 15.67 s Done.
+[Task  1/25]  Current/Best:   17.29/  17.29 GFLOPS | Progress: (4/20) | 5.87 s
+[Task  1/25]  Current/Best:    6.11/  17.29 GFLOPS | Progress: (8/20) | 9.44 s
+[Task  1/25]  Current/Best:   11.21/  22.34 GFLOPS | Progress: (12/20) | 11.94 s
+[Task  1/25]  Current/Best:   16.54/  22.34 GFLOPS | Progress: (16/20) | 13.63 s
+[Task  1/25]  Current/Best:   11.26/  23.50 GFLOPS | Progress: (20/20) | 15.40 s Done.
 
 [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  2/25]  Current/Best:   12.19/  12.48 GFLOPS | Progress: (4/20) | 3.98 s
-[Task  2/25]  Current/Best:   12.43/  18.14 GFLOPS | Progress: (8/20) | 5.30 s
-[Task  2/25]  Current/Best:   21.14/  21.14 GFLOPS | Progress: (12/20) | 6.67 s
-[Task  2/25]  Current/Best:   10.67/  21.14 GFLOPS | Progress: (16/20) | 7.97 s
-[Task  2/25]  Current/Best:   18.04/  21.14 GFLOPS | Progress: (20/20) | 9.61 s Done.
+[Task  2/25]  Current/Best:   12.23/  12.25 GFLOPS | Progress: (4/20) | 3.86 s
+[Task  2/25]  Current/Best:   12.50/  17.93 GFLOPS | Progress: (8/20) | 5.16 s
+[Task  2/25]  Current/Best:   20.97/  20.97 GFLOPS | Progress: (12/20) | 6.47 s
+[Task  2/25]  Current/Best:   10.77/  20.97 GFLOPS | Progress: (16/20) | 7.74 s
+[Task  2/25]  Current/Best:   17.54/  20.97 GFLOPS | Progress: (20/20) | 9.33 s Done.
 
 [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  3/25]  Current/Best:    1.63/  10.17 GFLOPS | Progress: (4/20) | 5.92 s
-[Task  3/25]  Current/Best:   15.35/  16.86 GFLOPS | Progress: (8/20) | 7.91 s
-[Task  3/25]  Current/Best:   15.00/  16.86 GFLOPS | Progress: (12/20) | 9.66 s
-[Task  3/25]  Current/Best:    6.79/  23.31 GFLOPS | Progress: (16/20) | 11.68 s
-[Task  3/25]  Current/Best:   11.11/  23.31 GFLOPS | Progress: (20/20) | 16.34 s Done.
+[Task  3/25]  Current/Best:    1.63/  10.17 GFLOPS | Progress: (4/20) | 5.88 s
+[Task  3/25]  Current/Best:   15.38/  16.87 GFLOPS | Progress: (8/20) | 7.82 s
+[Task  3/25]  Current/Best:   15.07/  16.87 GFLOPS | Progress: (12/20) | 9.54 s
+[Task  3/25]  Current/Best:    6.78/  23.37 GFLOPS | Progress: (16/20) | 11.54 s
+[Task  3/25]  Current/Best:   11.09/  23.37 GFLOPS | Progress: (20/20) | 16.20 s Done.
 
 [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  4/25]  Current/Best:    8.91/  18.37 GFLOPS | Progress: (4/20) | 2.43 s
-[Task  4/25]  Current/Best:    6.58/  18.37 GFLOPS | Progress: (8/20) | 7.21 s
-[Task  4/25]  Current/Best:   19.19/  19.19 GFLOPS | Progress: (12/20) | 12.26 s
-[Task  4/25]  Current/Best:   16.35/  19.29 GFLOPS | Progress: (16/20) | 14.66 s
-[Task  4/25]  Current/Best:   12.94/  19.29 GFLOPS | Progress: (20/20) | 16.67 s Done.
+[Task  4/25]  Current/Best:    9.02/  18.47 GFLOPS | Progress: (4/20) | 2.42 s
+[Task  4/25]  Current/Best:    6.50/  18.47 GFLOPS | Progress: (8/20) | 7.16 s
+[Task  4/25]  Current/Best:   20.57/  20.57 GFLOPS | Progress: (12/20) | 12.13 s
+[Task  4/25]  Current/Best:   15.86/  20.57 GFLOPS | Progress: (16/20) | 14.49 s
+[Task  4/25]  Current/Best:   12.95/  20.57 GFLOPS | Progress: (20/20) | 16.46 s Done.
 
 [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  5/25]  Current/Best:    8.88/   9.80 GFLOPS | Progress: (4/20) | 2.70 s
-[Task  5/25]  Current/Best:   11.58/  11.58 GFLOPS | Progress: (8/20) | 4.82 s
-[Task  5/25]  Current/Best:   11.68/  17.75 GFLOPS | Progress: (12/20) | 8.06 s
-[Task  5/25]  Current/Best:   11.54/  21.99 GFLOPS | Progress: (16/20) | 9.49 s
-[Task  5/25]  Current/Best:   12.05/  21.99 GFLOPS | Progress: (20/20) | 11.45 s Done.
+[Task  5/25]  Current/Best:    9.05/   9.70 GFLOPS | Progress: (4/20) | 2.63 s
+[Task  5/25]  Current/Best:   11.57/  11.57 GFLOPS | Progress: (8/20) | 4.76 s
+[Task  5/25]  Current/Best:   11.66/  18.06 GFLOPS | Progress: (12/20) | 7.82 s
+[Task  5/25]  Current/Best:   11.52/  20.53 GFLOPS | Progress: (16/20) | 9.25 s
+[Task  5/25]  Current/Best:   12.08/  21.03 GFLOPS | Progress: (20/20) | 11.19 s Done.
 
 [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  6/25]  Current/Best:   12.02/  19.79 GFLOPS | Progress: (4/20) | 4.17 s
-[Task  6/25]  Current/Best:   18.86/  19.79 GFLOPS | Progress: (8/20) | 5.95 s
-[Task  6/25]  Current/Best:   13.18/  19.79 GFLOPS | Progress: (12/20) | 7.98 s
-[Task  6/25]  Current/Best:   18.91/  19.79 GFLOPS | Progress: (16/20) | 10.25 s
-[Task  6/25]  Current/Best:    3.71/  19.79 GFLOPS | Progress: (20/20) | 12.85 s Done.
+[Task  6/25]  Current/Best:   12.13/  19.89 GFLOPS | Progress: (4/20) | 4.15 s
+[Task  6/25]  Current/Best:   18.97/  19.89 GFLOPS | Progress: (8/20) | 5.93 s
+[Task  6/25]  Current/Best:   13.18/  19.89 GFLOPS | Progress: (12/20) | 7.96 s
+[Task  6/25]  Current/Best:   19.30/  19.89 GFLOPS | Progress: (16/20) | 10.25 s
+[Task  6/25]  Current/Best:    3.72/  19.89 GFLOPS | Progress: (20/20) | 12.87 s Done.
 
 [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  7/25]  Current/Best:    9.71/  12.11 GFLOPS | Progress: (4/20) | 3.74 s
-[Task  7/25]  Current/Best:   19.62/  19.62 GFLOPS | Progress: (8/20) | 5.30 s
-[Task  7/25]  Current/Best:   15.96/  19.62 GFLOPS | Progress: (12/20) | 7.28 s
-[Task  7/25]  Current/Best:   12.11/  20.20 GFLOPS | Progress: (16/20) | 9.39 s
-[Task  7/25]  Current/Best:    6.17/  20.40 GFLOPS | Progress: (20/20) | 11.90 s Done.
+[Task  7/25]  Current/Best:    9.79/  12.12 GFLOPS | Progress: (4/20) | 3.70 s
+[Task  7/25]  Current/Best:   19.48/  20.06 GFLOPS | Progress: (8/20) | 5.25 s
+[Task  7/25]  Current/Best:   14.74/  20.06 GFLOPS | Progress: (12/20) | 7.20 s
+[Task  7/25]  Current/Best:   12.15/  20.12 GFLOPS | Progress: (16/20) | 9.29 s
+[Task  7/25]  Current/Best:    6.00/  20.57 GFLOPS | Progress: (20/20) | 11.81 s Done.
 
 [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  8/25]  Current/Best:    9.68/  13.45 GFLOPS | Progress: (4/20) | 3.00 s
-[Task  8/25]  Current/Best:    9.55/  13.45 GFLOPS | Progress: (8/20) | 8.18 s
-[Task  8/25]  Current/Best:   12.69/  13.45 GFLOPS | Progress: (12/20) | 14.80 s
-[Task  8/25]  Current/Best:   18.40/  18.40 GFLOPS | Progress: (16/20) | 16.94 s
-[Task  8/25]  Current/Best:   19.31/  19.31 GFLOPS | Progress: (20/20) | 24.12 s Done.
+[Task  8/25]  Current/Best:    9.70/  13.70 GFLOPS | Progress: (4/20) | 2.98 s
+[Task  8/25]  Current/Best:    9.30/  13.70 GFLOPS | Progress: (8/20) | 8.08 s
+[Task  8/25]  Current/Best:   12.73/  13.70 GFLOPS | Progress: (12/20) | 14.51 s
+[Task  8/25]  Current/Best:   18.86/  18.86 GFLOPS | Progress: (16/20) | 16.66 s
+[Task  8/25]  Current/Best:   18.52/  18.86 GFLOPS | Progress: (20/20) | 23.76 s Done.
 
 [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  9/25]  Current/Best:   14.31/  14.31 GFLOPS | Progress: (4/20) | 12.00 s
-[Task  9/25]  Current/Best:   21.88/  21.88 GFLOPS | Progress: (8/20) | 13.81 s
-[Task  9/25]  Current/Best:    8.01/  21.88 GFLOPS | Progress: (12/20) | 16.40 s
-[Task  9/25]  Current/Best:   17.87/  21.88 GFLOPS | Progress: (16/20) | 19.28 s
-[Task  9/25]  Current/Best:    9.09/  21.88 GFLOPS | Progress: (20/20) | 27.89 s
+[Task  9/25]  Current/Best:   14.35/  14.35 GFLOPS | Progress: (4/20) | 11.98 s
+[Task  9/25]  Current/Best:   21.67/  21.67 GFLOPS | Progress: (8/20) | 13.73 s
+[Task  9/25]  Current/Best:    8.03/  21.67 GFLOPS | Progress: (12/20) | 16.28 s
+[Task  9/25]  Current/Best:   17.87/  21.67 GFLOPS | Progress: (16/20) | 19.16 s
+[Task  9/25]  Current/Best:    9.09/  21.67 GFLOPS | Progress: (20/20) | 27.75 s
 [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 10/25]  Current/Best:   17.54/  17.54 GFLOPS | Progress: (4/20) | 2.66 s
-[Task 10/25]  Current/Best:   15.63/  17.54 GFLOPS | Progress: (8/20) | 4.36 s
-[Task 10/25]  Current/Best:   11.17/  18.52 GFLOPS | Progress: (12/20) | 5.95 s
-[Task 10/25]  Current/Best:   19.08/  20.07 GFLOPS | Progress: (16/20) | 7.08 s
-[Task 10/25]  Current/Best:    8.52/  20.07 GFLOPS | Progress: (20/20) | 8.65 s Done.
+[Task 10/25]  Current/Best:   18.18/  18.18 GFLOPS | Progress: (4/20) | 2.58 s
+[Task 10/25]  Current/Best:   15.58/  18.18 GFLOPS | Progress: (8/20) | 4.21 s
+[Task 10/25]  Current/Best:   11.38/  18.18 GFLOPS | Progress: (12/20) | 5.78 s
+[Task 10/25]  Current/Best:   19.06/  20.15 GFLOPS | Progress: (16/20) | 6.90 s
+[Task 10/25]  Current/Best:    8.39/  20.15 GFLOPS | Progress: (20/20) | 8.45 s Done.
 
 [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 11/25]  Current/Best:   10.89/  18.18 GFLOPS | Progress: (4/20) | 3.42 s
-[Task 11/25]  Current/Best:   14.83/  18.18 GFLOPS | Progress: (8/20) | 6.30 s
-[Task 11/25]  Current/Best:   15.88/  18.18 GFLOPS | Progress: (12/20) | 8.46 s
-[Task 11/25]  Current/Best:   11.88/  20.62 GFLOPS | Progress: (16/20) | 11.36 s
-[Task 11/25]  Current/Best:   18.56/  20.62 GFLOPS | Progress: (20/20) | 13.50 s Done.
+[Task 11/25]  Current/Best:   10.87/  18.10 GFLOPS | Progress: (4/20) | 3.46 s
+[Task 11/25]  Current/Best:   14.90/  18.10 GFLOPS | Progress: (8/20) | 6.29 s
+[Task 11/25]  Current/Best:   15.93/  18.10 GFLOPS | Progress: (12/20) | 8.43 s
+[Task 11/25]  Current/Best:   11.90/  20.66 GFLOPS | Progress: (16/20) | 11.31 s
+[Task 11/25]  Current/Best:   18.64/  20.66 GFLOPS | Progress: (20/20) | 13.45 s Done.
 
 [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 12/25]  Current/Best:    7.79/  17.83 GFLOPS | Progress: (4/20) | 5.82 s
-[Task 12/25]  Current/Best:    4.95/  17.83 GFLOPS | Progress: (8/20) | 9.77 s
-[Task 12/25]  Current/Best:   18.86/  18.86 GFLOPS | Progress: (12/20) | 11.82 s
-[Task 12/25]  Current/Best:   15.07/  18.86 GFLOPS | Progress: (16/20) | 14.84 s
-[Task 12/25]  Current/Best:   15.08/  18.86 GFLOPS | Progress: (20/20) | 16.82 s Done.
+[Task 12/25]  Current/Best:    7.80/  17.74 GFLOPS | Progress: (4/20) | 5.72 s
+[Task 12/25]  Current/Best:    4.88/  17.74 GFLOPS | Progress: (8/20) | 9.67 s
+[Task 12/25]  Current/Best:   18.86/  18.86 GFLOPS | Progress: (12/20) | 11.69 s
+[Task 12/25]  Current/Best:   15.28/  18.86 GFLOPS | Progress: (16/20) | 14.66 s
+[Task 12/25]  Current/Best:   14.87/  18.86 GFLOPS | Progress: (20/20) | 16.66 s Done.
 
 [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 13/25]  Current/Best:    8.65/  17.25 GFLOPS | Progress: (4/20) | 3.86 s
-[Task 13/25]  Current/Best:   15.56/  20.57 GFLOPS | Progress: (8/20) | 6.45 s
-[Task 13/25]  Current/Best:   18.68/  21.69 GFLOPS | Progress: (12/20) | 9.67 s
-[Task 13/25]  Current/Best:   12.26/  21.69 GFLOPS | Progress: (16/20) | 13.15 s
-[Task 13/25]  Current/Best:   17.52/  21.69 GFLOPS | Progress: (20/20) | 15.56 s Done.
+[Task 13/25]  Current/Best:    8.56/  17.29 GFLOPS | Progress: (4/20) | 3.82 s
+[Task 13/25]  Current/Best:   15.28/  20.85 GFLOPS | Progress: (8/20) | 6.41 s
+[Task 13/25]  Current/Best:   18.28/  21.18 GFLOPS | Progress: (12/20) | 9.52 s
+[Task 13/25]  Current/Best:   12.23/  21.18 GFLOPS | Progress: (16/20) | 12.93 s
+[Task 13/25]  Current/Best:   17.40/  21.18 GFLOPS | Progress: (20/20) | 15.35 s Done.
 
 [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 14/25]  Current/Best:   12.15/  13.38 GFLOPS | Progress: (4/20) | 3.53 s
-[Task 14/25]  Current/Best:    6.09/  13.38 GFLOPS | Progress: (8/20) | 5.75 s
-[Task 14/25]  Current/Best:   19.05/  19.18 GFLOPS | Progress: (12/20) | 8.48 s
-[Task 14/25]  Current/Best:   16.45/  19.18 GFLOPS | Progress: (16/20) | 10.14 s Done.
+[Task 14/25]  Current/Best:   12.16/  13.34 GFLOPS | Progress: (4/20) | 3.50 s
+[Task 14/25]  Current/Best:    6.10/  13.34 GFLOPS | Progress: (8/20) | 5.70 s
+[Task 14/25]  Current/Best:   18.72/  18.88 GFLOPS | Progress: (12/20) | 8.39 s
+[Task 14/25]  Current/Best:   16.09/  18.88 GFLOPS | Progress: (16/20) | 10.03 s Done.
 
-[Task 14/25]  Current/Best:   17.04/  19.18 GFLOPS | Progress: (20/20) | 11.93 s
+[Task 14/25]  Current/Best:   17.13/  18.88 GFLOPS | Progress: (20/20) | 11.80 s
 [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 15/25]  Current/Best:   14.90/  16.48 GFLOPS | Progress: (4/20) | 2.77 s
-[Task 15/25]  Current/Best:   12.73/  17.99 GFLOPS | Progress: (8/20) | 4.16 s
-[Task 15/25]  Current/Best:    9.87/  20.56 GFLOPS | Progress: (12/20) | 6.46 s
-[Task 15/25]  Current/Best:   20.27/  20.56 GFLOPS | Progress: (16/20) | 9.80 s
-[Task 15/25]  Current/Best:    9.52/  20.56 GFLOPS | Progress: (20/20) | 10.83 s
+[Task 15/25]  Current/Best:   15.22/  17.31 GFLOPS | Progress: (4/20) | 2.77 s
+[Task 15/25]  Current/Best:   12.46/  17.78 GFLOPS | Progress: (8/20) | 4.13 s
+[Task 15/25]  Current/Best:    9.71/  21.16 GFLOPS | Progress: (12/20) | 6.41 s
+[Task 15/25]  Current/Best:   20.35/  21.16 GFLOPS | Progress: (16/20) | 9.57 s
+[Task 15/25]  Current/Best:    9.53/  21.16 GFLOPS | Progress: (20/20) | 10.59 s
 [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 16/25]  Current/Best:   17.41/  17.41 GFLOPS | Progress: (4/20) | 3.12 s
-[Task 16/25]  Current/Best:    3.00/  17.41 GFLOPS | Progress: (8/20) | 4.74 s
-[Task 16/25]  Current/Best:   18.19/  19.30 GFLOPS | Progress: (12/20) | 5.98 s
-[Task 16/25]  Current/Best:   17.95/  19.30 GFLOPS | Progress: (16/20) | 7.38 s
-[Task 16/25]  Current/Best:    9.88/  21.29 GFLOPS | Progress: (20/20) | 9.54 s Done.
+[Task 16/25]  Current/Best:   18.12/  18.12 GFLOPS | Progress: (4/20) | 3.01 s
+[Task 16/25]  Current/Best:    3.03/  18.12 GFLOPS | Progress: (8/20) | 4.63 s
+[Task 16/25]  Current/Best:   17.07/  19.58 GFLOPS | Progress: (12/20) | 5.89 s
+[Task 16/25]  Current/Best:   17.21/  19.58 GFLOPS | Progress: (16/20) | 7.25 s
+[Task 16/25]  Current/Best:    9.80/  21.24 GFLOPS | Progress: (20/20) | 9.42 s Done.
 
 [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 17/25]  Current/Best:   11.96/  16.05 GFLOPS | Progress: (4/20) | 4.91 s
-[Task 17/25]  Current/Best:   13.15/  22.87 GFLOPS | Progress: (8/20) | 7.75 s
-[Task 17/25]  Current/Best:   16.50/  22.87 GFLOPS | Progress: (12/20) | 9.89 s
-[Task 17/25]  Current/Best:   16.44/  22.87 GFLOPS | Progress: (16/20) | 12.13 s
-[Task 17/25]  Current/Best:    9.99/  22.87 GFLOPS | Progress: (20/20) | 14.30 s Done.
+[Task 17/25]  Current/Best:   12.70/  16.16 GFLOPS | Progress: (4/20) | 4.84 s
+[Task 17/25]  Current/Best:   13.10/  22.99 GFLOPS | Progress: (8/20) | 7.64 s
+[Task 17/25]  Current/Best:   16.48/  22.99 GFLOPS | Progress: (12/20) | 9.75 s
+[Task 17/25]  Current/Best:   16.44/  22.99 GFLOPS | Progress: (16/20) | 12.00 s
+[Task 17/25]  Current/Best:   10.01/  22.99 GFLOPS | Progress: (20/20) | 14.17 s Done.
 
 [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 18/25]  Current/Best:   10.50/  16.63 GFLOPS | Progress: (4/20) | 3.89 s
-[Task 18/25]  Current/Best:   10.52/  19.07 GFLOPS | Progress: (8/20) | 7.66 s
-[Task 18/25]  Current/Best:   18.48/  19.07 GFLOPS | Progress: (12/20) | 9.63 s
-[Task 18/25]  Current/Best:   10.38/  19.07 GFLOPS | Progress: (16/20) | 13.58 s
-[Task 18/25]  Current/Best:   20.58/  20.58 GFLOPS | Progress: (20/20) | 15.13 s Done.
+[Task 18/25]  Current/Best:   10.95/  16.81 GFLOPS | Progress: (4/20) | 3.85 s
+[Task 18/25]  Current/Best:   10.52/  18.29 GFLOPS | Progress: (8/20) | 7.54 s
+[Task 18/25]  Current/Best:   18.20/  18.29 GFLOPS | Progress: (12/20) | 9.51 s
+[Task 18/25]  Current/Best:   10.21/  18.29 GFLOPS | Progress: (16/20) | 13.41 s
+[Task 18/25]  Current/Best:   20.54/  20.54 GFLOPS | Progress: (20/20) | 14.96 s Done.
 
 [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 19/25]  Current/Best:    7.27/  19.59 GFLOPS | Progress: (4/20) | 6.16 s
-[Task 19/25]  Current/Best:    2.69/  19.59 GFLOPS | Progress: (8/20) | 9.54 s
-[Task 19/25]  Current/Best:   18.53/  20.77 GFLOPS | Progress: (12/20) | 12.54 s
-[Task 19/25]  Current/Best:   12.76/  20.77 GFLOPS | Progress: (16/20) | 15.56 s
-[Task 19/25]  Current/Best:    2.69/  22.41 GFLOPS | Progress: (20/20) | 18.39 s Done.
+[Task 19/25]  Current/Best:    7.25/  19.87 GFLOPS | Progress: (4/20) | 6.15 s
+[Task 19/25]  Current/Best:    2.68/  19.87 GFLOPS | Progress: (8/20) | 9.54 s
+[Task 19/25]  Current/Best:   19.12/  20.97 GFLOPS | Progress: (12/20) | 12.52 s
+[Task 19/25]  Current/Best:   12.76/  20.97 GFLOPS | Progress: (16/20) | 15.50 s
+[Task 19/25]  Current/Best:    2.69/  22.48 GFLOPS | Progress: (20/20) | 18.34 s Done.
 
 [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 20/25]  Current/Best:    8.33/  14.90 GFLOPS | Progress: (4/20) | 3.40 s Done.
+[Task 20/25]  Current/Best:    9.58/  15.24 GFLOPS | Progress: (4/20) | 3.34 s Done.
  Done.
 
-[Task 20/25]  Current/Best:    9.76/  14.90 GFLOPS | Progress: (8/20) | 6.98 s
-[Task 20/25]  Current/Best:    2.30/  14.90 GFLOPS | Progress: (12/20) | 10.89 s
-[Task 20/25]  Current/Best:   11.42/  14.90 GFLOPS | Progress: (16/20) | 14.71 s
-[Task 20/25]  Current/Best:   11.19/  21.95 GFLOPS | Progress: (20/20) | 16.84 s
+[Task 20/25]  Current/Best:    9.96/  15.24 GFLOPS | Progress: (8/20) | 6.93 s
+[Task 20/25]  Current/Best:    2.32/  15.24 GFLOPS | Progress: (12/20) | 10.83 s
+[Task 20/25]  Current/Best:   11.19/  15.24 GFLOPS | Progress: (16/20) | 14.71 s
+[Task 20/25]  Current/Best:   11.41/  21.59 GFLOPS | Progress: (20/20) | 16.85 s
 [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 21/25]  Current/Best:    6.37/  17.71 GFLOPS | Progress: (4/20) | 3.33 s
-[Task 21/25]  Current/Best:   14.54/  17.71 GFLOPS | Progress: (8/20) | 4.97 s
-[Task 21/25]  Current/Best:    1.61/  17.71 GFLOPS | Progress: (12/20) | 7.15 s
-[Task 21/25]  Current/Best:   15.88/  17.71 GFLOPS | Progress: (16/20) | 10.73 s
-[Task 21/25]  Current/Best:    4.46/  17.71 GFLOPS | Progress: (20/20) | 18.07 s
+[Task 21/25]  Current/Best:    6.34/  17.73 GFLOPS | Progress: (4/20) | 3.29 s
+[Task 21/25]  Current/Best:   14.53/  17.73 GFLOPS | Progress: (8/20) | 4.91 s
+[Task 21/25]  Current/Best:    1.61/  17.73 GFLOPS | Progress: (12/20) | 7.05 s
+[Task 21/25]  Current/Best:   15.93/  17.73 GFLOPS | Progress: (16/20) | 10.58 s
+[Task 21/25]  Current/Best:    4.46/  17.73 GFLOPS | Progress: (20/20) | 17.89 s
 [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 22/25]  Current/Best:    2.70/  16.77 GFLOPS | Progress: (4/20) | 2.73 s
-[Task 22/25]  Current/Best:    8.70/  20.10 GFLOPS | Progress: (8/20) | 4.84 s
-[Task 22/25]  Current/Best:   19.86/  20.10 GFLOPS | Progress: (12/20) | 7.22 s
-[Task 22/25]  Current/Best:   15.22/  20.10 GFLOPS | Progress: (16/20) | 9.37 s
-[Task 22/25]  Current/Best:   12.27/  20.10 GFLOPS | Progress: (20/20) | 11.14 s Done.
+[Task 22/25]  Current/Best:    2.70/  16.20 GFLOPS | Progress: (4/20) | 2.71 s
+[Task 22/25]  Current/Best:    8.74/  20.49 GFLOPS | Progress: (8/20) | 4.70 s
+[Task 22/25]  Current/Best:   19.72/  20.49 GFLOPS | Progress: (12/20) | 7.08 s
+[Task 22/25]  Current/Best:   15.29/  20.49 GFLOPS | Progress: (16/20) | 9.21 s
+[Task 22/25]  Current/Best:   12.26/  20.49 GFLOPS | Progress: (20/20) | 10.94 s Done.
 
 [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 23/25]  Current/Best:   16.60/  19.90 GFLOPS | Progress: (4/20) | 3.31 s
-[Task 23/25]  Current/Best:   13.11/  19.90 GFLOPS | Progress: (8/20) | 6.84 s
-[Task 23/25]  Current/Best:   20.60/  21.61 GFLOPS | Progress: (12/20) | 8.69 s
-[Task 23/25]  Current/Best:    6.54/  21.61 GFLOPS | Progress: (16/20) | 15.65 s
-[Task 23/25]  Current/Best:    7.83/  21.61 GFLOPS | Progress: (20/20) | 19.89 s Done.
+[Task 23/25]  Current/Best:   16.64/  19.99 GFLOPS | Progress: (4/20) | 3.30 s
+[Task 23/25]  Current/Best:   13.61/  19.99 GFLOPS | Progress: (8/20) | 6.78 s
+[Task 23/25]  Current/Best:   20.63/  21.74 GFLOPS | Progress: (12/20) | 8.64 s
+[Task 23/25]  Current/Best:    6.56/  21.74 GFLOPS | Progress: (16/20) | 15.76 s
+[Task 23/25]  Current/Best:    7.65/  21.74 GFLOPS | Progress: (20/20) | 20.01 s Done.
 
 [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 24/25]  Current/Best:    8.27/   8.27 GFLOPS | Progress: (4/20) | 11.81 s
-[Task 24/25]  Current/Best:    2.02/   8.27 GFLOPS | Progress: (8/20) | 22.85 s
-[Task 24/25]  Current/Best:    3.61/   8.27 GFLOPS | Progress: (12/20) | 34.43 s Done.
+[Task 24/25]  Current/Best:    8.47/   8.47 GFLOPS | Progress: (4/20) | 11.80 s
+[Task 24/25]  Current/Best:    2.01/   8.47 GFLOPS | Progress: (8/20) | 22.88 s
+[Task 24/25]  Current/Best:    3.88/   8.47 GFLOPS | Progress: (12/20) | 34.43 s Done.
 
-[Task 24/25]  Current/Best:    5.62/   8.48 GFLOPS | Progress: (16/20) | 40.07 s
-[Task 24/25]  Current/Best:    2.99/   8.48 GFLOPS | Progress: (20/20) | 46.27 s Done.
+[Task 24/25]  Current/Best:    6.20/   8.89 GFLOPS | Progress: (16/20) | 40.04 s
+[Task 24/25]  Current/Best:    2.96/   8.89 GFLOPS | Progress: (20/20) | 46.06 s Done.
 
 [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 25/25]  Current/Best:    1.55/   2.73 GFLOPS | Progress: (4/20) | 11.64 s
-[Task 25/25]  Current/Best:    5.90/   7.80 GFLOPS | Progress: (8/20) | 22.96 s
-[Task 25/25]  Current/Best:    5.88/   7.80 GFLOPS | Progress: (12/20) | 34.47 s
-[Task 25/25]  Current/Best:    5.80/   7.92 GFLOPS | Progress: (16/20) | 36.35 s
-[Task 25/25]  Current/Best:    2.81/   8.61 GFLOPS | Progress: (20/20) | 47.03 s
+[Task 25/25]  Current/Best:    1.55/   2.77 GFLOPS | Progress: (4/20) | 11.60 s
+[Task 25/25]  Current/Best:    6.02/   8.29 GFLOPS | Progress: (8/20) | 22.89 s
+[Task 25/25]  Current/Best:    5.95/   8.29 GFLOPS | Progress: (12/20) | 34.38 s
+[Task 25/25]  Current/Best:    5.88/   8.45 GFLOPS | Progress: (16/20) | 36.18 s
+[Task 25/25]  Current/Best:    2.82/   9.18 GFLOPS | Progress: (20/20) | 46.85 s
 </pre></div>
 </div>
 <p>The output from this tuning process will look something like this:</p>
@@ -981,8 +981,8 @@ improvement in comparing the optimized model to the unoptimized model.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;unoptimized: </span><span class="si">%s</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="p">(</span><a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">unoptimized</span></a><span class="p">))</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>optimized: {&#39;mean&#39;: 411.55453831997875, &#39;median&#39;: 411.3484295499802, &#39;std&#39;: 0.87108492230167}
-unoptimized: {&#39;mean&#39;: 513.3250088999557, &#39;median&#39;: 513.3628811498056, &#39;std&#39;: 1.4233655232003068}
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>optimized: {&#39;mean&#39;: 414.2226953699969, &#39;median&#39;: 414.42295164999905, &#39;std&#39;: 1.1449126569482697}
+unoptimized: {&#39;mean&#39;: 512.3894513099992, &#39;median&#39;: 512.8161163000016, &#39;std&#39;: 2.366645412308052}
 </pre></div>
 </div>
 </div>
@@ -996,7 +996,7 @@ models.</p>
 <p>Here we presented a simple example using ResNet-50 v2 locally. However, TVM
 supports many more features including cross-compilation, remote execution and
 profiling/benchmarking.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 10 minutes  34.577 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 10 minutes  24.029 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-autotvm-relay-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/57a45d9bef1af358191e7d50043e652c/autotvm_relay_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">autotvm_relay_x86.py</span></code></a></p>
diff --git a/docs/tutorial/cross_compilation_and_rpc.html b/docs/tutorial/cross_compilation_and_rpc.html
index 7f7a457468..25f68ae9d1 100644
--- a/docs/tutorial/cross_compilation_and_rpc.html
+++ b/docs/tutorial/cross_compilation_and_rpc.html
@@ -527,7 +527,7 @@ device and returns the measured cost. Network overhead is excluded.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;</span><span class="si">%g</span><span class="s2"> secs/op&quot;</span> <span class="o">%</span> <span class="n">cost</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>1.441e-07 secs/op
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>1.263e-07 secs/op
 </pre></div>
 </div>
 </div>
diff --git a/docs/tutorial/intro_topi.html b/docs/tutorial/intro_topi.html
index 13b0842e37..9867106581 100644
--- a/docs/tutorial/intro_topi.html
+++ b/docs/tutorial/intro_topi.html
@@ -484,7 +484,7 @@ we can schedule the following series of operations ending with <code class="code
 <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/ir.html#tvm.ir.Array" title="tvm.ir.Array" class="sphx-glr-backref-module-tvm-ir sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">sg</span><span class="o">.</span><span class="n">stages</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>[stage(a, placeholder(a, 0x1ff323b0)), stage(b, placeholder(b, 0x22b91370)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[ [...]
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>[stage(a, placeholder(a, 0xcc33980)), stage(b, placeholder(b, 0x22681240)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[i [...]
 </pre></div>
 </div>
 <p>We can test the correctness by comparing with <code class="code docutils literal notranslate"><span class="pre">numpy</span></code> result as follows</p>
diff --git a/docs/tutorial/sg_execution_times.html b/docs/tutorial/sg_execution_times.html
index 0ff0c3a04c..903421b2fc 100644
--- a/docs/tutorial/sg_execution_times.html
+++ b/docs/tutorial/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-tutorial-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>13:54.846</strong> total execution time for <strong>tutorial</strong> files:</p>
+<p><strong>13:41.503</strong> total execution time for <strong>tutorial</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,35 +336,35 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="autotvm_relay_x86.html#sphx-glr-tutorial-autotvm-relay-x86-py"><span class="std std-ref">Compiling and Optimizing a Model with the Python Interface (AutoTVM)</span></a> (<code class="docutils literal notranslate"><span class="pre">autotvm_relay_x86.py</span></code>)</p></td>
-<td><p>10:34.577</p></td>
+<td><p>10:24.029</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="auto_scheduler_matmul_x86.html#sphx-glr-tutorial-auto-scheduler-matmul-x86-py"><span class="std std-ref">Optimizing Operators with Auto-scheduling</span></a> (<code class="docutils literal notranslate"><span class="pre">auto_scheduler_matmul_x86.py</span></code>)</p></td>
-<td><p>01:19.446</p></td>
+<td><p>01:18.746</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tensor_expr_get_started.html#sphx-glr-tutorial-tensor-expr-get-started-py"><span class="std std-ref">Working with Operators Using Tensor Expression</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_expr_get_started.py</span></code>)</p></td>
-<td><p>01:02.720</p></td>
+<td><p>01:01.192</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="relay_quick_start.html#sphx-glr-tutorial-relay-quick-start-py"><span class="std std-ref">Quick Start Tutorial for Compiling Deep Learning Models</span></a> (<code class="docutils literal notranslate"><span class="pre">relay_quick_start.py</span></code>)</p></td>
-<td><p>00:32.459</p></td>
+<td><p>00:31.110</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="autotvm_matmul_x86.html#sphx-glr-tutorial-autotvm-matmul-x86-py"><span class="std std-ref">Optimizing Operators with Schedule Templates and AutoTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">autotvm_matmul_x86.py</span></code>)</p></td>
-<td><p>00:23.992</p></td>
+<td><p>00:24.383</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tensor_ir_blitz_course.html#sphx-glr-tutorial-tensor-ir-blitz-course-py"><span class="std std-ref">Blitz Course to TensorIR</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_ir_blitz_course.py</span></code>)</p></td>
-<td><p>00:00.753</p></td>
+<td><p>00:01.187</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="intro_topi.html#sphx-glr-tutorial-intro-topi-py"><span class="std std-ref">Introduction to TOPI</span></a> (<code class="docutils literal notranslate"><span class="pre">intro_topi.py</span></code>)</p></td>
-<td><p>00:00.741</p></td>
+<td><p>00:00.699</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="cross_compilation_and_rpc.html#sphx-glr-tutorial-cross-compilation-and-rpc-py"><span class="std std-ref">Cross Compilation and RPC</span></a> (<code class="docutils literal notranslate"><span class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.149</p></td>
+<td><p>00:00.148</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="introduction.html#sphx-glr-tutorial-introduction-py"><span class="std std-ref">Introduction</span></a> (<code class="docutils literal notranslate"><span class="pre">introduction.py</span></code>)</p></td>
@@ -372,14 +372,14 @@
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="uma.html#sphx-glr-tutorial-uma-py"><span class="std std-ref">Making your Hardware Accelerator TVM-ready with UMA</span></a> (<code class="docutils literal notranslate"><span class="pre">uma.py</span></code>)</p></td>
-<td><p>00:00.001</p></td>
+<td><p>00:00.002</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="tvmc_python.html#sphx-glr-tutorial-tvmc-python-py"><span class="std std-ref">Getting Starting using TVMC Python: a high-level API for TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_python.py</span></code>)</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="tvmc_command_line_driver.html#sphx-glr-tutorial-tvmc-command-line-driver-py"><span class="std std-ref">Compiling and Optimizing a Model with TVMC</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_command_line_driver.py</span></code>)</p></td>
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="tvmc_command_line_driver.html#sphx-glr-tutorial-tvmc-command-line-driver-py"><span class="std std-ref">Compiling and Optimizing a Model with TVMC</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_command_line_driver.py</span></code>)</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="tvmc_python.html#sphx-glr-tutorial-tvmc-python-py"><span class="std std-ref">Getting Starting using TVMC Python: a high-level API for TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_python.py</span></code>)</p></td>
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
diff --git a/docs/tutorial/tensor_expr_get_started.html b/docs/tutorial/tensor_expr_get_started.html
index 220f546a30..d4cf00dbd9 100644
--- a/docs/tutorial/tensor_expr_get_started.html
+++ b/docs/tutorial/tensor_expr_get_started.html
@@ -542,7 +542,7 @@ helper function to run a profile of the TVM generated code.</p>
 <span class="n">evaluate_addition</span><span class="p">(</span><span class="n">fadd</span><span class="p">,</span> <a href="../reference/api/python/target.html#tvm.target.Target" title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">tgt</span></a><span class="p">,</span> <span class="s2">&quot;naive&quot;</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#list" ti [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.000008
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.000007
 naive: 0.000007
 </pre></div>
 </div>
@@ -594,7 +594,7 @@ compile and run this new schedule with the parallel operation applied:</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-parallel: 0.000007
+parallel: 0.000008
 </pre></div>
 </div>
 </div>
@@ -668,10 +668,10 @@ vector: 0.000025
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Operator                  Timing             Performance
-   numpy    7.968120007717516e-06                    1.0
-   naive    6.6864999999999994e-06    0.8391565379943821
-parallel              6.9711e-06      0.8748738715340817
-  vector             2.45637e-05      3.0827472447966207
+   numpy    7.033520000732097e-06                    1.0
+   naive              6.6893e-06      0.9510600665532666
+parallel              8.0827e-06       1.149168552752917
+  vector    2.5464300000000003e-05    3.6204205003112957
 </pre></div>
 </div>
 <div class="admonition-code-specialization admonition">
@@ -987,7 +987,7 @@ matrix multiplication.</p>
 <span class="n">answer</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">numpy</span><span class="p">(),</span> <span class="n">b</span><span class="o">.</span><span class="n">numpy</span><span class="p">())</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018261
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018294
 </pre></div>
 </div>
 <p>Now we write a basic matrix multiplication using TVM TE and verify that it
@@ -1030,7 +1030,7 @@ optimizations.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-none: 3.547660
+none: 3.442716
 </pre></div>
 </div>
 <p>Let’s take a look at the intermediate representation of the operator and
@@ -1097,7 +1097,7 @@ schedule.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-blocking: 0.299680
+blocking: 0.299036
 </pre></div>
 </div>
 <p>By reordering the computation to take advantage of caching, you should see a
@@ -1158,7 +1158,7 @@ already cache friendly from our previous optimizations.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-vectorization: 0.340141
+vectorization: 0.335199
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1215,7 +1215,7 @@ more cache friendly.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-loop permutation: 0.118493
+loop permutation: 0.115906
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1293,7 +1293,7 @@ optimized schedule.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-array packing: 0.109422
+array packing: 0.108423
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1369,7 +1369,7 @@ to `C</cite> when all the block results are ready.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-block caching: 0.111004
+block caching: 0.110404
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1438,7 +1438,7 @@ of thread-level parallelization.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-parallelization: 0.145424
+parallelization: 0.145882
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1500,13 +1500,13 @@ working, we can compare the results.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>        Operator                  Timing             Performance
-            none      3.5476597945000004                     1.0
-        blocking            0.2996795641     0.08447246395062981
-   vectorization     0.34014130740000004     0.09587765656879702
-loop permutation            0.1184933813     0.03340043526262084
-   array packing            0.1094217999    0.030843374573187247
-   block caching     0.11100439209999999     0.03128946926424345
- parallelization     0.14542441309999998    0.040991645626633655
+            none      3.4427163323000003                     1.0
+        blocking     0.29903623960000003     0.08686055159247487
+   vectorization     0.33519857109999995     0.09736456296300817
+loop permutation     0.11590583890000002     0.03366697331771333
+   array packing     0.10842323429999998     0.03149351379396539
+   block caching     0.11040388409999999     0.03206882979703462
+ parallelization     0.14588205399999998     0.04237411390282612
 </pre></div>
 </div>
 <p>Note that the outputs on the web page reflect the running times on a
@@ -1538,7 +1538,7 @@ is</p>
 you can build generic templates of the matrix multiplication and other
 operations with tunable parameters that allows you to automatically optimize
 the computation for specific platforms.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  2.720 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  1.192 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-tensor-expr-get-started-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/40a01cffb015a67aaec0fad7e27cf80d/tensor_expr_get_started.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tensor_expr_get_started.py</span></code></a></p>