You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by tq...@apache.org on 2022/09/19 15:57:52 UTC

[tvm-site] branch asf-site updated: deploying docs (apache/tvm@2af9b90ec191424724842795c552d4c15682eb8c)

This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new f40d5504b5 deploying docs (apache/tvm@2af9b90ec191424724842795c552d4c15682eb8c)
f40d5504b5 is described below

commit f40d5504b5695cb368917baac4c974d1f187d66a
Author: tvm-bot <95...@users.noreply.github.com>
AuthorDate: Mon Sep 19 15:57:45 2022 +0000

    deploying docs (apache/tvm@2af9b90ec191424724842795c552d4c15682eb8c)
---
 .../how_to/compile_models/from_darknet.rst.txt     |   2 +-
 .../how_to/compile_models/from_keras.rst.txt       |   2 +-
 .../how_to/compile_models/from_mxnet.rst.txt       |   2 +-
 .../how_to/compile_models/from_oneflow.rst.txt     |   2 +-
 .../how_to/compile_models/from_pytorch.rst.txt     |   2 +-
 .../how_to/compile_models/from_tensorflow.rst.txt  |   2 +-
 .../compile_models/sg_execution_times.rst.txt      |  22 +-
 .../deploy_models/deploy_model_on_android.rst.txt  |   2 +-
 .../deploy_object_detection_pytorch.rst.txt        |   4 +-
 .../deploy_models/deploy_prequantized.rst.txt      |   6 +-
 .../deploy_prequantized_tflite.rst.txt             |   4 +-
 .../how_to/deploy_models/deploy_quantized.rst.txt  |   2 +-
 .../deploy_models/deploy_ssd_gluoncv.rst.txt       |   4 +-
 .../deploy_models/sg_execution_times.rst.txt       |  18 +-
 .../extend_tvm/bring_your_own_datatypes.rst.txt    |   2 +-
 .../how_to/extend_tvm/sg_execution_times.rst.txt   |  10 +-
 .../how_to/extend_tvm/use_pass_instrument.rst.txt  |  16 +-
 .../optimize_operators/opt_conv_cuda.rst.txt       |   2 +-
 .../optimize_operators/opt_conv_tensorcore.rst.txt |   2 +-
 .../how_to/optimize_operators/opt_gemm.rst.txt     |  16 +-
 .../optimize_operators/sg_execution_times.rst.txt  |   8 +-
 .../sg_execution_times.rst.txt                     |  14 +-
 .../tune_conv2d_layer_cuda.rst.txt                 |   4 +-
 .../tune_network_cuda.rst.txt                      |   2 +-
 .../tune_network_x86.rst.txt                       |   4 +-
 .../tune_sparse_x86.rst.txt                        | 112 +++++++--
 .../tune_with_autotvm/sg_execution_times.rst.txt   |   6 +-
 .../tune_with_autotvm/tune_conv2d_cuda.rst.txt     |  26 +-
 .../work_with_microtvm/micro_autotune.rst.txt      |  16 +-
 .../how_to/work_with_microtvm/micro_train.rst.txt  |  16 +-
 .../work_with_microtvm/sg_execution_times.rst.txt  |  10 +-
 .../work_with_relay/sg_execution_times.rst.txt     |  10 +-
 .../how_to/work_with_schedules/intrin_math.rst.txt |   2 +-
 .../work_with_schedules/sg_execution_times.rst.txt |  14 +-
 .../how_to/work_with_schedules/tensorize.rst.txt   |   2 +-
 .../tutorials/autotvm/sg_execution_times.rst.txt   |   4 +-
 .../frontend/deploy_classification.rst.txt         |   2 +-
 .../tutorials/frontend/deploy_detection.rst.txt    |   2 +-
 .../tutorials/frontend/sg_execution_times.rst.txt  |   6 +-
 .../tutorials/optimize/sg_execution_times.rst.txt  |   6 +-
 .../topic/vta/tutorials/sg_execution_times.rst.txt |   6 +-
 .../tutorial/auto_scheduler_matmul_x86.rst.txt     |   7 +-
 docs/_sources/tutorial/autotvm_matmul_x86.rst.txt  |  20 +-
 docs/_sources/tutorial/autotvm_relay_x86.rst.txt   |  58 ++---
 .../tutorial/cross_compilation_and_rpc.rst.txt     |   2 +-
 docs/_sources/tutorial/intro_topi.rst.txt          |   2 +-
 docs/_sources/tutorial/sg_execution_times.rst.txt  |  26 +-
 .../tutorial/tensor_expr_get_started.rst.txt       |  49 ++--
 docs/commit_hash                                   |   2 +-
 docs/how_to/compile_models/from_darknet.html       |   2 +-
 docs/how_to/compile_models/from_keras.html         |   2 +-
 docs/how_to/compile_models/from_mxnet.html         |   2 +-
 docs/how_to/compile_models/from_oneflow.html       |  14 +-
 docs/how_to/compile_models/from_pytorch.html       |   7 +-
 docs/how_to/compile_models/from_tensorflow.html    |   2 +-
 docs/how_to/compile_models/sg_execution_times.html |  22 +-
 .../deploy_models/deploy_model_on_android.html     |   2 +-
 .../deploy_object_detection_pytorch.html           |  20 +-
 docs/how_to/deploy_models/deploy_prequantized.html |   8 +-
 .../deploy_models/deploy_prequantized_tflite.html  |   4 +-
 docs/how_to/deploy_models/deploy_quantized.html    |   2 +-
 docs/how_to/deploy_models/deploy_ssd_gluoncv.html  |  39 +--
 docs/how_to/deploy_models/sg_execution_times.html  |  18 +-
 .../extend_tvm/bring_your_own_datatypes.html       |   2 +-
 docs/how_to/extend_tvm/sg_execution_times.html     |  10 +-
 docs/how_to/extend_tvm/use_pass_instrument.html    |  16 +-
 docs/how_to/optimize_operators/opt_conv_cuda.html  |   2 +-
 .../optimize_operators/opt_conv_tensorcore.html    |   2 +-
 docs/how_to/optimize_operators/opt_gemm.html       |  16 +-
 .../optimize_operators/sg_execution_times.html     |   8 +-
 .../sg_execution_times.html                        |  14 +-
 .../tune_conv2d_layer_cuda.html                    |   4 +-
 .../tune_with_autoscheduler/tune_network_cuda.html |   2 +-
 .../tune_with_autoscheduler/tune_network_x86.html  |   4 +-
 .../tune_with_autoscheduler/tune_sparse_x86.html   | 112 +++++++--
 .../tune_with_autotvm/sg_execution_times.html      |   6 +-
 .../how_to/tune_with_autotvm/tune_conv2d_cuda.html |  26 +-
 docs/how_to/work_with_microtvm/micro_autotune.html |  16 +-
 docs/how_to/work_with_microtvm/micro_train.html    |  16 +-
 .../work_with_microtvm/sg_execution_times.html     |  10 +-
 .../how_to/work_with_relay/sg_execution_times.html |  10 +-
 docs/how_to/work_with_schedules/intrin_math.html   |   2 +-
 .../work_with_schedules/sg_execution_times.html    |  14 +-
 docs/how_to/work_with_schedules/tensorize.html     |   2 +-
 .../classtvm_1_1tir_1_1ScheduleNode-members.html   |   2 +-
 .../doxygen/classtvm_1_1tir_1_1ScheduleNode.html   |  22 +-
 docs/reference/api/doxygen/database_8h_source.html |   2 +-
 docs/reference/api/doxygen/functions_func_t.html   |   2 +-
 docs/reference/api/doxygen/functions_t.html        |   2 +-
 .../api/doxygen/measure__candidate_8h_source.html  |   2 +-
 docs/reference/api/doxygen/postproc_8h_source.html |   2 +-
 .../api/doxygen/schedule__rule_8h_source.html      |   2 +-
 docs/reference/api/doxygen/search/all_15.js        |   2 +-
 docs/reference/api/doxygen/search/functions_14.js  |   2 +-
 .../doxygen/tir_2schedule_2schedule_8h_source.html |   4 +-
 docs/reference/api/doxygen/trace_8h_source.html    |   2 +-
 docs/reference/api/python/auto_scheduler.html      |   4 +-
 docs/reference/api/python/tir.html                 |  41 +++-
 .../api/typedoc/classes/bytestreamreader.html      |  12 +-
 .../api/typedoc/classes/cachedcallstack.html       |  34 +--
 docs/reference/api/typedoc/classes/dldatatype.html |  12 +-
 docs/reference/api/typedoc/classes/dldevice.html   |  10 +-
 .../reference/api/typedoc/classes/environment.html |  12 +-
 docs/reference/api/typedoc/classes/ffilibrary.html |  20 +-
 .../api/typedoc/classes/graphexecutor.html         |  16 +-
 docs/reference/api/typedoc/classes/instance.html   |  40 ++--
 docs/reference/api/typedoc/classes/memory.html     |  34 +--
 docs/reference/api/typedoc/classes/module.html     |  10 +-
 docs/reference/api/typedoc/classes/ndarray.html    |  22 +-
 .../api/typedoc/classes/packedfunccell.html        |   6 +-
 docs/reference/api/typedoc/classes/rpcserver.html  |  14 +-
 docs/reference/api/typedoc/classes/scalar.html     |   6 +-
 .../api/typedoc/classes/webgpucontext.html         |  12 +-
 docs/reference/api/typedoc/enums/argtypecode.html  |  30 +--
 .../api/typedoc/enums/aynccallbackcode.html        |   4 +-
 .../api/typedoc/enums/dldatatypecode.html          |   8 +-
 .../api/typedoc/enums/rpcserverstate.html          |  12 +-
 docs/reference/api/typedoc/enums/sizeof.html       |  18 +-
 docs/reference/api/typedoc/index.html              | 112 ++++-----
 .../api/typedoc/interfaces/disposable.html         |   2 +-
 .../api/typedoc/interfaces/functioninfo.html       |   6 +-
 .../api/typedoc/interfaces/libraryprovider.html    |   4 +-
 docs/searchindex.js                                |   2 +-
 .../vta/tutorials/autotvm/sg_execution_times.html  |   4 +-
 .../tutorials/frontend/deploy_classification.html  |   2 +-
 .../vta/tutorials/frontend/deploy_detection.html   |   2 +-
 .../vta/tutorials/frontend/sg_execution_times.html |   6 +-
 .../vta/tutorials/optimize/sg_execution_times.html |   6 +-
 docs/topic/vta/tutorials/sg_execution_times.html   |   6 +-
 docs/tutorial/auto_scheduler_matmul_x86.html       |   3 +-
 docs/tutorial/autotvm_matmul_x86.html              |  20 +-
 docs/tutorial/autotvm_relay_x86.html               | 262 ++++++++++-----------
 docs/tutorial/cross_compilation_and_rpc.html       |   2 +-
 docs/tutorial/intro_topi.html                      |   2 +-
 docs/tutorial/sg_execution_times.html              |  30 +--
 docs/tutorial/tensor_expr_get_started.html         |  45 ++--
 136 files changed, 1055 insertions(+), 882 deletions(-)

diff --git a/docs/_sources/how_to/compile_models/from_darknet.rst.txt b/docs/_sources/how_to/compile_models/from_darknet.rst.txt
index e94bc82ca0..d615cd5731 100644
--- a/docs/_sources/how_to/compile_models/from_darknet.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_darknet.rst.txt
@@ -315,7 +315,7 @@ The process is no different from other examples.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  3.949 seconds)
+   **Total running time of the script:** ( 1 minutes  1.628 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_darknet.py:
diff --git a/docs/_sources/how_to/compile_models/from_keras.rst.txt b/docs/_sources/how_to/compile_models/from_keras.rst.txt
index dcae0dcf6a..509c185f80 100644
--- a/docs/_sources/how_to/compile_models/from_keras.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_keras.rst.txt
@@ -228,7 +228,7 @@ Look up prediction top 1 index in 1000 class synset.
  .. code-block:: none
 
     Relay top-1 id: 285, class name: Egyptian cat
-
    1/1 [==============================] - ETA: 0s
    1/1 [==============================] - 1s 987ms/step
+
    1/1 [==============================] - ETA: 0s
    1/1 [==============================] - 1s 952ms/step
     Keras top-1 id: 285, class name: Egyptian cat
 
 
diff --git a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
index 99d8279733..5e6e0c6f9c 100644
--- a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
@@ -115,7 +115,7 @@ In this section, we download a pretrained imagenet model and classify an image.
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip308a65c9-68f3-4e7e-8c8d-ca72843ec134 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip2c209c9a-c05d-45eb-9c4b-0f8f26611a57 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
     x (1, 3, 224, 224)
 
 
diff --git a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
index dda24c93c0..c0b8305000 100644
--- a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
@@ -116,7 +116,7 @@ Load a pretrained OneFlow model and save model
  .. code-block:: none
 
     Downloading: "https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip" to /workspace/.oneflow/flowvision_cache/resnet18.zip
-
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
     19%|#9        | 7.99M/41.5M [00:00<00:00, 48.3MB/s]
     39%|###8      | 16.0M/41.5M [00:00<00:00, 52.5MB/s]
     54%|#####3    | 22.3M/41.5M [00:00<00:00, 48.1MB/s]
     65%|######4   | 26.9M/41.5M [00:00<00:00, 44.1MB/s]
     82%|########2 | 34.1M/41.5M [00:00<00:00, 45.2MB/s]
     96%|#########6| 40.0M/41.5M [00:00<00:00, 46.9MB/s]
    100%|##########| 41.5M/41.5M [00:00<00:00, 48.4MB/s]
+
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
     19%|#9        | 7.99M/41.5M [00:00<00:00, 41.1MB/s]
     39%|###8      | 16.0M/41.5M [00:00<00:00, 45.5MB/s]
     55%|#####4    | 22.6M/41.5M [00:00<00:00, 53.1MB/s]
     68%|######7   | 28.0M/41.5M [00:00<00:00, 50.6MB/s]
     80%|#######9  | 33.1M/41.5M [00:00<00:00, 48.0MB/s]
     92%|#########2| 38.3M/41.5M [00:00<00:00, 46.8MB/s]
    100%|##########| 41.5M/41.5M [00:00<00:00, 48.0MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
index 44f31bd7dd..da5549dc47 100644
--- a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
@@ -94,7 +94,7 @@ Load a pretrained PyTorch model
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
-
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
      8%|8         | 3.66M/44.7M [00:00<00:01, 38.1MB/s]
     17%|#7        | 7.65M/44.7M [00:00<00:00, 40.3MB/s]
     56%|#####6    | 25.2M/44.7M [00:00<00:00, 105MB/s] 
    100%|##########| 44.7M/44.7M [00:00<00:00, 123MB/s]
+
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
     43%|####3     | 19.3M/44.7M [00:00<00:00, 203MB/s]
    100%|##########| 44.7M/44.7M [00:00<00:00, 239MB/s]
    100%|##########| 44.7M/44.7M [00:00<00:00, 234MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
index 140e67a18e..cf9586735f 100644
--- a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
@@ -416,7 +416,7 @@ Run the corresponding model on tensorflow
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  8.460 seconds)
+   **Total running time of the script:** ( 1 minutes  4.903 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_tensorflow.py:
diff --git a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
index 63227f0e6a..cae5bfdfbe 100644
--- a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
@@ -5,26 +5,26 @@
 
 Computation times
 =================
-**05:18.194** total execution time for **how_to_compile_models** files:
+**05:05.089** total execution time for **how_to_compile_models** files:
 
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``) | 01:08.460 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``) | 01:04.903 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)       | 01:03.949 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)       | 01:01.628 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)         | 00:41.236 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)         | 00:38.624 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)       | 00:30.049 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)       | 00:28.011 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)           | 00:27.178 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)           | 00:26.726 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)         | 00:25.552 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)         | 00:24.764 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)         | 00:22.025 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)         | 00:21.644 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)       | 00:21.107 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)       | 00:20.024 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)           | 00:16.229 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)           | 00:16.379 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)             | 00:02.409 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)             | 00:02.386 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
index 7a2279f610..e46e0b57de 100644
--- a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
@@ -434,7 +434,7 @@ Execute on TVM
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      16.2342      16.0759      16.7901      15.7561       0.3800   
+      16.2713      15.7376      20.6259      15.6294       1.4588   
                
 
 
diff --git a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
index a202708cf8..72f26e6aa8 100644
--- a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
@@ -123,7 +123,7 @@ Load pre-trained maskrcnn from torchvision and do tracing
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth" to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
-
      0%|          | 0.00/170M [00:00<?, ?B/s]
      2%|2         | 3.91M/170M [00:00<00:04, 40.9MB/s]
      5%|4         | 7.84M/170M [00:00<00:04, 41.1MB/s]
     12%|#2        | 21.0M/170M [00:00<00:01, 85.3MB/s]
     25%|##4       | 42.2M/170M [00:00<00:00, 139MB/s] 
     39%|###8      | 65.4M/170M [00:00<00:00, 177MB/s]
     49%|####9     | 83.7M/170M [00:00<00:00, 182MB/s]
     60%|#####9    | 101M/170M [00:00<00:00, 181MB/s] 
     71%|#######1  | 121M/170M [00:00<00:00, 189MB/s]
     84%|########3 | 142M/170M [00:00<00:00, 200MB/s]
     99%|#########8| 168M/170M [00:01<00:00, 221MB/s]
    100%|##########| 170M/170M [00:01<00:00, 176MB/s]
+
      0%|          | 0.00/170M [00:00<?, ?B/s]
     11%|#1        | 18.9M/170M [00:00<00:00, 198MB/s]
     24%|##3       | 40.2M/170M [00:00<00:00, 213MB/s]
     39%|###9      | 66.6M/170M [00:00<00:00, 242MB/s]
     55%|#####5    | 93.8M/170M [00:00<00:00, 259MB/s]
     72%|#######1  | 122M/170M [00:00<00:00, 272MB/s] 
     89%|########8 | 151M/170M [00:00<00:00, 282MB/s]
    100%|##########| 170M/170M [00:00<00:00, 265MB/s]
     /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
       for i in range(dim)
     /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
@@ -288,7 +288,7 @@ Get boxes with score larger than 0.9
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  57.751 seconds)
+   **Total running time of the script:** ( 2 minutes  53.848 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_object_detection_pytorch.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
index c60e017e12..68e6e1a9e0 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
@@ -232,7 +232,7 @@ training. Other models require a full post training calibration.
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/mobilenet_v2-b0353104.pth" to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
-
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
     23%|##2       | 3.10M/13.6M [00:00<00:00, 32.5MB/s]
     46%|####5     | 6.20M/13.6M [00:00<00:00, 31.6MB/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 47.9MB/s]
+
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 169MB/s]
 
 
 
@@ -405,7 +405,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      90.2953      90.2327      91.4960      90.1105       0.1933   
+      90.2392      90.1272      93.5011      89.9589       0.4160   
                
 
 
@@ -454,7 +454,7 @@ TODO
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  9.340 seconds)
+   **Total running time of the script:** ( 1 minutes  7.858 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
index 08f1ab5beb..120fb0ef26 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
@@ -432,7 +432,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      119.9456     119.9569     120.4957     119.2269      0.2380   
+      118.1101     118.1413     123.6279     115.9274      1.0153   
                
 
 
@@ -469,7 +469,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  2.546 seconds)
+   **Total running time of the script:** ( 1 minutes  57.521 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized_tflite.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
index 2ef2409c78..6e2f157887 100644
--- a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
@@ -253,7 +253,7 @@ We create a Relay VM to build and execute the model.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  22.959 seconds)
+   **Total running time of the script:** ( 1 minutes  22.663 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_quantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
index 13efab9f8e..14261d2896 100644
--- a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
@@ -158,7 +158,7 @@ Convert and compile model for CPU.
             data: None
       input_sym_arg_type = in_param.infer_type()[0]
     Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
-
      0%|          | 0/132723 [00:00<?, ?KB/s]
      3%|2         | 3874/132723 [00:00<00:03, 38737.27KB/s]
      8%|8         | 11013/132723 [00:00<00:02, 57939.44KB/s]
     14%|#3        | 18488/132723 [00:00<00:01, 65611.77KB/s]
     20%|#9        | 26144/132723 [00:00<00:01, 69932.56KB/s]
     25%|##5       | 33748/132723 [00:00<00:01, 72133.18KB/s]
     31%|###1      | 41382/132723 [00:00<00:01, 73561.85KB/s]
     37%|###6      | 49013/132723 [00:00<00:01, 74458.29KB/s]
     43%|####2     | 56693/132723 [00:00<00:01, 75198.00KB/s]
     48%|####8     | 64345/132723 [00:00<00:00, 75607.94KB/s]
     54%|#####4    | 72058/132723 [00:01<00:00, 76075.29KB/s]
     60%|######    | 79783/132723 [00:01<00:00, 76432.57KB/s]
     66%|######5   | 87454/132723 [00:01<00:00, 76514.88KB/s]
     72%|#######1  | 95128/132723 [00:01<00:00, 76581.32KB/s]
     77%|#######7  | 102788/132723 [00:01<00:00, 76585.44KB/s]
     83%|########3 | 110490/132723 [00:01<00:00, 76713.47KB/s]
     89%|########9
  | 118162/132723 [00:01<00:00, 76601.78KB/s]
     95%|#########4| 125887/132723 [00:01<00:00, 76781.28KB/s]
    100%|##########| 132723/132723 [00:01<00:00, 74125.95KB/s]
+
      0%|          | 0/132723 [00:00<?, ?KB/s]
      2%|1         | 2323/132723 [00:00<00:05, 23106.95KB/s]
      6%|5         | 7310/132723 [00:00<00:03, 38812.83KB/s]
     11%|#         | 14533/132723 [00:00<00:02, 54052.23KB/s]
     17%|#6        | 22158/132723 [00:00<00:01, 62802.53KB/s]
     22%|##2       | 29749/132723 [00:00<00:01, 67522.84KB/s]
     28%|##8       | 37313/132723 [00:00<00:01, 70278.47KB/s]
     34%|###3      | 44850/132723 [00:00<00:01, 71938.34KB/s]
     39%|###9      | 52415/132723 [00:00<00:01, 73118.08KB/s]
     45%|####5     | 59961/132723 [00:00<00:00, 73846.02KB/s]
     51%|#####     | 67377/132723 [00:01<00:00, 73938.93KB/s]
     56%|#####6    | 74911/132723 [00:01<00:00, 74365.90KB/s]
     62%|######2   | 82521/132723 [00:01<00:00, 74889.86KB/s]
     68%|######7   | 90022/132723 [00:01<00:00, 74924.27KB/s]
     74%|#######3  | 97657/132723 [00:01<00:00, 75347.24KB/s]
     79%|#######9  | 105203/132723 [00:01<00:00, 75373.61KB/s]
     85%|########4 |
  112810/132723 [00:01<00:00, 75581.47KB/s]
     91%|######### | 120409/132723 [00:01<00:00, 75699.60KB/s]
     97%|#########6| 128105/132723 [00:01<00:00, 76072.69KB/s]
    100%|##########| 132723/132723 [00:01<00:00, 71348.18KB/s]
 
 
 
@@ -234,7 +234,7 @@ Display result
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  36.767 seconds)
+   **Total running time of the script:** ( 2 minutes  35.889 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_ssd_gluoncv.py:
diff --git a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
index 694ec25c21..30857abd73 100644
--- a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
@@ -5,24 +5,24 @@
 
 Computation times
 =================
-**11:24.857** total execution time for **how_to_deploy_models** files:
+**11:12.321** total execution time for **how_to_deploy_models** files:
 
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``) | 02:57.751 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``) | 02:53.848 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)                           | 02:36.767 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)                           | 02:35.889 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)           | 02:02.546 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)           | 01:57.521 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)                               | 01:22.959 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)                               | 01:22.663 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)                         | 01:09.340 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)                         | 01:07.858 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)                 | 00:30.067 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)                 | 00:29.641 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_nano.py` (``deploy_model_on_nano.py``)                       | 00:23.191 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_nano.py` (``deploy_model_on_nano.py``)                       | 00:22.665 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)                       | 00:22.229 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)                       | 00:22.230 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_deploy_models_deploy_sparse.py` (``deploy_sparse.py``)                                     | 00:00.006 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
index 4adb50beca..a64e9fb66d 100644
--- a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
@@ -472,7 +472,7 @@ First let us define two helper functions to get the mobilenet model and a cat im
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zipf85191a6-1797-488c-8310-b8b08200946a from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip81e92deb-670b-4325-98c0-cfac8ef7795d from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 
 
 
diff --git a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
index 6e40a2813e..42da86b106 100644
--- a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
@@ -5,14 +5,14 @@
 
 Computation times
 =================
-**00:41.127** total execution time for **how_to_extend_tvm** files:
+**00:40.365** total execution time for **how_to_extend_tvm** files:
 
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``) | 00:38.012 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``) | 00:37.296 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)           | 00:02.193 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)           | 00:02.148 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)                     | 00:00.914 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)                     | 00:00.913 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_low_level_custom_pass.py` (``low_level_custom_pass.py``)       | 00:00.008 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_low_level_custom_pass.py` (``low_level_custom_pass.py``)       | 00:00.007 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
index f97d3b79ea..7cb6f27830 100644
--- a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
@@ -216,10 +216,10 @@ profile the execution time of each passes.
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 6695us [6695us] (46.43%; 46.43%)
-    FoldScaleAxis: 7724us [6us] (53.57%; 53.57%)
-            FoldConstant: 7718us [1578us] (53.53%; 99.93%)
-                    InferType: 6140us [6140us] (42.59%; 79.56%)
+    InferType: 6699us [6699us] (45.83%; 45.83%)
+    FoldScaleAxis: 7917us [5us] (54.17%; 54.17%)
+            FoldConstant: 7912us [1637us] (54.13%; 99.94%)
+                    InferType: 6275us [6275us] (42.93%; 79.31%)
 
 
 
@@ -258,10 +258,10 @@ Refer to following sections and :py:func:`tvm.instrument.pass_instrument` for th
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 6151us [6151us] (44.81%; 44.81%)
-    FoldScaleAxis: 7576us [5us] (55.19%; 55.19%)
-            FoldConstant: 7571us [1519us] (55.16%; 99.94%)
-                    InferType: 6052us [6052us] (44.09%; 79.94%)
+    InferType: 6321us [6321us] (44.63%; 44.63%)
+    FoldScaleAxis: 7842us [5us] (55.37%; 55.37%)
+            FoldConstant: 7838us [1624us] (55.34%; 99.94%)
+                    InferType: 6214us [6214us] (43.87%; 79.28%)
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
index 6a839fe94a..558c81f216 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
@@ -340,7 +340,7 @@ latency of convolution.
 
  .. code-block:: none
 
-    Convolution: 54.149073 ms
+    Convolution: 33.803029 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
index 2431a80cbf..c7502fe3a6 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
@@ -671,7 +671,7 @@ be able to run on our build server
 
  .. code-block:: none
 
-    conv2d with tensor core: 7.165055 ms
+    conv2d with tensor core: 8.071042 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
index 5ca939f151..89e353e054 100644
--- a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
@@ -143,8 +143,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 
  .. code-block:: none
 
-    Numpy running time: 0.019315
-    Baseline: 3.299040
+    Numpy running time: 0.018574
+    Baseline: 3.547201
 
 
 
@@ -239,7 +239,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 
  .. code-block:: none
 
-    Opt1: 0.307622
+    Opt1: 0.294292
 
 
 
@@ -342,7 +342,7 @@ In this tutorial, we chose to vectorize the inner loop row data since it is cach
 
  .. code-block:: none
 
-    Opt2: 0.355002
+    Opt2: 0.329266
 
 
 
@@ -438,7 +438,7 @@ the access pattern for A matrix is more cache friendly.
 
  .. code-block:: none
 
-    Opt3: 0.118580
+    Opt3: 0.115266
 
 
 
@@ -563,7 +563,7 @@ flattening.
 
  .. code-block:: none
 
-    Opt4: 0.109346
+    Opt4: 0.109356
 
 
 
@@ -685,7 +685,7 @@ write to C when all the block results are ready.
 
  .. code-block:: none
 
-    Opt5: 0.110808
+    Opt5: 0.111867
 
 
 
@@ -810,7 +810,7 @@ Furthermore, we can also utilize multi-core processors to do the thread-level pa
 
  .. code-block:: none
 
-    Opt6: 0.146752
+    Opt6: 0.147408
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
index 4e48a3b477..bf3d56b6e5 100644
--- a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
@@ -5,12 +5,12 @@
 
 Computation times
 =================
-**00:34.531** total execution time for **how_to_optimize_operators** files:
+**00:34.777** total execution time for **how_to_optimize_operators** files:
 
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)                       | 00:32.219 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)                       | 00:32.513 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``) | 00:01.245 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``) | 00:01.244 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)             | 00:01.067 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)             | 00:01.019 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
index 248a524eab..cb75710401 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
@@ -5,18 +5,18 @@
 
 Computation times
 =================
-**06:14.329** total execution time for **how_to_tune_with_autoscheduler** files:
+**06:18.811** total execution time for **how_to_tune_with_autoscheduler** files:
 
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``) | 03:18.588 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``) | 03:23.028 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)             | 01:23.035 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)             | 01:22.364 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)           | 00:56.131 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)           | 00:56.087 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)               | 00:19.052 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)               | 00:20.042 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)           | 00:08.818 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)           | 00:08.729 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)             | 00:08.705 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)             | 00:08.562 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
index d37ba4e9e4..1afdaaf1df 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
@@ -771,7 +771,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 0.353 ms
+    Execution time of this operator: 0.351 ms
 
 
 
@@ -1378,7 +1378,7 @@ In the example below we resume the status and do more 5 trials.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 3 minutes  18.588 seconds)
+   **Total running time of the script:** ( 3 minutes  23.028 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
index 9d71e88b97..0c010f3554 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
@@ -643,7 +643,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       8.2317       8.2330       8.2351       8.2270       0.0034   
+       8.1813       8.1746       8.1988       8.1705       0.0125   
                
 
 
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
index 087a9670f5..7ad0a02ac4 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
@@ -662,7 +662,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      763.7009     763.8071     764.2583     763.0374      0.5041   
+      759.0404     759.5762     761.6077     755.9372      2.3458   
                
 
 
@@ -690,7 +690,7 @@ Other Tips
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  23.035 seconds)
+   **Total running time of the script:** ( 1 minutes  22.364 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_network_x86.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
index fda9de978b..63a02949df 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
@@ -397,31 +397,103 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
                  placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
                  compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
       buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-      preflattened_buffer_map = {placeholder_7: placeholder_15: Buffer(placeholder_12, int32, [4916], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_5: placeholder_17: Buffer(placeholder_10, float32, [128, 256], []), placeholder_9: placeholder_18: Buffer(placeholder_14, float32, [128, 512], []), placeholder_8: placeholder_19: Buffer(placeholder_13, int32, [33], [])} {
-      for (i0.outer: int32, 0, 8) "parallel" {
-        allocate(compute_4: Pointer(global float32), float32, [512]), storage_scope = global;
-        for (i1.outer: int32, 0, 16) {
-          for (nb_j.inner: int32, 0, 2) {
-            for (i.inner.init: int32, 0, 16) {
-              for (j.init: int32, 0, 16) {
-                compute_5: Buffer(compute_4, float32, [512], [])[(((i.inner.init*32) + (nb_j.inner*16)) + j.init)] = 0f32
+      preflattened_buffer_map = {placeholder_9: placeholder_15: Buffer(placeholder_14, float32, [128, 512], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_8: placeholder_16: Buffer(placeholder_13, int32, [33], []), placeholder_6: placeholder_17: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_7: placeholder_19: Buffer(placeholder_12, int32, [4916], [])} {
+      for (i0.outer.i1.outer.fused: int32, 0, 32) "parallel" {
+        allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global {
+          for (i.outer.inner: int32, 0, 2) {
+            for (i.inner.init: int32, 0, 64) {
+              let cse_var_1: int32 = ((i.outer.inner*1024) + (i.inner.init*16))
+               {
+                compute_5: Buffer(compute_4, float32, [2048], [])[cse_var_1] = 0f32
+                compute_5[(cse_var_1 + 1)] = 0f32
+                compute_5[(cse_var_1 + 2)] = 0f32
+                compute_5[(cse_var_1 + 3)] = 0f32
+                compute_5[(cse_var_1 + 4)] = 0f32
+                compute_5[(cse_var_1 + 5)] = 0f32
+                compute_5[(cse_var_1 + 6)] = 0f32
+                compute_5[(cse_var_1 + 7)] = 0f32
+                compute_5[(cse_var_1 + 8)] = 0f32
+                compute_5[(cse_var_1 + 9)] = 0f32
+                compute_5[(cse_var_1 + 10)] = 0f32
+                compute_5[(cse_var_1 + 11)] = 0f32
+                compute_5[(cse_var_1 + 12)] = 0f32
+                compute_5[(cse_var_1 + 13)] = 0f32
+                compute_5[(cse_var_1 + 14)] = 0f32
+                compute_5[(cse_var_1 + 15)] = 0f32
               }
             }
-            for (elem_idx: int32, 0, let cse_var_1: int32 = ((i1.outer*2) + nb_j.inner) in (placeholder_3[(cse_var_1 + 1)] - placeholder_3[cse_var_1])) {
-              for (i.inner: int32, 0, 16) {
-                for (j: int32, 0, 16) {
-                  let cse_var_3: int32 = ((i1.outer*2) + nb_j.inner)
-                  let cse_var_2: int32 = (((i.inner*32) + (nb_j.inner*16)) + j)
-                  compute_5[cse_var_2] = (compute_5[cse_var_2] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + j)]*max(placeholder[(((i0.outer*4096) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+            for (elem_idx: int32, 0, (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])) {
+              for (i.inner: int32, 0, 64) {
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_2: int32 = ((i.outer.inner*1024) + (i.inner*16))
+                  compute_5[cse_var_2] = (compute_5[cse_var_2] + (placeholder_1[((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16))]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_3: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 1)
+                  compute_5[cse_var_3] = (compute_5[cse_var_3] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 1)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_4: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 2)
+                  compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 2)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_5: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 3)
+                  compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 3)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_6: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 4)
+                  compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 4)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_7: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 5)
+                  compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 5)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_8: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 6)
+                  compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 6)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_9: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 7)
+                  compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 7)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_10: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 8)
+                  compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 8)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_11: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 9)
+                  compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 9)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_12: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 10)
+                  compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 10)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_13: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 11)
+                  compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 11)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_14: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 12)
+                  compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 12)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_15: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 13)
+                  compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 13)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_16: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 14)
+                  compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 14)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+                }
+                if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+                  let cse_var_17: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 15)
+                  compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 15)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
                 }
               }
             }
           }
-          for (i0.inner: int32, 0, 16) {
-            for (i1.inner: int32, 0, 32) {
-              let cse_var_4: int32 = ((((i0.outer*8192) + (i0.inner*512)) + (i1.outer*32)) + i1.inner)
-              compute[cse_var_4] = max((compute_5[((i0.inner*32) + i1.inner)] + placeholder_4[cse_var_4]), 0f32)
-            }
+          for (i0.inner: int32, 0, 128) {
+            let cse_var_18: int32 = ((i0.inner*512) + (i0.outer.i1.outer.fused*16))
+            compute[ramp(cse_var_18, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_18, 1, 16)]), broadcast(0f32, 16))
           }
         }
       }
@@ -477,7 +549,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 1.472 ms
+    Execution time of this operator: 1.847 ms
 
 
 
diff --git a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
index df683fa623..5ff7f43f75 100644
--- a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
@@ -5,12 +5,12 @@
 
 Computation times
 =================
-**00:46.722** total execution time for **how_to_tune_with_autotvm** files:
+**00:45.782** total execution time for **how_to_tune_with_autotvm** files:
 
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)           | 00:46.684 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)           | 00:45.748 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_x86.py` (``tune_relay_x86.py``)               | 00:00.023 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_x86.py` (``tune_relay_x86.py``)               | 00:00.019 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_cuda.py` (``tune_relay_cuda.py``)             | 00:00.005 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
index 4cf8ce3535..64717b7127 100644
--- a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
@@ -1156,8 +1156,8 @@ for this template
     TimeoutError
 
             [('tile_f', [-1, 2, 1, 64]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4909501
-    No: 9   GFLOPS: 122.34/122.34   result: MeasureResult(costs=(0.0018922917142857143,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.066124200820923, timestamp=1663570950.7812462)       [('tile_f', [-1, 1, 4, 8]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 2, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5072689
-    No: 10  GFLOPS: 0.00/122.34     result: Traceback (most recent call last):
+    No: 9   GFLOPS: 80.76/80.76     result: MeasureResult(costs=(0.002866573742857143,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6295883655548096, timestamp=1663598748.3520875)       [('tile_f', [-1, 1, 4, 8]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 2, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5072689
+    No: 10  GFLOPS: 0.00/80.76      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1280,8 +1280,8 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 4, 8]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 64, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5092711
-    No: 11  GFLOPS: 260.23/260.23   result: MeasureResult(costs=(0.0008895890939226519,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.72236967086792, timestamp=1663570951.7091606)        [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
-    No: 12  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+    No: 11  GFLOPS: 260.43/260.43   result: MeasureResult(costs=(0.0008889050165745856,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.740246057510376, timestamp=1663598749.2615347)       [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
+    No: 12  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1404,7 +1404,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 128, 1, 2]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 256]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,183542
-    No: 13  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+    No: 13  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1527,7 +1527,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 8, 8]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 64]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2482196
-    No: 14  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+    No: 14  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1650,9 +1650,9 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 64, 1, 4]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10306226
-    No: 15  GFLOPS: 5.47/260.23     result: MeasureResult(costs=(0.042334316250000004,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8582491874694824, timestamp=1663570956.3224638)       [('tile_f', [-1, 2, 2, 8]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 8]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,5330964
-    No: 16  GFLOPS: 3.35/260.23     result: MeasureResult(costs=(0.06910300125,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.610316753387451, timestamp=1663570957.5599132)       [('tile_f', [-1, 8, 4, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2140058
-    No: 17  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+    No: 15  GFLOPS: 5.48/260.43     result: MeasureResult(costs=(0.0422195405,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8663091659545898, timestamp=1663598753.808667)        [('tile_f', [-1, 2, 2, 8]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 8]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,5330964
+    No: 16  GFLOPS: 3.36/260.43     result: MeasureResult(costs=(0.06892734625,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.543661117553711, timestamp=1663598755.042103)        [('tile_f', [-1, 8, 4, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2140058
+    No: 17  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 142, in build
         res = future.result()
       File "/usr/lib/python3.7/concurrent/futures/_base.py", line 435, in result
@@ -1670,8 +1670,8 @@ for this template
     TimeoutError
 
             [('tile_f', [-1, 2, 2, 1]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 16]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10195251
-    No: 18  GFLOPS: 25.42/260.23    result: MeasureResult(costs=(0.009107807090909092,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2614367008209229, timestamp=1663570968.5666356)       [('tile_f', [-1, 4, 8, 4]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6068603
-    No: 19  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+    No: 18  GFLOPS: 26.32/260.43    result: MeasureResult(costs=(0.008796826,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.1562161445617676, timestamp=1663598765.9055436)        [('tile_f', [-1, 4, 8, 4]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6068603
+    No: 19  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1794,7 +1794,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 16, 4, 8]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 128]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6956993
-    No: 20  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+    No: 20  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1973,7 +1973,7 @@ and measure running time.
     Best config:
     [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
     Finish loading 20 records
-    Time cost of this operator: 0.001281
+    Time cost of this operator: 0.001219
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
index f1945586fe..345f031cb7 100644
--- a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
@@ -327,10 +327,10 @@ Timing the untuned program
     ########## Build without Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)  
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  310.7     98.73    (1, 2, 10, 10, 3)  2       1        [310.7]           
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.018     0.959    (1, 6, 10, 10)     1       1        [3.018]           
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.978     0.311    (1, 1, 10, 10, 3)  1       1        [0.978]           
-    Total_time                                    -                                             314.696   -        -                  -       -        -                 
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  311.5     98.726   (1, 2, 10, 10, 3)  2       1        [311.5]           
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.067     0.972    (1, 6, 10, 10)     1       1        [3.067]           
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.954     0.302    (1, 1, 10, 10, 3)  1       1        [0.954]           
+    Total_time                                    -                                             315.521   -        -                  -       -        -                 
 
 
 
@@ -394,10 +394,10 @@ Timing the tuned program
     ########## Build with Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)  
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  79.625    96.674   (1, 6, 10, 10, 1)  2       1        [79.625]          
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.767     2.146    (1, 6, 10, 10)     1       1        [1.767]           
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.972     1.18     (1, 1, 10, 10, 3)  1       1        [0.972]           
-    Total_time                                    -                                             82.365    -        -                  -       -        -                 
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  130.3     97.914   (1, 6, 10, 10, 1)  2       1        [130.3]           
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.805     1.356    (1, 6, 10, 10)     1       1        [1.805]           
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.971     0.73     (1, 1, 10, 10, 3)  1       1        [0.971]           
+    Total_time                                    -                                             133.076   -        -                  -       -        -                 
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt b/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
index 8ec6ea171f..eb87d29d5d 100644
--- a/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
@@ -225,7 +225,7 @@ take about **2 minutes** to download the Stanford Cars, while COCO 2017 validati
  .. code-block:: none
 
 
-    '/tmp/tmp3t0xlwvq/images/random'
+    '/tmp/tmp_74ha2pj/images/random'
 
 
 
@@ -325,8 +325,8 @@ objects to other stuff? We can display some examples from our datasets using ``m
 
  .. code-block:: none
 
-    /tmp/tmp3t0xlwvq/images/target contains 8144 images
-    /tmp/tmp3t0xlwvq/images/random contains 5000 images
+    /tmp/tmp_74ha2pj/images/target contains 8144 images
+    /tmp/tmp_74ha2pj/images/random contains 5000 images
 
 
 
@@ -501,13 +501,13 @@ the time on our validation set).
  .. code-block:: none
 
     Epoch 1/3
-    328/328 - 47s - loss: 0.2141 - accuracy: 0.9268 - val_loss: 0.1278 - val_accuracy: 0.9569 - 47s/epoch - 144ms/step
+    328/328 - 46s - loss: 0.2312 - accuracy: 0.9204 - val_loss: 0.1370 - val_accuracy: 0.9554 - 46s/epoch - 141ms/step
     Epoch 2/3
-    328/328 - 43s - loss: 0.0990 - accuracy: 0.9631 - val_loss: 0.1031 - val_accuracy: 0.9660 - 43s/epoch - 132ms/step
+    328/328 - 43s - loss: 0.0946 - accuracy: 0.9645 - val_loss: 0.1184 - val_accuracy: 0.9603 - 43s/epoch - 132ms/step
     Epoch 3/3
-    328/328 - 43s - loss: 0.0634 - accuracy: 0.9771 - val_loss: 0.1065 - val_accuracy: 0.9611 - 43s/epoch - 132ms/step
+    328/328 - 43s - loss: 0.0677 - accuracy: 0.9740 - val_loss: 0.1360 - val_accuracy: 0.9596 - 43s/epoch - 131ms/step
 
-    <keras.callbacks.History object at 0x7f082c7edb10>
+    <keras.callbacks.History object at 0x7fad68c22fd0>
 
 
 
@@ -864,7 +864,7 @@ Arduino tutorial for how to do that `on GitHub <https://github.com/guberti/tvm-a
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 4 minutes  41.436 seconds)
+   **Total running time of the script:** ( 4 minutes  20.966 seconds)
 
 
 .. _sphx_glr_download_how_to_work_with_microtvm_micro_train.py:
diff --git a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
index d7b32d7c85..4ca06e0fde 100644
--- a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
@@ -5,16 +5,16 @@
 
 Computation times
 =================
-**05:35.915** total execution time for **how_to_work_with_microtvm** files:
+**05:14.379** total execution time for **how_to_work_with_microtvm** files:
 
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_train.py` (``micro_train.py``)               | 04:41.436 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_train.py` (``micro_train.py``)               | 04:20.966 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)         | 00:42.436 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)         | 00:41.957 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_aot.py` (``micro_aot.py``)                   | 00:08.732 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_aot.py` (``micro_aot.py``)                   | 00:08.158 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)             | 00:03.309 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)             | 00:03.296 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_microtvm_micro_ethosu.py` (``micro_ethosu.py``)             | 00:00.001 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
index 017bcac33d..a8293b5177 100644
--- a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
@@ -5,14 +5,14 @@
 
 Computation times
 =================
-**00:46.379** total execution time for **how_to_work_with_relay** files:
+**00:43.074** total execution time for **how_to_work_with_relay** files:
 
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_using_pipeline_executor.py` (``using_pipeline_executor.py``) | 00:32.692 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_using_pipeline_executor.py` (``using_pipeline_executor.py``) | 00:31.385 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)           | 00:11.336 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)           | 00:10.046 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)                             | 00:02.345 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)                             | 00:01.637 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_using_relay_viz.py` (``using_relay_viz.py``)                 | 00:00.007 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_using_relay_viz.py` (``using_relay_viz.py``)                 | 00:00.006 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt b/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
index 88d3f412a0..e470d8e142 100644
--- a/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
@@ -261,7 +261,7 @@ The following example customizes CUDA lowering rule for :code:`exp`.
  .. code-block:: none
 
 
-    <function my_cuda_math_rule at 0x7f07c0901d40>
+    <function my_cuda_math_rule at 0x7fad0dbcb680>
 
 
 
diff --git a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
index d8890e4406..ec7d247d74 100644
--- a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
@@ -5,22 +5,22 @@
 
 Computation times
 =================
-**00:09.899** total execution time for **how_to_work_with_schedules** files:
+**00:07.969** total execution time for **how_to_work_with_schedules** files:
 
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)                 | 00:06.996 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)                 | 00:05.692 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)                     | 00:01.600 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)                     | 00:01.030 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)                     | 00:00.571 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)                     | 00:00.542 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)                               | 00:00.546 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)                               | 00:00.525 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_extern_op.py` (``extern_op.py``)                     | 00:00.100 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_extern_op.py` (``extern_op.py``)                     | 00:00.098 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_schedules_schedule_primitives.py` (``schedule_primitives.py``) | 00:00.042 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)                               | 00:00.029 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)                               | 00:00.027 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_schedules_tuple_inputs.py` (``tuple_inputs.py``)               | 00:00.015 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
index 251f47f368..e514f9d7af 100644
--- a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
@@ -347,7 +347,7 @@ The importing needs to happen before the tensorized GEMV being executed.
                  C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
       buffer_map = {A_1: A, B_1: B, C_1: C}
       preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmpiuk6tos9/input0.cc'\nsource_filename = \"/tmp/tmpiuk6tos9/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
+      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmpdz20gvwn/input0.cc'\nsource_filename = \"/tmp/tmpdz20gvwn/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
       for (i, 0, 1024) {
         for (j.outer: int32, 0, 32) {
           @tir.call_extern("gemv_update", @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
index 235e10dff5..9807ec01b3 100644
--- a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:21.281** total execution time for **topic_vta_tutorials_autotvm** files:
+**00:21.146** total execution time for **topic_vta_tutorials_autotvm** files:
 
 +---------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``) | 00:21.275 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``) | 00:21.140 | 0.0 MB |
 +---------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_alu_vta.py` (``tune_alu_vta.py``)     | 00:00.006 | 0.0 MB |
 +---------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
index b2339845cd..c5a1a82eac 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
@@ -289,7 +289,7 @@ The compilation steps are:
       DeprecationWarning,
     /workspace/vta/tutorials/frontend/deploy_classification.py:213: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
       relay_prog, target=tvm.target.Target(target, host=env.target_host), params=params
-    resnet18_v1 inference graph built in 22.90s!
+    resnet18_v1 inference graph built in 22.51s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
index 04bf47003e..0cd1d993d0 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
@@ -333,7 +333,7 @@ The compilation steps are:
 
     /workspace/python/tvm/relay/build_module.py:348: DeprecationWarning: Please use input parameter mod (tvm.IRModule) instead of deprecated parameter mod (tvm.relay.function.Function)
       DeprecationWarning,
-    yolov3-tiny inference graph built in 15.89s!
+    yolov3-tiny inference graph built in 15.99s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
index cd44e6ff53..562cafe842 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**01:32.270** total execution time for **topic_vta_tutorials_frontend** files:
+**01:31.138** total execution time for **topic_vta_tutorials_frontend** files:
 
 +------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)           | 00:48.924 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)           | 00:48.460 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``) | 00:43.346 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``) | 00:42.678 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
index 33c98d7037..ce468abb68 100644
--- a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:03.030** total execution time for **topic_vta_tutorials_optimize** files:
+**00:03.009** total execution time for **topic_vta_tutorials_optimize** files:
 
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)         | 00:02.621 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)         | 00:02.613 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``) | 00:00.409 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``) | 00:00.396 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
index e6f4287a47..37c61f97bb 100644
--- a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:00.771** total execution time for **topic_vta_tutorials** files:
+**00:00.736** total execution time for **topic_vta_tutorials** files:
 
 +---------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``) | 00:00.413 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``) | 00:00.394 | 0.0 MB |
 +---------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_vta_get_started.py` (``vta_get_started.py``) | 00:00.359 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_vta_get_started.py` (``vta_get_started.py``) | 00:00.341 | 0.0 MB |
 +---------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
index 7d806f0449..c16bafac1d 100644
--- a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
+++ b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
@@ -326,7 +326,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 93.587 ms
+    Execution time of this operator: 93.459 ms
 
 
 
@@ -442,11 +442,6 @@ Expression (TE) language that demonstrates how TVM can optimize computational
 operations.
 
 
-.. rst-class:: sphx-glr-timing
-
-   **Total running time of the script:** ( 1 minutes  3.483 seconds)
-
-
 .. _sphx_glr_download_tutorial_auto_scheduler_matmul_x86.py:
 
 .. only:: html
diff --git a/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt b/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
index 8a0ebfedfd..533cb18f5f 100644
--- a/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
+++ b/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
@@ -462,16 +462,16 @@ reduce variance, we take 5 measurements and average them.
     waiting for device...
     device available
     Get devices for measurement successfully!
-    No: 1   GFLOPS: 9.80/9.80       result: MeasureResult(costs=(0.0273801888,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.571343183517456, timestamp=1663569700.7747304)        [('tile_y', [-1, 1]), ('tile_x', [-1, 256])],None,80
-    No: 2   GFLOPS: 2.56/9.80       result: MeasureResult(costs=(0.1050429266,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.831789255142212, timestamp=1663569703.1474528)        [('tile_y', [-1, 4]), ('tile_x', [-1, 8])],None,32
-    No: 3   GFLOPS: 11.88/11.88     result: MeasureResult(costs=(0.0225865448,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5976603031158447, timestamp=1663569703.7128532)       [('tile_y', [-1, 64]), ('tile_x', [-1, 32])],None,56
-    No: 4   GFLOPS: 1.51/11.88      result: MeasureResult(costs=(0.1776615278,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.947221040725708, timestamp=1663569707.2437475)        [('tile_y', [-1, 1]), ('tile_x', [-1, 4])],None,20
-    No: 5   GFLOPS: 3.66/11.88      result: MeasureResult(costs=(0.0734260772,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.3109934329986572, timestamp=1663569708.685995)        [('tile_y', [-1, 256]), ('tile_x', [-1, 16])],None,48
-    No: 6   GFLOPS: 1.82/11.88      result: MeasureResult(costs=(0.1478481288,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.487691879272461, timestamp=1663569711.7454987)        [('tile_y', [-1, 512]), ('tile_x', [-1, 4])],None,29
-    No: 7   GFLOPS: 0.83/11.88      result: MeasureResult(costs=(0.3236701332,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.298122406005859, timestamp=1663569717.0876806)        [('tile_y', [-1, 512]), ('tile_x', [-1, 2])],None,19
-    No: 8   GFLOPS: 10.52/11.88     result: MeasureResult(costs=(0.0255259578,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.55462646484375, timestamp=1663569717.6584527) [('tile_y', [-1, 4]), ('tile_x', [-1, 64])],None,62
-    No: 9   GFLOPS: 1.73/11.88      result: MeasureResult(costs=(0.15490296580000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.576029062271118, timestamp=1663569720.3543928) [('tile_y', [-1, 2]), ('tile_x', [-1, 2])],None,11
-    No: 10  GFLOPS: 2.47/11.88      result: MeasureResult(costs=(0.10874074519999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.848226547241211, timestamp=1663569722.2602882) [('tile_y', [-1, 4]), ('tile_x', [-1, 4])],None,22
+    No: 1   GFLOPS: 9.12/9.12       result: MeasureResult(costs=(0.0294415704,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.604203462600708, timestamp=1663597546.662406) [('tile_y', [-1, 1]), ('tile_x', [-1, 256])],None,80
+    No: 2   GFLOPS: 2.55/9.12       result: MeasureResult(costs=(0.1051508428,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8359980583190918, timestamp=1663597549.0406845)       [('tile_y', [-1, 4]), ('tile_x', [-1, 8])],None,32
+    No: 3   GFLOPS: 11.76/11.76     result: MeasureResult(costs=(0.022827451800000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5912885665893555, timestamp=1663597549.610176)        [('tile_y', [-1, 64]), ('tile_x', [-1, 32])],None,56
+    No: 4   GFLOPS: 1.85/11.76      result: MeasureResult(costs=(0.145001262,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.443932056427002, timestamp=1663597552.6316605) [('tile_y', [-1, 1]), ('tile_x', [-1, 4])],None,20
+    No: 5   GFLOPS: 3.71/11.76      result: MeasureResult(costs=(0.07244774979999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2978484630584717, timestamp=1663597554.0604458)        [('tile_y', [-1, 256]), ('tile_x', [-1, 16])],None,48
+    No: 6   GFLOPS: 1.72/11.76      result: MeasureResult(costs=(0.1563153742,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.6690032482147217, timestamp=1663597556.7735612)       [('tile_y', [-1, 512]), ('tile_x', [-1, 4])],None,29
+    No: 7   GFLOPS: 0.87/11.76      result: MeasureResult(costs=(0.3092872586,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.071336507797241, timestamp=1663597562.4192626)        [('tile_y', [-1, 512]), ('tile_x', [-1, 2])],None,19
+    No: 8   GFLOPS: 10.47/11.76     result: MeasureResult(costs=(0.025647052200000003,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5561935901641846, timestamp=1663597562.9923697)       [('tile_y', [-1, 4]), ('tile_x', [-1, 64])],None,62
+    No: 9   GFLOPS: 1.90/11.76      result: MeasureResult(costs=(0.1412330426,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.379261016845703, timestamp=1663597565.4917383)        [('tile_y', [-1, 2]), ('tile_x', [-1, 2])],None,11
+    No: 10  GFLOPS: 2.74/11.76      result: MeasureResult(costs=(0.0980500346,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.689483642578125, timestamp=1663597567.221062) [('tile_y', [-1, 4]), ('tile_x', [-1, 4])],None,22
 
 
 
diff --git a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
index 0a50178751..aa3339ef66 100644
--- a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
+++ b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
@@ -320,7 +320,7 @@ standard deviation.
 
  .. code-block:: none
 
-    {'mean': 512.4212018799153, 'median': 512.6739428502333, 'std': 1.444597566839756}
+    {'mean': 511.21548624999724, 'median': 511.614363700005, 'std': 1.595007203886461}
 
 
 
@@ -554,30 +554,30 @@ the tuning data to.
 
  .. code-block:: none
 
-
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  1/25]  Current/Best:   17.45/  17.45 GFLOPS | Progress: (4/20) | 6.42 s
    [Task  1/25]  Current/Best:    6.08/  17.45 GFLOPS | Progress: (8/20) | 9.52 s
    [Task  1/25]  Current/Best:   11.14/  22.20 GFLOPS | Progress: (12/20) | 12.06 s
    [Task  1/25]  Current/Best:   16.49/  22.33 GFLOPS | Progress: (16/20) | 13.75 s
    [Task  1/25]  Current/Best:   11.29/  23.38 GFLOPS | Progress: (20/20) | 15.54 s Done.
-
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  2/25]  Current/Best:   12.06/  12.27 GFLOPS | Progress: (4/20) | 3.98 s
    [Task  2/25]  Current/Best:   12.46/  18.62 GFLOPS | Progress: (8/20) | 5.29 s
    [Task  2/25]  Current/Best:   20.81/  20.81 GFLOPS | Progress: (12/20) | 6.62 s
    [Task  2/25]  Current/Best:   10.61/  20.81 GFLOPS | Progress: (16/20) | 7.91 s
    [Task  2/25]  Current/Best:   16.77/  20.81 GFLOPS | Progress: (20/20) | 9.55 s Done.
-
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  3/25]  Current/Best:    1.63/  10.15 GFLOPS | Progress: (4/20) | 5.96 s
    [Task  3/25]  Current/Best:   15.31/  16.64 GFLOPS | Progress: (8/20) | 7.92 s
    [Task  3/25]  Current/Best:   14.98/  16.64 GFLOPS | Progress: (12/20) | 9.67 s
    [Task  3/25]  Current/Best:    6.83/  22.97 GFLOPS | Progress: (16/20) | 11.65 s
    [Task  3/25]  Current/Best:   11.02/  22.97 GFLOPS | Progress: (20/20) | 16.33 s Done.
-
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  4/25]  Current/Best:    9.04/  18.61 GFLOPS | Progress: (4/20) | 2.50 s
    [Task  4/25]  Current/Best:    6.60/  18.61 GFLOPS | Progress: (8/20) | 7.25 s
    [Task  4/25]  Current/Best:   20.72/  20.72 GFLOPS | Progress: (12/20) | 12.27 s
    [Task  4/25]  Current/Best:   16.35/  20.72 GFLOPS | Progress: (16/20) | 14.67 s
    [Task  4/25]  Current/Best:   12.72/  20.72 GFLOPS | Progress: (20/20) | 16.73 s Done.
-
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  5/25]  Current/Best:    9.29/   9.88 GFLOPS | Progress: (4/20) | 2.72 s
    [Task  5/25]  Current/Best:   11.60/  11.60 GFLOPS | Progress: (8/20) | 4.79 s
    [Task  5/25]  Current/Best:   11.19/  17.88 GFLOPS | Progress: (12/20) | 7.98 s
    [Task  5/25]  Current/Best:   11.60/  21.93 GFLOPS | Progress: (16/20) | 9.45 s
    [Task  5/25]  Current/Best:   12.07/  21.93 GFLOPS | Progress: (20/20) | 11.39 s Done.
-
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  6/25]  Current/Best:   12.01/  19.97 GFLOPS | Progress: (4/20) | 4.17 s
    [Task  6/25]  Current/Best:   18.93/  19.97 GFLOPS | Progress: (8/20) | 5.94 s
    [Task  6/25]  Current/Best:   13.29/  19.97 GFLOPS | Progress: (12/20) | 7.93 s
    [Task  6/25]  Current/Best:   19.43/  19.97 GFLOPS | Progress: (16/20) | 10.21 s
    [Task  6/25]  Current/Best:    3.73/  19.97 GFLOPS | Progress: (20/20) | 12.81 s Done.
-
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  7/25]  Current/Best:    9.79/  12.11 GFLOPS | Progress: (4/20) | 3.69 s
    [Task  7/25]  Current/Best:   19.47/  20.10 GFLOPS | Progress: (8/20) | 5.24 s
    [Task  7/25]  Current/Best:   15.84/  20.10 GFLOPS | Progress: (12/20) | 7.15 s
    [Task  7/25]  Current/Best:   12.14/  20.10 GFLOPS | Progress: (16/20) | 9.23 s
    [Task  7/25]  Current/Best:    6.12/  20.10 GFLOPS | Progress: (20/20) | 11.74 s Done.
-
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  8/25]  Current/Best:    9.97/  13.88 GFLOPS | Progress: (4/20) | 3.00 s
    [Task  8/25]  Current/Best:    9.68/  13.88 GFLOPS | Progress: (8/20) | 8.06 s
    [Task  8/25]  Current/Best:   12.88/  13.88 GFLOPS | Progress: (12/20) | 14.56 s
    [Task  8/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (16/20) | 16.68 s
    [Task  8/25]  Current/Best:   18.57/  18.80 GFLOPS | Progress: (20/20) | 23.74 s Done.
-
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  9/25]  Current/Best:   14.38/  14.38 GFLOPS | Progress: (4/20) | 12.02 s
    [Task  9/25]  Current/Best:   22.15/  22.15 GFLOPS | Progress: (8/20) | 13.78 s
    [Task  9/25]  Current/Best:    8.00/  22.15 GFLOPS | Progress: (12/20) | 16.32 s
    [Task  9/25]  Current/Best:   17.90/  22.15 GFLOPS | Progress: (16/20) | 19.06 s
    [Task  9/25]  Current/Best:    9.11/  22.15 GFLOPS | Progress: (20/20) | 27.53 s
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 10/25]  Current/Best:   17.66/  17.66 GFLOPS | Progress: (4/20) | 2.63 s
    [Task 10/25]  Current/Best:   15.64/  17.66 GFLOPS | Progress: (8/20) | 4.25 s
    [Task 10/25]  Current/Best:   11.71/  18.99 GFLOPS | Progress: (12/20) | 5.81 s
    [Task 10/25]  Current/Best:   19.08/  20.22 GFLOPS | Progress: (16/20) | 6.93 s
    [Task 10/25]  Current/Best:    8.50/  20.22 GFLOPS | Progress: (20/20
 ) | 8.46 s Done.
-
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 11/25]  Current/Best:   10.85/  18.19 GFLOPS | Progress: (4/20) | 3.40 s
    [Task 11/25]  Current/Best:   14.92/  18.19 GFLOPS | Progress: (8/20) | 6.22 s
    [Task 11/25]  Current/Best:   15.88/  18.19 GFLOPS | Progress: (12/20) | 8.36 s
    [Task 11/25]  Current/Best:   11.87/  20.63 GFLOPS | Progress: (16/20) | 11.35 s
    [Task 11/25]  Current/Best:   18.40/  20.63 GFLOPS | Progress: (20/20) | 13.47 s Done.
-
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 12/25]  Current/Best:    7.76/  17.98 GFLOPS | Progress: (4/20) | 5.75 s
    [Task 12/25]  Current/Best:    5.11/  17.98 GFLOPS | Progress: (8/20) | 9.71 s
    [Task 12/25]  Current/Best:   18.95/  18.95 GFLOPS | Progress: (12/20) | 11.70 s
    [Task 12/25]  Current/Best:   15.19/  18.95 GFLOPS | Progress: (16/20) | 14.62 s
    [Task 12/25]  Current/Best:   15.11/  18.95 GFLOPS | Progress: (20/20) | 16.60 s Done.
-
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 13/25]  Current/Best:    8.31/  17.15 GFLOPS | Progress: (4/20) | 3.82 s
    [Task 13/25]  Current/Best:   15.23/  20.65 GFLOPS | Progress: (8/20) | 6.44 s
    [Task 13/25]  Current/Best:   18.83/  21.24 GFLOPS | Progress: (12/20) | 9.54 s
    [Task 13/25]  Current/Best:   12.25/  21.24 GFLOPS | Progress: (16/20) | 13.01 s
    [Task 13/25]  Current/Best:   17.49/  21.24 GFLOPS | Progress: (20/20) | 15.40 s Done.
-
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 14/25]  Current/Best:   12.18/  13.35 GFLOPS | Progress: (4/20) | 3.48 s
    [Task 14/25]  Current/Best:    6.10/  13.35 GFLOPS | Progress: (8/20) | 5.68 s
    [Task 14/25]  Current/Best:   20.36/  20.36 GFLOPS | Progress: (12/20) | 8.36 s
    [Task 14/25]  Current/Best:   16.12/  20.36 GFLOPS | Progress: (16/20) | 10.07 s Done.
-
    [Task 14/25]  Current/Best:   16.94/  20.36 GFLOPS | Progress: (20/20) | 11.86 s
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 15/25]  Current/Best:   15.69/  17.28 GFLOPS | Progress: (4/20) | 2.77 s
    [Task 15/25]  Current/Best:   12.67/  17.57 GFLOPS | Progress: (8/20) | 4.09 s
    [Task 15/25]  Current/Best:   10.02/  21.66 GFLOPS | Progress: (12/20) | 6.32 s
    [Task 15/25]  Current/Best:   20.30/  21.66 GFLOPS | Progress: (16/20) | 9.95 s
    [Task 15/25]  Current/Best:    9.53/  21.66 GFLOPS | Progress: (20/20) | 10.97 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 16/25]  Current/Best:   19.24/  19.24 GFLOPS | Progress: (4/20) | 3.03 s
    [Task 16/25]  Current/Best:    3.04/  19.24 GFLOPS | Progress: (8/20) | 4.65 s
    [Task 16/25]  Current/Best:   18.20/  19.32 GFLOPS | Progress: (12/20) | 5.88 s
    [Task 16/25]  Current/Best:   18.11/  19.32 GFLOPS | Progress: (16/20) |
  7.27 s
    [Task 16/25]  Current/Best:   10.26/  21.28 GFLOPS | Progress: (20/20) | 9.42 s Done.
-
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 17/25]  Current/Best:   12.93/  16.17 GFLOPS | Progress: (4/20) | 4.87 s
    [Task 17/25]  Current/Best:   12.88/  22.95 GFLOPS | Progress: (8/20) | 7.78 s
    [Task 17/25]  Current/Best:   16.51/  22.95 GFLOPS | Progress: (12/20) | 9.89 s
    [Task 17/25]  Current/Best:   16.44/  22.95 GFLOPS | Progress: (16/20) | 12.11 s
    [Task 17/25]  Current/Best:    9.98/  22.95 GFLOPS | Progress: (20/20) | 14.27 s Done.
-
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 18/25]  Current/Best:   10.44/  16.39 GFLOPS | Progress: (4/20) | 3.85 s
    [Task 18/25]  Current/Best:   10.58/  19.03 GFLOPS | Progress: (8/20) | 7.49 s
    [Task 18/25]  Current/Best:   18.99/  19.03 GFLOPS | Progress: (12/20) | 9.45 s
    [Task 18/25]  Current/Best:   10.05/  19.03 GFLOPS | Progress: (16/20) | 13.34 s
    [Task 18/25]  Current/Best:   20.72/  20.72 GFLOPS | Progress: (20/20) | 14.89 s Done.
-
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 19/25]  Current/Best:    7.32/  19.84 GFLOPS | Progress: (4/20) | 6.18 s
    [Task 19/25]  Current/Best:    2.68/  19.84 GFLOPS | Progress: (8/20) | 9.53 s
    [Task 19/25]  Current/Best:   19.14/  20.92 GFLOPS | Progress: (12/20) | 12.51 s
    [Task 19/25]  Current/Best:   12.87/  21.50 GFLOPS | Progress: (16/20) | 15.58 s
    [Task 19/25]  Current/Best:    2.69/  22.45 GFLOPS | Progress: (20/20) | 18.43 s Done.
-
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 20/25]  Current/Best:    9.61/  15.36 GFLOPS | Progress: (4/20) | 3.34 s Done.
+
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  1/25]  Current/Best:   17.63/  17.63 GFLOPS | Progress: (4/20) | 6.33 s
    [Task  1/25]  Current/Best:    6.11/  17.63 GFLOPS | Progress: (8/20) | 9.30 s
    [Task  1/25]  Current/Best:   11.26/  22.23 GFLOPS | Progress: (12/20) | 11.82 s
    [Task  1/25]  Current/Best:   16.53/  22.23 GFLOPS | Progress: (16/20) | 13.52 s
    [Task  1/25]  Current/Best:   11.33/  23.63 GFLOPS | Progress: (20/20) | 15.31 s Done.
+
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  2/25]  Current/Best:   12.27/  12.27 GFLOPS | Progress: (4/20) | 3.89 s
    [Task  2/25]  Current/Best:   12.61/  18.70 GFLOPS | Progress: (8/20) | 5.18 s
    [Task  2/25]  Current/Best:   20.88/  20.88 GFLOPS | Progress: (12/20) | 6.49 s
    [Task  2/25]  Current/Best:   10.63/  20.88 GFLOPS | Progress: (16/20) | 7.74 s
    [Task  2/25]  Current/Best:   17.59/  20.88 GFLOPS | Progress: (20/20) | 9.33 s Done.
+
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  3/25]  Current/Best:    1.63/  10.17 GFLOPS | Progress: (4/20) | 5.87 s
    [Task  3/25]  Current/Best:   15.38/  16.87 GFLOPS | Progress: (8/20) | 7.83 s
    [Task  3/25]  Current/Best:   15.04/  16.87 GFLOPS | Progress: (12/20) | 9.60 s
    [Task  3/25]  Current/Best:    6.81/  23.41 GFLOPS | Progress: (16/20) | 11.55 s
    [Task  3/25]  Current/Best:   11.08/  23.41 GFLOPS | Progress: (20/20) | 16.16 s Done.
+
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  4/25]  Current/Best:    9.07/  17.51 GFLOPS | Progress: (4/20) | 2.41 s
    [Task  4/25]  Current/Best:    6.21/  17.51 GFLOPS | Progress: (8/20) | 7.17 s
    [Task  4/25]  Current/Best:   20.64/  20.64 GFLOPS | Progress: (12/20) | 12.14 s
    [Task  4/25]  Current/Best:   16.33/  20.64 GFLOPS | Progress: (16/20) | 14.52 s
    [Task  4/25]  Current/Best:   12.15/  20.64 GFLOPS | Progress: (20/20) | 16.62 s Done.
+
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  5/25]  Current/Best:    9.14/   9.87 GFLOPS | Progress: (4/20) | 2.64 s
    [Task  5/25]  Current/Best:   11.64/  11.64 GFLOPS | Progress: (8/20) | 4.74 s
    [Task  5/25]  Current/Best:   11.87/  18.09 GFLOPS | Progress: (12/20) | 7.93 s
    [Task  5/25]  Current/Best:   11.46/  21.99 GFLOPS | Progress: (16/20) | 9.36 s
    [Task  5/25]  Current/Best:   12.15/  21.99 GFLOPS | Progress: (20/20) | 11.27 s Done.
+
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  6/25]  Current/Best:   11.99/  20.00 GFLOPS | Progress: (4/20) | 4.10 s
    [Task  6/25]  Current/Best:   18.84/  20.00 GFLOPS | Progress: (8/20) | 5.89 s
    [Task  6/25]  Current/Best:   13.17/  20.00 GFLOPS | Progress: (12/20) | 7.90 s
    [Task  6/25]  Current/Best:   19.65/  20.00 GFLOPS | Progress: (16/20) | 10.16 s
    [Task  6/25]  Current/Best:    3.69/  20.00 GFLOPS | Progress: (20/20) | 12.73 s Done.
+
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  7/25]  Current/Best:    9.80/  12.07 GFLOPS | Progress: (4/20) | 3.69 s
    [Task  7/25]  Current/Best:   18.94/  19.93 GFLOPS | Progress: (8/20) | 5.24 s
    [Task  7/25]  Current/Best:   16.06/  19.93 GFLOPS | Progress: (12/20) | 7.20 s
    [Task  7/25]  Current/Best:   12.15/  19.93 GFLOPS | Progress: (16/20) | 9.29 s
    [Task  7/25]  Current/Best:    5.99/  20.56 GFLOPS | Progress: (20/20) | 11.80 s Done.
+
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  8/25]  Current/Best:    9.75/  13.51 GFLOPS | Progress: (4/20) | 3.00 s
    [Task  8/25]  Current/Best:    9.00/  13.51 GFLOPS | Progress: (8/20) | 8.18 s
    [Task  8/25]  Current/Best:   12.52/  13.51 GFLOPS | Progress: (12/20) | 14.67 s
    [Task  8/25]  Current/Best:   18.97/  18.97 GFLOPS | Progress: (16/20) | 16.81 s
    [Task  8/25]  Current/Best:   18.34/  18.97 GFLOPS | Progress: (20/20) | 23.89 s Done.
+
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  9/25]  Current/Best:   14.35/  14.35 GFLOPS | Progress: (4/20) | 11.97 s
    [Task  9/25]  Current/Best:   21.70/  21.70 GFLOPS | Progress: (8/20) | 13.79 s
    [Task  9/25]  Current/Best:    7.66/  21.70 GFLOPS | Progress: (12/20) | 16.33 s
    [Task  9/25]  Current/Best:   17.92/  21.70 GFLOPS | Progress: (16/20) | 19.21 s
    [Task  9/25]  Current/Best:    9.05/  21.70 GFLOPS | Progress: (20/20) | 27.80 s
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 10/25]  Current/Best:   17.96/  17.96 GFLOPS | Progress: (4/20) | 2.58 s
    [Task 10/25]  Current/Best:   15.66/  17.96 GFLOPS | Progress: (8/20) | 4.20 s
    [Task 10/25]  Current/Best:   11.33/  18.78 GFLOPS | Progress: (12/20) | 5.76 s
    [Task 10/25]  Current/Best:   19.12/  20.07 GFLOPS | Progress: (16/20) | 6.86 s
    [Task 10/25]  Current/Best:    8.36/  20.07 GFLOPS | Progress: (20/20
 ) | 8.42 s Done.
+
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 11/25]  Current/Best:   10.82/  18.17 GFLOPS | Progress: (4/20) | 3.43 s
    [Task 11/25]  Current/Best:   14.96/  18.17 GFLOPS | Progress: (8/20) | 6.27 s
    [Task 11/25]  Current/Best:   15.85/  18.17 GFLOPS | Progress: (12/20) | 8.36 s
    [Task 11/25]  Current/Best:   11.85/  20.71 GFLOPS | Progress: (16/20) | 11.24 s
    [Task 11/25]  Current/Best:   17.77/  20.71 GFLOPS | Progress: (20/20) | 13.37 s Done.
+
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 12/25]  Current/Best:    7.81/  17.64 GFLOPS | Progress: (4/20) | 5.76 s
    [Task 12/25]  Current/Best:    5.07/  17.64 GFLOPS | Progress: (8/20) | 9.74 s
    [Task 12/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (12/20) | 11.75 s
    [Task 12/25]  Current/Best:   14.94/  18.80 GFLOPS | Progress: (16/20) | 14.67 s
    [Task 12/25]  Current/Best:   15.14/  18.80 GFLOPS | Progress: (20/20) | 16.64 s Done.
+
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 13/25]  Current/Best:    8.66/  17.34 GFLOPS | Progress: (4/20) | 3.75 s
    [Task 13/25]  Current/Best:   15.21/  20.78 GFLOPS | Progress: (8/20) | 6.38 s
    [Task 13/25]  Current/Best:   18.51/  20.89 GFLOPS | Progress: (12/20) | 9.54 s
    [Task 13/25]  Current/Best:   12.22/  20.89 GFLOPS | Progress: (16/20) | 12.99 s
    [Task 13/25]  Current/Best:   17.96/  20.89 GFLOPS | Progress: (20/20) | 15.37 s Done.
+
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 14/25]  Current/Best:   12.13/  13.39 GFLOPS | Progress: (4/20) | 3.38 s
    [Task 14/25]  Current/Best:    6.08/  13.39 GFLOPS | Progress: (8/20) | 5.56 s
    [Task 14/25]  Current/Best:   19.46/  19.46 GFLOPS | Progress: (12/20) | 8.25 s
    [Task 14/25]  Current/Best:   14.93/  19.46 GFLOPS | Progress: (16/20) | 9.95 s Done.
+
    [Task 14/25]  Current/Best:   16.76/  19.46 GFLOPS | Progress: (20/20) | 11.69 s
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 15/25]  Current/Best:   15.64/  17.22 GFLOPS | Progress: (4/20) | 2.74 s
    [Task 15/25]  Current/Best:   12.65/  17.57 GFLOPS | Progress: (8/20) | 4.04 s
    [Task 15/25]  Current/Best:   10.00/  21.68 GFLOPS | Progress: (12/20) | 6.26 s
    [Task 15/25]  Current/Best:   20.29/  21.68 GFLOPS | Progress: (16/20) | 9.85 s
    [Task 15/25]  Current/Best:    9.53/  21.68 GFLOPS | Progress: (20/20) | 10.87 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 16/25]  Current/Best:   19.31/  19.31 GFLOPS | Progress: (4/20) | 3.00 s
    [Task 16/25]  Current/Best:    2.99/  19.31 GFLOPS | Progress: (8/20) | 4.62 s
    [Task 16/25]  Current/Best:   17.80/  19.31 GFLOPS | Progress: (12/20) | 5.85 s
    [Task 16/25]  Current/Best:   17.72/  19.31 GFLOPS | Progress: (16/20) |
  7.24 s
    [Task 16/25]  Current/Best:    9.84/  21.21 GFLOPS | Progress: (20/20) | 9.39 s Done.
+
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 17/25]  Current/Best:   12.18/  16.08 GFLOPS | Progress: (4/20) | 4.82 s
    [Task 17/25]  Current/Best:   12.82/  23.00 GFLOPS | Progress: (8/20) | 7.71 s
    [Task 17/25]  Current/Best:   16.52/  23.00 GFLOPS | Progress: (12/20) | 9.83 s
    [Task 17/25]  Current/Best:   16.39/  23.00 GFLOPS | Progress: (16/20) | 12.07 s
    [Task 17/25]  Current/Best:    9.97/  23.00 GFLOPS | Progress: (20/20) | 14.22 s Done.
+
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 18/25]  Current/Best:    9.92/  16.60 GFLOPS | Progress: (4/20) | 3.85 s
    [Task 18/25]  Current/Best:   10.59/  19.11 GFLOPS | Progress: (8/20) | 7.56 s
    [Task 18/25]  Current/Best:   18.94/  19.11 GFLOPS | Progress: (12/20) | 9.51 s
    [Task 18/25]  Current/Best:   10.00/  19.11 GFLOPS | Progress: (16/20) | 13.32 s
    [Task 18/25]  Current/Best:   20.77/  20.77 GFLOPS | Progress: (20/20) | 14.85 s Done.
+
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 19/25]  Current/Best:    7.30/  19.90 GFLOPS | Progress: (4/20) | 6.12 s
    [Task 19/25]  Current/Best:    2.69/  19.90 GFLOPS | Progress: (8/20) | 9.50 s
    [Task 19/25]  Current/Best:   17.35/  20.91 GFLOPS | Progress: (12/20) | 12.51 s
    [Task 19/25]  Current/Best:   13.77/  20.91 GFLOPS | Progress: (16/20) | 15.53 s
    [Task 19/25]  Current/Best:    2.70/  22.47 GFLOPS | Progress: (20/20) | 18.41 s Done.
+
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 20/25]  Current/Best:    9.17/  14.25 GFLOPS | Progress: (4/20) | 3.35 s Done.
      Done.
-
    [Task 20/25]  Current/Best:   10.01/  15.36 GFLOPS | Progress: (8/20) | 6.91 s
    [Task 20/25]  Current/Best:    2.32/  15.36 GFLOPS | Progress: (12/20) | 10.87 s
    [Task 20/25]  Current/Best:   11.02/  15.36 GFLOPS | Progress: (16/20) | 14.76 s
    [Task 20/25]  Current/Best:   11.05/  21.33 GFLOPS | Progress: (20/20) | 16.92 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 21/25]  Current/Best:    6.35/  17.71 GFLOPS | Progress: (4/20) | 3.29 s
    [Task 21/25]  Current/Best:   14.64/  17.71 GFLOPS | Progress: (8/20) | 4.93 s
    [Task 21/25]  Current/Best:    1.61/  17.71 GFLOPS | Progress: (12/20) | 7.06 s
    [Task 21/25]  Current/Best:   15.98/  17.71 GFLOPS | Progress: (16/20) | 10.58 s
    [Task 21/25]  Current/Best:    4.43/  17.71 GFLOPS | Progress: (20/20) | 17.84 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 22/25]  Current/Best:    2.70/  16.95 GFLOPS | Progress: (4/20
 ) | 2.71 s
    [Task 22/25]  Current/Best:    9.04/  21.20 GFLOPS | Progress: (8/20) | 4.76 s
    [Task 22/25]  Current/Best:   19.94/  21.20 GFLOPS | Progress: (12/20) | 7.15 s
    [Task 22/25]  Current/Best:   15.44/  21.20 GFLOPS | Progress: (16/20) | 9.29 s
    [Task 22/25]  Current/Best:   12.51/  21.20 GFLOPS | Progress: (20/20) | 11.07 s Done.
-
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 23/25]  Current/Best:   16.61/  19.37 GFLOPS | Progress: (4/20) | 3.35 s
    [Task 23/25]  Current/Best:   13.64/  19.87 GFLOPS | Progress: (8/20) | 6.77 s
    [Task 23/25]  Current/Best:   20.26/  21.63 GFLOPS | Progress: (12/20) | 8.63 s
    [Task 23/25]  Current/Best:    6.55/  21.63 GFLOPS | Progress: (16/20) | 15.62 s
    [Task 23/25]  Current/Best:    7.63/  21.63 GFLOPS | Progress: (20/20) | 19.86 s Done.
-
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 24/25]  Current/Best:    8.37/   8.37 GFLOPS | Progress: (4/20) | 11.84 s
    [Task 24/25]  Current/Best:    1.90/   8.37 GFLOPS | Progress: (8/20) | 22.87 s
    [Task 24/25]  Current/Best:    3.92/   8.37 GFLOPS | Progress: (12/20) | 34.44 s Done.
-
    [Task 24/25]  Current/Best:    6.43/   8.89 GFLOPS | Progress: (16/20) | 40.12 s
    [Task 24/25]  Current/Best:    2.95/   8.89 GFLOPS | Progress: (20/20) | 46.06 s Done.
-
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 25/25]  Current/Best:    1.55/   2.77 GFLOPS | Progress: (4/20) | 11.63 s
    [Task 25/25]  Current/Best:    5.88/   8.44 GFLOPS | Progress: (8/20) | 22.91 s
    [Task 25/25]  Current/Best:    6.02/   8.44 GFLOPS | Progress: (12/20) | 34.22 s
    [Task 25/25]  Current/Best:    5.85/   8.94 GFLOPS | Progress: (16/20) | 36.13 s
    [Task 25/25]  Current/Best:    2.81/   9.18 GFLOPS | Progress: (20/20) | 46.84 s
+
    [Task 20/25]  Current/Best:    9.94/  14.25 GFLOPS | Progress: (8/20) | 6.92 s
    [Task 20/25]  Current/Best:    2.32/  14.51 GFLOPS | Progress: (12/20) | 10.88 s
    [Task 20/25]  Current/Best:   10.91/  14.51 GFLOPS | Progress: (16/20) | 14.74 s
    [Task 20/25]  Current/Best:   10.76/  22.05 GFLOPS | Progress: (20/20) | 16.88 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 21/25]  Current/Best:    6.36/  17.69 GFLOPS | Progress: (4/20) | 3.29 s
    [Task 21/25]  Current/Best:   14.60/  17.69 GFLOPS | Progress: (8/20) | 4.90 s
    [Task 21/25]  Current/Best:    1.61/  17.69 GFLOPS | Progress: (12/20) | 7.06 s
    [Task 21/25]  Current/Best:   15.91/  17.69 GFLOPS | Progress: (16/20) | 10.59 s
    [Task 21/25]  Current/Best:    4.45/  17.69 GFLOPS | Progress: (20/20) | 17.87 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 22/25]  Current/Best:    2.70/  16.92 GFLOPS | Progress: (4/20
 ) | 2.69 s
    [Task 22/25]  Current/Best:    8.71/  20.62 GFLOPS | Progress: (8/20) | 4.67 s
    [Task 22/25]  Current/Best:   19.64/  20.62 GFLOPS | Progress: (12/20) | 7.08 s
    [Task 22/25]  Current/Best:   15.43/  20.62 GFLOPS | Progress: (16/20) | 9.20 s
    [Task 22/25]  Current/Best:   12.78/  20.62 GFLOPS | Progress: (20/20) | 10.97 s Done.
+
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 23/25]  Current/Best:   16.76/  20.09 GFLOPS | Progress: (4/20) | 3.29 s
    [Task 23/25]  Current/Best:   13.51/  20.09 GFLOPS | Progress: (8/20) | 6.70 s
    [Task 23/25]  Current/Best:   20.61/  21.81 GFLOPS | Progress: (12/20) | 8.55 s
    [Task 23/25]  Current/Best:    6.61/  21.81 GFLOPS | Progress: (16/20) | 15.62 s
    [Task 23/25]  Current/Best:    7.68/  21.81 GFLOPS | Progress: (20/20) | 19.84 s Done.
+
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 24/25]  Current/Best:    8.19/   8.19 GFLOPS | Progress: (4/20) | 11.80 s
    [Task 24/25]  Current/Best:    3.32/   8.19 GFLOPS | Progress: (8/20) | 23.05 s
    [Task 24/25]  Current/Best:    3.98/   8.19 GFLOPS | Progress: (12/20) | 33.80 s Done.
+
    [Task 24/25]  Current/Best:    5.48/   8.67 GFLOPS | Progress: (16/20) | 39.37 s
    [Task 24/25]  Current/Best:    3.03/   8.67 GFLOPS | Progress: (20/20) | 45.34 s Done.
+
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 25/25]  Current/Best:    1.55/   2.75 GFLOPS | Progress: (4/20) | 11.61 s
    [Task 25/25]  Current/Best:    5.59/   7.97 GFLOPS | Progress: (8/20) | 22.89 s
    [Task 25/25]  Current/Best:    5.82/   7.97 GFLOPS | Progress: (12/20) | 34.38 s
    [Task 25/25]  Current/Best:    5.62/   8.14 GFLOPS | Progress: (16/20) | 36.26 s
    [Task 25/25]  Current/Best:    2.83/   8.28 GFLOPS | Progress: (20/20) | 46.98 s
 
 
 
@@ -679,8 +679,8 @@ Verify that the optimized model runs and produces the same results:
 
  .. code-block:: none
 
-    class='n02123045 tabby, tabby cat' with probability=0.621104
-    class='n02123159 tiger cat' with probability=0.356378
+    class='n02123045 tabby, tabby cat' with probability=0.621105
+    class='n02123159 tiger cat' with probability=0.356377
     class='n02124075 Egyptian cat' with probability=0.019712
     class='n02129604 tiger, Panthera tigris' with probability=0.001215
     class='n04040759 radiator' with probability=0.000262
@@ -737,8 +737,8 @@ improvement in comparing the optimized model to the unoptimized model.
 
  .. code-block:: none
 
-    optimized: {'mean': 411.6166502700071, 'median': 411.56105894988286, 'std': 1.0797258407980037}
-    unoptimized: {'mean': 512.4212018799153, 'median': 512.6739428502333, 'std': 1.444597566839756}
+    optimized: {'mean': 407.16022603000283, 'median': 407.1732334000103, 'std': 0.8974312165856505}
+    unoptimized: {'mean': 511.21548624999724, 'median': 511.614363700005, 'std': 1.595007203886461}
 
 
 
@@ -761,7 +761,7 @@ profiling/benchmarking.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 10 minutes  28.919 seconds)
+   **Total running time of the script:** ( 10 minutes  25.412 seconds)
 
 
 .. _sphx_glr_download_tutorial_autotvm_relay_x86.py:
diff --git a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
index 2ba3b6b889..9979d1fd75 100644
--- a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
@@ -282,7 +282,7 @@ device and returns the measured cost. Network overhead is excluded.
 
  .. code-block:: none
 
-    1.533e-07 secs/op
+    1.228e-07 secs/op
 
 
 
diff --git a/docs/_sources/tutorial/intro_topi.rst.txt b/docs/_sources/tutorial/intro_topi.rst.txt
index f2f768c6db..ebaf1dd251 100644
--- a/docs/_sources/tutorial/intro_topi.rst.txt
+++ b/docs/_sources/tutorial/intro_topi.rst.txt
@@ -263,7 +263,7 @@ As you can see, scheduled stages of computation have been accumulated and we can
 
  .. code-block:: none
 
-    [stage(a, placeholder(a, 0x1fa8da40)), stage(b, placeholder(b, 0x1f6e8b50)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(mi [...]
+    [stage(a, placeholder(a, 0x57a23f0)), stage(b, placeholder(b, 0x27909010)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min [...]
 
 
 
diff --git a/docs/_sources/tutorial/sg_execution_times.rst.txt b/docs/_sources/tutorial/sg_execution_times.rst.txt
index 4c96a1fc4c..e9b9811e4c 100644
--- a/docs/_sources/tutorial/sg_execution_times.rst.txt
+++ b/docs/_sources/tutorial/sg_execution_times.rst.txt
@@ -5,32 +5,32 @@
 
 Computation times
 =================
-**13:29.356** total execution time for **tutorial** files:
+**13:12.068** total execution time for **tutorial** files:
 
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)                 | 10:28.919 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)                 | 10:25.412 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``) | 01:03.483 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)     | 01:00.703 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)     | 00:58.348 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``) | 00:48.639 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)                 | 00:31.656 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)                 | 00:31.093 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)               | 00:25.004 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)               | 00:24.126 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)       | 00:01.068 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)       | 00:01.233 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)                               | 00:00.699 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)                               | 00:00.702 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``) | 00:00.167 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``) | 00:00.152 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_introduction.py` (``introduction.py``)                           | 00:00.007 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_introduction.py` (``introduction.py``)                           | 00:00.005 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_uma.py` (``uma.py``)                                             | 00:00.001 | 0.0 MB |
-+------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tvmc_python.py` (``tvmc_python.py``)                             | 00:00.001 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_uma.py` (``uma.py``)                                             | 00:00.002 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_tvmc_command_line_driver.py` (``tvmc_command_line_driver.py``)   | 00:00.001 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
+| :ref:`sphx_glr_tutorial_tvmc_python.py` (``tvmc_python.py``)                             | 00:00.001 | 0.0 MB |
++------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_install.py` (``install.py``)                                     | 00:00.001 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
index 088e4c22cf..1540cfc129 100644
--- a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
+++ b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
@@ -294,8 +294,8 @@ helper function to run a profile of the TVM generated code.
 
  .. code-block:: none
 
-    Numpy running time: 0.000007
-    naive: 0.000009
+    Numpy running time: 0.000008
+    naive: 0.000007
 
 
 
@@ -394,7 +394,7 @@ compile and run this new schedule with the parallel operation applied:
 
  .. code-block:: none
 
-    parallel: 0.000009
+    parallel: 0.000008
 
 
 
@@ -501,10 +501,10 @@ We can now compare the different schedules
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                   numpy    7.336000053328462e-06                    1.0
-                   naive    8.708999999999999e-06     1.1871592062010121
-                parallel              9.2118e-06      1.2556979188979773
-                  vector    2.4513500000000002e-05    3.3415348721102895
+                   numpy    7.625989999269223e-06                    1.0
+                   naive    6.6650000000000006e-06    0.8739848859805335
+                parallel    8.055700000000001e-06     1.0563480939224883
+                  vector    2.4568400000000004e-05    3.2216669576480332
 
 
 
@@ -925,7 +925,7 @@ matrix multiplication.
 
  .. code-block:: none
 
-    Numpy running time: 0.018508
+    Numpy running time: 0.019004
 
 
 
@@ -983,7 +983,7 @@ optimizations.
 
  .. code-block:: none
 
-    none: 3.224370
+    none: 3.399834
 
 
 
@@ -1086,7 +1086,7 @@ schedule.
 
  .. code-block:: none
 
-    blocking: 0.302546
+    blocking: 0.294778
 
 
 
@@ -1182,7 +1182,7 @@ already cache friendly from our previous optimizations.
 
  .. code-block:: none
 
-    vectorization: 0.336048
+    vectorization: 0.336641
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1256,7 +1256,7 @@ more cache friendly.
 
  .. code-block:: none
 
-    loop permutation: 0.113060
+    loop permutation: 0.115950
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1355,7 +1355,7 @@ optimized schedule.
 
  .. code-block:: none
 
-    array packing: 0.107321
+    array packing: 0.109689
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1448,7 +1448,7 @@ to `C` when all the block results are ready.
 
  .. code-block:: none
 
-    block caching: 0.110543
+    block caching: 0.110457
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1534,7 +1534,7 @@ of thread-level parallelization.
 
  .. code-block:: none
 
-    parallelization: 0.146259
+    parallelization: 0.146722
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1615,13 +1615,13 @@ working, we can compare the results.
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                    none      3.2243697173999997                     1.0
-                blocking     0.30254556819999995     0.09383091726961149
-           vectorization            0.3360484456     0.10422143707235156
-        loop permutation            0.1130601591     0.03506426651071736
-           array packing            0.1073209683     0.03328432459865032
-           block caching            0.1105431506     0.03428364619710468
-         parallelization     0.14625904979999999    0.045360508446263825
+                    none            3.3998342633                     1.0
+                blocking            0.2947782946       0.086703724879188
+           vectorization            0.3366409768     0.09901687868550518
+        loop permutation     0.11595009150000002      0.0341046305555656
+           array packing            0.1096886654     0.03226294486882789
+           block caching     0.11045665819999999    0.032488836115436646
+         parallelization            0.1467221105    0.043155665581647004
 
 
 
@@ -1661,6 +1661,11 @@ operations with tunable parameters that allows you to automatically optimize
 the computation for specific platforms.
 
 
+.. rst-class:: sphx-glr-timing
+
+   **Total running time of the script:** ( 1 minutes  0.703 seconds)
+
+
 .. _sphx_glr_download_tutorial_tensor_expr_get_started.py:
 
 .. only:: html
diff --git a/docs/commit_hash b/docs/commit_hash
index a25a3c31f4..b08477b0e3 100644
--- a/docs/commit_hash
+++ b/docs/commit_hash
@@ -1 +1 @@
-60cf692a63a22cd2698273c4945f037b4b22474b
+2af9b90ec191424724842795c552d4c15682eb8c
diff --git a/docs/how_to/compile_models/from_darknet.html b/docs/how_to/compile_models/from_darknet.html
index 1901507dce..0aa5219012 100644
--- a/docs/how_to/compile_models/from_darknet.html
+++ b/docs/how_to/compile_models/from_darknet.html
@@ -572,7 +572,7 @@ class:[&#39;truck 0.9266&#39;] left:471 top:83 right:689 bottom:169
 class:[&#39;bicycle 0.9984&#39;] left:111 top:113 right:577 bottom:447
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  3.949 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  1.628 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-darknet-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7716f96385bd5abb6e822041e285be54/from_darknet.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_darknet.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/from_keras.html b/docs/how_to/compile_models/from_keras.html
index 003eb47dd6..5c96618a47 100644
--- a/docs/how_to/compile_models/from_keras.html
+++ b/docs/how_to/compile_models/from_keras.html
@@ -493,7 +493,7 @@ pip install -U tensorflow --user
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Relay top-1 id: 285, class name: Egyptian cat
 
 1/1 [==============================] - ETA: 0s
-1/1 [==============================] - 1s 987ms/step
+1/1 [==============================] - 1s 952ms/step
 Keras top-1 id: 285, class name: Egyptian cat
 </pre></div>
 </div>
diff --git a/docs/how_to/compile_models/from_mxnet.html b/docs/how_to/compile_models/from_mxnet.html
index 8b0aaf4020..9840b3b8ce 100644
--- a/docs/how_to/compile_models/from_mxnet.html
+++ b/docs/how_to/compile_models/from_mxnet.html
@@ -427,7 +427,7 @@ to download the full example code</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;x&quot;</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#tuple" title="builtins.tuple" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">x</span><span class="o">.</span><span class="n">shape</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_from_mxnet_001.png" srcset="../../_images/sphx_glr_from_mxnet_001.png" alt="from mxnet" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip308a65c9-68f3-4e7e-8c8d-ca72843ec134 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+<img src="../../_images/sphx_glr_from_mxnet_001.png" srcset="../../_images/sphx_glr_from_mxnet_001.png" alt="from mxnet" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip2c209c9a-c05d-45eb-9c4b-0f8f26611a57 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
 x (1, 3, 224, 224)
 </pre></div>
 </div>
diff --git a/docs/how_to/compile_models/from_oneflow.html b/docs/how_to/compile_models/from_oneflow.html
index e6b3b44b7a..d0e71183a3 100644
--- a/docs/how_to/compile_models/from_oneflow.html
+++ b/docs/how_to/compile_models/from_oneflow.html
@@ -435,13 +435,13 @@ Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdo
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip&quot; to /workspace/.oneflow/flowvision_cache/resnet18.zip
 
   0%|          | 0.00/41.5M [00:00&lt;?, ?B/s]
- 19%|#9        | 7.99M/41.5M [00:00&lt;00:00, 48.3MB/s]
- 39%|###8      | 16.0M/41.5M [00:00&lt;00:00, 52.5MB/s]
- 54%|#####3    | 22.3M/41.5M [00:00&lt;00:00, 48.1MB/s]
- 65%|######4   | 26.9M/41.5M [00:00&lt;00:00, 44.1MB/s]
- 82%|########2 | 34.1M/41.5M [00:00&lt;00:00, 45.2MB/s]
- 96%|#########6| 40.0M/41.5M [00:00&lt;00:00, 46.9MB/s]
-100%|##########| 41.5M/41.5M [00:00&lt;00:00, 48.4MB/s]
+ 19%|#9        | 7.99M/41.5M [00:00&lt;00:00, 41.1MB/s]
+ 39%|###8      | 16.0M/41.5M [00:00&lt;00:00, 45.5MB/s]
+ 55%|#####4    | 22.6M/41.5M [00:00&lt;00:00, 53.1MB/s]
+ 68%|######7   | 28.0M/41.5M [00:00&lt;00:00, 50.6MB/s]
+ 80%|#######9  | 33.1M/41.5M [00:00&lt;00:00, 48.0MB/s]
+ 92%|#########2| 38.3M/41.5M [00:00&lt;00:00, 46.8MB/s]
+100%|##########| 41.5M/41.5M [00:00&lt;00:00, 48.0MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_pytorch.html b/docs/how_to/compile_models/from_pytorch.html
index fc75bfed3f..5ce7f37e32 100644
--- a/docs/how_to/compile_models/from_pytorch.html
+++ b/docs/how_to/compile_models/from_pytorch.html
@@ -414,10 +414,9 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/resnet18-f37072fd.pth&quot; to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
 
   0%|          | 0.00/44.7M [00:00&lt;?, ?B/s]
-  8%|8         | 3.66M/44.7M [00:00&lt;00:01, 38.1MB/s]
- 17%|#7        | 7.65M/44.7M [00:00&lt;00:00, 40.3MB/s]
- 56%|#####6    | 25.2M/44.7M [00:00&lt;00:00, 105MB/s]
-100%|##########| 44.7M/44.7M [00:00&lt;00:00, 123MB/s]
+ 43%|####3     | 19.3M/44.7M [00:00&lt;00:00, 203MB/s]
+100%|##########| 44.7M/44.7M [00:00&lt;00:00, 239MB/s]
+100%|##########| 44.7M/44.7M [00:00&lt;00:00, 234MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_tensorflow.html b/docs/how_to/compile_models/from_tensorflow.html
index 7c04b30e4d..c8cad0d8c2 100644
--- a/docs/how_to/compile_models/from_tensorflow.html
+++ b/docs/how_to/compile_models/from_tensorflow.html
@@ -632,7 +632,7 @@ banana (score = 0.00022)
 desk (score = 0.00019)
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  8.460 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  4.903 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-tensorflow-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7f1d3d1b878694c201c614c807cdebc8/from_tensorflow.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_tensorflow.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/sg_execution_times.html b/docs/how_to/compile_models/sg_execution_times.html
index b0585b8c84..d2f26a3647 100644
--- a/docs/how_to/compile_models/sg_execution_times.html
+++ b/docs/how_to/compile_models/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-compile-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>05:18.194</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
+<p><strong>05:05.089</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 81%" />
@@ -336,43 +336,43 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_tensorflow.html#sphx-glr-how-to-compile-models-from-tensorflow-py"><span class="std std-ref">Compile Tensorflow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tensorflow.py</span></code>)</p></td>
-<td><p>01:08.460</p></td>
+<td><p>01:04.903</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_darknet.html#sphx-glr-how-to-compile-models-from-darknet-py"><span class="std std-ref">Compile YOLO-V2 and YOLO-V3 in DarkNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_darknet.py</span></code>)</p></td>
-<td><p>01:03.949</p></td>
+<td><p>01:01.628</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_paddle.html#sphx-glr-how-to-compile-models-from-paddle-py"><span class="std std-ref">Compile PaddlePaddle Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_paddle.py</span></code>)</p></td>
-<td><p>00:41.236</p></td>
+<td><p>00:38.624</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_oneflow.html#sphx-glr-how-to-compile-models-from-oneflow-py"><span class="std std-ref">Compile OneFlow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_oneflow.py</span></code>)</p></td>
-<td><p>00:30.049</p></td>
+<td><p>00:28.011</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_mxnet.html#sphx-glr-how-to-compile-models-from-mxnet-py"><span class="std std-ref">Compile MXNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_mxnet.py</span></code>)</p></td>
-<td><p>00:27.178</p></td>
+<td><p>00:26.726</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_tflite.html#sphx-glr-how-to-compile-models-from-tflite-py"><span class="std std-ref">Compile TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tflite.py</span></code>)</p></td>
-<td><p>00:25.552</p></td>
+<td><p>00:24.764</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_coreml.html#sphx-glr-how-to-compile-models-from-coreml-py"><span class="std std-ref">Compile CoreML Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_coreml.py</span></code>)</p></td>
-<td><p>00:22.025</p></td>
+<td><p>00:21.644</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_pytorch.html#sphx-glr-how-to-compile-models-from-pytorch-py"><span class="std std-ref">Compile PyTorch Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_pytorch.py</span></code>)</p></td>
-<td><p>00:21.107</p></td>
+<td><p>00:20.024</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_keras.html#sphx-glr-how-to-compile-models-from-keras-py"><span class="std std-ref">Compile Keras Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_keras.py</span></code>)</p></td>
-<td><p>00:16.229</p></td>
+<td><p>00:16.379</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_onnx.html#sphx-glr-how-to-compile-models-from-onnx-py"><span class="std std-ref">Compile ONNX Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_onnx.py</span></code>)</p></td>
-<td><p>00:02.409</p></td>
+<td><p>00:02.386</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/deploy_models/deploy_model_on_android.html b/docs/how_to/deploy_models/deploy_model_on_android.html
index bc0504b3a1..e42f836ac2 100644
--- a/docs/how_to/deploy_models/deploy_model_on_android.html
+++ b/docs/how_to/deploy_models/deploy_model_on_android.html
@@ -649,7 +649,7 @@ to the remote android device.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  16.2342      16.0759      16.7901      15.7561       0.3800
+  16.2713      15.7376      20.6259      15.6294       1.4588
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
index cb8cfb75a1..6b9ccebadc 100644
--- a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
+++ b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
@@ -436,17 +436,13 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth&quot; to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
 
   0%|          | 0.00/170M [00:00&lt;?, ?B/s]
-  2%|2         | 3.91M/170M [00:00&lt;00:04, 40.9MB/s]
-  5%|4         | 7.84M/170M [00:00&lt;00:04, 41.1MB/s]
- 12%|#2        | 21.0M/170M [00:00&lt;00:01, 85.3MB/s]
- 25%|##4       | 42.2M/170M [00:00&lt;00:00, 139MB/s]
- 39%|###8      | 65.4M/170M [00:00&lt;00:00, 177MB/s]
- 49%|####9     | 83.7M/170M [00:00&lt;00:00, 182MB/s]
- 60%|#####9    | 101M/170M [00:00&lt;00:00, 181MB/s]
- 71%|#######1  | 121M/170M [00:00&lt;00:00, 189MB/s]
- 84%|########3 | 142M/170M [00:00&lt;00:00, 200MB/s]
- 99%|#########8| 168M/170M [00:01&lt;00:00, 221MB/s]
-100%|##########| 170M/170M [00:01&lt;00:00, 176MB/s]
+ 11%|#1        | 18.9M/170M [00:00&lt;00:00, 198MB/s]
+ 24%|##3       | 40.2M/170M [00:00&lt;00:00, 213MB/s]
+ 39%|###9      | 66.6M/170M [00:00&lt;00:00, 242MB/s]
+ 55%|#####5    | 93.8M/170M [00:00&lt;00:00, 259MB/s]
+ 72%|#######1  | 122M/170M [00:00&lt;00:00, 272MB/s]
+ 89%|########8 | 151M/170M [00:00&lt;00:00, 282MB/s]
+100%|##########| 170M/170M [00:00&lt;00:00, 265MB/s]
 /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
   for i in range(dim)
 /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the &#39;trunc&#39; function NOT &#39;floor&#39;). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=&#39;trunc&#39;), or for actual floor division, use torch.div(a, b, rounding_mode=&#39;floor&#39;).
@@ -540,7 +536,7 @@ torchvision rcnn models.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Get 9 valid boxes
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  57.751 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  53.848 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-object-detection-pytorch-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7795da4b258c8feff986668b95ef57ad/deploy_object_detection_pytorch.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_object_detection_pytorch.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized.html b/docs/how_to/deploy_models/deploy_prequantized.html
index 23e4911753..df5ef1450b 100644
--- a/docs/how_to/deploy_models/deploy_prequantized.html
+++ b/docs/how_to/deploy_models/deploy_prequantized.html
@@ -480,9 +480,7 @@ training. Other models require a full post training calibration.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/mobilenet_v2-b0353104.pth&quot; to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
 
   0%|          | 0.00/13.6M [00:00&lt;?, ?B/s]
- 23%|##2       | 3.10M/13.6M [00:00&lt;00:00, 32.5MB/s]
- 46%|####5     | 6.20M/13.6M [00:00&lt;00:00, 31.6MB/s]
-100%|##########| 13.6M/13.6M [00:00&lt;00:00, 47.9MB/s]
+100%|##########| 13.6M/13.6M [00:00&lt;00:00, 169MB/s]
 </pre></div>
 </div>
 </div>
@@ -567,7 +565,7 @@ output values are identical out of 1000 outputs from mobilenet v2.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  90.2953      90.2327      91.4960      90.1105       0.1933
+  90.2392      90.1272      93.5011      89.9589       0.4160
 </pre></div>
 </div>
 <div class="admonition note">
@@ -606,7 +604,7 @@ This includes support for the VNNI 8 bit dot product instruction (CascadeLake or
 <div class="section" id="deploy-a-quantized-tflite-model">
 <h2>Deploy a quantized TFLite Model<a class="headerlink" href="#deploy-a-quantized-tflite-model" title="Permalink to this headline">¶</a></h2>
 <p>TODO</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  9.340 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  7.858 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/fb8217c13f4351224c6cf3aacf1a87fc/deploy_prequantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized_tflite.html b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
index 3073564818..c5adc5c342 100644
--- a/docs/how_to/deploy_models/deploy_prequantized_tflite.html
+++ b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
@@ -569,7 +569,7 @@ TFLite Top-5 labels: [387 102 386 341 349]
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  119.9456     119.9569     120.4957     119.2269      0.2380
+  118.1101     118.1413     123.6279     115.9274      1.0153
 </pre></div>
 </div>
 <div class="admonition note">
@@ -597,7 +597,7 @@ network for ARM CPU</span></a>.</p></li>
 </ul>
 </div></blockquote>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  2.546 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  57.521 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-tflite-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/56691c7a27d45da61d112276334640d3/deploy_prequantized_tflite.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized_tflite.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_quantized.html b/docs/how_to/deploy_models/deploy_quantized.html
index e33ebced9b..d6133c975b 100644
--- a/docs/how_to/deploy_models/deploy_quantized.html
+++ b/docs/how_to/deploy_models/deploy_quantized.html
@@ -507,7 +507,7 @@ for calibration. But the accuracy might be impacted.</p>
   DeprecationWarning,
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  22.959 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  22.663 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-quantized-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7810ecf51bfc05f7d5e8a400ac3e815d/deploy_quantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_quantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
index 8f45548138..13001e8f02 100644
--- a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
+++ b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
@@ -441,24 +441,25 @@ to your device.</p>
 Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
 
   0%|          | 0/132723 [00:00&lt;?, ?KB/s]
-  3%|2         | 3874/132723 [00:00&lt;00:03, 38737.27KB/s]
-  8%|8         | 11013/132723 [00:00&lt;00:02, 57939.44KB/s]
- 14%|#3        | 18488/132723 [00:00&lt;00:01, 65611.77KB/s]
- 20%|#9        | 26144/132723 [00:00&lt;00:01, 69932.56KB/s]
- 25%|##5       | 33748/132723 [00:00&lt;00:01, 72133.18KB/s]
- 31%|###1      | 41382/132723 [00:00&lt;00:01, 73561.85KB/s]
- 37%|###6      | 49013/132723 [00:00&lt;00:01, 74458.29KB/s]
- 43%|####2     | 56693/132723 [00:00&lt;00:01, 75198.00KB/s]
- 48%|####8     | 64345/132723 [00:00&lt;00:00, 75607.94KB/s]
- 54%|#####4    | 72058/132723 [00:01&lt;00:00, 76075.29KB/s]
- 60%|######    | 79783/132723 [00:01&lt;00:00, 76432.57KB/s]
- 66%|######5   | 87454/132723 [00:01&lt;00:00, 76514.88KB/s]
- 72%|#######1  | 95128/132723 [00:01&lt;00:00, 76581.32KB/s]
- 77%|#######7  | 102788/132723 [00:01&lt;00:00, 76585.44KB/s]
- 83%|########3 | 110490/132723 [00:01&lt;00:00, 76713.47KB/s]
- 89%|########9 | 118162/132723 [00:01&lt;00:00, 76601.78KB/s]
- 95%|#########4| 125887/132723 [00:01&lt;00:00, 76781.28KB/s]
-100%|##########| 132723/132723 [00:01&lt;00:00, 74125.95KB/s]
+  2%|1         | 2323/132723 [00:00&lt;00:05, 23106.95KB/s]
+  6%|5         | 7310/132723 [00:00&lt;00:03, 38812.83KB/s]
+ 11%|#         | 14533/132723 [00:00&lt;00:02, 54052.23KB/s]
+ 17%|#6        | 22158/132723 [00:00&lt;00:01, 62802.53KB/s]
+ 22%|##2       | 29749/132723 [00:00&lt;00:01, 67522.84KB/s]
+ 28%|##8       | 37313/132723 [00:00&lt;00:01, 70278.47KB/s]
+ 34%|###3      | 44850/132723 [00:00&lt;00:01, 71938.34KB/s]
+ 39%|###9      | 52415/132723 [00:00&lt;00:01, 73118.08KB/s]
+ 45%|####5     | 59961/132723 [00:00&lt;00:00, 73846.02KB/s]
+ 51%|#####     | 67377/132723 [00:01&lt;00:00, 73938.93KB/s]
+ 56%|#####6    | 74911/132723 [00:01&lt;00:00, 74365.90KB/s]
+ 62%|######2   | 82521/132723 [00:01&lt;00:00, 74889.86KB/s]
+ 68%|######7   | 90022/132723 [00:01&lt;00:00, 74924.27KB/s]
+ 74%|#######3  | 97657/132723 [00:01&lt;00:00, 75347.24KB/s]
+ 79%|#######9  | 105203/132723 [00:01&lt;00:00, 75373.61KB/s]
+ 85%|########4 | 112810/132723 [00:01&lt;00:00, 75581.47KB/s]
+ 91%|######### | 120409/132723 [00:01&lt;00:00, 75699.60KB/s]
+ 97%|#########6| 128105/132723 [00:01&lt;00:00, 76072.69KB/s]
+100%|##########| 132723/132723 [00:01&lt;00:00, 71348.18KB/s]
 </pre></div>
 </div>
 <p>Create TVM runtime and do inference
@@ -497,7 +498,7 @@ Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from h
 <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" srcset="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" alt="deploy ssd gluoncv" class = "sphx-glr-single-img"/><p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  36.767 seconds)</p>
+<img src="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" srcset="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" alt="deploy ssd gluoncv" class = "sphx-glr-single-img"/><p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  35.889 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-ssd-gluoncv-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/cccb17d28e5e8b2e94ea8cd5ec59f6ed/deploy_ssd_gluoncv.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_ssd_gluoncv.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/sg_execution_times.html b/docs/how_to/deploy_models/sg_execution_times.html
index f249cf94b3..a97d96590c 100644
--- a/docs/how_to/deploy_models/sg_execution_times.html
+++ b/docs/how_to/deploy_models/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-deploy-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>11:24.857</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
+<p><strong>11:12.321</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 86%" />
@@ -336,35 +336,35 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_object_detection_pytorch.html#sphx-glr-how-to-deploy-models-deploy-object-detection-pytorch-py"><span class="std std-ref">Compile PyTorch Object Detection Models</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_object_detection_pytorch.py</span></code>)</p></td>
-<td><p>02:57.751</p></td>
+<td><p>02:53.848</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_ssd_gluoncv.html#sphx-glr-how-to-deploy-models-deploy-ssd-gluoncv-py"><span class="std std-ref">Deploy Single Shot Multibox Detector(SSD) model</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_ssd_gluoncv.py</span></code>)</p></td>
-<td><p>02:36.767</p></td>
+<td><p>02:35.889</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_prequantized_tflite.html#sphx-glr-how-to-deploy-models-deploy-prequantized-tflite-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM - Part 3 (TFLite)</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized_tflite.py</span></code>)</p></td>
-<td><p>02:02.546</p></td>
+<td><p>01:57.521</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_quantized.html#sphx-glr-how-to-deploy-models-deploy-quantized-py"><span class="std std-ref">Deploy a Quantized Model on Cuda</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_quantized.py</span></code>)</p></td>
-<td><p>01:22.959</p></td>
+<td><p>01:22.663</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_prequantized.html#sphx-glr-how-to-deploy-models-deploy-prequantized-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized.py</span></code>)</p></td>
-<td><p>01:09.340</p></td>
+<td><p>01:07.858</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_model_on_android.html#sphx-glr-how-to-deploy-models-deploy-model-on-android-py"><span class="std std-ref">Deploy the Pretrained Model on Android</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_android.py</span></code>)</p></td>
-<td><p>00:30.067</p></td>
+<td><p>00:29.641</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_model_on_nano.html#sphx-glr-how-to-deploy-models-deploy-model-on-nano-py"><span class="std std-ref">Deploy the Pretrained Model on Jetson Nano</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_nano.py</span></code>)</p></td>
-<td><p>00:23.191</p></td>
+<td><p>00:22.665</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_model_on_rasp.html#sphx-glr-how-to-deploy-models-deploy-model-on-rasp-py"><span class="std std-ref">Deploy the Pretrained Model on Raspberry Pi</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_rasp.py</span></code>)</p></td>
-<td><p>00:22.229</p></td>
+<td><p>00:22.230</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_sparse.html#sphx-glr-how-to-deploy-models-deploy-sparse-py"><span class="std std-ref">Deploy a Hugging Face Pruned Model on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_sparse.py</span></code>)</p></td>
diff --git a/docs/how_to/extend_tvm/bring_your_own_datatypes.html b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
index 5f64f911fb..0e681471a7 100644
--- a/docs/how_to/extend_tvm/bring_your_own_datatypes.html
+++ b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
@@ -608,7 +608,7 @@ In this alpha state of the Bring Your Own Datatypes framework, we have not imple
 <span class="n">module</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">params</span></a> <span class="o">=</span> <span class="n">get_mobilenet</span><span class="p">()</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zipf85191a6-1797-488c-8310-b8b08200946a from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip81e92deb-670b-4325-98c0-cfac8ef7795d from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 </pre></div>
 </div>
 <p>It’s easy to execute MobileNet with native TVM:</p>
diff --git a/docs/how_to/extend_tvm/sg_execution_times.html b/docs/how_to/extend_tvm/sg_execution_times.html
index 751f99e679..31c4a829a7 100644
--- a/docs/how_to/extend_tvm/sg_execution_times.html
+++ b/docs/how_to/extend_tvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-extend-tvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:41.127</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
+<p><strong>00:40.365</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,19 +336,19 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="bring_your_own_datatypes.html#sphx-glr-how-to-extend-tvm-bring-your-own-datatypes-py"><span class="std std-ref">Bring Your Own Datatypes to TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">bring_your_own_datatypes.py</span></code>)</p></td>
-<td><p>00:38.012</p></td>
+<td><p>00:37.296</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="use_pass_instrument.html#sphx-glr-how-to-extend-tvm-use-pass-instrument-py"><span class="std std-ref">How to Use TVM Pass Instrument</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_instrument.py</span></code>)</p></td>
-<td><p>00:02.193</p></td>
+<td><p>00:02.148</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="use_pass_infra.html#sphx-glr-how-to-extend-tvm-use-pass-infra-py"><span class="std std-ref">How to Use TVM Pass Infra</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_infra.py</span></code>)</p></td>
-<td><p>00:00.914</p></td>
+<td><p>00:00.913</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="low_level_custom_pass.html#sphx-glr-how-to-extend-tvm-low-level-custom-pass-py"><span class="std std-ref">Writing a Customized Pass</span></a> (<code class="docutils literal notranslate"><span class="pre">low_level_custom_pass.py</span></code>)</p></td>
-<td><p>00:00.008</p></td>
+<td><p>00:00.007</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/extend_tvm/use_pass_instrument.html b/docs/how_to/extend_tvm/use_pass_instrument.html
index 4528043ccc..ab500025fc 100644
--- a/docs/how_to/extend_tvm/use_pass_instrument.html
+++ b/docs/how_to/extend_tvm/use_pass_instrument.html
@@ -512,10 +512,10 @@ profile the execution time of each passes.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 6695us [6695us] (46.43%; 46.43%)
-FoldScaleAxis: 7724us [6us] (53.57%; 53.57%)
-        FoldConstant: 7718us [1578us] (53.53%; 99.93%)
-                InferType: 6140us [6140us] (42.59%; 79.56%)
+InferType: 6699us [6699us] (45.83%; 45.83%)
+FoldScaleAxis: 7917us [5us] (54.17%; 54.17%)
+        FoldConstant: 7912us [1637us] (54.13%; 99.94%)
+                InferType: 6275us [6275us] (42.93%; 79.31%)
 </pre></div>
 </div>
 </div>
@@ -537,10 +537,10 @@ Refer to following sections and <a class="reference internal" href="../../refere
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 6151us [6151us] (44.81%; 44.81%)
-FoldScaleAxis: 7576us [5us] (55.19%; 55.19%)
-        FoldConstant: 7571us [1519us] (55.16%; 99.94%)
-                InferType: 6052us [6052us] (44.09%; 79.94%)
+InferType: 6321us [6321us] (44.63%; 44.63%)
+FoldScaleAxis: 7842us [5us] (55.37%; 55.37%)
+        FoldConstant: 7838us [1624us] (55.34%; 99.94%)
+                InferType: 6214us [6214us] (43.87%; 79.28%)
 </pre></div>
 </div>
 <p>Register empty list to clear existing instruments.</p>
diff --git a/docs/how_to/optimize_operators/opt_conv_cuda.html b/docs/how_to/optimize_operators/opt_conv_cuda.html
index 4e232aebea..188d60dd83 100644
--- a/docs/how_to/optimize_operators/opt_conv_cuda.html
+++ b/docs/how_to/optimize_operators/opt_conv_cuda.html
@@ -564,7 +564,7 @@ latency of convolution.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Convolution: </span><span class="si">%f</span><span class="s2"> ms&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span> <span class="o">*</span> <span cl [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 54.149073 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 33.803029 ms
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-optimize-operators-opt-conv-cuda-py">
diff --git a/docs/how_to/optimize_operators/opt_conv_tensorcore.html b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
index b505e65a35..28bcbe2daa 100644
--- a/docs/how_to/optimize_operators/opt_conv_tensorcore.html
+++ b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
@@ -906,7 +906,7 @@ be able to run on our build server</p>
     <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;conv2d with tensor core: </span><span class="si">%f</span><span class="s2"> ms&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span> <span class="o">* [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 7.165055 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 8.071042 ms
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/optimize_operators/opt_gemm.html b/docs/how_to/optimize_operators/opt_gemm.html
index bcec3daf4f..7b1c176f5c 100644
--- a/docs/how_to/optimize_operators/opt_gemm.html
+++ b/docs/how_to/optimize_operators/opt_gemm.html
@@ -461,8 +461,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Baseline: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.019315
-Baseline: 3.299040
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018574
+Baseline: 3.547201
 </pre></div>
 </div>
 <p>In TVM, we can always inspect lower level IR to debug or optimize our schedule.
@@ -522,7 +522,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt1: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.307622
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.294292
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -589,7 +589,7 @@ vastly.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt2: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.355002
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.329266
 </pre></div>
 </div>
 <p>Here is the generated IR after vectorization.</p>
@@ -650,7 +650,7 @@ the access pattern for A matrix is more cache friendly.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt3: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.118580
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.115266
 </pre></div>
 </div>
 <p>Here is the generated IR after loop permutation.</p>
@@ -733,7 +733,7 @@ flattening.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt4: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.109346
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.109356
 </pre></div>
 </div>
 <p>Here is the generated IR after array packing.</p>
@@ -819,7 +819,7 @@ write to C when all the block results are ready.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt5: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.110808
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.111867
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -909,7 +909,7 @@ write to C when all the block results are ready.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt6: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">opt6_time</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.146752
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.147408
 </pre></div>
 </div>
 <p>Here is the generated IR after parallelization.</p>
diff --git a/docs/how_to/optimize_operators/sg_execution_times.html b/docs/how_to/optimize_operators/sg_execution_times.html
index f89fe14143..8cb0a286b1 100644
--- a/docs/how_to/optimize_operators/sg_execution_times.html
+++ b/docs/how_to/optimize_operators/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-optimize-operators-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:34.531</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
+<p><strong>00:34.777</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,15 +336,15 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="opt_gemm.html#sphx-glr-how-to-optimize-operators-opt-gemm-py"><span class="std std-ref">How to optimize GEMM on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_gemm.py</span></code>)</p></td>
-<td><p>00:32.219</p></td>
+<td><p>00:32.513</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="opt_conv_tensorcore.html#sphx-glr-how-to-optimize-operators-opt-conv-tensorcore-py"><span class="std std-ref">How to optimize convolution using TensorCores</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_tensorcore.py</span></code>)</p></td>
-<td><p>00:01.245</p></td>
+<td><p>00:01.244</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="opt_conv_cuda.html#sphx-glr-how-to-optimize-operators-opt-conv-cuda-py"><span class="std std-ref">How to optimize convolution on GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_cuda.py</span></code>)</p></td>
-<td><p>00:01.067</p></td>
+<td><p>00:01.019</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
index a300184a2d..a4b82e900e 100644
--- a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
+++ b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autoscheduler-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>06:14.329</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
+<p><strong>06:18.811</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 85%" />
@@ -336,27 +336,27 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_conv2d_layer_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py"><span class="std std-ref">Auto-scheduling a Convolution Layer for GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_layer_cuda.py</span></code>)</p></td>
-<td><p>03:18.588</p></td>
+<td><p>03:23.028</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_network_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-x86-py"><span class="std std-ref">Auto-scheduling a Neural Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_x86.py</span></code>)</p></td>
-<td><p>01:23.035</p></td>
+<td><p>01:22.364</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_network_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-cuda-py"><span class="std std-ref">Auto-scheduling a Neural Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_cuda.py</span></code>)</p></td>
-<td><p>00:56.131</p></td>
+<td><p>00:56.087</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_sparse_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-sparse-x86-py"><span class="std std-ref">Auto-scheduling Sparse Matrix Multiplication on CPU with Custom Sketch Rule</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_sparse_x86.py</span></code>)</p></td>
-<td><p>00:19.052</p></td>
+<td><p>00:20.042</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_network_mali.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-mali-py"><span class="std std-ref">Auto-scheduling a Neural Network for mali GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_mali.py</span></code>)</p></td>
-<td><p>00:08.818</p></td>
+<td><p>00:08.729</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_network_arm.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-arm-py"><span class="std std-ref">Auto-scheduling a Neural Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_arm.py</span></code>)</p></td>
-<td><p>00:08.705</p></td>
+<td><p>00:08.562</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
index 20af8eec1d..04dbaefdac 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
@@ -1004,7 +1004,7 @@ cooperative fetching, unrolling and operator fusion.</p>
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.353 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.351 ms
 </pre></div>
 </div>
 </div>
@@ -1567,7 +1567,7 @@ In the example below we resume the status and do more 5 trials.</p>
 Get devices for measurement successfully!
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  18.588 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  23.028 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e3e540f3b477c0c52d8eb73e674e8ffd/tune_conv2d_layer_cuda.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_conv2d_layer_cuda.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
index 97323b8353..7b9d7b6293 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
@@ -902,7 +902,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   8.2317       8.2330       8.2351       8.2270       0.0034
+   8.1813       8.1746       8.1988       8.1705       0.0125
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
index 652e1bd9bb..32872a217d 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
@@ -921,7 +921,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  763.7009     763.8071     764.2583     763.0374      0.5041
+  759.0404     759.5762     761.6077     755.9372      2.3458
 </pre></div>
 </div>
 </div>
@@ -943,7 +943,7 @@ to learn how to use the RPC Tracker and RPC Server.
 To use the RPC Tracker in auto-scheduler, replace the runner in <code class="code docutils literal notranslate"><span class="pre">TuningOptions</span></code>
 with <a class="reference internal" href="../../reference/api/python/auto_scheduler.html#tvm.auto_scheduler.RPCRunner" title="tvm.auto_scheduler.RPCRunner"><code class="xref any py py-class docutils literal notranslate"><span class="pre">auto_scheduler.RPCRunner</span></code></a>.</p></li>
 </ol>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  23.035 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  22.364 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-network-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e416b94ca1090b0897c0f6e0df95b911/tune_network_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_network_x86.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
index 1d178d87d8..65526c8f73 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
@@ -625,31 +625,103 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
              placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
              compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
   buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-  preflattened_buffer_map = {placeholder_7: placeholder_15: Buffer(placeholder_12, int32, [4916], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_5: placeholder_17: Buffer(placeholder_10, float32, [128, 256], []), placeholder_9: placeholder_18: Buffer(placeholder_14, float32, [128, 512], []), placeholder_8: placeholder_19: Buffer(placeholder_13, int32, [33], [])} {
-  for (i0.outer: int32, 0, 8) &quot;parallel&quot; {
-    allocate(compute_4: Pointer(global float32), float32, [512]), storage_scope = global;
-    for (i1.outer: int32, 0, 16) {
-      for (nb_j.inner: int32, 0, 2) {
-        for (i.inner.init: int32, 0, 16) {
-          for (j.init: int32, 0, 16) {
-            compute_5: Buffer(compute_4, float32, [512], [])[(((i.inner.init*32) + (nb_j.inner*16)) + j.init)] = 0f32
+  preflattened_buffer_map = {placeholder_9: placeholder_15: Buffer(placeholder_14, float32, [128, 512], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_8: placeholder_16: Buffer(placeholder_13, int32, [33], []), placeholder_6: placeholder_17: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_7: placeholder_19: Buffer(placeholder_12, int32, [4916], [])} {
+  for (i0.outer.i1.outer.fused: int32, 0, 32) &quot;parallel&quot; {
+    allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global {
+      for (i.outer.inner: int32, 0, 2) {
+        for (i.inner.init: int32, 0, 64) {
+          let cse_var_1: int32 = ((i.outer.inner*1024) + (i.inner.init*16))
+           {
+            compute_5: Buffer(compute_4, float32, [2048], [])[cse_var_1] = 0f32
+            compute_5[(cse_var_1 + 1)] = 0f32
+            compute_5[(cse_var_1 + 2)] = 0f32
+            compute_5[(cse_var_1 + 3)] = 0f32
+            compute_5[(cse_var_1 + 4)] = 0f32
+            compute_5[(cse_var_1 + 5)] = 0f32
+            compute_5[(cse_var_1 + 6)] = 0f32
+            compute_5[(cse_var_1 + 7)] = 0f32
+            compute_5[(cse_var_1 + 8)] = 0f32
+            compute_5[(cse_var_1 + 9)] = 0f32
+            compute_5[(cse_var_1 + 10)] = 0f32
+            compute_5[(cse_var_1 + 11)] = 0f32
+            compute_5[(cse_var_1 + 12)] = 0f32
+            compute_5[(cse_var_1 + 13)] = 0f32
+            compute_5[(cse_var_1 + 14)] = 0f32
+            compute_5[(cse_var_1 + 15)] = 0f32
           }
         }
-        for (elem_idx: int32, 0, let cse_var_1: int32 = ((i1.outer*2) + nb_j.inner) in (placeholder_3[(cse_var_1 + 1)] - placeholder_3[cse_var_1])) {
-          for (i.inner: int32, 0, 16) {
-            for (j: int32, 0, 16) {
-              let cse_var_3: int32 = ((i1.outer*2) + nb_j.inner)
-              let cse_var_2: int32 = (((i.inner*32) + (nb_j.inner*16)) + j)
-              compute_5[cse_var_2] = (compute_5[cse_var_2] + (placeholder_1[(((placeholder_3[cse_var_3]*16) + (elem_idx*16)) + j)]*max(placeholder[(((i0.outer*4096) + (i.inner*256)) + placeholder_2[(placeholder_3[cse_var_3] + elem_idx)])], 0f32)))
+        for (elem_idx: int32, 0, (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])) {
+          for (i.inner: int32, 0, 64) {
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_2: int32 = ((i.outer.inner*1024) + (i.inner*16))
+              compute_5[cse_var_2] = (compute_5[cse_var_2] + (placeholder_1[((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16))]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_3: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 1)
+              compute_5[cse_var_3] = (compute_5[cse_var_3] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 1)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_4: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 2)
+              compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 2)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_5: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 3)
+              compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 3)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_6: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 4)
+              compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 4)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_7: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 5)
+              compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 5)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_8: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 6)
+              compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 6)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_9: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 7)
+              compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 7)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_10: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 8)
+              compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 8)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_11: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 9)
+              compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 9)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_12: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 10)
+              compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 10)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_13: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 11)
+              compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 11)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_14: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 12)
+              compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 12)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_15: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 13)
+              compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 13)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_16: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 14)
+              compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 14)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+            }
+            if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
+              let cse_var_17: int32 = (((i.outer.inner*1024) + (i.inner*16)) + 15)
+              compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + 15)]*max(placeholder[(((i.outer.inner*16384) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
             }
           }
         }
       }
-      for (i0.inner: int32, 0, 16) {
-        for (i1.inner: int32, 0, 32) {
-          let cse_var_4: int32 = ((((i0.outer*8192) + (i0.inner*512)) + (i1.outer*32)) + i1.inner)
-          compute[cse_var_4] = max((compute_5[((i0.inner*32) + i1.inner)] + placeholder_4[cse_var_4]), 0f32)
-        }
+      for (i0.inner: int32, 0, 128) {
+        let cse_var_18: int32 = ((i0.inner*512) + (i0.outer.i1.outer.fused*16))
+        compute[ramp(cse_var_18, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_18, 1, 16)]), broadcast(0f32, 16))
       }
     }
   }
@@ -687,7 +759,7 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 1.472 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 1.847 ms
 </pre></div>
 </div>
 <div class="admonition note">
diff --git a/docs/how_to/tune_with_autotvm/sg_execution_times.html b/docs/how_to/tune_with_autotvm/sg_execution_times.html
index 5403db2dad..f90a399e79 100644
--- a/docs/how_to/tune_with_autotvm/sg_execution_times.html
+++ b/docs/how_to/tune_with_autotvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autotvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:46.722</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
+<p><strong>00:45.782</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_conv2d_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-conv2d-cuda-py"><span class="std std-ref">Tuning High Performance Convolution on NVIDIA GPUs</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_cuda.py</span></code>)</p></td>
-<td><p>00:46.684</p></td>
+<td><p>00:45.748</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_relay_x86.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-x86-py"><span class="std std-ref">Auto-tuning a Convolutional Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_x86.py</span></code>)</p></td>
-<td><p>00:00.023</p></td>
+<td><p>00:00.019</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_relay_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-cuda-py"><span class="std std-ref">Auto-tuning a Convolutional Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_cuda.py</span></code>)</p></td>
diff --git a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
index e426ce811a..0a3dd775b8 100644
--- a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
+++ b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
@@ -1436,8 +1436,8 @@ No: 8   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
 TimeoutError
 
         [(&#39;tile_f&#39;, [-1, 2, 1, 64]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4909501
-No: 9   GFLOPS: 122.34/122.34   result: MeasureResult(costs=(0.0018922917142857143,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.066124200820923, timestamp=1663570950.7812462)       [(&#39;tile_f&#39;, [-1, 1, 4, 8]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 2, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5072689
-No: 10  GFLOPS: 0.00/122.34     result: Traceback (most recent call last):
+No: 9   GFLOPS: 80.76/80.76     result: MeasureResult(costs=(0.002866573742857143,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6295883655548096, timestamp=1663598748.3520875)       [(&#39;tile_f&#39;, [-1, 1, 4, 8]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 2, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5072689
+No: 10  GFLOPS: 0.00/80.76      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1560,8 +1560,8 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 4, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 64, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5092711
-No: 11  GFLOPS: 260.23/260.23   result: MeasureResult(costs=(0.0008895890939226519,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.72236967086792, timestamp=1663570951.7091606)        [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
-No: 12  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+No: 11  GFLOPS: 260.43/260.43   result: MeasureResult(costs=(0.0008889050165745856,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.740246057510376, timestamp=1663598749.2615347)       [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
+No: 12  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1684,7 +1684,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 128, 1, 2]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 256]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 0)],None,183542
-No: 13  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+No: 13  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1807,7 +1807,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 8, 8]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 64]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2482196
-No: 14  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+No: 14  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1930,9 +1930,9 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 64, 1, 4]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,10306226
-No: 15  GFLOPS: 5.47/260.23     result: MeasureResult(costs=(0.042334316250000004,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8582491874694824, timestamp=1663570956.3224638)       [(&#39;tile_f&#39;, [-1, 2, 2, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 8]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,5330964
-No: 16  GFLOPS: 3.35/260.23     result: MeasureResult(costs=(0.06910300125,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.610316753387451, timestamp=1663570957.5599132)       [(&#39;tile_f&#39;, [-1, 8, 4, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2140058
-No: 17  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+No: 15  GFLOPS: 5.48/260.43     result: MeasureResult(costs=(0.0422195405,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8663091659545898, timestamp=1663598753.808667)        [(&#39;tile_f&#39;, [-1, 2, 2, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 8]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,5330964
+No: 16  GFLOPS: 3.36/260.43     result: MeasureResult(costs=(0.06892734625,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.543661117553711, timestamp=1663598755.042103)        [(&#39;tile_f&#39;, [-1, 8, 4, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2140058
+No: 17  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 142, in build
     res = future.result()
   File &quot;/usr/lib/python3.7/concurrent/futures/_base.py&quot;, line 435, in result
@@ -1950,8 +1950,8 @@ No: 17  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
 TimeoutError
 
         [(&#39;tile_f&#39;, [-1, 2, 2, 1]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 16]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,10195251
-No: 18  GFLOPS: 25.42/260.23    result: MeasureResult(costs=(0.009107807090909092,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2614367008209229, timestamp=1663570968.5666356)       [(&#39;tile_f&#39;, [-1, 4, 8, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6068603
-No: 19  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+No: 18  GFLOPS: 26.32/260.43    result: MeasureResult(costs=(0.008796826,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.1562161445617676, timestamp=1663598765.9055436)        [(&#39;tile_f&#39;, [-1, 4, 8, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6068603
+No: 19  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -2074,7 +2074,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 16, 4, 8]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 128]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6956993
-No: 20  GFLOPS: 0.00/260.23     result: Traceback (most recent call last):
+No: 20  GFLOPS: 0.00/260.43     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -2237,7 +2237,7 @@ and measure running time.</p>
 Best config:
 [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
 Finish loading 20 records
-Time cost of this operator: 0.001281
+Time cost of this operator: 0.001219
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autotvm-tune-conv2d-cuda-py">
diff --git a/docs/how_to/work_with_microtvm/micro_autotune.html b/docs/how_to/work_with_microtvm/micro_autotune.html
index eefc416878..0e5aec0736 100644
--- a/docs/how_to/work_with_microtvm/micro_autotune.html
+++ b/docs/how_to/work_with_microtvm/micro_autotune.html
@@ -582,10 +582,10 @@ the tuned operator.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>########## Build without Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
 ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  310.7     98.73    (1, 2, 10, 10, 3)  2       1        [310.7]
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.018     0.959    (1, 6, 10, 10)     1       1        [3.018]
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.978     0.311    (1, 1, 10, 10, 3)  1       1        [0.978]
-Total_time                                    -                                             314.696   -        -                  -       -        -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  311.5     98.726   (1, 2, 10, 10, 3)  2       1        [311.5]
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.067     0.972    (1, 6, 10, 10)     1       1        [3.067]
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.954     0.302    (1, 1, 10, 10, 3)  1       1        [0.954]
+Total_time                                    -                                             315.521   -        -                  -       -        -
 </pre></div>
 </div>
 </div>
@@ -636,10 +636,10 @@ Total_time                                    -
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>########## Build with Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
 ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  79.625    96.674   (1, 6, 10, 10, 1)  2       1        [79.625]
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.767     2.146    (1, 6, 10, 10)     1       1        [1.767]
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.972     1.18     (1, 1, 10, 10, 3)  1       1        [0.972]
-Total_time                                    -                                             82.365    -        -                  -       -        -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  130.3     97.914   (1, 6, 10, 10, 1)  2       1        [130.3]
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.805     1.356    (1, 6, 10, 10)     1       1        [1.805]
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.971     0.73     (1, 1, 10, 10, 3)  1       1        [0.971]
+Total_time                                    -                                             133.076   -        -                  -       -        -
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-work-with-microtvm-micro-autotune-py">
diff --git a/docs/how_to/work_with_microtvm/micro_train.html b/docs/how_to/work_with_microtvm/micro_train.html
index e9a9a3f505..59d2d35941 100644
--- a/docs/how_to/work_with_microtvm/micro_train.html
+++ b/docs/how_to/work_with_microtvm/micro_train.html
@@ -516,7 +516,7 @@ take about <strong>2 minutes</strong> to download the Stanford Cars, while COCO
 <a href="https://docs.python.org/3/library/shutil.html#shutil.move" title="shutil.move" class="sphx-glr-backref-module-shutil sphx-glr-backref-type-py-function"><span class="n">shutil</span><span class="o">.</span><span class="n">move</span></a><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><a href="https://docs.python.org/3/library/stdtypes.html#str" title="builtins.str" class="sphx-glr-backref-module-builtins sphx-glr-backref-typ [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&#39;/tmp/tmp3t0xlwvq/images/random&#39;
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&#39;/tmp/tmp_74ha2pj/images/random&#39;
 </pre></div>
 </div>
 </div>
@@ -576,8 +576,8 @@ objects to other stuff? We can display some examples from our datasets using <co
     <span class="n">plt</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&quot;off&quot;</span><span class="p">)</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_micro_train_001.png" srcset="../../_images/sphx_glr_micro_train_001.png" alt="[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0], [1.0, 0.0]" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/tmp/tmp3t0xlwvq/images/target contains 8144 images
-/tmp/tmp3t0xlwvq/images/random contains 5000 images
+<img src="../../_images/sphx_glr_micro_train_001.png" srcset="../../_images/sphx_glr_micro_train_001.png" alt="[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0], [1.0, 0.0]" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/tmp/tmp_74ha2pj/images/target contains 8144 images
+/tmp/tmp_74ha2pj/images/random contains 5000 images
 </pre></div>
 </div>
 </div>
@@ -689,13 +689,13 @@ the time on our validation set).</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Epoch 1/3
-328/328 - 47s - loss: 0.2141 - accuracy: 0.9268 - val_loss: 0.1278 - val_accuracy: 0.9569 - 47s/epoch - 144ms/step
+328/328 - 46s - loss: 0.2312 - accuracy: 0.9204 - val_loss: 0.1370 - val_accuracy: 0.9554 - 46s/epoch - 141ms/step
 Epoch 2/3
-328/328 - 43s - loss: 0.0990 - accuracy: 0.9631 - val_loss: 0.1031 - val_accuracy: 0.9660 - 43s/epoch - 132ms/step
+328/328 - 43s - loss: 0.0946 - accuracy: 0.9645 - val_loss: 0.1184 - val_accuracy: 0.9603 - 43s/epoch - 132ms/step
 Epoch 3/3
-328/328 - 43s - loss: 0.0634 - accuracy: 0.9771 - val_loss: 0.1065 - val_accuracy: 0.9611 - 43s/epoch - 132ms/step
+328/328 - 43s - loss: 0.0677 - accuracy: 0.9740 - val_loss: 0.1360 - val_accuracy: 0.9596 - 43s/epoch - 131ms/step
 
-&lt;keras.callbacks.History object at 0x7f082c7edb10&gt;
+&lt;keras.callbacks.History object at 0x7fad68c22fd0&gt;
 </pre></div>
 </div>
 </div>
@@ -957,7 +957,7 @@ as intended.</p>
 <p>From here, we could modify the model to read live images from the camera - we have another
 Arduino tutorial for how to do that <a class="reference external" href="https://github.com/guberti/tvm-arduino-demos/tree/master/examples/person_detection">on GitHub</a>. Alternatively, we could also
 <a class="reference external" href="https://tvm.apache.org/docs/how_to/work_with_microtvm/micro_autotune.html">use TVM’s autotuning capabilities</a> to dramatically improve the model’s performance.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 4 minutes  41.436 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 4 minutes  20.966 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-work-with-microtvm-micro-train-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/b52cec46baf4f78d6bcd94cbe269c8a6/micro_train.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">micro_train.py</span></code></a></p>
diff --git a/docs/how_to/work_with_microtvm/sg_execution_times.html b/docs/how_to/work_with_microtvm/sg_execution_times.html
index 500352d15d..c198281607 100644
--- a/docs/how_to/work_with_microtvm/sg_execution_times.html
+++ b/docs/how_to/work_with_microtvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-microtvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>05:35.915</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
+<p><strong>05:14.379</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,19 +336,19 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_train.html#sphx-glr-how-to-work-with-microtvm-micro-train-py"><span class="std std-ref">Training Vision Models for microTVM on Arduino</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_train.py</span></code>)</p></td>
-<td><p>04:41.436</p></td>
+<td><p>04:20.966</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="micro_autotune.html#sphx-glr-how-to-work-with-microtvm-micro-autotune-py"><span class="std std-ref">Autotuning with microTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_autotune.py</span></code>)</p></td>
-<td><p>00:42.436</p></td>
+<td><p>00:41.957</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_aot.html#sphx-glr-how-to-work-with-microtvm-micro-aot-py"><span class="std std-ref">microTVM Host-Driven AoT</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_aot.py</span></code>)</p></td>
-<td><p>00:08.732</p></td>
+<td><p>00:08.158</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="micro_tflite.html#sphx-glr-how-to-work-with-microtvm-micro-tflite-py"><span class="std std-ref">microTVM with TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_tflite.py</span></code>)</p></td>
-<td><p>00:03.309</p></td>
+<td><p>00:03.296</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_ethosu.html#sphx-glr-how-to-work-with-microtvm-micro-ethosu-py"><span class="std std-ref">Running TVM on bare metal Arm(R) Cortex(R)-M55 CPU and Ethos(TM)-U55 NPU with CMSIS-NN</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_ethosu.py</span></code>)</p></td>
diff --git a/docs/how_to/work_with_relay/sg_execution_times.html b/docs/how_to/work_with_relay/sg_execution_times.html
index 03e8f34653..8b84851ead 100644
--- a/docs/how_to/work_with_relay/sg_execution_times.html
+++ b/docs/how_to/work_with_relay/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-relay-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:46.379</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
+<p><strong>00:43.074</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,19 +336,19 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="using_pipeline_executor.html#sphx-glr-how-to-work-with-relay-using-pipeline-executor-py"><span class="std std-ref">Using Pipeline Executor in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_pipeline_executor.py</span></code>)</p></td>
-<td><p>00:32.692</p></td>
+<td><p>00:31.385</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="using_external_lib.html#sphx-glr-how-to-work-with-relay-using-external-lib-py"><span class="std std-ref">Using External Libraries in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_external_lib.py</span></code>)</p></td>
-<td><p>00:11.336</p></td>
+<td><p>00:10.046</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="build_gcn.html#sphx-glr-how-to-work-with-relay-build-gcn-py"><span class="std std-ref">Building a Graph Convolutional Network</span></a> (<code class="docutils literal notranslate"><span class="pre">build_gcn.py</span></code>)</p></td>
-<td><p>00:02.345</p></td>
+<td><p>00:01.637</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="using_relay_viz.html#sphx-glr-how-to-work-with-relay-using-relay-viz-py"><span class="std std-ref">Use Relay Visualizer to Visualize Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_relay_viz.py</span></code>)</p></td>
-<td><p>00:00.007</p></td>
+<td><p>00:00.006</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/work_with_schedules/intrin_math.html b/docs/how_to/work_with_schedules/intrin_math.html
index 2a0397836f..287a6ac38b 100644
--- a/docs/how_to/work_with_schedules/intrin_math.html
+++ b/docs/how_to/work_with_schedules/intrin_math.html
@@ -522,7 +522,7 @@ The following example customizes CUDA lowering rule for <code class="code docuti
 <a href="../../reference/api/python/ir.html#tvm.ir.register_intrin_lowering" title="tvm.ir.register_intrin_lowering" class="sphx-glr-backref-module-tvm-ir sphx-glr-backref-type-py-function"><span class="n">register_intrin_lowering</span></a><span class="p">(</span><span class="s2">&quot;tir.exp&quot;</span><span class="p">,</span> <span class="n">target</span><span class="o">=</span><span class="s2">&quot;cuda&quot;</span><span class="p">,</span> <span class="n">f</span><span class="o">= [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&lt;function my_cuda_math_rule at 0x7f07c0901d40&gt;
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&lt;function my_cuda_math_rule at 0x7fad0dbcb680&gt;
 </pre></div>
 </div>
 <p>Register the rule to TVM with override option to override existing rule.
diff --git a/docs/how_to/work_with_schedules/sg_execution_times.html b/docs/how_to/work_with_schedules/sg_execution_times.html
index b5289f2792..b8b1890659 100644
--- a/docs/how_to/work_with_schedules/sg_execution_times.html
+++ b/docs/how_to/work_with_schedules/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-schedules-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:09.899</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
+<p><strong>00:07.969</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,23 +336,23 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="intrin_math.html#sphx-glr-how-to-work-with-schedules-intrin-math-py"><span class="std std-ref">Intrinsics and Math Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">intrin_math.py</span></code>)</p></td>
-<td><p>00:06.996</p></td>
+<td><p>00:05.692</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tensorize.html#sphx-glr-how-to-work-with-schedules-tensorize-py"><span class="std std-ref">Use Tensorize to Leverage Hardware Intrinsics</span></a> (<code class="docutils literal notranslate"><span class="pre">tensorize.py</span></code>)</p></td>
-<td><p>00:01.600</p></td>
+<td><p>00:01.030</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="reduction.html#sphx-glr-how-to-work-with-schedules-reduction-py"><span class="std std-ref">Reduction</span></a> (<code class="docutils literal notranslate"><span class="pre">reduction.py</span></code>)</p></td>
-<td><p>00:00.571</p></td>
+<td><p>00:00.542</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="scan.html#sphx-glr-how-to-work-with-schedules-scan-py"><span class="std std-ref">Scan and Recurrent Kernel</span></a> (<code class="docutils literal notranslate"><span class="pre">scan.py</span></code>)</p></td>
-<td><p>00:00.546</p></td>
+<td><p>00:00.525</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="extern_op.html#sphx-glr-how-to-work-with-schedules-extern-op-py"><span class="std std-ref">External Tensor Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">extern_op.py</span></code>)</p></td>
-<td><p>00:00.100</p></td>
+<td><p>00:00.098</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="schedule_primitives.html#sphx-glr-how-to-work-with-schedules-schedule-primitives-py"><span class="std std-ref">Schedule Primitives in TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">schedule_primitives.py</span></code>)</p></td>
@@ -360,7 +360,7 @@
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tedd.html#sphx-glr-how-to-work-with-schedules-tedd-py"><span class="std std-ref">Use Tensor Expression Debug Display (TEDD) for Visualization</span></a> (<code class="docutils literal notranslate"><span class="pre">tedd.py</span></code>)</p></td>
-<td><p>00:00.029</p></td>
+<td><p>00:00.027</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tuple_inputs.html#sphx-glr-how-to-work-with-schedules-tuple-inputs-py"><span class="std std-ref">Compute and Reduce with Tuple Inputs</span></a> (<code class="docutils literal notranslate"><span class="pre">tuple_inputs.py</span></code>)</p></td>
diff --git a/docs/how_to/work_with_schedules/tensorize.html b/docs/how_to/work_with_schedules/tensorize.html
index b00f869b52..7705e0dfdb 100644
--- a/docs/how_to/work_with_schedules/tensorize.html
+++ b/docs/how_to/work_with_schedules/tensorize.html
@@ -577,7 +577,7 @@ The importing needs to happen before the tensorized GEMV being executed.</p>
              C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
   buffer_map = {A_1: A, B_1: B, C_1: C}
   preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmpiuk6tos9/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmpiuk6tos9/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
+  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmpdz20gvwn/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmpdz20gvwn/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
   for (i, 0, 1024) {
     for (j.outer: int32, 0, 32) {
       @tir.call_extern(&quot;gemv_update&quot;, @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html
index cf654aee52..7413ad6853 100644
--- a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html
+++ b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode-members.html
@@ -149,7 +149,7 @@ $(function() {
   <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#aaca1621ab9c3db0ddd04ac57de79d37f">Tensorize</a>(const BlockRV &amp;block_rv, const String &amp;intrin)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
   <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a953bca4123b5a758adfdcd65634a5f3b">trace</a>() const =0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
   <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a998b22e37ef63a697a984c8ebcc39ca2">TransformBlockLayout</a>(const BlockRV &amp;block_rv, const IndexMap &amp;index_map)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c">TransformLayout</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type, const IndexMap &amp;index_map)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></td></tr>
+  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a55848f8f7a3293731cc4f4ed3832e901">TransformLayout</a>(const BlockRV &amp;block_rv, int buffer_index, BufferIndexType buffer_index_type, const IndexMap &amp;index_map, const Optional&lt; IndexMap &gt; &amp;pad_value=NullOpt)=0</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"><span class="mlabel">pure virtual</span></ [...]
   <tr><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#abc4294398b140f3ff13a33f94a2f9e5f">TVM_DECLARE_FINAL_OBJECT_INFO</a>(ScheduleNode, runtime::Object)</td><td class="entry"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html">tvm::tir::ScheduleNode</a></td><td class="entry"></td></tr>
   <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a481f01923b14e1851ebd38506e9c66ea">type_index</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
   <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a4bfc2586cb55f2af47728187b3256255">type_index_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
diff --git a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html
index e160ed8813..4cbfef6645 100644
--- a/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html
+++ b/docs/reference/api/doxygen/classtvm_1_1tir_1_1ScheduleNode.html
@@ -255,9 +255,9 @@ Public Member Functions</h2></td></tr>
 <tr class="memitem:a7c310bca5d1583e61a3f27052a1dd5d0"><td class="memItemLeft" align="right" valign="top">virtual void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a7c310bca5d1583e61a3f27052a1dd5d0">Unannotate</a> (const <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> &amp;block_rv, const <a class="el" href="classtvm_1_1runtime_1_1String.html">String</a> &amp;ann_key)=0</td></tr>
 <tr class="memdesc:a7c310bca5d1583e61a3f27052a1dd5d0"><td class="mdescLeft">&#160;</td><td class="mdescRight">Unannotate a block's annotation with key ann_key.  <a href="#a7c310bca5d1583e61a3f27052a1dd5d0">More...</a><br /></td></tr>
 <tr class="separator:a7c310bca5d1583e61a3f27052a1dd5d0"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a63d45b3109e1dbebcdd4d4f2223b395c"><td class="memItemLeft" align="right" valign="top">virtual void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c">TransformLayout</a> (const <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> &amp;block_rv, int buffer_index, <a class="el" href="namespacetvm_1_1tir.html#a1c8232edeb2fcce8eb95477c5153237a">BufferIndexType</a> buffer [...]
-<tr class="memdesc:a63d45b3109e1dbebcdd4d4f2223b395c"><td class="mdescLeft">&#160;</td><td class="mdescRight">Apply a transformation represented by <a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> to buffer.  <a href="#a63d45b3109e1dbebcdd4d4f2223b395c">More...</a><br /></td></tr>
-<tr class="separator:a63d45b3109e1dbebcdd4d4f2223b395c"><td class="memSeparator" colspan="2">&#160;</td></tr>
+<tr class="memitem:a55848f8f7a3293731cc4f4ed3832e901"><td class="memItemLeft" align="right" valign="top">virtual void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a55848f8f7a3293731cc4f4ed3832e901">TransformLayout</a> (const <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> &amp;block_rv, int buffer_index, <a class="el" href="namespacetvm_1_1tir.html#a1c8232edeb2fcce8eb95477c5153237a">BufferIndexType</a> buffer [...]
+<tr class="memdesc:a55848f8f7a3293731cc4f4ed3832e901"><td class="mdescLeft">&#160;</td><td class="mdescRight">Apply a transformation represented by <a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> to buffer.  <a href="#a55848f8f7a3293731cc4f4ed3832e901">More...</a><br /></td></tr>
+<tr class="separator:a55848f8f7a3293731cc4f4ed3832e901"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a998b22e37ef63a697a984c8ebcc39ca2"><td class="memItemLeft" align="right" valign="top">virtual void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a998b22e37ef63a697a984c8ebcc39ca2">TransformBlockLayout</a> (const <a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> &amp;block_rv, const <a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> &amp;index_map)=0</td></tr>
 <tr class="memdesc:a998b22e37ef63a697a984c8ebcc39ca2"><td class="mdescLeft">&#160;</td><td class="mdescRight">Apply a transformation represented by <a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> to block.  <a href="#a998b22e37ef63a697a984c8ebcc39ca2">More...</a><br /></td></tr>
 <tr class="separator:a998b22e37ef63a697a984c8ebcc39ca2"><td class="memSeparator" colspan="2">&#160;</td></tr>
@@ -2642,8 +2642,8 @@ Additional Inherited Members</h2></td></tr>
 
 </div>
 </div>
-<a id="a63d45b3109e1dbebcdd4d4f2223b395c"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a63d45b3109e1dbebcdd4d4f2223b395c">&#9670;&nbsp;</a></span>TransformLayout()</h2>
+<a id="a55848f8f7a3293731cc4f4ed3832e901"></a>
+<h2 class="memtitle"><span class="permalink"><a href="#a55848f8f7a3293731cc4f4ed3832e901">&#9670;&nbsp;</a></span>TransformLayout()</h2>
 
 <div class="memitem">
 <div class="memproto">
@@ -2673,7 +2673,13 @@ Additional Inherited Members</h2></td></tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">const <a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> &amp;&#160;</td>
-          <td class="paramname"><em>index_map</em>&#160;</td>
+          <td class="paramname"><em>index_map</em>, </td>
+        </tr>
+        <tr>
+          <td class="paramkey"></td>
+          <td></td>
+          <td class="paramtype">const <a class="el" href="classtvm_1_1runtime_1_1Optional.html">Optional</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> &gt; &amp;&#160;</td>
+          <td class="paramname"><em>pad_value</em> = <code><a class="el" href="namespacetvm.html#aae7034e3e41c18e7fb78ff32bfc6a318">NullOpt</a></code>&#160;</td>
         </tr>
         <tr>
           <td></td>
@@ -2694,10 +2700,12 @@ Additional Inherited Members</h2></td></tr>
     <tr><td class="paramname">block_rv</td><td>The block that accesses the target buffer. </td></tr>
     <tr><td class="paramname">buffer_index</td><td>The index of the buffer in block's read or write region. </td></tr>
     <tr><td class="paramname">buffer_index_type</td><td>The type of the buffer index, kRead or kWrite. </td></tr>
-    <tr><td class="paramname">index_map</td><td>The transformation to apply. </td></tr>
+    <tr><td class="paramname">index_map</td><td>The transformation to apply.</td></tr>
+    <tr><td class="paramname">pad_value</td><td>The value to write into padding introduced by the transformation. If the schedule contains a producer block for the specified buffer, the pad value will be written as part of the producer block if possible, or after the producer block otherwise. Otherwise, if the buffer is an input, will insert an annotation block to state that the padding contains the known value.</td></tr>
   </table>
   </dd>
 </dl>
+<p>Note: If applied to an input buffer, the calling scope is responsible for ensuring that the pad_value is present. Algebraic symplifications, branch elimination, and other optimizations may assume that this precondition is met, and may result in incorrect results being returned. </p>
 
 </div>
 </div>
diff --git a/docs/reference/api/doxygen/database_8h_source.html b/docs/reference/api/doxygen/database_8h_source.html
index f78c67d542..eb21b1e211 100644
--- a/docs/reference/api/doxygen/database_8h_source.html
+++ b/docs/reference/api/doxygen/database_8h_source.html
@@ -83,7 +83,7 @@ $(function() {
 <div class="ttc" id="structtvm_1_1meta__schedule_1_1WorkloadHash_html_a7cb09ddc6c76d9d00ddbeab8502d97cb"><div class="ttname"><a href="structtvm_1_1meta__schedule_1_1WorkloadHash.html#a7cb09ddc6c76d9d00ddbeab8502d97cb">tvm::meta_schedule::WorkloadHash::operator()</a></div><div class="ttdeci">size_t operator()(const Workload &amp;a) const</div><div class="ttdef"><b>Definition:</b> database.h:91</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyDatabaseNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyDatabaseNode.html">tvm::meta_schedule::PyDatabaseNode</a></div><div class="ttdoc">The database with customized methods on the python-side. </div><div class="ttdef"><b>Definition:</b> database.h:239</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:694</div></div>
 <div class="ttc" id="classtvm_1_1StructuralEqual_html"><div class="ttname"><a href="classtvm_1_1StructuralEqual.html">tvm::StructuralEqual</a></div><div class="ttdoc">Content-aware structural equality comparator for objects. </div><div class="ttdef"><b>Definition:</b> structural_equal.h:103</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1TuningRecordNode_html_a8cc2d64f796593a1a774eef259f17b29"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1TuningRecordNode.html#a8cc2d64f796593a1a774eef259f17b29">tvm::meta_schedule::TuningRecordNode::trace</a></div><div class="ttdeci">tir::Trace trace</div><div class="ttdoc">The trace tuned. </div><div class="ttdef"><b>Definition:</b> database.h:108</div></div>
 <div class="ttc" id="arg__info_8h_html"><div class="ttname"><a href="arg__info_8h.html">arg_info.h</a></div></div>
diff --git a/docs/reference/api/doxygen/functions_func_t.html b/docs/reference/api/doxygen/functions_func_t.html
index 7be48f887e..0176c5e1cd 100644
--- a/docs/reference/api/doxygen/functions_func_t.html
+++ b/docs/reference/api/doxygen/functions_func_t.html
@@ -215,7 +215,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a998b22e37ef63a697a984c8ebcc39ca2">tvm::tir::ScheduleNode</a>
 </li>
 <li>TransformLayout()
-: <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c">tvm::tir::ScheduleNode</a>
+: <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a55848f8f7a3293731cc4f4ed3832e901">tvm::tir::ScheduleNode</a>
 </li>
 <li>TryDowncast()
 : <a class="el" href="classtvm_1_1TracedObject.html#a1f8bbeb719ce563a5cb4d9ea36e125d2">tvm::TracedObject&lt; RefT &gt;</a>
diff --git a/docs/reference/api/doxygen/functions_t.html b/docs/reference/api/doxygen/functions_t.html
index 50d5dbb4a5..1d24d300c8 100644
--- a/docs/reference/api/doxygen/functions_t.html
+++ b/docs/reference/api/doxygen/functions_t.html
@@ -368,7 +368,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1te_1_1TransformNode.html#a034d22228133e50074502bfe1f495935">tvm::te::TransformNode</a>
 </li>
 <li>TransformLayout()
-: <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c">tvm::tir::ScheduleNode</a>
+: <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a55848f8f7a3293731cc4f4ed3832e901">tvm::tir::ScheduleNode</a>
 </li>
 <li>transpose_a
 : <a class="el" href="structtvm_1_1relay_1_1BatchMatmulAttrs.html#aea3a5e93559981fc31122615d677d831">tvm::relay::BatchMatmulAttrs</a>
diff --git a/docs/reference/api/doxygen/measure__candidate_8h_source.html b/docs/reference/api/doxygen/measure__candidate_8h_source.html
index f810a31c46..c112db434d 100644
--- a/docs/reference/api/doxygen/measure__candidate_8h_source.html
+++ b/docs/reference/api/doxygen/measure__candidate_8h_source.html
@@ -71,7 +71,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1MeasureCandidateNode_html_a99858dbe74082cc52938ac942523d792"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html#a99858dbe74082cc52938ac942523d792">tvm::meta_schedule::MeasureCandidateNode::VisitAttrs</a></div><div class="ttdeci">void VisitAttrs(tvm::AttrVisitor *v)</div><div class="ttdef"><b>Definition:</b> measure_candidate.h:40</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1MeasureCandidateNode_html_a6891e92cac8712bb690401ed121ae7e8"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html#a6891e92cac8712bb690401ed121ae7e8">tvm::meta_schedule::MeasureCandidateNode::args_info</a></div><div class="ttdeci">Array&lt; ArgInfo &gt; args_info</div><div class="ttdoc">The argument information, e.g., (shape, dtype) for tensors. </div><div class="ttdef"><b>Definition:</b> measure_candidate. [...]
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:694</div></div>
 <div class="ttc" id="arg__info_8h_html"><div class="ttname"><a href="arg__info_8h.html">arg_info.h</a></div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1MeasureCandidateNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html">tvm::meta_schedule::MeasureCandidateNode</a></div><div class="ttdoc">The schedule (with input shapes) to be measured. </div><div class="ttdef"><b>Definition:</b> measure_candidate.h:33</div></div>
 <div class="ttc" id="array_8h_html"><div class="ttname"><a href="array_8h.html">array.h</a></div><div class="ttdoc">Runtime Array container types. </div></div>
diff --git a/docs/reference/api/doxygen/postproc_8h_source.html b/docs/reference/api/doxygen/postproc_8h_source.html
index e4721f4c6a..a5570a414a 100644
--- a/docs/reference/api/doxygen/postproc_8h_source.html
+++ b/docs/reference/api/doxygen/postproc_8h_source.html
@@ -75,7 +75,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyPostprocNode_html_a3771e585727ef6dfecc502ffe57fd2a2"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyPostprocNode.html#a3771e585727ef6dfecc502ffe57fd2a2">tvm::meta_schedule::PyPostprocNode::f_apply</a></div><div class="ttdeci">FApply f_apply</div><div class="ttdoc">The packed function to the Apply function. </div><div class="ttdef"><b>Definition:</b> postproc.h:166</div></div>
 <div class="ttc" id="object_8h_html_aaaa3dc5b6dc33f84b2d28f9a81267212"><div class="ttname"><a href="object_8h.html#aaaa3dc5b6dc33f84b2d28f9a81267212">TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS</a></div><div class="ttdeci">#define TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(TypeName, ParentType, ObjectName)</div><div class="ttdef"><b>Definition:</b> object.h:744</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:694</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1TuneContext_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1TuneContext.html">tvm::meta_schedule::TuneContext</a></div><div class="ttdoc">Managed reference to TuneContextNode. </div><div class="ttdef"><b>Definition:</b> tune_context.h:135</div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1Postproc_html_a1b95aac48704d0c0740ede2040b942bb"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1Postproc.html#a1b95aac48704d0c0740ede2040b942bb">tvm::meta_schedule::Postproc::FAsString</a></div><div class="ttdeci">runtime::TypedPackedFunc&lt; String()&gt; FAsString</div><div class="ttdoc">Get the postprocessor function as string with name. </div><div class="ttdef"><b>Definition:</b> postproc.h:94</div></div>
diff --git a/docs/reference/api/doxygen/schedule__rule_8h_source.html b/docs/reference/api/doxygen/schedule__rule_8h_source.html
index 1b753aaf42..a18acec3d3 100644
--- a/docs/reference/api/doxygen/schedule__rule_8h_source.html
+++ b/docs/reference/api/doxygen/schedule__rule_8h_source.html
@@ -80,7 +80,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="object_8h_html_aaaa3dc5b6dc33f84b2d28f9a81267212"><div class="ttname"><a href="object_8h.html#aaaa3dc5b6dc33f84b2d28f9a81267212">TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS</a></div><div class="ttdeci">#define TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(TypeName, ParentType, ObjectName)</div><div class="ttdef"><b>Definition:</b> object.h:744</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyScheduleRuleNode_html_a752192bcb5385b1ba72b7c1856c6f360"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyScheduleRuleNode.html#a752192bcb5385b1ba72b7c1856c6f360">tvm::meta_schedule::PyScheduleRuleNode::f_apply</a></div><div class="ttdeci">FApply f_apply</div><div class="ttdoc">The packed function to the Apply function. </div><div class="ttdef"><b>Definition:</b> schedule_rule.h:263</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:694</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1TuneContext_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1TuneContext.html">tvm::meta_schedule::TuneContext</a></div><div class="ttdoc">Managed reference to TuneContextNode. </div><div class="ttdef"><b>Definition:</b> tune_context.h:135</div></div>
 <div class="ttc" id="array_8h_html"><div class="ttname"><a href="array_8h.html">array.h</a></div><div class="ttdoc">Runtime Array container types. </div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
diff --git a/docs/reference/api/doxygen/search/all_15.js b/docs/reference/api/doxygen/search/all_15.js
index b98da53838..1617c601cc 100644
--- a/docs/reference/api/doxygen/search/all_15.js
+++ b/docs/reference/api/doxygen/search/all_15.js
@@ -196,7 +196,7 @@ var searchData=
   ['transform_5fsteps',['transform_steps',['../classtvm_1_1auto__scheduler_1_1StateNode.html#a980f03e5744ed104cf231219a4895d5e',1,'tvm::auto_scheduler::StateNode']]],
   ['transformblocklayout',['TransformBlockLayout',['../classtvm_1_1tir_1_1ScheduleNode.html#a998b22e37ef63a697a984c8ebcc39ca2',1,'tvm::tir::ScheduleNode']]],
   ['transformed_5fvariables',['transformed_variables',['../classtvm_1_1te_1_1TransformNode.html#a034d22228133e50074502bfe1f495935',1,'tvm::te::TransformNode']]],
-  ['transformlayout',['TransformLayout',['../classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c',1,'tvm::tir::ScheduleNode']]],
+  ['transformlayout',['TransformLayout',['../classtvm_1_1tir_1_1ScheduleNode.html#a55848f8f7a3293731cc4f4ed3832e901',1,'tvm::tir::ScheduleNode']]],
   ['transformnode',['TransformNode',['../classtvm_1_1te_1_1TransformNode.html',1,'tvm::te']]],
   ['transpose',['transpose',['../namespacetvm_1_1topi.html#a1488ee98fd053e8b01b481f720df77fa',1,'tvm::topi']]],
   ['transpose_5fa',['transpose_a',['../structtvm_1_1relay_1_1MatmulAttrs.html#a397aa1573fc7e0bc13930390298a22fc',1,'tvm::relay::MatmulAttrs::transpose_a()'],['../structtvm_1_1relay_1_1BatchMatmulAttrs.html#aea3a5e93559981fc31122615d677d831',1,'tvm::relay::BatchMatmulAttrs::transpose_a()']]],
diff --git a/docs/reference/api/doxygen/search/functions_14.js b/docs/reference/api/doxygen/search/functions_14.js
index 9e07e7f19a..a1b12d1f8d 100644
--- a/docs/reference/api/doxygen/search/functions_14.js
+++ b/docs/reference/api/doxygen/search/functions_14.js
@@ -68,7 +68,7 @@ var searchData=
   ['transform',['Transform',['../classtvm_1_1te_1_1Transform.html#a51422cc2290f6b87fe61edb0db691125',1,'tvm::te::Transform']]],
   ['transform_5flayout',['transform_layout',['../classtvm_1_1te_1_1Stage.html#acec77eca6c9a4f1738a7c119d7ac2c2c',1,'tvm::te::Stage']]],
   ['transformblocklayout',['TransformBlockLayout',['../classtvm_1_1tir_1_1ScheduleNode.html#a998b22e37ef63a697a984c8ebcc39ca2',1,'tvm::tir::ScheduleNode']]],
-  ['transformlayout',['TransformLayout',['../classtvm_1_1tir_1_1ScheduleNode.html#a63d45b3109e1dbebcdd4d4f2223b395c',1,'tvm::tir::ScheduleNode']]],
+  ['transformlayout',['TransformLayout',['../classtvm_1_1tir_1_1ScheduleNode.html#a55848f8f7a3293731cc4f4ed3832e901',1,'tvm::tir::ScheduleNode']]],
   ['transpose',['transpose',['../namespacetvm_1_1topi.html#a1488ee98fd053e8b01b481f720df77fa',1,'tvm::topi']]],
   ['traverseafterreduce',['TraverseAfterReduce',['../namespacetvm_1_1topi_1_1cuda.html#a9009672dab261008d66d4e59d896935f',1,'tvm::topi::cuda']]],
   ['traversebeforereduce',['TraverseBeforeReduce',['../namespacetvm_1_1topi_1_1cuda.html#a9d51320c5b9bd9147018689b1b5f1279',1,'tvm::topi::cuda']]],
diff --git a/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html b/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html
index 8cba00829b..374fc7e07b 100644
--- a/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html
+++ b/docs/reference/api/doxygen/tir_2schedule_2schedule_8h_source.html
@@ -66,7 +66,7 @@ $(function() {
 <div class="title">schedule.h</div>  </div>
 </div><!--header-->
 <div class="contents">
-<a href="tir_2schedule_2schedule_8h.html">Go to the documentation of this file.</a><div class="fragment"><div class="line"><a name="l00001"></a><span class="lineno">    1</span>&#160;<span class="comment">/*</span></div><div class="line"><a name="l00002"></a><span class="lineno">    2</span>&#160;<span class="comment"> * Licensed to the Apache Software Foundation (ASF) under one</span></div><div class="line"><a name="l00003"></a><span class="lineno">    3</span>&#160;<span class="comment [...]
+<a href="tir_2schedule_2schedule_8h.html">Go to the documentation of this file.</a><div class="fragment"><div class="line"><a name="l00001"></a><span class="lineno">    1</span>&#160;<span class="comment">/*</span></div><div class="line"><a name="l00002"></a><span class="lineno">    2</span>&#160;<span class="comment"> * Licensed to the Apache Software Foundation (ASF) under one</span></div><div class="line"><a name="l00003"></a><span class="lineno">    3</span>&#160;<span class="comment [...]
 <div class="ttc" id="namespacetvm_1_1script_1_1ir__builder_1_1tir_html_acd41556b0c4088d0f309ef5495aaebe3"><div class="ttname"><a href="namespacetvm_1_1script_1_1ir__builder_1_1tir.html#acd41556b0c4088d0f309ef5495aaebe3">tvm::script::ir_builder::tir::Unroll</a></div><div class="ttdeci">ForFrame Unroll(PrimExpr start, PrimExpr stop, Optional&lt; Map&lt; String, ObjectRef &gt;&gt; annotations=NullOpt)</div><div class="ttdoc">The unrolled For statement. </div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1StmtNode_html"><div class="ttname"><a href="classtvm_1_1tir_1_1StmtNode.html">tvm::tir::StmtNode</a></div><div class="ttdoc">Base node of all statements. </div><div class="ttdef"><b>Definition:</b> stmt.h:38</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1BlockRVNode_html_af90b398c502892d19ff3bdf6463d32ab"><div class="ttname"><a href="classtvm_1_1tir_1_1BlockRVNode.html#af90b398c502892d19ff3bdf6463d32ab">tvm::tir::BlockRVNode::VisitAttrs</a></div><div class="ttdeci">void VisitAttrs(tvm::AttrVisitor *v)</div><div class="ttdef"><b>Definition:</b> schedule.h:53</div></div>
@@ -84,7 +84,7 @@ $(function() {
 <div class="ttc" id="object_8h_html_aaaa3dc5b6dc33f84b2d28f9a81267212"><div class="ttname"><a href="object_8h.html#aaaa3dc5b6dc33f84b2d28f9a81267212">TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS</a></div><div class="ttdeci">#define TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(TypeName, ParentType, ObjectName)</div><div class="ttdef"><b>Definition:</b> object.h:744</div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_a9ae244600a5e56c4adc9faf6d88f931e"><div class="ttname"><a href="namespacetvm_1_1tir.html#a9ae244600a5e56c4adc9faf6d88f931e">tvm::tir::ScheduleErrorRenderLevel</a></div><div class="ttdeci">ScheduleErrorRenderLevel</div><div class="ttdoc">The level of detailed error message rendering. </div><div class="ttdef"><b>Definition:</b> schedule.h:31</div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_a9ae244600a5e56c4adc9faf6d88f931ead6733547bb237ce06cddf96357f1b66b"><div class="ttname"><a href="namespacetvm_1_1tir.html#a9ae244600a5e56c4adc9faf6d88f931ead6733547bb237ce06cddf96357f1b66b">tvm::tir::ScheduleErrorRenderLevel::kDetail</a></div><div class="ttdoc">Render a detailed error message. </div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:694</div></div>
 <div class="ttc" id="index__map_8h_html"><div class="ttname"><a href="index__map_8h.html">index_map.h</a></div><div class="ttdoc">Defines a remapping of buffer indices. </div></div>
 <div class="ttc" id="classtvm_1_1support_1_1LinearCongruentialEngine_html_a4d3a3a94a3f3d2dfab4b5ccb1a7e97de"><div class="ttname"><a href="classtvm_1_1support_1_1LinearCongruentialEngine.html#a4d3a3a94a3f3d2dfab4b5ccb1a7e97de">tvm::support::LinearCongruentialEngine::TRandState</a></div><div class="ttdeci">int64_t TRandState</div><div class="ttdef"><b>Definition:</b> random_engine.h:54</div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
diff --git a/docs/reference/api/doxygen/trace_8h_source.html b/docs/reference/api/doxygen/trace_8h_source.html
index 8ef9a8c605..9293d61593 100644
--- a/docs/reference/api/doxygen/trace_8h_source.html
+++ b/docs/reference/api/doxygen/trace_8h_source.html
@@ -76,7 +76,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1tir_html_a75918aeef1136f9d6308556902d5bcae"><div class="ttname"><a href="namespacetvm_1_1tir.html#a75918aeef1136f9d6308556902d5bcae">tvm::tir::FTraceDecisionProvider</a></div><div class="ttdeci">runtime::TypedPackedFunc&lt; ObjectRef(const Instruction &amp;inst, const Array&lt; ObjectRef &gt; &amp;inputs, const Array&lt; ObjectRef &gt; &amp;attrs, const Optional&lt; ObjectRef &gt; &amp;decision)&gt; FTraceDecisionProvider</div><div class="ttdoc">A cal [...]
 <div class="ttc" id="instruction_8h_html"><div class="ttname"><a href="instruction_8h.html">instruction.h</a></div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
-<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:679</div></div>
+<div class="ttc" id="classtvm_1_1tir_1_1Schedule_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Schedule.html">tvm::tir::Schedule</a></div><div class="ttdoc">Managed reference to ScheduleNode. </div><div class="ttdef"><b>Definition:</b> schedule.h:694</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1TraceNode_html_ad6c859ed32b1e2ae076355eda37df0a2"><div class="ttname"><a href="classtvm_1_1tir_1_1TraceNode.html#ad6c859ed32b1e2ae076355eda37df0a2">tvm::tir::TraceNode::insts</a></div><div class="ttdeci">Array&lt; Instruction &gt; insts</div><div class="ttdoc">The instructions invoked so far in the program execution. </div><div class="ttdef"><b>Definition:</b> trace.h:61</div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1TraceNode_html_a764346045e536fa26b56c9e140de8e7b"><div class="ttname"><a href="classtvm_1_1tir_1_1TraceNode.html#a764346045e536fa26b56c9e140de8e7b">tvm::tir::TraceNode::ApplyToSchedule</a></div><div class="ttdeci">void ApplyToSchedule(Schedule sch, bool remove_postproc, FTraceDecisionProvider decision_provider=nullptr) const</div><div class="ttdoc">Apply the trace to a TensorIR schedule. </div></div>
diff --git a/docs/reference/api/python/auto_scheduler.html b/docs/reference/api/python/auto_scheduler.html
index a334cb4738..956b2e7568 100644
--- a/docs/reference/api/python/auto_scheduler.html
+++ b/docs/reference/api/python/auto_scheduler.html
@@ -1602,7 +1602,7 @@ history states as starting point to perform Evolutionary Search).</p></li>
 
 <dl class="py class">
 <dt class="sig sig-object py" id="tvm.auto_scheduler.SketchPolicy">
-<em class="property"><span class="pre">class</span> </em><span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">SketchPolicy</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">program_cost_model</span></span><span class="o"><span class="pre">=</span></span><span class="defau [...]
+<em class="property"><span class="pre">class</span> </em><span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">SketchPolicy</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">program_cost_model</span></span><span class="o"><span class="pre">=</span></span><span class="defau [...]
 <dd><p>The search policy that searches in a hierarchical search space defined by sketches.
 The policy randomly samples programs from the space defined by sketches and use evolutionary
 search to fine-tune them.</p>
@@ -1886,7 +1886,7 @@ Candidates:
 
 <dl class="py function">
 <dt class="sig sig-object py" id="tvm.auto_scheduler.auto_schedule">
-<span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">auto_schedule</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">search_policy</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em clas [...]
+<span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">auto_schedule</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">search_policy</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em clas [...]
 <dd><p>THIS API IS DEPRECATED.</p>
 <p>Run auto scheduling search for a task.</p>
 <dl class="field-list simple">
diff --git a/docs/reference/api/python/tir.html b/docs/reference/api/python/tir.html
index 8f2c106dea..b103aa7eb5 100644
--- a/docs/reference/api/python/tir.html
+++ b/docs/reference/api/python/tir.html
@@ -2668,8 +2668,9 @@ index map.</p></li>
 <dd class="field-odd"><ul class="simple">
 <li><p><strong>mapping_function</strong> (<em>Callable</em>) – The function to map from source indices to target indices.
 The function should accept <cite>tir.Var</cite> parameters and return
-a list. Each element of the returned list should be a
-<cite>tir.PrimExpr</cite>.</p></li>
+a either a <cite>tir.PrimExpr</cite>, or a list of <cite>tir.PrimExpr</cite>.
+Returning a <cite>tir.PrimExpr</cite> is equivalent to returning a
+list of length 1 containing that <cite>tir.PrimExpr</cite>.</p></li>
 <li><p><strong>ndim</strong> (<em>Optional</em><em>[</em><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.10)"><em>int</em></a><em>]</em>) – The dimensionality of the buffer to which this
 transformation should be applied.  If mapping_function uses
 variadic argument <cite>*args</cite>, <cite>ndim</cite> must be specified.  If
@@ -2699,9 +2700,12 @@ index map.</p></li>
 <dt class="field-odd">Parameters</dt>
 <dd class="field-odd"><ul class="simple">
 <li><p><strong>mapping_function</strong> (<em>Callable</em>) – The function to map from source indices to target indices.
-The function should accept tir.Var parameters and return a
-list. Each element of the returned list should be either a
-<cite>tir.PrimExpr</cite> or the object <cite>IndexMap.AXIS_SEPARATOR</cite>.</p></li>
+The function should accept tir.Var parameters and return
+either a <cite>tir.PrimExpr</cite> or a list.  Each element of the
+returned list should be either a <cite>tir.PrimExpr</cite> or the
+object <cite>IndexMap.AXIS_SEPARATOR</cite>.  Returning a
+<cite>tir.PrimExpr</cite> is equivalent to returning a list of length
+1 containing that <cite>tir.PrimExpr</cite>.</p></li>
 <li><p><strong>ndim</strong> (<em>Optional</em><em>[</em><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.10)"><em>int</em></a><em>]</em>) – The dimensionality of the buffer to which this
 transformation should be applied.  If mapping_function uses
 variadic argument <cite>*args</cite>, ndim must be specified.  If
@@ -5562,7 +5566,7 @@ preserve the semantics of computation. Some example of schedules:
 <tr class="row-odd"><td><p><a class="reference internal" href="#tvm.tir.Schedule.unannotate" title="tvm.tir.Schedule.unannotate"><code class="xref py py-obj docutils literal notranslate"><span class="pre">unannotate</span></code></a>(block_or_loop, ann_key)</p></td>
 <td><p>Unannotate a block/loop's annotation with key ann_key</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="#tvm.tir.Schedule.transform_layout" title="tvm.tir.Schedule.transform_layout"><code class="xref py py-obj docutils literal notranslate"><span class="pre">transform_layout</span></code></a>(block, buffer, index_map)</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="#tvm.tir.Schedule.transform_layout" title="tvm.tir.Schedule.transform_layout"><code class="xref py py-obj docutils literal notranslate"><span class="pre">transform_layout</span></code></a>(block, buffer, index_map[, ...])</p></td>
 <td><p>Apply a transformation represented by IndexMap to buffer</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="#tvm.tir.Schedule.transform_block_layout" title="tvm.tir.Schedule.transform_block_layout"><code class="xref py py-obj docutils literal notranslate"><span class="pre">transform_block_layout</span></code></a>(block, index_map)</p></td>
@@ -7364,7 +7368,7 @@ block are divisible by the subspace represented by the loops starting at the giv
 
 <dl class="py method">
 <dt class="sig sig-object py" id="tvm.tir.Schedule.transform_layout">
-<span class="sig-name descname"><span class="pre">transform_layout</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">block</span></span><span class="p"><span class="pre">:</span></span> <span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">tvm.tir.schedule.schedule.BlockRV</span><span class="p"><span class="pre">,</span> </span><a class="reference external" href="https://docs.pyt [...]
+<span class="sig-name descname"><span class="pre">transform_layout</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">block</span></span><span class="p"><span class="pre">:</span></span> <span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">tvm.tir.schedule.schedule.BlockRV</span><span class="p"><span class="pre">,</span> </span><a class="reference external" href="https://docs.pyt [...]
 <dd><p>Apply a transformation represented by IndexMap to buffer</p>
 <dl class="field-list simple">
 <dt class="field-odd">Parameters</dt>
@@ -7389,6 +7393,29 @@ contains IndexMap.AXIS_SEPARATOR, the SetAxisSeparators
 primitive will be called in addition to the
 TransformLayout primitive.</p>
 </p></li>
+<li><p><strong>pad_value</strong> (<em>Optional</em><em>[</em><em>Union</em><em>[</em><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.10)"><em>int</em></a><em>, </em><a class="reference external" href="https://docs.python.org/3/library/functions.html#float" title="(in Python v3.10)"><em>float</em></a><em>, </em><a class="reference internal" href="ir.html#tvm.ir.PrimExpr" title="tvm.ir.PrimExpr"><em>PrimExpr</em></a><em>, </em [...]
+transformation.  If the schedule contains a producer block
+for the specified buffer, the pad value will be written as
+part of the producer block if possible, or after the producer
+block otherwise.  Otherwise, if the buffer is an input, will
+insert an annotation block to state that the padding contains
+the known value.</p>
+<p>The pad value may not contain instances of BufferLoad,
+except where it loads a value from the buffer being
+transformed (e.g. to create a circular buffer with
+padding that consists of repeated elements).</p>
+<p>Note: If applied to an input buffer, the calling scope is
+responsible for ensuring that the pad_value is present.
+Algebraic symplifications, branch elimination, and other
+optimizations may assume that this precondition is met, and
+may result in incorrect results being returned.</p>
+<p>If None, the transformation may not introduce padding.</p>
+<p>If an int, float or PrimExpr, the transformation is the
+specific value to be present in the padding.</p>
+<p>If an IndexMap or Callable, the transformation is the
+value to be present in the padding in terms of the
+transformed index.</p>
+</p></li>
 </ul>
 </dd>
 </dl>
diff --git a/docs/reference/api/typedoc/classes/bytestreamreader.html b/docs/reference/api/typedoc/classes/bytestreamreader.html
index 3bdc4ecefb..28e9b9d6cc 100644
--- a/docs/reference/api/typedoc/classes/bytestreamreader.html
+++ b/docs/reference/api/typedoc/classes/bytestreamreader.html
@@ -119,7 +119,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -141,7 +141,7 @@
 					<div class="tsd-signature tsd-kind-icon">bytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Uint8Array</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -151,7 +151,7 @@
 					<div class="tsd-signature tsd-kind-icon">offset<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L42">rpc_server.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L42">rpc_server.ts:42</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -168,7 +168,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L63">rpc_server.ts:63</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L63">rpc_server.ts:63</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">Uint8Array</span></h4>
@@ -185,7 +185,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L49">rpc_server.ts:49</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L49">rpc_server.ts:49</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -202,7 +202,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L57">rpc_server.ts:57</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L57">rpc_server.ts:57</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
diff --git a/docs/reference/api/typedoc/classes/cachedcallstack.html b/docs/reference/api/typedoc/classes/cachedcallstack.html
index 82cb229982..b78e900fc2 100644
--- a/docs/reference/api/typedoc/classes/cachedcallstack.html
+++ b/docs/reference/api/typedoc/classes/cachedcallstack.html
@@ -144,7 +144,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L223">memory.ts:223</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L223">memory.ts:223</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -172,7 +172,7 @@
 					<div class="tsd-signature tsd-kind-icon">temp<wbr>Args<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><a href="../interfaces/disposable.html" class="tsd-signature-type">Disposable</a><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = []</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L208">memory.ts:208</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L208">memory.ts:208</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -194,7 +194,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L312">memory.ts:312</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L312">memory.ts:312</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -226,7 +226,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L284">memory.ts:284</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L284">memory.ts:284</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -262,7 +262,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L388">memory.ts:388</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L388">memory.ts:388</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -300,7 +300,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L376">memory.ts:376</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L376">memory.ts:376</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -340,7 +340,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L267">memory.ts:267</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L267">memory.ts:267</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -373,7 +373,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L243">memory.ts:243</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L243">memory.ts:243</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -390,7 +390,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L321">memory.ts:321</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L321">memory.ts:321</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -422,7 +422,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L252">memory.ts:252</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L252">memory.ts:252</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -444,7 +444,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L359">memory.ts:359</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L359">memory.ts:359</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -470,7 +470,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L342">memory.ts:342</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L342">memory.ts:342</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -496,7 +496,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L350">memory.ts:350</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L350">memory.ts:350</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -522,7 +522,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L326">memory.ts:326</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L326">memory.ts:326</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -548,7 +548,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L363">memory.ts:363</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L363">memory.ts:363</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -574,7 +574,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L346">memory.ts:346</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L346">memory.ts:346</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -600,7 +600,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L334">memory.ts:334</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L334">memory.ts:334</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
diff --git a/docs/reference/api/typedoc/classes/dldatatype.html b/docs/reference/api/typedoc/classes/dldatatype.html
index f2da4de391..cab6743132 100644
--- a/docs/reference/api/typedoc/classes/dldatatype.html
+++ b/docs/reference/api/typedoc/classes/dldatatype.html
@@ -119,7 +119,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L262">runtime.ts:262</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L262">runtime.ts:262</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -147,7 +147,7 @@
 					<div class="tsd-signature tsd-kind-icon">bits<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L260">runtime.ts:260</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L260">runtime.ts:260</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">code<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L258">runtime.ts:258</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L258">runtime.ts:258</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -177,7 +177,7 @@
 					<div class="tsd-signature tsd-kind-icon">lanes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L262">runtime.ts:262</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L262">runtime.ts:262</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -199,7 +199,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L279">runtime.ts:279</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L279">runtime.ts:279</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -216,7 +216,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L270">runtime.ts:270</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L270">runtime.ts:270</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">string</span></h4>
diff --git a/docs/reference/api/typedoc/classes/dldevice.html b/docs/reference/api/typedoc/classes/dldevice.html
index d1bb24131f..73b12fd5f1 100644
--- a/docs/reference/api/typedoc/classes/dldevice.html
+++ b/docs/reference/api/typedoc/classes/dldevice.html
@@ -118,7 +118,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L202">runtime.ts:202</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L202">runtime.ts:202</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -146,7 +146,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<wbr>Id<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L200">runtime.ts:200</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L200">runtime.ts:200</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -161,7 +161,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L198">runtime.ts:198</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L198">runtime.ts:198</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -183,7 +183,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L223">runtime.ts:223</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L223">runtime.ts:223</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -205,7 +205,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L230">runtime.ts:230</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L230">runtime.ts:230</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">string</span></h4>
diff --git a/docs/reference/api/typedoc/classes/environment.html b/docs/reference/api/typedoc/classes/environment.html
index 27824e1bd2..b43f04c388 100644
--- a/docs/reference/api/typedoc/classes/environment.html
+++ b/docs/reference/api/typedoc/classes/environment.html
@@ -125,7 +125,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/environment.ts#L86">environment.ts:86</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/environment.ts#L86">environment.ts:86</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -169,7 +169,7 @@
 					<aside class="tsd-sources">
 						<p>Implementation of <a href="../interfaces/libraryprovider.html">LibraryProvider</a>.<a href="../interfaces/libraryprovider.html#imports">imports</a></p>
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/environment.ts#L70">environment.ts:70</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/environment.ts#L70">environment.ts:70</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 					<div class="tsd-signature tsd-kind-icon">logger<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>msg<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/environment.ts#L69">environment.ts:69</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/environment.ts#L69">environment.ts:69</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -210,7 +210,7 @@
 					<div class="tsd-signature tsd-kind-icon">packedCFunc<wbr>Table<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">ctypes.FTVMWasmPackedCFunc</span><span class="tsd-signature-symbol"> | </span><span class="tsd-signature-type">undefined</span><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = [undefined,]</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/environment.ts#L78">environment.ts:78</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/environment.ts#L78">environment.ts:78</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -228,7 +228,7 @@
 					<div class="tsd-signature tsd-kind-icon">packedCFunc<wbr>Table<wbr>Free<wbr>Id<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = []</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/environment.ts#L84">environment.ts:84</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/environment.ts#L84">environment.ts:84</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -250,7 +250,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/environment.ts#L105">environment.ts:105</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/environment.ts#L105">environment.ts:105</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/ffilibrary.html b/docs/reference/api/typedoc/classes/ffilibrary.html
index 4c99934978..4bca6e48c8 100644
--- a/docs/reference/api/typedoc/classes/ffilibrary.html
+++ b/docs/reference/api/typedoc/classes/ffilibrary.html
@@ -131,7 +131,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L49">runtime.ts:49</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L49">runtime.ts:49</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -156,7 +156,7 @@
 					<div class="tsd-signature tsd-kind-icon">exports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">Function</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L46">runtime.ts:46</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L46">runtime.ts:46</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -166,7 +166,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L45">runtime.ts:45</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L45">runtime.ts:45</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">wasm32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">boolean</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L44">runtime.ts:44</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L44">runtime.ts:44</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -186,7 +186,7 @@
 					<div class="tsd-signature tsd-kind-icon">webGPUContext<span class="tsd-signature-symbol">:</span> <a href="webgpucontext.html" class="tsd-signature-type">WebGPUContext</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L47">runtime.ts:47</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L47">runtime.ts:47</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -203,7 +203,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L76">runtime.ts:76</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L76">runtime.ts:76</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -226,7 +226,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L66">runtime.ts:66</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L66">runtime.ts:66</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -243,7 +243,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L84">runtime.ts:84</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L84">runtime.ts:84</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <a href="cachedcallstack.html" class="tsd-signature-type">CachedCallStack</a></h4>
@@ -260,7 +260,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L95">runtime.ts:95</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L95">runtime.ts:95</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -283,7 +283,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L72">runtime.ts:72</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L72">runtime.ts:72</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
diff --git a/docs/reference/api/typedoc/classes/graphexecutor.html b/docs/reference/api/typedoc/classes/graphexecutor.html
index 0b8285edbf..2a5494d404 100644
--- a/docs/reference/api/typedoc/classes/graphexecutor.html
+++ b/docs/reference/api/typedoc/classes/graphexecutor.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L583">runtime.ts:583</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L583">runtime.ts:583</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">module<span class="tsd-signature-symbol">:</span> <a href="module.html" class="tsd-signature-type">Module</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L579">runtime.ts:579</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L579">runtime.ts:579</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L654">runtime.ts:654</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L654">runtime.ts:654</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -224,7 +224,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L597">runtime.ts:597</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L597">runtime.ts:597</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -241,7 +241,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L631">runtime.ts:631</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L631">runtime.ts:631</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -279,7 +279,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L644">runtime.ts:644</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L644">runtime.ts:644</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -310,7 +310,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L621">runtime.ts:621</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L621">runtime.ts:621</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -332,7 +332,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L609">runtime.ts:609</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L609">runtime.ts:609</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/instance.html b/docs/reference/api/typedoc/classes/instance.html
index 5d2ee5b5d7..81f486575e 100644
--- a/docs/reference/api/typedoc/classes/instance.html
+++ b/docs/reference/api/typedoc/classes/instance.html
@@ -139,7 +139,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L692">runtime.ts:692</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L692">runtime.ts:692</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -202,7 +202,7 @@
 					<div class="tsd-signature tsd-kind-icon">exports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">Function</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L684">runtime.ts:684</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L684">runtime.ts:684</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -212,7 +212,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L683">runtime.ts:683</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L683">runtime.ts:683</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -229,7 +229,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L932">runtime.ts:932</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L932">runtime.ts:932</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -260,7 +260,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L994">runtime.ts:994</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L994">runtime.ts:994</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -303,7 +303,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L924">runtime.ts:924</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L924">runtime.ts:924</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -341,7 +341,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L732">runtime.ts:732</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L732">runtime.ts:732</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -358,7 +358,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L952">runtime.ts:952</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L952">runtime.ts:952</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -402,7 +402,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L816">runtime.ts:816</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L816">runtime.ts:816</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -434,7 +434,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L1033">runtime.ts:1033</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L1033">runtime.ts:1033</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -465,7 +465,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L846">runtime.ts:846</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L846">runtime.ts:846</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -497,7 +497,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L750">runtime.ts:750</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L750">runtime.ts:750</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -520,7 +520,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L1013">runtime.ts:1013</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L1013">runtime.ts:1013</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -568,7 +568,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L789">runtime.ts:789</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L789">runtime.ts:789</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -608,7 +608,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L914">runtime.ts:914</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L914">runtime.ts:914</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -646,7 +646,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L1145">runtime.ts:1145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L1145">runtime.ts:1145</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -698,7 +698,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L740">runtime.ts:740</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L740">runtime.ts:740</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -722,7 +722,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L868">runtime.ts:868</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L868">runtime.ts:868</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -754,7 +754,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L857">runtime.ts:857</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L857">runtime.ts:857</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -786,7 +786,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L940">runtime.ts:940</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L940">runtime.ts:940</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/memory.html b/docs/reference/api/typedoc/classes/memory.html
index 7f24029ea0..9498e6fa34 100644
--- a/docs/reference/api/typedoc/classes/memory.html
+++ b/docs/reference/api/typedoc/classes/memory.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L40">memory.ts:40</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L40">memory.ts:40</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -152,7 +152,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Memory</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L32">memory.ts:32</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L32">memory.ts:32</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">wasm32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">boolean</span><span class="tsd-signature-symbol"> = true</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L33">memory.ts:33</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L33">memory.ts:33</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L154">memory.ts:154</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L154">memory.ts:154</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -210,7 +210,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L90">memory.ts:90</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L90">memory.ts:90</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -233,7 +233,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L97">memory.ts:97</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L97">memory.ts:97</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -256,7 +256,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L74">memory.ts:74</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L74">memory.ts:74</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -279,7 +279,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L81">memory.ts:81</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L81">memory.ts:81</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -302,7 +302,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L104">memory.ts:104</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L104">memory.ts:104</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -325,7 +325,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L132">memory.ts:132</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L132">memory.ts:132</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -362,7 +362,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L145">memory.ts:145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L145">memory.ts:145</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -393,7 +393,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L60">memory.ts:60</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L60">memory.ts:60</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -416,7 +416,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L67">memory.ts:67</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L67">memory.ts:67</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -439,7 +439,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L53">memory.ts:53</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L53">memory.ts:53</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -462,7 +462,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L114">memory.ts:114</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L114">memory.ts:114</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -485,7 +485,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L124">memory.ts:124</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L124">memory.ts:124</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -502,7 +502,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/memory.ts#L175">memory.ts:175</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/memory.ts#L175">memory.ts:175</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/module.html b/docs/reference/api/typedoc/classes/module.html
index 0364d6d76b..ccb299c65b 100644
--- a/docs/reference/api/typedoc/classes/module.html
+++ b/docs/reference/api/typedoc/classes/module.html
@@ -124,7 +124,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L504">runtime.ts:504</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L504">runtime.ts:504</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -170,7 +170,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L502">runtime.ts:502</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L502">runtime.ts:502</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -187,7 +187,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L516">runtime.ts:516</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L516">runtime.ts:516</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -204,7 +204,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L530">runtime.ts:530</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L530">runtime.ts:530</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -236,7 +236,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L561">runtime.ts:561</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L561">runtime.ts:561</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/ndarray.html b/docs/reference/api/typedoc/classes/ndarray.html
index d834550efa..e21f9ef313 100644
--- a/docs/reference/api/typedoc/classes/ndarray.html
+++ b/docs/reference/api/typedoc/classes/ndarray.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L304">runtime.ts:304</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L304">runtime.ts:304</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -158,7 +158,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<span class="tsd-signature-symbol">:</span> <a href="dldevice.html" class="tsd-signature-type">DLDevice</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L297">runtime.ts:297</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L297">runtime.ts:297</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -173,7 +173,7 @@
 					<div class="tsd-signature tsd-kind-icon">dtype<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L293">runtime.ts:293</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L293">runtime.ts:293</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -188,7 +188,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L289">runtime.ts:289</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L289">runtime.ts:289</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -203,7 +203,7 @@
 					<div class="tsd-signature tsd-kind-icon">ndim<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L291">runtime.ts:291</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L291">runtime.ts:291</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -218,7 +218,7 @@
 					<div class="tsd-signature tsd-kind-icon">shape<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L295">runtime.ts:295</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L295">runtime.ts:295</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -240,7 +240,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L370">runtime.ts:370</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L370">runtime.ts:370</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -273,7 +273,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L414">runtime.ts:414</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L414">runtime.ts:414</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -305,7 +305,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L355">runtime.ts:355</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L355">runtime.ts:355</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -322,7 +322,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L474">runtime.ts:474</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L474">runtime.ts:474</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -346,7 +346,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L443">runtime.ts:443</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L443">runtime.ts:443</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/packedfunccell.html b/docs/reference/api/typedoc/classes/packedfunccell.html
index b4bcd483f1..c91b49fb9e 100644
--- a/docs/reference/api/typedoc/classes/packedfunccell.html
+++ b/docs/reference/api/typedoc/classes/packedfunccell.html
@@ -122,7 +122,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L158">runtime.ts:158</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L158">runtime.ts:158</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -147,7 +147,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L157">runtime.ts:157</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L157">runtime.ts:157</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -164,7 +164,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L165">runtime.ts:165</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L165">runtime.ts:165</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
diff --git a/docs/reference/api/typedoc/classes/rpcserver.html b/docs/reference/api/typedoc/classes/rpcserver.html
index 0b31f9ca0b..68ed0216c3 100644
--- a/docs/reference/api/typedoc/classes/rpcserver.html
+++ b/docs/reference/api/typedoc/classes/rpcserver.html
@@ -115,7 +115,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L92">rpc_server.ts:92</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L92">rpc_server.ts:92</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">get<wbr>Imports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">unknown</span><span class="tsd-signat [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L82">rpc_server.ts:82</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L82">rpc_server.ts:82</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -201,7 +201,7 @@
 					<div class="tsd-signature tsd-kind-icon">key<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L78">rpc_server.ts:78</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L78">rpc_server.ts:78</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -211,7 +211,7 @@
 					<div class="tsd-signature tsd-kind-icon">logger<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>msg<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L81">rpc_server.ts:81</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L81">rpc_server.ts:81</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -242,7 +242,7 @@
 					<div class="tsd-signature tsd-kind-icon">socket<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">WebSocket</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L79">rpc_server.ts:79</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L79">rpc_server.ts:79</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -252,7 +252,7 @@
 					<div class="tsd-signature tsd-kind-icon">state<span class="tsd-signature-symbol">:</span> <a href="../enums/rpcserverstate.html" class="tsd-signature-type">RPCServerState</a><span class="tsd-signature-symbol"> = RPCServerState.InitHeader</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L80">rpc_server.ts:80</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L80">rpc_server.ts:80</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -262,7 +262,7 @@
 					<div class="tsd-signature tsd-kind-icon">url<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L77">rpc_server.ts:77</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L77">rpc_server.ts:77</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/classes/scalar.html b/docs/reference/api/typedoc/classes/scalar.html
index c63944ee7e..3a44a38ed2 100644
--- a/docs/reference/api/typedoc/classes/scalar.html
+++ b/docs/reference/api/typedoc/classes/scalar.html
@@ -112,7 +112,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L145">runtime.ts:145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L145">runtime.ts:145</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -137,7 +137,7 @@
 					<div class="tsd-signature tsd-kind-icon">dtype<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L145">runtime.ts:145</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L145">runtime.ts:145</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -152,7 +152,7 @@
 					<div class="tsd-signature tsd-kind-icon">value<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L143">runtime.ts:143</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L143">runtime.ts:143</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/webgpucontext.html b/docs/reference/api/typedoc/classes/webgpucontext.html
index f65580f146..b074ea0bd7 100644
--- a/docs/reference/api/typedoc/classes/webgpucontext.html
+++ b/docs/reference/api/typedoc/classes/webgpucontext.html
@@ -120,7 +120,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L57">webgpu.ts:57</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L57">webgpu.ts:57</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -145,7 +145,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">GPUDevice</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L50">webgpu.ts:50</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L50">webgpu.ts:50</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -155,7 +155,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L51">webgpu.ts:51</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L51">webgpu.ts:51</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -172,7 +172,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L84">webgpu.ts:84</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L84">webgpu.ts:84</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -209,7 +209,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L170">webgpu.ts:170</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L170">webgpu.ts:170</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -238,7 +238,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L67">webgpu.ts:67</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L67">webgpu.ts:67</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/enums/argtypecode.html b/docs/reference/api/typedoc/enums/argtypecode.html
index b380abf6fd..ae0035a3ac 100644
--- a/docs/reference/api/typedoc/enums/argtypecode.html
+++ b/docs/reference/api/typedoc/enums/argtypecode.html
@@ -106,7 +106,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLDevice<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 6</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L220">ctypes.ts:220</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L220">ctypes.ts:220</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -116,7 +116,7 @@
 					<div class="tsd-signature tsd-kind-icon">Float<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L216">ctypes.ts:216</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L216">ctypes.ts:216</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -126,7 +126,7 @@
 					<div class="tsd-signature tsd-kind-icon">Int<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L214">ctypes.ts:214</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L214">ctypes.ts:214</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -136,7 +136,7 @@
 					<div class="tsd-signature tsd-kind-icon">Null<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L218">ctypes.ts:218</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L218">ctypes.ts:218</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -146,7 +146,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMBytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 12</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L226">ctypes.ts:226</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L226">ctypes.ts:226</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -156,7 +156,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMDLTensor<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 7</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L221">ctypes.ts:221</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L221">ctypes.ts:221</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -166,7 +166,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMData<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 5</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L219">ctypes.ts:219</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L219">ctypes.ts:219</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMModule<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 9</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L223">ctypes.ts:223</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L223">ctypes.ts:223</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -186,7 +186,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMNDArray<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 13</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L227">ctypes.ts:227</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L227">ctypes.ts:227</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -196,7 +196,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMObject<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L222">ctypes.ts:222</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L222">ctypes.ts:222</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -206,7 +206,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMObjectRValue<wbr>Ref<wbr>Arg<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 14</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L228">ctypes.ts:228</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L228">ctypes.ts:228</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -216,7 +216,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMOpaque<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 3</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L217">ctypes.ts:217</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L217">ctypes.ts:217</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -226,7 +226,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMPacked<wbr>Func<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 10</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L224">ctypes.ts:224</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L224">ctypes.ts:224</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -236,7 +236,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 11</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L225">ctypes.ts:225</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L225">ctypes.ts:225</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -246,7 +246,7 @@
 					<div class="tsd-signature tsd-kind-icon">UInt<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L215">ctypes.ts:215</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L215">ctypes.ts:215</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/aynccallbackcode.html b/docs/reference/api/typedoc/enums/aynccallbackcode.html
index 3bc1d8d685..59b662efe0 100644
--- a/docs/reference/api/typedoc/enums/aynccallbackcode.html
+++ b/docs/reference/api/typedoc/enums/aynccallbackcode.html
@@ -93,7 +93,7 @@
 					<div class="tsd-signature tsd-kind-icon">k<wbr>Exception<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 5</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L676">runtime.ts:676</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L676">runtime.ts:676</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -103,7 +103,7 @@
 					<div class="tsd-signature tsd-kind-icon">k<wbr>Return<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L675">runtime.ts:675</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L675">runtime.ts:675</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/dldatatypecode.html b/docs/reference/api/typedoc/enums/dldatatypecode.html
index a962ecb6ad..127aa9f597 100644
--- a/docs/reference/api/typedoc/enums/dldatatypecode.html
+++ b/docs/reference/api/typedoc/enums/dldatatypecode.html
@@ -95,7 +95,7 @@
 					<div class="tsd-signature tsd-kind-icon">Float<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L242">runtime.ts:242</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L242">runtime.ts:242</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -105,7 +105,7 @@
 					<div class="tsd-signature tsd-kind-icon">Int<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L240">runtime.ts:240</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L240">runtime.ts:240</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -115,7 +115,7 @@
 					<div class="tsd-signature tsd-kind-icon">Opaque<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 3</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L243">runtime.ts:243</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L243">runtime.ts:243</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -125,7 +125,7 @@
 					<div class="tsd-signature tsd-kind-icon">UInt<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L241">runtime.ts:241</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L241">runtime.ts:241</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/rpcserverstate.html b/docs/reference/api/typedoc/enums/rpcserverstate.html
index 9f93170839..02964b028c 100644
--- a/docs/reference/api/typedoc/enums/rpcserverstate.html
+++ b/docs/reference/api/typedoc/enums/rpcserverstate.html
@@ -90,7 +90,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Header<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L27">rpc_server.ts:27</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L27">rpc_server.ts:27</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -100,7 +100,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Header<wbr>Key<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L28">rpc_server.ts:28</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L28">rpc_server.ts:28</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -110,7 +110,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Server<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L29">rpc_server.ts:29</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L29">rpc_server.ts:29</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -120,7 +120,7 @@
 					<div class="tsd-signature tsd-kind-icon">Receive<wbr>Packet<wbr>Body<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L32">rpc_server.ts:32</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L32">rpc_server.ts:32</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -130,7 +130,7 @@
 					<div class="tsd-signature tsd-kind-icon">Receive<wbr>Packet<wbr>Header<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L31">rpc_server.ts:31</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L31">rpc_server.ts:31</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -140,7 +140,7 @@
 					<div class="tsd-signature tsd-kind-icon">Wait<wbr>For<wbr>Callback<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L30">rpc_server.ts:30</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L30">rpc_server.ts:30</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/sizeof.html b/docs/reference/api/typedoc/enums/sizeof.html
index e49af8bc5d..e9db3ee507 100644
--- a/docs/reference/api/typedoc/enums/sizeof.html
+++ b/docs/reference/api/typedoc/enums/sizeof.html
@@ -100,7 +100,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLData<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = I32</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L206">ctypes.ts:206</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L206">ctypes.ts:206</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -110,7 +110,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLDevice<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = I32 + I32</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L207">ctypes.ts:207</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L207">ctypes.ts:207</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -120,7 +120,7 @@
 					<div class="tsd-signature tsd-kind-icon">F32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L203">ctypes.ts:203</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L203">ctypes.ts:203</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -130,7 +130,7 @@
 					<div class="tsd-signature tsd-kind-icon">F64<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L204">ctypes.ts:204</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L204">ctypes.ts:204</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -140,7 +140,7 @@
 					<div class="tsd-signature tsd-kind-icon">I32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L201">ctypes.ts:201</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L201">ctypes.ts:201</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -150,7 +150,7 @@
 					<div class="tsd-signature tsd-kind-icon">I64<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L202">ctypes.ts:202</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L202">ctypes.ts:202</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -160,7 +160,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMValue<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L205">ctypes.ts:205</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L205">ctypes.ts:205</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -170,7 +170,7 @@
 					<div class="tsd-signature tsd-kind-icon">U16<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L200">ctypes.ts:200</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L200">ctypes.ts:200</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -180,7 +180,7 @@
 					<div class="tsd-signature tsd-kind-icon">U8<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L199">ctypes.ts:199</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L199">ctypes.ts:199</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/index.html b/docs/reference/api/typedoc/index.html
index a53d323ba4..d367d831b5 100644
--- a/docs/reference/api/typedoc/index.html
+++ b/docs/reference/api/typedoc/index.html
@@ -174,7 +174,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Alloc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>shape<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, ndim<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, dtypeCode<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, dtypeBits<span class="tsd [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L112">ctypes.ts:112</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L112">ctypes.ts:112</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -238,7 +238,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>From<wbr>Bytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, data<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nbytes<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">num [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L128">ctypes.ts:128</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L128">ctypes.ts:128</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -282,7 +282,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>From<wbr>To<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>from<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, to<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, stream<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-sig [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L144">ctypes.ts:144</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L144">ctypes.ts:144</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -326,7 +326,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>ToBytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, data<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nbytes<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</sp [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L136">ctypes.ts:136</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L136">ctypes.ts:136</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -370,7 +370,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L121">ctypes.ts:121</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L121">ctypes.ts:121</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -406,7 +406,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMBackend<wbr>PackedCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>argValues<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, argCodes<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nargs<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number< [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L160">ctypes.ts:160</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L160">ctypes.ts:160</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -458,7 +458,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMCFunc<wbr>Set<wbr>Return<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>ret<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, value<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCode<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signa [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L77">ctypes.ts:77</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L77">ctypes.ts:77</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -506,7 +506,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMCb<wbr>Arg<wbr>ToReturn<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>value<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, code<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span c [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L83">ctypes.ts:83</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L83">ctypes.ts:83</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -545,7 +545,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Call<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>func<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, argValues<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCode<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-t [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L67">ctypes.ts:67</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L67">ctypes.ts:67</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -601,7 +601,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>func<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L57">ctypes.ts:57</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L57">ctypes.ts:57</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -637,7 +637,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Get<wbr>Global<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>name<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, out<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span cla [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L100">ctypes.ts:100</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L100">ctypes.ts:100</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -676,7 +676,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>List<wbr>Global<wbr>Names<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>outSize<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, outArray<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&g [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L88">ctypes.ts:88</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L88">ctypes.ts:88</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -715,7 +715,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Register<wbr>Global<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>name<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, f<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, override<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</spa [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L94">ctypes.ts:94</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L94">ctypes.ts:94</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -758,7 +758,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMGet<wbr>Last<wbr>Error<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L34">ctypes.ts:34</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L34">ctypes.ts:34</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -788,7 +788,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L52">ctypes.ts:52</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L52">ctypes.ts:52</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -824,7 +824,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Get<wbr>Function<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, funcName<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, queryImports<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">numbe [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L42">ctypes.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L42">ctypes.ts:42</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -872,7 +872,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Import<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, dep<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-si [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L48">ctypes.ts:48</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L48">ctypes.ts:48</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -912,7 +912,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMSynchronize<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>deviceType<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, deviceId<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, stream<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signatur [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L150">ctypes.ts:150</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L150">ctypes.ts:150</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -954,7 +954,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Alloc<wbr>Space<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>size<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L167">ctypes.ts:167</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L167">ctypes.ts:167</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -990,7 +990,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Free<wbr>Space<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>ptr<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L170">ctypes.ts:170</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L170">ctypes.ts:170</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1026,7 +1026,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Func<wbr>Create<wbr>FromCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>resource<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, out<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&g [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L187">ctypes.ts:187</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L187">ctypes.ts:187</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1066,7 +1066,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>PackedCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>args<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCodes<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nargs<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L179">ctypes.ts:179</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L179">ctypes.ts:179</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1118,7 +1118,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>PackedCFunc<wbr>Finalizer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>resourceHandle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L193">ctypes.ts:193</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L193">ctypes.ts:193</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1154,7 +1154,7 @@
 					<div class="tsd-signature tsd-kind-icon">GPUPointer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L25">webgpu.ts:25</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L25">webgpu.ts:25</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1169,7 +1169,7 @@
 					<div class="tsd-signature tsd-kind-icon">Packed<wbr>Func<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">...</span>args<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol"> &amp; </span><a href="interfaces/disp [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L36">runtime.ts:36</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L36">runtime.ts:36</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1184,7 +1184,7 @@
 					<div class="tsd-signature tsd-kind-icon">Pointer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L25">ctypes.ts:25</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L25">ctypes.ts:25</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1199,7 +1199,7 @@
 					<div class="tsd-signature tsd-kind-icon">Ptr<wbr>Offset<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/ctypes.ts#L28">ctypes.ts:28</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/ctypes.ts#L28">ctypes.ts:28</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1217,7 +1217,7 @@
 					<div class="tsd-signature tsd-kind-icon">RPC_<wbr>MAGIC<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">1045105</span><span class="tsd-signature-symbol"> = 1045105</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/rpc_server.ts#L36">rpc_server.ts:36</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/rpc_server.ts#L36">rpc_server.ts:36</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1239,7 +1239,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/support.ts#L25">support.ts:25</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/support.ts#L25">support.ts:25</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1271,7 +1271,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/support.ts#L39">support.ts:39</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/support.ts#L39">support.ts:39</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1300,7 +1300,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/support.ts#L52">support.ts:52</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/support.ts#L52">support.ts:52</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1337,7 +1337,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/compact.ts#L38">compact.ts:38</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/compact.ts#L38">compact.ts:38</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1368,7 +1368,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L30">webgpu.ts:30</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L30">webgpu.ts:30</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1390,7 +1390,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/environment.ts#L32">environment.ts:32</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/environment.ts#L32">environment.ts:32</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1421,7 +1421,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/compact.ts#L24">compact.ts:24</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/compact.ts#L24">compact.ts:24</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1443,7 +1443,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L1367">runtime.ts:1367</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L1367">runtime.ts:1367</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1508,7 +1508,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/support.ts#L62">support.ts:62</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/support.ts#L62">support.ts:62</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1530,7 +1530,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLData<wbr>Type<wbr>Code<wbr>ToStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L246">runtime.ts:246</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L246">runtime.ts:246</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1539,7 +1539,7 @@
 						<div class="tsd-signature tsd-kind-icon">0<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;int&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L247">runtime.ts:247</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L247">runtime.ts:247</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1549,7 +1549,7 @@
 						<div class="tsd-signature tsd-kind-icon">1<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;uint&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L248">runtime.ts:248</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L248">runtime.ts:248</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1559,7 +1559,7 @@
 						<div class="tsd-signature tsd-kind-icon">2<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;float&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L249">runtime.ts:249</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L249">runtime.ts:249</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1569,7 +1569,7 @@
 						<div class="tsd-signature tsd-kind-icon">3<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;handle&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L250">runtime.ts:250</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L250">runtime.ts:250</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1580,7 +1580,7 @@
 					<div class="tsd-signature tsd-kind-icon">Device<wbr>Enum<wbr>ToStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L175">runtime.ts:175</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L175">runtime.ts:175</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1589,7 +1589,7 @@
 						<div class="tsd-signature tsd-kind-icon">1<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;cpu&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L176">runtime.ts:176</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L176">runtime.ts:176</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1599,7 +1599,7 @@
 						<div class="tsd-signature tsd-kind-icon">15<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;webgpu&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L180">runtime.ts:180</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L180">runtime.ts:180</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1609,7 +1609,7 @@
 						<div class="tsd-signature tsd-kind-icon">2<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;cuda&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L177">runtime.ts:177</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L177">runtime.ts:177</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1619,7 +1619,7 @@
 						<div class="tsd-signature tsd-kind-icon">4<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;opencl&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L178">runtime.ts:178</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L178">runtime.ts:178</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1629,7 +1629,7 @@
 						<div class="tsd-signature tsd-kind-icon">8<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;metal&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L179">runtime.ts:179</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L179">runtime.ts:179</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1640,7 +1640,7 @@
 					<div class="tsd-signature tsd-kind-icon">Device<wbr>Str<wbr>ToEnum<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L183">runtime.ts:183</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L183">runtime.ts:183</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1649,7 +1649,7 @@
 						<div class="tsd-signature tsd-kind-icon">cl<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 4</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L186">runtime.ts:186</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L186">runtime.ts:186</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1659,7 +1659,7 @@
 						<div class="tsd-signature tsd-kind-icon">cpu<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 1</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L184">runtime.ts:184</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L184">runtime.ts:184</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1669,7 +1669,7 @@
 						<div class="tsd-signature tsd-kind-icon">cuda<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 2</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L185">runtime.ts:185</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L185">runtime.ts:185</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1679,7 +1679,7 @@
 						<div class="tsd-signature tsd-kind-icon">metal<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 8</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L189">runtime.ts:189</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L189">runtime.ts:189</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1689,7 +1689,7 @@
 						<div class="tsd-signature tsd-kind-icon">opencl<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 4</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L187">runtime.ts:187</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L187">runtime.ts:187</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1699,7 +1699,7 @@
 						<div class="tsd-signature tsd-kind-icon">vulkan<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 7</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L188">runtime.ts:188</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L188">runtime.ts:188</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1709,7 +1709,7 @@
 						<div class="tsd-signature tsd-kind-icon">webgpu<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 15</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/runtime.ts#L190">runtime.ts:190</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/runtime.ts#L190">runtime.ts:190</a></li>
 							</ul>
 						</aside>
 					</section>
diff --git a/docs/reference/api/typedoc/interfaces/disposable.html b/docs/reference/api/typedoc/interfaces/disposable.html
index 2a9cbbbac7..bdf6e57be2 100644
--- a/docs/reference/api/typedoc/interfaces/disposable.html
+++ b/docs/reference/api/typedoc/interfaces/disposable.html
@@ -113,7 +113,7 @@
 					<div class="tsd-signature tsd-kind-icon">dispose<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/types.ts#L52">types.ts:52</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/types.ts#L52">types.ts:52</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/interfaces/functioninfo.html b/docs/reference/api/typedoc/interfaces/functioninfo.html
index e4522e9825..91e7c64bf1 100644
--- a/docs/reference/api/typedoc/interfaces/functioninfo.html
+++ b/docs/reference/api/typedoc/interfaces/functioninfo.html
@@ -95,7 +95,7 @@
 					<div class="tsd-signature tsd-kind-icon">arg_<wbr>types<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L41">webgpu.ts:41</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L41">webgpu.ts:41</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -105,7 +105,7 @@
 					<div class="tsd-signature tsd-kind-icon">launch_<wbr>param_<wbr>tags<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L42">webgpu.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L42">webgpu.ts:42</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -115,7 +115,7 @@
 					<div class="tsd-signature tsd-kind-icon">name<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/webgpu.ts#L40">webgpu.ts:40</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/webgpu.ts#L40">webgpu.ts:40</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/interfaces/libraryprovider.html b/docs/reference/api/typedoc/interfaces/libraryprovider.html
index ff5a64f312..7c1deb6bde 100644
--- a/docs/reference/api/typedoc/interfaces/libraryprovider.html
+++ b/docs/reference/api/typedoc/interfaces/libraryprovider.html
@@ -112,7 +112,7 @@
 					<div class="tsd-signature tsd-kind-icon">imports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/types.ts#L34">types.ts:34</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/types.ts#L34">types.ts:34</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -127,7 +127,7 @@
 					<div class="tsd-signature tsd-kind-icon">start<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>inst<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">Instance</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/60cf692a6/web/src/types.ts#L39">types.ts:39</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/2af9b90ec/web/src/types.ts#L39">types.ts:39</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/searchindex.js b/docs/searchindex.js
index 3541262e80..189b590b6e 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({docnames:["arch/benchmark","arch/convert_layout","arch/debugger","arch/device_target_interactions","arch/frontend/tensorflow","arch/hybrid_script","arch/index","arch/inferbound","arch/introduction_to_module_serialization","arch/microtvm_design","arch/microtvm_project_api","arch/model_library_format","arch/pass_infra","arch/relay_intro","arch/relay_op_strategy","arch/runtime","arch/runtimes/vulkan","arch/security","arch/virtual_machine","contribute/ci","contribute/code_gu [...]
\ No newline at end of file
+Search.setIndex({docnames:["arch/benchmark","arch/convert_layout","arch/debugger","arch/device_target_interactions","arch/frontend/tensorflow","arch/hybrid_script","arch/index","arch/inferbound","arch/introduction_to_module_serialization","arch/microtvm_design","arch/microtvm_project_api","arch/model_library_format","arch/pass_infra","arch/relay_intro","arch/relay_op_strategy","arch/runtime","arch/runtimes/vulkan","arch/security","arch/virtual_machine","contribute/ci","contribute/code_gu [...]
\ No newline at end of file
diff --git a/docs/topic/vta/tutorials/autotvm/sg_execution_times.html b/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
index 2fc69e3f85..4c4ae42b99 100644
--- a/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-autotvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:21.281</strong> total execution time for <strong>topic_vta_tutorials_autotvm</strong> files:</p>
+<p><strong>00:21.146</strong> total execution time for <strong>topic_vta_tutorials_autotvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 82%" />
@@ -336,7 +336,7 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_relay_vta.html#sphx-glr-topic-vta-tutorials-autotvm-tune-relay-vta-py"><span class="std std-ref">Auto-tuning a convolutional network on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_vta.py</span></code>)</p></td>
-<td><p>00:21.275</p></td>
+<td><p>00:21.140</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_alu_vta.html#sphx-glr-topic-vta-tutorials-autotvm-tune-alu-vta-py"><span class="std std-ref">Auto-tuning a ALU fused op on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_alu_vta.py</span></code>)</p></td>
diff --git a/docs/topic/vta/tutorials/frontend/deploy_classification.html b/docs/topic/vta/tutorials/frontend/deploy_classification.html
index 68e0715c52..220528aff5 100644
--- a/docs/topic/vta/tutorials/frontend/deploy_classification.html
+++ b/docs/topic/vta/tutorials/frontend/deploy_classification.html
@@ -569,7 +569,7 @@ and dense layer which will both be executed in fp32 on the CPU.</p></li>
   DeprecationWarning,
 /workspace/vta/tutorials/frontend/deploy_classification.py:213: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
   relay_prog, target=tvm.target.Target(target, host=env.target_host), params=params
-resnet18_v1 inference graph built in 22.90s!
+resnet18_v1 inference graph built in 22.51s!
 </pre></div>
 </div>
 </div>
diff --git a/docs/topic/vta/tutorials/frontend/deploy_detection.html b/docs/topic/vta/tutorials/frontend/deploy_detection.html
index 3b2bad3799..041d9aee73 100644
--- a/docs/topic/vta/tutorials/frontend/deploy_detection.html
+++ b/docs/topic/vta/tutorials/frontend/deploy_detection.html
@@ -587,7 +587,7 @@ and dense layer which will both be executed in fp32 on the CPU.</p></li>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/relay/build_module.py:348: DeprecationWarning: Please use input parameter mod (tvm.IRModule) instead of deprecated parameter mod (tvm.relay.function.Function)
   DeprecationWarning,
-yolov3-tiny inference graph built in 15.89s!
+yolov3-tiny inference graph built in 15.99s!
 </pre></div>
 </div>
 </div>
diff --git a/docs/topic/vta/tutorials/frontend/sg_execution_times.html b/docs/topic/vta/tutorials/frontend/sg_execution_times.html
index 77e62479e7..ae1bc9513f 100644
--- a/docs/topic/vta/tutorials/frontend/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/frontend/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-frontend-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>01:32.270</strong> total execution time for <strong>topic_vta_tutorials_frontend</strong> files:</p>
+<p><strong>01:31.138</strong> total execution time for <strong>topic_vta_tutorials_frontend</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_detection.html#sphx-glr-topic-vta-tutorials-frontend-deploy-detection-py"><span class="std std-ref">Deploy Pretrained Vision Detection Model from Darknet on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_detection.py</span></code>)</p></td>
-<td><p>00:48.924</p></td>
+<td><p>00:48.460</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_classification.html#sphx-glr-topic-vta-tutorials-frontend-deploy-classification-py"><span class="std std-ref">Deploy Pretrained Vision Model from MxNet on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_classification.py</span></code>)</p></td>
-<td><p>00:43.346</p></td>
+<td><p>00:42.678</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/topic/vta/tutorials/optimize/sg_execution_times.html b/docs/topic/vta/tutorials/optimize/sg_execution_times.html
index 0b9c4df465..377b289503 100644
--- a/docs/topic/vta/tutorials/optimize/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/optimize/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-optimize-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:03.030</strong> total execution time for <strong>topic_vta_tutorials_optimize</strong> files:</p>
+<p><strong>00:03.009</strong> total execution time for <strong>topic_vta_tutorials_optimize</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="convolution_opt.html#sphx-glr-topic-vta-tutorials-optimize-convolution-opt-py"><span class="std std-ref">2D Convolution Optimization</span></a> (<code class="docutils literal notranslate"><span class="pre">convolution_opt.py</span></code>)</p></td>
-<td><p>00:02.621</p></td>
+<td><p>00:02.613</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="matrix_multiply_opt.html#sphx-glr-topic-vta-tutorials-optimize-matrix-multiply-opt-py"><span class="std std-ref">Matrix Multiply Blocking</span></a> (<code class="docutils literal notranslate"><span class="pre">matrix_multiply_opt.py</span></code>)</p></td>
-<td><p>00:00.409</p></td>
+<td><p>00:00.396</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/topic/vta/tutorials/sg_execution_times.html b/docs/topic/vta/tutorials/sg_execution_times.html
index e1549685e4..ef4fba400e 100644
--- a/docs/topic/vta/tutorials/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:00.771</strong> total execution time for <strong>topic_vta_tutorials</strong> files:</p>
+<p><strong>00:00.736</strong> total execution time for <strong>topic_vta_tutorials</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 81%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="matrix_multiply.html#sphx-glr-topic-vta-tutorials-matrix-multiply-py"><span class="std std-ref">Simple Matrix Multiply</span></a> (<code class="docutils literal notranslate"><span class="pre">matrix_multiply.py</span></code>)</p></td>
-<td><p>00:00.413</p></td>
+<td><p>00:00.394</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="vta_get_started.html#sphx-glr-topic-vta-tutorials-vta-get-started-py"><span class="std std-ref">Get Started with VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">vta_get_started.py</span></code>)</p></td>
-<td><p>00:00.359</p></td>
+<td><p>00:00.341</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/tutorial/auto_scheduler_matmul_x86.html b/docs/tutorial/auto_scheduler_matmul_x86.html
index d9df3e106c..6044115edb 100644
--- a/docs/tutorial/auto_scheduler_matmul_x86.html
+++ b/docs/tutorial/auto_scheduler_matmul_x86.html
@@ -565,7 +565,7 @@ operator fusion.</p>
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 93.587 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 93.459 ms
 </pre></div>
 </div>
 </div>
@@ -639,7 +639,6 @@ automatically optimize a matrix multiplication, without the need to specify a
 search template.  It ends a series of examples that starts from the Tensor
 Expression (TE) language that demonstrates how TVM can optimize computational
 operations.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  3.483 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-auto-scheduler-matmul-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/eac4389b114db015e95cb3cdf8b86b83/auto_scheduler_matmul_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">auto_scheduler_matmul_x86.py</span></code></a></p>
diff --git a/docs/tutorial/autotvm_matmul_x86.html b/docs/tutorial/autotvm_matmul_x86.html
index 049cbed317..3ff4ebcd2a 100644
--- a/docs/tutorial/autotvm_matmul_x86.html
+++ b/docs/tutorial/autotvm_matmul_x86.html
@@ -669,16 +669,16 @@ reduce variance, we take 5 measurements and average them.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>waiting for device...
 device available
 Get devices for measurement successfully!
-No: 1   GFLOPS: 9.80/9.80       result: MeasureResult(costs=(0.0273801888,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.571343183517456, timestamp=1663569700.7747304)        [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 256])],None,80
-No: 2   GFLOPS: 2.56/9.80       result: MeasureResult(costs=(0.1050429266,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.831789255142212, timestamp=1663569703.1474528)        [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 8])],None,32
-No: 3   GFLOPS: 11.88/11.88     result: MeasureResult(costs=(0.0225865448,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5976603031158447, timestamp=1663569703.7128532)       [(&#39;tile_y&#39;, [-1, 64]), (&#39;tile_x&#39;, [-1, 32])],None,56
-No: 4   GFLOPS: 1.51/11.88      result: MeasureResult(costs=(0.1776615278,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.947221040725708, timestamp=1663569707.2437475)        [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 4])],None,20
-No: 5   GFLOPS: 3.66/11.88      result: MeasureResult(costs=(0.0734260772,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.3109934329986572, timestamp=1663569708.685995)        [(&#39;tile_y&#39;, [-1, 256]), (&#39;tile_x&#39;, [-1, 16])],None,48
-No: 6   GFLOPS: 1.82/11.88      result: MeasureResult(costs=(0.1478481288,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.487691879272461, timestamp=1663569711.7454987)        [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 4])],None,29
-No: 7   GFLOPS: 0.83/11.88      result: MeasureResult(costs=(0.3236701332,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.298122406005859, timestamp=1663569717.0876806)        [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 2])],None,19
-No: 8   GFLOPS: 10.52/11.88     result: MeasureResult(costs=(0.0255259578,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.55462646484375, timestamp=1663569717.6584527) [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 64])],None,62
-No: 9   GFLOPS: 1.73/11.88      result: MeasureResult(costs=(0.15490296580000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.576029062271118, timestamp=1663569720.3543928) [(&#39;tile_y&#39;, [-1, 2]), (&#39;tile_x&#39;, [-1, 2])],None,11
-No: 10  GFLOPS: 2.47/11.88      result: MeasureResult(costs=(0.10874074519999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.848226547241211, timestamp=1663569722.2602882) [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 4])],None,22
+No: 1   GFLOPS: 9.12/9.12       result: MeasureResult(costs=(0.0294415704,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.604203462600708, timestamp=1663597546.662406) [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 256])],None,80
+No: 2   GFLOPS: 2.55/9.12       result: MeasureResult(costs=(0.1051508428,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8359980583190918, timestamp=1663597549.0406845)       [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 8])],None,32
+No: 3   GFLOPS: 11.76/11.76     result: MeasureResult(costs=(0.022827451800000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5912885665893555, timestamp=1663597549.610176)        [(&#39;tile_y&#39;, [-1, 64]), (&#39;tile_x&#39;, [-1, 32])],None,56
+No: 4   GFLOPS: 1.85/11.76      result: MeasureResult(costs=(0.145001262,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.443932056427002, timestamp=1663597552.6316605) [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 4])],None,20
+No: 5   GFLOPS: 3.71/11.76      result: MeasureResult(costs=(0.07244774979999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2978484630584717, timestamp=1663597554.0604458)        [(&#39;tile_y&#39;, [-1, 256]), (&#39;tile_x&#39;, [-1, 16])],None,48
+No: 6   GFLOPS: 1.72/11.76      result: MeasureResult(costs=(0.1563153742,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.6690032482147217, timestamp=1663597556.7735612)       [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 4])],None,29
+No: 7   GFLOPS: 0.87/11.76      result: MeasureResult(costs=(0.3092872586,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.071336507797241, timestamp=1663597562.4192626)        [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 2])],None,19
+No: 8   GFLOPS: 10.47/11.76     result: MeasureResult(costs=(0.025647052200000003,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5561935901641846, timestamp=1663597562.9923697)       [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 64])],None,62
+No: 9   GFLOPS: 1.90/11.76      result: MeasureResult(costs=(0.1412330426,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.379261016845703, timestamp=1663597565.4917383)        [(&#39;tile_y&#39;, [-1, 2]), (&#39;tile_x&#39;, [-1, 2])],None,11
+No: 10  GFLOPS: 2.74/11.76      result: MeasureResult(costs=(0.0980500346,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.689483642578125, timestamp=1663597567.221062) [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 4])],None,22
 </pre></div>
 </div>
 <p>With tuning completed, we can choose the configuration from the log file that
diff --git a/docs/tutorial/autotvm_relay_x86.html b/docs/tutorial/autotvm_relay_x86.html
index 1279bccd76..2cb410ee20 100644
--- a/docs/tutorial/autotvm_relay_x86.html
+++ b/docs/tutorial/autotvm_relay_x86.html
@@ -547,7 +547,7 @@ standard deviation.</p>
 <span class="nb">print</span><span class="p">(</span><a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">unoptimized</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>{&#39;mean&#39;: 512.4212018799153, &#39;median&#39;: 512.6739428502333, &#39;std&#39;: 1.444597566839756}
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>{&#39;mean&#39;: 511.21548624999724, &#39;median&#39;: 511.614363700005, &#39;std&#39;: 1.595007203886461}
 </pre></div>
 </div>
 </div>
@@ -699,178 +699,178 @@ depending on the specifics of the model and the target platform.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>[Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  1/25]  Current/Best:   17.45/  17.45 GFLOPS | Progress: (4/20) | 6.42 s
-[Task  1/25]  Current/Best:    6.08/  17.45 GFLOPS | Progress: (8/20) | 9.52 s
-[Task  1/25]  Current/Best:   11.14/  22.20 GFLOPS | Progress: (12/20) | 12.06 s
-[Task  1/25]  Current/Best:   16.49/  22.33 GFLOPS | Progress: (16/20) | 13.75 s
-[Task  1/25]  Current/Best:   11.29/  23.38 GFLOPS | Progress: (20/20) | 15.54 s Done.
+[Task  1/25]  Current/Best:   17.63/  17.63 GFLOPS | Progress: (4/20) | 6.33 s
+[Task  1/25]  Current/Best:    6.11/  17.63 GFLOPS | Progress: (8/20) | 9.30 s
+[Task  1/25]  Current/Best:   11.26/  22.23 GFLOPS | Progress: (12/20) | 11.82 s
+[Task  1/25]  Current/Best:   16.53/  22.23 GFLOPS | Progress: (16/20) | 13.52 s
+[Task  1/25]  Current/Best:   11.33/  23.63 GFLOPS | Progress: (20/20) | 15.31 s Done.
 
 [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  2/25]  Current/Best:   12.06/  12.27 GFLOPS | Progress: (4/20) | 3.98 s
-[Task  2/25]  Current/Best:   12.46/  18.62 GFLOPS | Progress: (8/20) | 5.29 s
-[Task  2/25]  Current/Best:   20.81/  20.81 GFLOPS | Progress: (12/20) | 6.62 s
-[Task  2/25]  Current/Best:   10.61/  20.81 GFLOPS | Progress: (16/20) | 7.91 s
-[Task  2/25]  Current/Best:   16.77/  20.81 GFLOPS | Progress: (20/20) | 9.55 s Done.
+[Task  2/25]  Current/Best:   12.27/  12.27 GFLOPS | Progress: (4/20) | 3.89 s
+[Task  2/25]  Current/Best:   12.61/  18.70 GFLOPS | Progress: (8/20) | 5.18 s
+[Task  2/25]  Current/Best:   20.88/  20.88 GFLOPS | Progress: (12/20) | 6.49 s
+[Task  2/25]  Current/Best:   10.63/  20.88 GFLOPS | Progress: (16/20) | 7.74 s
+[Task  2/25]  Current/Best:   17.59/  20.88 GFLOPS | Progress: (20/20) | 9.33 s Done.
 
 [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  3/25]  Current/Best:    1.63/  10.15 GFLOPS | Progress: (4/20) | 5.96 s
-[Task  3/25]  Current/Best:   15.31/  16.64 GFLOPS | Progress: (8/20) | 7.92 s
-[Task  3/25]  Current/Best:   14.98/  16.64 GFLOPS | Progress: (12/20) | 9.67 s
-[Task  3/25]  Current/Best:    6.83/  22.97 GFLOPS | Progress: (16/20) | 11.65 s
-[Task  3/25]  Current/Best:   11.02/  22.97 GFLOPS | Progress: (20/20) | 16.33 s Done.
+[Task  3/25]  Current/Best:    1.63/  10.17 GFLOPS | Progress: (4/20) | 5.87 s
+[Task  3/25]  Current/Best:   15.38/  16.87 GFLOPS | Progress: (8/20) | 7.83 s
+[Task  3/25]  Current/Best:   15.04/  16.87 GFLOPS | Progress: (12/20) | 9.60 s
+[Task  3/25]  Current/Best:    6.81/  23.41 GFLOPS | Progress: (16/20) | 11.55 s
+[Task  3/25]  Current/Best:   11.08/  23.41 GFLOPS | Progress: (20/20) | 16.16 s Done.
 
 [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  4/25]  Current/Best:    9.04/  18.61 GFLOPS | Progress: (4/20) | 2.50 s
-[Task  4/25]  Current/Best:    6.60/  18.61 GFLOPS | Progress: (8/20) | 7.25 s
-[Task  4/25]  Current/Best:   20.72/  20.72 GFLOPS | Progress: (12/20) | 12.27 s
-[Task  4/25]  Current/Best:   16.35/  20.72 GFLOPS | Progress: (16/20) | 14.67 s
-[Task  4/25]  Current/Best:   12.72/  20.72 GFLOPS | Progress: (20/20) | 16.73 s Done.
+[Task  4/25]  Current/Best:    9.07/  17.51 GFLOPS | Progress: (4/20) | 2.41 s
+[Task  4/25]  Current/Best:    6.21/  17.51 GFLOPS | Progress: (8/20) | 7.17 s
+[Task  4/25]  Current/Best:   20.64/  20.64 GFLOPS | Progress: (12/20) | 12.14 s
+[Task  4/25]  Current/Best:   16.33/  20.64 GFLOPS | Progress: (16/20) | 14.52 s
+[Task  4/25]  Current/Best:   12.15/  20.64 GFLOPS | Progress: (20/20) | 16.62 s Done.
 
 [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  5/25]  Current/Best:    9.29/   9.88 GFLOPS | Progress: (4/20) | 2.72 s
-[Task  5/25]  Current/Best:   11.60/  11.60 GFLOPS | Progress: (8/20) | 4.79 s
-[Task  5/25]  Current/Best:   11.19/  17.88 GFLOPS | Progress: (12/20) | 7.98 s
-[Task  5/25]  Current/Best:   11.60/  21.93 GFLOPS | Progress: (16/20) | 9.45 s
-[Task  5/25]  Current/Best:   12.07/  21.93 GFLOPS | Progress: (20/20) | 11.39 s Done.
+[Task  5/25]  Current/Best:    9.14/   9.87 GFLOPS | Progress: (4/20) | 2.64 s
+[Task  5/25]  Current/Best:   11.64/  11.64 GFLOPS | Progress: (8/20) | 4.74 s
+[Task  5/25]  Current/Best:   11.87/  18.09 GFLOPS | Progress: (12/20) | 7.93 s
+[Task  5/25]  Current/Best:   11.46/  21.99 GFLOPS | Progress: (16/20) | 9.36 s
+[Task  5/25]  Current/Best:   12.15/  21.99 GFLOPS | Progress: (20/20) | 11.27 s Done.
 
 [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  6/25]  Current/Best:   12.01/  19.97 GFLOPS | Progress: (4/20) | 4.17 s
-[Task  6/25]  Current/Best:   18.93/  19.97 GFLOPS | Progress: (8/20) | 5.94 s
-[Task  6/25]  Current/Best:   13.29/  19.97 GFLOPS | Progress: (12/20) | 7.93 s
-[Task  6/25]  Current/Best:   19.43/  19.97 GFLOPS | Progress: (16/20) | 10.21 s
-[Task  6/25]  Current/Best:    3.73/  19.97 GFLOPS | Progress: (20/20) | 12.81 s Done.
+[Task  6/25]  Current/Best:   11.99/  20.00 GFLOPS | Progress: (4/20) | 4.10 s
+[Task  6/25]  Current/Best:   18.84/  20.00 GFLOPS | Progress: (8/20) | 5.89 s
+[Task  6/25]  Current/Best:   13.17/  20.00 GFLOPS | Progress: (12/20) | 7.90 s
+[Task  6/25]  Current/Best:   19.65/  20.00 GFLOPS | Progress: (16/20) | 10.16 s
+[Task  6/25]  Current/Best:    3.69/  20.00 GFLOPS | Progress: (20/20) | 12.73 s Done.
 
 [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  7/25]  Current/Best:    9.79/  12.11 GFLOPS | Progress: (4/20) | 3.69 s
-[Task  7/25]  Current/Best:   19.47/  20.10 GFLOPS | Progress: (8/20) | 5.24 s
-[Task  7/25]  Current/Best:   15.84/  20.10 GFLOPS | Progress: (12/20) | 7.15 s
-[Task  7/25]  Current/Best:   12.14/  20.10 GFLOPS | Progress: (16/20) | 9.23 s
-[Task  7/25]  Current/Best:    6.12/  20.10 GFLOPS | Progress: (20/20) | 11.74 s Done.
+[Task  7/25]  Current/Best:    9.80/  12.07 GFLOPS | Progress: (4/20) | 3.69 s
+[Task  7/25]  Current/Best:   18.94/  19.93 GFLOPS | Progress: (8/20) | 5.24 s
+[Task  7/25]  Current/Best:   16.06/  19.93 GFLOPS | Progress: (12/20) | 7.20 s
+[Task  7/25]  Current/Best:   12.15/  19.93 GFLOPS | Progress: (16/20) | 9.29 s
+[Task  7/25]  Current/Best:    5.99/  20.56 GFLOPS | Progress: (20/20) | 11.80 s Done.
 
 [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  8/25]  Current/Best:    9.97/  13.88 GFLOPS | Progress: (4/20) | 3.00 s
-[Task  8/25]  Current/Best:    9.68/  13.88 GFLOPS | Progress: (8/20) | 8.06 s
-[Task  8/25]  Current/Best:   12.88/  13.88 GFLOPS | Progress: (12/20) | 14.56 s
-[Task  8/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (16/20) | 16.68 s
-[Task  8/25]  Current/Best:   18.57/  18.80 GFLOPS | Progress: (20/20) | 23.74 s Done.
+[Task  8/25]  Current/Best:    9.75/  13.51 GFLOPS | Progress: (4/20) | 3.00 s
+[Task  8/25]  Current/Best:    9.00/  13.51 GFLOPS | Progress: (8/20) | 8.18 s
+[Task  8/25]  Current/Best:   12.52/  13.51 GFLOPS | Progress: (12/20) | 14.67 s
+[Task  8/25]  Current/Best:   18.97/  18.97 GFLOPS | Progress: (16/20) | 16.81 s
+[Task  8/25]  Current/Best:   18.34/  18.97 GFLOPS | Progress: (20/20) | 23.89 s Done.
 
 [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  9/25]  Current/Best:   14.38/  14.38 GFLOPS | Progress: (4/20) | 12.02 s
-[Task  9/25]  Current/Best:   22.15/  22.15 GFLOPS | Progress: (8/20) | 13.78 s
-[Task  9/25]  Current/Best:    8.00/  22.15 GFLOPS | Progress: (12/20) | 16.32 s
-[Task  9/25]  Current/Best:   17.90/  22.15 GFLOPS | Progress: (16/20) | 19.06 s
-[Task  9/25]  Current/Best:    9.11/  22.15 GFLOPS | Progress: (20/20) | 27.53 s
+[Task  9/25]  Current/Best:   14.35/  14.35 GFLOPS | Progress: (4/20) | 11.97 s
+[Task  9/25]  Current/Best:   21.70/  21.70 GFLOPS | Progress: (8/20) | 13.79 s
+[Task  9/25]  Current/Best:    7.66/  21.70 GFLOPS | Progress: (12/20) | 16.33 s
+[Task  9/25]  Current/Best:   17.92/  21.70 GFLOPS | Progress: (16/20) | 19.21 s
+[Task  9/25]  Current/Best:    9.05/  21.70 GFLOPS | Progress: (20/20) | 27.80 s
 [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 10/25]  Current/Best:   17.66/  17.66 GFLOPS | Progress: (4/20) | 2.63 s
-[Task 10/25]  Current/Best:   15.64/  17.66 GFLOPS | Progress: (8/20) | 4.25 s
-[Task 10/25]  Current/Best:   11.71/  18.99 GFLOPS | Progress: (12/20) | 5.81 s
-[Task 10/25]  Current/Best:   19.08/  20.22 GFLOPS | Progress: (16/20) | 6.93 s
-[Task 10/25]  Current/Best:    8.50/  20.22 GFLOPS | Progress: (20/20) | 8.46 s Done.
+[Task 10/25]  Current/Best:   17.96/  17.96 GFLOPS | Progress: (4/20) | 2.58 s
+[Task 10/25]  Current/Best:   15.66/  17.96 GFLOPS | Progress: (8/20) | 4.20 s
+[Task 10/25]  Current/Best:   11.33/  18.78 GFLOPS | Progress: (12/20) | 5.76 s
+[Task 10/25]  Current/Best:   19.12/  20.07 GFLOPS | Progress: (16/20) | 6.86 s
+[Task 10/25]  Current/Best:    8.36/  20.07 GFLOPS | Progress: (20/20) | 8.42 s Done.
 
 [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 11/25]  Current/Best:   10.85/  18.19 GFLOPS | Progress: (4/20) | 3.40 s
-[Task 11/25]  Current/Best:   14.92/  18.19 GFLOPS | Progress: (8/20) | 6.22 s
-[Task 11/25]  Current/Best:   15.88/  18.19 GFLOPS | Progress: (12/20) | 8.36 s
-[Task 11/25]  Current/Best:   11.87/  20.63 GFLOPS | Progress: (16/20) | 11.35 s
-[Task 11/25]  Current/Best:   18.40/  20.63 GFLOPS | Progress: (20/20) | 13.47 s Done.
+[Task 11/25]  Current/Best:   10.82/  18.17 GFLOPS | Progress: (4/20) | 3.43 s
+[Task 11/25]  Current/Best:   14.96/  18.17 GFLOPS | Progress: (8/20) | 6.27 s
+[Task 11/25]  Current/Best:   15.85/  18.17 GFLOPS | Progress: (12/20) | 8.36 s
+[Task 11/25]  Current/Best:   11.85/  20.71 GFLOPS | Progress: (16/20) | 11.24 s
+[Task 11/25]  Current/Best:   17.77/  20.71 GFLOPS | Progress: (20/20) | 13.37 s Done.
 
 [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 12/25]  Current/Best:    7.76/  17.98 GFLOPS | Progress: (4/20) | 5.75 s
-[Task 12/25]  Current/Best:    5.11/  17.98 GFLOPS | Progress: (8/20) | 9.71 s
-[Task 12/25]  Current/Best:   18.95/  18.95 GFLOPS | Progress: (12/20) | 11.70 s
-[Task 12/25]  Current/Best:   15.19/  18.95 GFLOPS | Progress: (16/20) | 14.62 s
-[Task 12/25]  Current/Best:   15.11/  18.95 GFLOPS | Progress: (20/20) | 16.60 s Done.
+[Task 12/25]  Current/Best:    7.81/  17.64 GFLOPS | Progress: (4/20) | 5.76 s
+[Task 12/25]  Current/Best:    5.07/  17.64 GFLOPS | Progress: (8/20) | 9.74 s
+[Task 12/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (12/20) | 11.75 s
+[Task 12/25]  Current/Best:   14.94/  18.80 GFLOPS | Progress: (16/20) | 14.67 s
+[Task 12/25]  Current/Best:   15.14/  18.80 GFLOPS | Progress: (20/20) | 16.64 s Done.
 
 [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 13/25]  Current/Best:    8.31/  17.15 GFLOPS | Progress: (4/20) | 3.82 s
-[Task 13/25]  Current/Best:   15.23/  20.65 GFLOPS | Progress: (8/20) | 6.44 s
-[Task 13/25]  Current/Best:   18.83/  21.24 GFLOPS | Progress: (12/20) | 9.54 s
-[Task 13/25]  Current/Best:   12.25/  21.24 GFLOPS | Progress: (16/20) | 13.01 s
-[Task 13/25]  Current/Best:   17.49/  21.24 GFLOPS | Progress: (20/20) | 15.40 s Done.
+[Task 13/25]  Current/Best:    8.66/  17.34 GFLOPS | Progress: (4/20) | 3.75 s
+[Task 13/25]  Current/Best:   15.21/  20.78 GFLOPS | Progress: (8/20) | 6.38 s
+[Task 13/25]  Current/Best:   18.51/  20.89 GFLOPS | Progress: (12/20) | 9.54 s
+[Task 13/25]  Current/Best:   12.22/  20.89 GFLOPS | Progress: (16/20) | 12.99 s
+[Task 13/25]  Current/Best:   17.96/  20.89 GFLOPS | Progress: (20/20) | 15.37 s Done.
 
 [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 14/25]  Current/Best:   12.18/  13.35 GFLOPS | Progress: (4/20) | 3.48 s
-[Task 14/25]  Current/Best:    6.10/  13.35 GFLOPS | Progress: (8/20) | 5.68 s
-[Task 14/25]  Current/Best:   20.36/  20.36 GFLOPS | Progress: (12/20) | 8.36 s
-[Task 14/25]  Current/Best:   16.12/  20.36 GFLOPS | Progress: (16/20) | 10.07 s Done.
+[Task 14/25]  Current/Best:   12.13/  13.39 GFLOPS | Progress: (4/20) | 3.38 s
+[Task 14/25]  Current/Best:    6.08/  13.39 GFLOPS | Progress: (8/20) | 5.56 s
+[Task 14/25]  Current/Best:   19.46/  19.46 GFLOPS | Progress: (12/20) | 8.25 s
+[Task 14/25]  Current/Best:   14.93/  19.46 GFLOPS | Progress: (16/20) | 9.95 s Done.
 
-[Task 14/25]  Current/Best:   16.94/  20.36 GFLOPS | Progress: (20/20) | 11.86 s
+[Task 14/25]  Current/Best:   16.76/  19.46 GFLOPS | Progress: (20/20) | 11.69 s
 [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 15/25]  Current/Best:   15.69/  17.28 GFLOPS | Progress: (4/20) | 2.77 s
-[Task 15/25]  Current/Best:   12.67/  17.57 GFLOPS | Progress: (8/20) | 4.09 s
-[Task 15/25]  Current/Best:   10.02/  21.66 GFLOPS | Progress: (12/20) | 6.32 s
-[Task 15/25]  Current/Best:   20.30/  21.66 GFLOPS | Progress: (16/20) | 9.95 s
-[Task 15/25]  Current/Best:    9.53/  21.66 GFLOPS | Progress: (20/20) | 10.97 s
+[Task 15/25]  Current/Best:   15.64/  17.22 GFLOPS | Progress: (4/20) | 2.74 s
+[Task 15/25]  Current/Best:   12.65/  17.57 GFLOPS | Progress: (8/20) | 4.04 s
+[Task 15/25]  Current/Best:   10.00/  21.68 GFLOPS | Progress: (12/20) | 6.26 s
+[Task 15/25]  Current/Best:   20.29/  21.68 GFLOPS | Progress: (16/20) | 9.85 s
+[Task 15/25]  Current/Best:    9.53/  21.68 GFLOPS | Progress: (20/20) | 10.87 s
 [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 16/25]  Current/Best:   19.24/  19.24 GFLOPS | Progress: (4/20) | 3.03 s
-[Task 16/25]  Current/Best:    3.04/  19.24 GFLOPS | Progress: (8/20) | 4.65 s
-[Task 16/25]  Current/Best:   18.20/  19.32 GFLOPS | Progress: (12/20) | 5.88 s
-[Task 16/25]  Current/Best:   18.11/  19.32 GFLOPS | Progress: (16/20) | 7.27 s
-[Task 16/25]  Current/Best:   10.26/  21.28 GFLOPS | Progress: (20/20) | 9.42 s Done.
+[Task 16/25]  Current/Best:   19.31/  19.31 GFLOPS | Progress: (4/20) | 3.00 s
+[Task 16/25]  Current/Best:    2.99/  19.31 GFLOPS | Progress: (8/20) | 4.62 s
+[Task 16/25]  Current/Best:   17.80/  19.31 GFLOPS | Progress: (12/20) | 5.85 s
+[Task 16/25]  Current/Best:   17.72/  19.31 GFLOPS | Progress: (16/20) | 7.24 s
+[Task 16/25]  Current/Best:    9.84/  21.21 GFLOPS | Progress: (20/20) | 9.39 s Done.
 
 [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 17/25]  Current/Best:   12.93/  16.17 GFLOPS | Progress: (4/20) | 4.87 s
-[Task 17/25]  Current/Best:   12.88/  22.95 GFLOPS | Progress: (8/20) | 7.78 s
-[Task 17/25]  Current/Best:   16.51/  22.95 GFLOPS | Progress: (12/20) | 9.89 s
-[Task 17/25]  Current/Best:   16.44/  22.95 GFLOPS | Progress: (16/20) | 12.11 s
-[Task 17/25]  Current/Best:    9.98/  22.95 GFLOPS | Progress: (20/20) | 14.27 s Done.
+[Task 17/25]  Current/Best:   12.18/  16.08 GFLOPS | Progress: (4/20) | 4.82 s
+[Task 17/25]  Current/Best:   12.82/  23.00 GFLOPS | Progress: (8/20) | 7.71 s
+[Task 17/25]  Current/Best:   16.52/  23.00 GFLOPS | Progress: (12/20) | 9.83 s
+[Task 17/25]  Current/Best:   16.39/  23.00 GFLOPS | Progress: (16/20) | 12.07 s
+[Task 17/25]  Current/Best:    9.97/  23.00 GFLOPS | Progress: (20/20) | 14.22 s Done.
 
 [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 18/25]  Current/Best:   10.44/  16.39 GFLOPS | Progress: (4/20) | 3.85 s
-[Task 18/25]  Current/Best:   10.58/  19.03 GFLOPS | Progress: (8/20) | 7.49 s
-[Task 18/25]  Current/Best:   18.99/  19.03 GFLOPS | Progress: (12/20) | 9.45 s
-[Task 18/25]  Current/Best:   10.05/  19.03 GFLOPS | Progress: (16/20) | 13.34 s
-[Task 18/25]  Current/Best:   20.72/  20.72 GFLOPS | Progress: (20/20) | 14.89 s Done.
+[Task 18/25]  Current/Best:    9.92/  16.60 GFLOPS | Progress: (4/20) | 3.85 s
+[Task 18/25]  Current/Best:   10.59/  19.11 GFLOPS | Progress: (8/20) | 7.56 s
+[Task 18/25]  Current/Best:   18.94/  19.11 GFLOPS | Progress: (12/20) | 9.51 s
+[Task 18/25]  Current/Best:   10.00/  19.11 GFLOPS | Progress: (16/20) | 13.32 s
+[Task 18/25]  Current/Best:   20.77/  20.77 GFLOPS | Progress: (20/20) | 14.85 s Done.
 
 [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 19/25]  Current/Best:    7.32/  19.84 GFLOPS | Progress: (4/20) | 6.18 s
-[Task 19/25]  Current/Best:    2.68/  19.84 GFLOPS | Progress: (8/20) | 9.53 s
-[Task 19/25]  Current/Best:   19.14/  20.92 GFLOPS | Progress: (12/20) | 12.51 s
-[Task 19/25]  Current/Best:   12.87/  21.50 GFLOPS | Progress: (16/20) | 15.58 s
-[Task 19/25]  Current/Best:    2.69/  22.45 GFLOPS | Progress: (20/20) | 18.43 s Done.
+[Task 19/25]  Current/Best:    7.30/  19.90 GFLOPS | Progress: (4/20) | 6.12 s
+[Task 19/25]  Current/Best:    2.69/  19.90 GFLOPS | Progress: (8/20) | 9.50 s
+[Task 19/25]  Current/Best:   17.35/  20.91 GFLOPS | Progress: (12/20) | 12.51 s
+[Task 19/25]  Current/Best:   13.77/  20.91 GFLOPS | Progress: (16/20) | 15.53 s
+[Task 19/25]  Current/Best:    2.70/  22.47 GFLOPS | Progress: (20/20) | 18.41 s Done.
 
 [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 20/25]  Current/Best:    9.61/  15.36 GFLOPS | Progress: (4/20) | 3.34 s Done.
+[Task 20/25]  Current/Best:    9.17/  14.25 GFLOPS | Progress: (4/20) | 3.35 s Done.
  Done.
 
-[Task 20/25]  Current/Best:   10.01/  15.36 GFLOPS | Progress: (8/20) | 6.91 s
-[Task 20/25]  Current/Best:    2.32/  15.36 GFLOPS | Progress: (12/20) | 10.87 s
-[Task 20/25]  Current/Best:   11.02/  15.36 GFLOPS | Progress: (16/20) | 14.76 s
-[Task 20/25]  Current/Best:   11.05/  21.33 GFLOPS | Progress: (20/20) | 16.92 s
+[Task 20/25]  Current/Best:    9.94/  14.25 GFLOPS | Progress: (8/20) | 6.92 s
+[Task 20/25]  Current/Best:    2.32/  14.51 GFLOPS | Progress: (12/20) | 10.88 s
+[Task 20/25]  Current/Best:   10.91/  14.51 GFLOPS | Progress: (16/20) | 14.74 s
+[Task 20/25]  Current/Best:   10.76/  22.05 GFLOPS | Progress: (20/20) | 16.88 s
 [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 21/25]  Current/Best:    6.35/  17.71 GFLOPS | Progress: (4/20) | 3.29 s
-[Task 21/25]  Current/Best:   14.64/  17.71 GFLOPS | Progress: (8/20) | 4.93 s
-[Task 21/25]  Current/Best:    1.61/  17.71 GFLOPS | Progress: (12/20) | 7.06 s
-[Task 21/25]  Current/Best:   15.98/  17.71 GFLOPS | Progress: (16/20) | 10.58 s
-[Task 21/25]  Current/Best:    4.43/  17.71 GFLOPS | Progress: (20/20) | 17.84 s
+[Task 21/25]  Current/Best:    6.36/  17.69 GFLOPS | Progress: (4/20) | 3.29 s
+[Task 21/25]  Current/Best:   14.60/  17.69 GFLOPS | Progress: (8/20) | 4.90 s
+[Task 21/25]  Current/Best:    1.61/  17.69 GFLOPS | Progress: (12/20) | 7.06 s
+[Task 21/25]  Current/Best:   15.91/  17.69 GFLOPS | Progress: (16/20) | 10.59 s
+[Task 21/25]  Current/Best:    4.45/  17.69 GFLOPS | Progress: (20/20) | 17.87 s
 [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 22/25]  Current/Best:    2.70/  16.95 GFLOPS | Progress: (4/20) | 2.71 s
-[Task 22/25]  Current/Best:    9.04/  21.20 GFLOPS | Progress: (8/20) | 4.76 s
-[Task 22/25]  Current/Best:   19.94/  21.20 GFLOPS | Progress: (12/20) | 7.15 s
-[Task 22/25]  Current/Best:   15.44/  21.20 GFLOPS | Progress: (16/20) | 9.29 s
-[Task 22/25]  Current/Best:   12.51/  21.20 GFLOPS | Progress: (20/20) | 11.07 s Done.
+[Task 22/25]  Current/Best:    2.70/  16.92 GFLOPS | Progress: (4/20) | 2.69 s
+[Task 22/25]  Current/Best:    8.71/  20.62 GFLOPS | Progress: (8/20) | 4.67 s
+[Task 22/25]  Current/Best:   19.64/  20.62 GFLOPS | Progress: (12/20) | 7.08 s
+[Task 22/25]  Current/Best:   15.43/  20.62 GFLOPS | Progress: (16/20) | 9.20 s
+[Task 22/25]  Current/Best:   12.78/  20.62 GFLOPS | Progress: (20/20) | 10.97 s Done.
 
 [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 23/25]  Current/Best:   16.61/  19.37 GFLOPS | Progress: (4/20) | 3.35 s
-[Task 23/25]  Current/Best:   13.64/  19.87 GFLOPS | Progress: (8/20) | 6.77 s
-[Task 23/25]  Current/Best:   20.26/  21.63 GFLOPS | Progress: (12/20) | 8.63 s
-[Task 23/25]  Current/Best:    6.55/  21.63 GFLOPS | Progress: (16/20) | 15.62 s
-[Task 23/25]  Current/Best:    7.63/  21.63 GFLOPS | Progress: (20/20) | 19.86 s Done.
+[Task 23/25]  Current/Best:   16.76/  20.09 GFLOPS | Progress: (4/20) | 3.29 s
+[Task 23/25]  Current/Best:   13.51/  20.09 GFLOPS | Progress: (8/20) | 6.70 s
+[Task 23/25]  Current/Best:   20.61/  21.81 GFLOPS | Progress: (12/20) | 8.55 s
+[Task 23/25]  Current/Best:    6.61/  21.81 GFLOPS | Progress: (16/20) | 15.62 s
+[Task 23/25]  Current/Best:    7.68/  21.81 GFLOPS | Progress: (20/20) | 19.84 s Done.
 
 [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 24/25]  Current/Best:    8.37/   8.37 GFLOPS | Progress: (4/20) | 11.84 s
-[Task 24/25]  Current/Best:    1.90/   8.37 GFLOPS | Progress: (8/20) | 22.87 s
-[Task 24/25]  Current/Best:    3.92/   8.37 GFLOPS | Progress: (12/20) | 34.44 s Done.
+[Task 24/25]  Current/Best:    8.19/   8.19 GFLOPS | Progress: (4/20) | 11.80 s
+[Task 24/25]  Current/Best:    3.32/   8.19 GFLOPS | Progress: (8/20) | 23.05 s
+[Task 24/25]  Current/Best:    3.98/   8.19 GFLOPS | Progress: (12/20) | 33.80 s Done.
 
-[Task 24/25]  Current/Best:    6.43/   8.89 GFLOPS | Progress: (16/20) | 40.12 s
-[Task 24/25]  Current/Best:    2.95/   8.89 GFLOPS | Progress: (20/20) | 46.06 s Done.
+[Task 24/25]  Current/Best:    5.48/   8.67 GFLOPS | Progress: (16/20) | 39.37 s
+[Task 24/25]  Current/Best:    3.03/   8.67 GFLOPS | Progress: (20/20) | 45.34 s Done.
 
 [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 25/25]  Current/Best:    1.55/   2.77 GFLOPS | Progress: (4/20) | 11.63 s
-[Task 25/25]  Current/Best:    5.88/   8.44 GFLOPS | Progress: (8/20) | 22.91 s
-[Task 25/25]  Current/Best:    6.02/   8.44 GFLOPS | Progress: (12/20) | 34.22 s
-[Task 25/25]  Current/Best:    5.85/   8.94 GFLOPS | Progress: (16/20) | 36.13 s
-[Task 25/25]  Current/Best:    2.81/   9.18 GFLOPS | Progress: (20/20) | 46.84 s
+[Task 25/25]  Current/Best:    1.55/   2.75 GFLOPS | Progress: (4/20) | 11.61 s
+[Task 25/25]  Current/Best:    5.59/   7.97 GFLOPS | Progress: (8/20) | 22.89 s
+[Task 25/25]  Current/Best:    5.82/   7.97 GFLOPS | Progress: (12/20) | 34.38 s
+[Task 25/25]  Current/Best:    5.62/   8.14 GFLOPS | Progress: (16/20) | 36.26 s
+[Task 25/25]  Current/Best:    2.83/   8.28 GFLOPS | Progress: (20/20) | 46.98 s
 </pre></div>
 </div>
 <p>The output from this tuning process will look something like this:</p>
@@ -934,8 +934,8 @@ model using optimized operators to speed up our computations.</p>
     <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;class=&#39;</span><span class="si">%s</span><span class="s2">&#39; with probability=</span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="p">(</span><a href="https://docs.python.org/3/library/stdtypes.html#list" title="builtins.list" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">labels</span></a [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>class=&#39;n02123045 tabby, tabby cat&#39; with probability=0.621104
-class=&#39;n02123159 tiger cat&#39; with probability=0.356378
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>class=&#39;n02123045 tabby, tabby cat&#39; with probability=0.621105
+class=&#39;n02123159 tiger cat&#39; with probability=0.356377
 class=&#39;n02124075 Egyptian cat&#39; with probability=0.019712
 class=&#39;n02129604 tiger, Panthera tigris&#39; with probability=0.001215
 class=&#39;n04040759 radiator&#39; with probability=0.000262
@@ -972,8 +972,8 @@ improvement in comparing the optimized model to the unoptimized model.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;unoptimized: </span><span class="si">%s</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="p">(</span><a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">unoptimized</span></a><span class="p">))</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>optimized: {&#39;mean&#39;: 411.6166502700071, &#39;median&#39;: 411.56105894988286, &#39;std&#39;: 1.0797258407980037}
-unoptimized: {&#39;mean&#39;: 512.4212018799153, &#39;median&#39;: 512.6739428502333, &#39;std&#39;: 1.444597566839756}
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>optimized: {&#39;mean&#39;: 407.16022603000283, &#39;median&#39;: 407.1732334000103, &#39;std&#39;: 0.8974312165856505}
+unoptimized: {&#39;mean&#39;: 511.21548624999724, &#39;median&#39;: 511.614363700005, &#39;std&#39;: 1.595007203886461}
 </pre></div>
 </div>
 </div>
@@ -987,7 +987,7 @@ models.</p>
 <p>Here we presented a simple example using ResNet-50 v2 locally. However, TVM
 supports many more features including cross-compilation, remote execution and
 profiling/benchmarking.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 10 minutes  28.919 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 10 minutes  25.412 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-autotvm-relay-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/57a45d9bef1af358191e7d50043e652c/autotvm_relay_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">autotvm_relay_x86.py</span></code></a></p>
diff --git a/docs/tutorial/cross_compilation_and_rpc.html b/docs/tutorial/cross_compilation_and_rpc.html
index 24fd90605f..d179bbfdf8 100644
--- a/docs/tutorial/cross_compilation_and_rpc.html
+++ b/docs/tutorial/cross_compilation_and_rpc.html
@@ -527,7 +527,7 @@ device and returns the measured cost. Network overhead is excluded.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;</span><span class="si">%g</span><span class="s2"> secs/op&quot;</span> <span class="o">%</span> <span class="n">cost</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>1.533e-07 secs/op
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>1.228e-07 secs/op
 </pre></div>
 </div>
 </div>
diff --git a/docs/tutorial/intro_topi.html b/docs/tutorial/intro_topi.html
index 700839286b..9e0eb45da4 100644
--- a/docs/tutorial/intro_topi.html
+++ b/docs/tutorial/intro_topi.html
@@ -484,7 +484,7 @@ we can schedule the following series of operations ending with <code class="code
 <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/ir.html#tvm.ir.Array" title="tvm.ir.Array" class="sphx-glr-backref-module-tvm-ir sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">sg</span><span class="o">.</span><span class="n">stages</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>[stage(a, placeholder(a, 0x1fa8da40)), stage(b, placeholder(b, 0x1f6e8b50)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[ [...]
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>[stage(a, placeholder(a, 0x57a23f0)), stage(b, placeholder(b, 0x27909010)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[i [...]
 </pre></div>
 </div>
 <p>We can test the correctness by comparing with <code class="code docutils literal notranslate"><span class="pre">numpy</span></code> result as follows</p>
diff --git a/docs/tutorial/sg_execution_times.html b/docs/tutorial/sg_execution_times.html
index f4596fe0fc..e3654418a0 100644
--- a/docs/tutorial/sg_execution_times.html
+++ b/docs/tutorial/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-tutorial-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>13:29.356</strong> total execution time for <strong>tutorial</strong> files:</p>
+<p><strong>13:12.068</strong> total execution time for <strong>tutorial</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,50 +336,50 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="autotvm_relay_x86.html#sphx-glr-tutorial-autotvm-relay-x86-py"><span class="std std-ref">Compiling and Optimizing a Model with the Python Interface (AutoTVM)</span></a> (<code class="docutils literal notranslate"><span class="pre">autotvm_relay_x86.py</span></code>)</p></td>
-<td><p>10:28.919</p></td>
+<td><p>10:25.412</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="auto_scheduler_matmul_x86.html#sphx-glr-tutorial-auto-scheduler-matmul-x86-py"><span class="std std-ref">Optimizing Operators with Auto-scheduling</span></a> (<code class="docutils literal notranslate"><span class="pre">auto_scheduler_matmul_x86.py</span></code>)</p></td>
-<td><p>01:03.483</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="tensor_expr_get_started.html#sphx-glr-tutorial-tensor-expr-get-started-py"><span class="std std-ref">Working with Operators Using Tensor Expression</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_expr_get_started.py</span></code>)</p></td>
+<td><p>01:00.703</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="tensor_expr_get_started.html#sphx-glr-tutorial-tensor-expr-get-started-py"><span class="std std-ref">Working with Operators Using Tensor Expression</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_expr_get_started.py</span></code>)</p></td>
-<td><p>00:58.348</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="auto_scheduler_matmul_x86.html#sphx-glr-tutorial-auto-scheduler-matmul-x86-py"><span class="std std-ref">Optimizing Operators with Auto-scheduling</span></a> (<code class="docutils literal notranslate"><span class="pre">auto_scheduler_matmul_x86.py</span></code>)</p></td>
+<td><p>00:48.639</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="relay_quick_start.html#sphx-glr-tutorial-relay-quick-start-py"><span class="std std-ref">Quick Start Tutorial for Compiling Deep Learning Models</span></a> (<code class="docutils literal notranslate"><span class="pre">relay_quick_start.py</span></code>)</p></td>
-<td><p>00:31.656</p></td>
+<td><p>00:31.093</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="autotvm_matmul_x86.html#sphx-glr-tutorial-autotvm-matmul-x86-py"><span class="std std-ref">Optimizing Operators with Schedule Templates and AutoTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">autotvm_matmul_x86.py</span></code>)</p></td>
-<td><p>00:25.004</p></td>
+<td><p>00:24.126</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tensor_ir_blitz_course.html#sphx-glr-tutorial-tensor-ir-blitz-course-py"><span class="std std-ref">Blitz Course to TensorIR</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_ir_blitz_course.py</span></code>)</p></td>
-<td><p>00:01.068</p></td>
+<td><p>00:01.233</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="intro_topi.html#sphx-glr-tutorial-intro-topi-py"><span class="std std-ref">Introduction to TOPI</span></a> (<code class="docutils literal notranslate"><span class="pre">intro_topi.py</span></code>)</p></td>
-<td><p>00:00.699</p></td>
+<td><p>00:00.702</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="cross_compilation_and_rpc.html#sphx-glr-tutorial-cross-compilation-and-rpc-py"><span class="std std-ref">Cross Compilation and RPC</span></a> (<code class="docutils literal notranslate"><span class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.167</p></td>
+<td><p>00:00.152</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="introduction.html#sphx-glr-tutorial-introduction-py"><span class="std std-ref">Introduction</span></a> (<code class="docutils literal notranslate"><span class="pre">introduction.py</span></code>)</p></td>
-<td><p>00:00.007</p></td>
+<td><p>00:00.005</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="uma.html#sphx-glr-tutorial-uma-py"><span class="std std-ref">Making your Hardware Accelerator TVM-ready with UMA</span></a> (<code class="docutils literal notranslate"><span class="pre">uma.py</span></code>)</p></td>
-<td><p>00:00.001</p></td>
+<td><p>00:00.002</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="tvmc_python.html#sphx-glr-tutorial-tvmc-python-py"><span class="std std-ref">Getting Starting using TVMC Python: a high-level API for TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_python.py</span></code>)</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="tvmc_command_line_driver.html#sphx-glr-tutorial-tvmc-command-line-driver-py"><span class="std std-ref">Compiling and Optimizing a Model with TVMC</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_command_line_driver.py</span></code>)</p></td>
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="tvmc_command_line_driver.html#sphx-glr-tutorial-tvmc-command-line-driver-py"><span class="std std-ref">Compiling and Optimizing a Model with TVMC</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_command_line_driver.py</span></code>)</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="tvmc_python.html#sphx-glr-tutorial-tvmc-python-py"><span class="std std-ref">Getting Starting using TVMC Python: a high-level API for TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_python.py</span></code>)</p></td>
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
diff --git a/docs/tutorial/tensor_expr_get_started.html b/docs/tutorial/tensor_expr_get_started.html
index e30eb39f28..7a5fb53145 100644
--- a/docs/tutorial/tensor_expr_get_started.html
+++ b/docs/tutorial/tensor_expr_get_started.html
@@ -538,8 +538,8 @@ helper function to run a profile of the TVM generated code.</p>
 <span class="n">evaluate_addition</span><span class="p">(</span><span class="n">fadd</span><span class="p">,</span> <a href="../reference/api/python/target.html#tvm.target.Target" title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">tgt</span></a><span class="p">,</span> <span class="s2">&quot;naive&quot;</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#list" ti [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.000007
-naive: 0.000009
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.000008
+naive: 0.000007
 </pre></div>
 </div>
 </div>
@@ -588,7 +588,7 @@ compile and run this new schedule with the parallel operation applied:</p>
 <span class="n">evaluate_addition</span><span class="p">(</span><span class="n">fadd_parallel</span><span class="p">,</span> <a href="../reference/api/python/target.html#tvm.target.Target" title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">tgt</span></a><span class="p">,</span> <span class="s2">&quot;parallel&quot;</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.h [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>parallel: 0.000009
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>parallel: 0.000008
 </pre></div>
 </div>
 </div>
@@ -660,10 +660,10 @@ factor to be the number of threads on your CPU.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Operator                  Timing             Performance
-   numpy    7.336000053328462e-06                    1.0
-   naive    8.708999999999999e-06     1.1871592062010121
-parallel              9.2118e-06      1.2556979188979773
-  vector    2.4513500000000002e-05    3.3415348721102895
+   numpy    7.625989999269223e-06                    1.0
+   naive    6.6650000000000006e-06    0.8739848859805335
+parallel    8.055700000000001e-06     1.0563480939224883
+  vector    2.4568400000000004e-05    3.2216669576480332
 </pre></div>
 </div>
 <div class="admonition-code-specialization admonition">
@@ -979,7 +979,7 @@ matrix multiplication.</p>
 <span class="n">answer</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">numpy</span><span class="p">(),</span> <span class="n">b</span><span class="o">.</span><span class="n">numpy</span><span class="p">())</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018508
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.019004
 </pre></div>
 </div>
 <p>Now we write a basic matrix multiplication using TVM TE and verify that it
@@ -1020,7 +1020,7 @@ optimizations.</p>
 <span class="n">evaluate_operation</span><span class="p">(</span><a href="../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">s</span></a><span class="p">,</span> <span class="p">[</span><a href="../reference/api/python/te.html#tvm.te.Tensor" title="tvm.te.Tensor" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>none: 3.224370
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>none: 3.399834
 </pre></div>
 </div>
 <p>Let’s take a look at the intermediate representation of the operator and
@@ -1085,7 +1085,7 @@ schedule.</p>
 <span class="n">evaluate_operation</span><span class="p">(</span><a href="../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">s</span></a><span class="p">,</span> <span class="p">[</span><a href="../reference/api/python/te.html#tvm.te.Tensor" title="tvm.te.Tensor" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>blocking: 0.302546
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>blocking: 0.294778
 </pre></div>
 </div>
 <p>By reordering the computation to take advantage of caching, you should see a
@@ -1144,7 +1144,7 @@ already cache friendly from our previous optimizations.</p>
 <span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/driver.html#tvm.lower" title="tvm.lower" class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span class="n">tvm</span><span class="o">.</span><span class="n">lower</span></a><span class="p">(</span><a href="../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>vectorization: 0.336048
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>vectorization: 0.336641
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1199,7 +1199,7 @@ more cache friendly.</p>
 <span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/driver.html#tvm.lower" title="tvm.lower" class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span class="n">tvm</span><span class="o">.</span><span class="n">lower</span></a><span class="p">(</span><a href="../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>loop permutation: 0.113060
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>loop permutation: 0.115950
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1275,7 +1275,7 @@ optimized schedule.</p>
 <span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/driver.html#tvm.lower" title="tvm.lower" class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span class="n">tvm</span><span class="o">.</span><span class="n">lower</span></a><span class="p">(</span><a href="../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>array packing: 0.107321
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>array packing: 0.109689
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1349,7 +1349,7 @@ to `C</cite> when all the block results are ready.</p>
 <span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/driver.html#tvm.lower" title="tvm.lower" class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span class="n">tvm</span><span class="o">.</span><span class="n">lower</span></a><span class="p">(</span><a href="../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>block caching: 0.110543
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>block caching: 0.110457
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1416,7 +1416,7 @@ of thread-level parallelization.</p>
 <span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/driver.html#tvm.lower" title="tvm.lower" class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span class="n">tvm</span><span class="o">.</span><span class="n">lower</span></a><span class="p">(</span><a href="../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>parallelization: 0.146259
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>parallelization: 0.146722
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1478,13 +1478,13 @@ working, we can compare the results.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>        Operator                  Timing             Performance
-            none      3.2243697173999997                     1.0
-        blocking     0.30254556819999995     0.09383091726961149
-   vectorization            0.3360484456     0.10422143707235156
-loop permutation            0.1130601591     0.03506426651071736
-   array packing            0.1073209683     0.03328432459865032
-   block caching            0.1105431506     0.03428364619710468
- parallelization     0.14625904979999999    0.045360508446263825
+            none            3.3998342633                     1.0
+        blocking            0.2947782946       0.086703724879188
+   vectorization            0.3366409768     0.09901687868550518
+loop permutation     0.11595009150000002      0.0341046305555656
+   array packing            0.1096886654     0.03226294486882789
+   block caching     0.11045665819999999    0.032488836115436646
+ parallelization            0.1467221105    0.043155665581647004
 </pre></div>
 </div>
 <p>Note that the outputs on the web page reflect the running times on a
@@ -1516,6 +1516,7 @@ is</p>
 you can build generic templates of the matrix multiplication and other
 operations with tunable parameters that allows you to automatically optimize
 the computation for specific platforms.</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  0.703 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-tensor-expr-get-started-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/40a01cffb015a67aaec0fad7e27cf80d/tensor_expr_get_started.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tensor_expr_get_started.py</span></code></a></p>