You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by tq...@apache.org on 2022/08/20 02:37:16 UTC

[tvm-site] branch asf-site updated: deploying docs (apache/tvm@bdcfa01eae3ffe8c6d39aa26d0d1e5b311d47efb)

This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 2986c03cf deploying docs (apache/tvm@bdcfa01eae3ffe8c6d39aa26d0d1e5b311d47efb)
2986c03cf is described below

commit 2986c03cf910b25b55d6837d1682c7faf2867b33
Author: tvm-bot <95...@users.noreply.github.com>
AuthorDate: Sat Aug 20 02:37:09 2022 +0000

    deploying docs (apache/tvm@bdcfa01eae3ffe8c6d39aa26d0d1e5b311d47efb)
---
 .../opt_gemm.ipynb                                 |   2 +-
 .../deploy_detection.py                            |   2 +-
 .../deploy_detection.ipynb                         |   2 +-
 .../96137df89d8034b548f407123ec50ce9/opt_gemm.py   |   2 +-
 docs/_sources/arch/pass_infra.rst.txt              |   2 +-
 docs/_sources/arch/security.rst.txt                |   2 +-
 .../how_to/compile_models/from_darknet.rst.txt     |   2 +-
 .../how_to/compile_models/from_mxnet.rst.txt       |   2 +-
 .../how_to/compile_models/from_oneflow.rst.txt     |   2 +-
 .../how_to/compile_models/from_pytorch.rst.txt     |   2 +-
 .../how_to/compile_models/from_tensorflow.rst.txt  |   2 +-
 .../compile_models/sg_execution_times.rst.txt      |  22 +-
 docs/_sources/how_to/deploy/index.rst.txt          |   2 +-
 .../deploy_models/deploy_model_on_android.rst.txt  |   2 +-
 .../deploy_object_detection_pytorch.rst.txt        |   4 +-
 .../deploy_models/deploy_prequantized.rst.txt      |   6 +-
 .../deploy_prequantized_tflite.rst.txt             |   4 +-
 .../how_to/deploy_models/deploy_quantized.rst.txt  |   2 +-
 .../deploy_models/deploy_ssd_gluoncv.rst.txt       |   4 +-
 .../deploy_models/sg_execution_times.rst.txt       |  18 +-
 .../extend_tvm/bring_your_own_datatypes.rst.txt    |   4 +-
 .../how_to/extend_tvm/sg_execution_times.rst.txt   |  10 +-
 .../how_to/extend_tvm/use_pass_instrument.rst.txt  |  16 +-
 .../optimize_operators/opt_conv_cuda.rst.txt       |   2 +-
 .../optimize_operators/opt_conv_tensorcore.rst.txt |   2 +-
 .../how_to/optimize_operators/opt_gemm.rst.txt     |  18 +-
 .../optimize_operators/sg_execution_times.rst.txt  |   8 +-
 .../sg_execution_times.rst.txt                     |  14 +-
 .../tune_conv2d_layer_cuda.rst.txt                 |  11 +-
 .../tune_network_cuda.rst.txt                      |   2 +-
 .../tune_network_x86.rst.txt                       |   4 +-
 .../tune_sparse_x86.rst.txt                        |  86 +++++--
 .../tune_with_autotvm/sg_execution_times.rst.txt   |   8 +-
 .../tune_with_autotvm/tune_conv2d_cuda.rst.txt     |  26 +--
 .../work_with_microtvm/micro_autotune.rst.txt      |  16 +-
 .../how_to/work_with_microtvm/micro_train.rst.txt  |  16 +-
 .../work_with_microtvm/sg_execution_times.rst.txt  |  10 +-
 .../work_with_relay/sg_execution_times.rst.txt     |   8 +-
 .../how_to/work_with_schedules/intrin_math.rst.txt |   2 +-
 .../work_with_schedules/sg_execution_times.rst.txt |  14 +-
 .../how_to/work_with_schedules/tensorize.rst.txt   |   2 +-
 docs/_sources/index.rst.txt                        |   2 +-
 .../reference/langref/hybrid_script.rst.txt        |   2 +-
 .../tutorials/autotvm/sg_execution_times.rst.txt   |   4 +-
 .../frontend/deploy_classification.rst.txt         |   2 +-
 .../tutorials/frontend/deploy_detection.rst.txt    |   4 +-
 .../tutorials/frontend/sg_execution_times.rst.txt  |   6 +-
 .../tutorials/optimize/sg_execution_times.rst.txt  |   6 +-
 .../topic/vta/tutorials/sg_execution_times.rst.txt |   6 +-
 .../tutorial/auto_scheduler_matmul_x86.rst.txt     |   2 +-
 docs/_sources/tutorial/autotvm_matmul_x86.rst.txt  |  20 +-
 docs/_sources/tutorial/autotvm_relay_x86.rst.txt   |  54 ++---
 .../tutorial/cross_compilation_and_rpc.rst.txt     |   2 +-
 docs/_sources/tutorial/intro_topi.rst.txt          |   2 +-
 docs/_sources/tutorial/sg_execution_times.rst.txt  |  22 +-
 .../tutorial/tensor_expr_get_started.rst.txt       |  49 ++--
 docs/arch/pass_infra.html                          |   2 +-
 docs/arch/security.html                            |   2 +-
 docs/commit_hash                                   |   2 +-
 docs/how_to/compile_models/from_darknet.html       |   2 +-
 docs/how_to/compile_models/from_mxnet.html         |   2 +-
 docs/how_to/compile_models/from_oneflow.html       |  14 +-
 docs/how_to/compile_models/from_pytorch.html       |   4 +-
 docs/how_to/compile_models/from_tensorflow.html    |   2 +-
 docs/how_to/compile_models/sg_execution_times.html |  26 +--
 docs/how_to/deploy/index.html                      |   2 +-
 .../deploy_models/deploy_model_on_android.html     |   2 +-
 .../deploy_object_detection_pytorch.html           |  18 +-
 docs/how_to/deploy_models/deploy_prequantized.html |   6 +-
 .../deploy_models/deploy_prequantized_tflite.html  |   4 +-
 docs/how_to/deploy_models/deploy_quantized.html    |   2 +-
 docs/how_to/deploy_models/deploy_ssd_gluoncv.html  |  38 +--
 docs/how_to/deploy_models/sg_execution_times.html  |  18 +-
 .../extend_tvm/bring_your_own_datatypes.html       |   4 +-
 docs/how_to/extend_tvm/sg_execution_times.html     |  10 +-
 docs/how_to/extend_tvm/use_pass_instrument.html    |  16 +-
 docs/how_to/optimize_operators/opt_conv_cuda.html  |   2 +-
 .../optimize_operators/opt_conv_tensorcore.html    |   2 +-
 docs/how_to/optimize_operators/opt_gemm.html       |  18 +-
 .../optimize_operators/sg_execution_times.html     |   8 +-
 .../sg_execution_times.html                        |  14 +-
 .../tune_conv2d_layer_cuda.html                    |   7 +-
 .../tune_with_autoscheduler/tune_network_cuda.html |   2 +-
 .../tune_with_autoscheduler/tune_network_x86.html  |   4 +-
 .../tune_with_autoscheduler/tune_sparse_x86.html   |  86 +++++--
 .../tune_with_autotvm/sg_execution_times.html      |   8 +-
 .../how_to/tune_with_autotvm/tune_conv2d_cuda.html |  26 +--
 docs/how_to/work_with_microtvm/micro_autotune.html |  16 +-
 docs/how_to/work_with_microtvm/micro_train.html    |  16 +-
 .../work_with_microtvm/sg_execution_times.html     |  10 +-
 .../how_to/work_with_relay/sg_execution_times.html |   8 +-
 docs/how_to/work_with_schedules/intrin_math.html   |   2 +-
 .../work_with_schedules/sg_execution_times.html    |  14 +-
 docs/how_to/work_with_schedules/tensorize.html     |   2 +-
 docs/index.html                                    |   2 +-
 docs/reference/api/doxygen/annotated.html          |   2 +-
 docs/reference/api/doxygen/c__backend__api_8h.html |   2 +-
 .../api/doxygen/classtvm_1_1FuncTypeNode.html      |   2 +-
 .../api/doxygen/classtvm_1_1TargetNode.html        |   4 +-
 ...m_1_1auto__scheduler_1_1CacheWriteStepNode.html |   4 +-
 .../classtvm_1_1auto__scheduler_1_1State.html      |   4 +-
 ...sstvm_1_1meta__schedule_1_1PyCostModelNode.html |   2 +-
 .../api/doxygen/classtvm_1_1te_1_1Schedule.html    |   8 +-
 docs/reference/api/doxygen/hierarchy.html          |   2 +-
 .../doxygen/local__response__norm_8h_source.html   |   2 +-
 docs/reference/api/doxygen/namespacetvm.html       |  24 +-
 .../doxygen/namespacetvm_1_1auto__scheduler.html   |   2 +-
 .../namespacetvm_1_1relay_1_1transform.html        |   4 +-
 docs/reference/api/doxygen/namespacetvm_1_1te.html |   2 +-
 .../api/doxygen/namespacetvm_1_1topi.html          |  18 +-
 docs/reference/api/doxygen/nn_2bnn_8h_source.html  |   2 +-
 .../reference/api/doxygen/nn_2dense_8h_source.html |   2 +-
 .../api/doxygen/nn_2pooling_8h_source.html         |   2 +-
 .../api/doxygen/nn_2softmax_8h_source.html         |   2 +-
 docs/reference/api/doxygen/reduction_8h.html       |   2 +-
 .../reference/api/doxygen/reduction_8h_source.html |   8 +-
 .../reference/api/doxygen/relay_2transform_8h.html |   2 +-
 .../api/doxygen/relay_2transform_8h_source.html    |   2 +-
 .../api/doxygen/runtime_2crt_2module_8h.html       |   2 +-
 docs/reference/api/doxygen/target_8h_source.html   |   2 +-
 docs/reference/api/doxygen/tir_2op_8h.html         |  12 +-
 docs/reference/api/doxygen/tir_2op_8h_source.html  |   8 +-
 docs/reference/api/doxygen/topi_2nn_8h_source.html |   2 +-
 .../api/doxygen/topi_2transform_8h_source.html     |   2 +-
 docs/reference/api/doxygen/transform__step_8h.html |   2 +-
 docs/reference/api/python/auto_scheduler.html      |   4 +-
 docs/reference/api/python/error.html               |   2 +-
 docs/reference/api/python/ir.html                  |   2 +-
 docs/reference/api/python/relay/index.html         |   4 +-
 docs/reference/api/python/runtime.html             |   2 +-
 docs/reference/api/python/target.html              |   2 +-
 docs/reference/api/python/te.html                  |   4 +-
 docs/reference/api/python/tir.html                 |   2 +-
 docs/reference/api/python/topi.html                |   8 +-
 .../api/typedoc/classes/bytestreamreader.html      |  12 +-
 .../api/typedoc/classes/cachedcallstack.html       |  34 +--
 docs/reference/api/typedoc/classes/dldatatype.html |  12 +-
 docs/reference/api/typedoc/classes/dldevice.html   |  10 +-
 .../reference/api/typedoc/classes/environment.html |  12 +-
 docs/reference/api/typedoc/classes/ffilibrary.html |  20 +-
 .../api/typedoc/classes/graphexecutor.html         |  16 +-
 docs/reference/api/typedoc/classes/instance.html   |  40 ++--
 docs/reference/api/typedoc/classes/memory.html     |  34 +--
 docs/reference/api/typedoc/classes/module.html     |  10 +-
 docs/reference/api/typedoc/classes/ndarray.html    |  22 +-
 .../api/typedoc/classes/packedfunccell.html        |   6 +-
 docs/reference/api/typedoc/classes/rpcserver.html  |  14 +-
 docs/reference/api/typedoc/classes/scalar.html     |   6 +-
 .../api/typedoc/classes/webgpucontext.html         |  12 +-
 docs/reference/api/typedoc/enums/argtypecode.html  |  30 +--
 .../api/typedoc/enums/aynccallbackcode.html        |   4 +-
 .../api/typedoc/enums/dldatatypecode.html          |   8 +-
 .../api/typedoc/enums/rpcserverstate.html          |  12 +-
 docs/reference/api/typedoc/enums/sizeof.html       |  18 +-
 docs/reference/api/typedoc/index.html              | 112 ++++-----
 .../api/typedoc/interfaces/disposable.html         |   2 +-
 .../api/typedoc/interfaces/functioninfo.html       |   6 +-
 .../api/typedoc/interfaces/libraryprovider.html    |   4 +-
 docs/reference/langref/hybrid_script.html          |   2 +-
 docs/reference/langref/relay_op.html               |   2 +-
 docs/searchindex.js                                |   2 +-
 .../vta/tutorials/autotvm/sg_execution_times.html  |   4 +-
 .../tutorials/frontend/deploy_classification.html  |   2 +-
 .../vta/tutorials/frontend/deploy_detection.html   |   4 +-
 .../vta/tutorials/frontend/sg_execution_times.html |   6 +-
 .../vta/tutorials/optimize/sg_execution_times.html |   6 +-
 docs/topic/vta/tutorials/sg_execution_times.html   |   6 +-
 docs/tutorial/auto_scheduler_matmul_x86.html       |   2 +-
 docs/tutorial/autotvm_matmul_x86.html              |  20 +-
 docs/tutorial/autotvm_relay_x86.html               | 258 ++++++++++-----------
 docs/tutorial/cross_compilation_and_rpc.html       |   2 +-
 docs/tutorial/intro_topi.html                      |   2 +-
 docs/tutorial/sg_execution_times.html              |  28 +--
 docs/tutorial/tensor_expr_get_started.html         |  45 ++--
 174 files changed, 1044 insertions(+), 938 deletions(-)

diff --git a/docs/_downloads/0f8d36b3ffd04a5a08089dc671eb788e/opt_gemm.ipynb b/docs/_downloads/0f8d36b3ffd04a5a08089dc671eb788e/opt_gemm.ipynb
index 19c4dc5b8..30061793b 100644
--- a/docs/_downloads/0f8d36b3ffd04a5a08089dc671eb788e/opt_gemm.ipynb
+++ b/docs/_downloads/0f8d36b3ffd04a5a08089dc671eb788e/opt_gemm.ipynb
@@ -256,7 +256,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Parallel\nFuthermore, we can also utilize multi-core processors to do the thread-level parallelization.\n\n"
+        "## Parallel\nFurthermore, we can also utilize multi-core processors to do the thread-level parallelization.\n\n"
       ]
     },
     {
diff --git a/docs/_downloads/65b9451c8de050d7cd9da2fe5a49acc6/deploy_detection.py b/docs/_downloads/65b9451c8de050d7cd9da2fe5a49acc6/deploy_detection.py
index 2d9ddb416..0c430e8f9 100644
--- a/docs/_downloads/65b9451c8de050d7cd9da2fe5a49acc6/deploy_detection.py
+++ b/docs/_downloads/65b9451c8de050d7cd9da2fe5a49acc6/deploy_detection.py
@@ -130,7 +130,7 @@ pack_dict = {
 # The ``start_pack`` and ``stop_pack`` labels indicate where
 # to start and end the graph packing relay pass: in other words
 # where to start and finish offloading to VTA.
-# the number 4 indicate the the ``start_pack`` index is 4, the
+# the number 4 indicate the ``start_pack`` index is 4, the
 # number 186 indicate the ``stop_pack index`` is 186, by using
 # name and index number, here we can located to correct place
 # where to start/end when there are multiple ``nn.max_pool2d``
diff --git a/docs/_downloads/66e1a42229aae7ed49ac268f520e6727/deploy_detection.ipynb b/docs/_downloads/66e1a42229aae7ed49ac268f520e6727/deploy_detection.ipynb
index eafc3eb54..0aaa23f5a 100644
--- a/docs/_downloads/66e1a42229aae7ed49ac268f520e6727/deploy_detection.ipynb
+++ b/docs/_downloads/66e1a42229aae7ed49ac268f520e6727/deploy_detection.ipynb
@@ -87,7 +87,7 @@
       },
       "outputs": [],
       "source": [
-        "# Load VTA parameters from the 3rdparty/vta-hw/config/vta_config.json file\nenv = vta.get_env()\n# Set ``device=arm_cpu`` to run inference on the CPU\n# or ``device=vta`` to run inference on the FPGA.\ndevice = \"vta\"\ntarget = env.target if device == \"vta\" else env.target_vta_cpu\n\npack_dict = {\n    \"yolov3-tiny\": [\"nn.max_pool2d\", \"cast\", 4, 186],\n}\n\n# Name of Darknet model to compile\n# The ``start_pack`` and ``stop_pack`` labels indicate where\n# to start and e [...]
+        "# Load VTA parameters from the 3rdparty/vta-hw/config/vta_config.json file\nenv = vta.get_env()\n# Set ``device=arm_cpu`` to run inference on the CPU\n# or ``device=vta`` to run inference on the FPGA.\ndevice = \"vta\"\ntarget = env.target if device == \"vta\" else env.target_vta_cpu\n\npack_dict = {\n    \"yolov3-tiny\": [\"nn.max_pool2d\", \"cast\", 4, 186],\n}\n\n# Name of Darknet model to compile\n# The ``start_pack`` and ``stop_pack`` labels indicate where\n# to start and e [...]
       ]
     },
     {
diff --git a/docs/_downloads/96137df89d8034b548f407123ec50ce9/opt_gemm.py b/docs/_downloads/96137df89d8034b548f407123ec50ce9/opt_gemm.py
index d2ec711c2..249a4e26e 100644
--- a/docs/_downloads/96137df89d8034b548f407123ec50ce9/opt_gemm.py
+++ b/docs/_downloads/96137df89d8034b548f407123ec50ce9/opt_gemm.py
@@ -346,7 +346,7 @@ print(tvm.lower(s, [A, B, C], simple_mode=True))
 ###################################################################################################
 # Parallel
 # --------
-# Futhermore, we can also utilize multi-core processors to do the thread-level parallelization.
+# Furthermore, we can also utilize multi-core processors to do the thread-level parallelization.
 
 s = te.create_schedule(C.op)
 
diff --git a/docs/_sources/arch/pass_infra.rst.txt b/docs/_sources/arch/pass_infra.rst.txt
index 9e76251cc..1e320dceb 100644
--- a/docs/_sources/arch/pass_infra.rst.txt
+++ b/docs/_sources/arch/pass_infra.rst.txt
@@ -51,7 +51,7 @@ scheme through `Sequential`_ and `Block`_, respectively. With such constructs,
 these modern frameworks are able to conveniently add modules/layers to their
 containers and build up neural networks easily.
 
-The design of the Relay pass infra is largely inspired by the the hierarchical
+The design of the Relay pass infra is largely inspired by the hierarchical
 pass manager used in LLVM and the block-style containers used in the popular
 deep learning frameworks. The major goals of the pass infra include:
 
diff --git a/docs/_sources/arch/security.rst.txt b/docs/_sources/arch/security.rst.txt
index 22dbfc3cd..c2603dd33 100644
--- a/docs/_sources/arch/security.rst.txt
+++ b/docs/_sources/arch/security.rst.txt
@@ -28,7 +28,7 @@ We strongly encourage folks to report such problems to our private security mail
 
 Please note that the security mailing list should only be used for reporting undisclosed security vulnerabilities and managing the process of fixing such vulnerabilities. We cannot accept regular bug reports or other queries at this address. All mail sent to this address that does not relate to an undisclosed security problem in our source code will be ignored.
 Questions about: if a vulnerability applies to your particular application obtaining further information on a published vulnerability availability of patches
-and/or new releases should be addressed to to the user discuss forum.
+and/or new releases should be addressed to the user Discuss forum.
 
 The private security mailing address is: `security@apache.org <se...@apache.org>`_.
 Feel free to consult the `Apache Security guide <https://www.apache.org/security/>`_.
diff --git a/docs/_sources/how_to/compile_models/from_darknet.rst.txt b/docs/_sources/how_to/compile_models/from_darknet.rst.txt
index 4bc3a3683..82ceed1e6 100644
--- a/docs/_sources/how_to/compile_models/from_darknet.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_darknet.rst.txt
@@ -317,7 +317,7 @@ The process is no different from other examples.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  1.139 seconds)
+   **Total running time of the script:** ( 1 minutes  5.116 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_darknet.py:
diff --git a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
index aa1f06521..f5b0bc5ef 100644
--- a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
@@ -115,7 +115,7 @@ In this section, we download a pretrained imagenet model and classify an image.
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zipfd30f395-f5ff-4b3c-9faa-ce429c571eec from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip8b84f8d6-39a9-41c1-b841-c30cb3b6f630 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
     x (1, 3, 224, 224)
 
 
diff --git a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
index 58702d5af..7b32da558 100644
--- a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
@@ -113,7 +113,7 @@ Load a pretrained OneFlow model and save model
  .. code-block:: none
 
     Downloading: "https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip" to /workspace/.oneflow/flowvision_cache/resnet18.zip
-
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
     21%|##1       | 8.81M/41.5M [00:00<00:00, 92.4MB/s]
     42%|####2     | 17.6M/41.5M [00:00<00:00, 68.6MB/s]
     59%|#####9    | 24.5M/41.5M [00:00<00:00, 40.0MB/s]
     82%|########2 | 34.1M/41.5M [00:00<00:00, 51.0MB/s]
    100%|##########| 41.5M/41.5M [00:00<00:00, 59.4MB/s]
+
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
     15%|#5        | 6.33M/41.5M [00:00<00:01, 34.8MB/s]
     23%|##3       | 9.65M/41.5M [00:00<00:01, 27.3MB/s]
     37%|###6      | 15.2M/41.5M [00:00<00:00, 37.2MB/s]
     46%|####6     | 19.1M/41.5M [00:00<00:00, 34.4MB/s]
     58%|#####7    | 24.0M/41.5M [00:00<00:00, 30.1MB/s]
     77%|#######7  | 32.0M/41.5M [00:00<00:00, 39.9MB/s]
     87%|########6 | 36.1M/41.5M [00:01<00:00, 39.9MB/s]
     97%|#########6| 40.1M/41.5M [00:01<00:00, 36.0MB/s]
    100%|##########| 41.5M/41.5M [00:01<00:00, 36.3MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
index 67170eef8..074f1b93d 100644
--- a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
@@ -94,7 +94,7 @@ Load a pretrained PyTorch model
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
-
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
     48%|####8     | 21.6M/44.7M [00:00<00:00, 226MB/s]
    100%|##########| 44.7M/44.7M [00:00<00:00, 255MB/s]
+
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
     44%|####4     | 19.8M/44.7M [00:00<00:00, 207MB/s]
    100%|##########| 44.7M/44.7M [00:00<00:00, 238MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
index e011b73b6..99e81d136 100644
--- a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
@@ -423,7 +423,7 @@ Run the corresponding model on tensorflow
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  2.342 seconds)
+   **Total running time of the script:** ( 1 minutes  0.864 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_tensorflow.py:
diff --git a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
index e278b95d9..e0f78ebec 100644
--- a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
@@ -5,26 +5,26 @@
 
 Computation times
 =================
-**05:01.673** total execution time for **how_to_compile_models** files:
+**05:03.223** total execution time for **how_to_compile_models** files:
 
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``) | 01:02.342 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)       | 01:05.116 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)       | 01:01.139 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``) | 01:00.864 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)         | 00:39.461 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)         | 00:39.238 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)       | 00:28.160 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)       | 00:29.048 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)           | 00:25.808 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)           | 00:25.201 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)         | 00:24.177 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)         | 00:24.474 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)         | 00:23.146 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)         | 00:22.189 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)       | 00:19.420 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)       | 00:19.194 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)           | 00:15.703 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)           | 00:15.453 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)             | 00:02.316 | 0.0 MB |
+| :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)             | 00:02.446 | 0.0 MB |
 +-----------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/deploy/index.rst.txt b/docs/_sources/how_to/deploy/index.rst.txt
index 73269e85d..f28883446 100644
--- a/docs/_sources/how_to/deploy/index.rst.txt
+++ b/docs/_sources/how_to/deploy/index.rst.txt
@@ -70,7 +70,7 @@ After you get the TVM runtime library, you can link the compiled library
 
 A model (optimized or not by TVM) can be cross compiled by TVM for
 different architectures such as ``aarch64`` on a ``x64_64`` host. Once the model
-is cross compiled it is neccessary to have a runtime compatible with the target
+is cross compiled it is necessary to have a runtime compatible with the target
 architecture to be able to run the cross compiled model.
 
 
diff --git a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
index 6883a8617..aae5b0bf6 100644
--- a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
@@ -441,7 +441,7 @@ Execute on TVM
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      16.0282      15.9883      16.2973      15.8762       0.1440   
+      15.8096      15.7889      15.9555      15.6716       0.0876   
                
 
 
diff --git a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
index 489b045bc..51e3759e5 100644
--- a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
@@ -123,7 +123,7 @@ Load pre-trained maskrcnn from torchvision and do tracing
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth" to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
-
      0%|          | 0.00/170M [00:00<?, ?B/s]
      2%|2         | 3.87M/170M [00:00<00:04, 40.4MB/s]
      5%|4         | 8.02M/170M [00:00<00:04, 42.3MB/s]
     19%|#9        | 32.7M/170M [00:00<00:01, 141MB/s] 
     34%|###3      | 56.9M/170M [00:00<00:00, 185MB/s]
     47%|####6     | 79.5M/170M [00:00<00:00, 204MB/s]
     60%|######    | 102M/170M [00:00<00:00, 216MB/s] 
     73%|#######3  | 125M/170M [00:00<00:00, 222MB/s]
     87%|########7 | 148M/170M [00:00<00:00, 228MB/s]
    100%|##########| 170M/170M [00:00<00:00, 199MB/s]
+
      0%|          | 0.00/170M [00:00<?, ?B/s]
     12%|#1        | 20.1M/170M [00:00<00:00, 210MB/s]
     26%|##6       | 44.4M/170M [00:00<00:00, 237MB/s]
     41%|####      | 69.1M/170M [00:00<00:00, 247MB/s]
     56%|#####5    | 95.0M/170M [00:00<00:00, 257MB/s]
     72%|#######1  | 122M/170M [00:00<00:00, 265MB/s] 
     87%|########7 | 148M/170M [00:00<00:00, 268MB/s]
    100%|##########| 170M/170M [00:00<00:00, 261MB/s]
     /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
       for i in range(dim)
     /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
@@ -292,7 +292,7 @@ Get boxes with score larger than 0.9
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 3 minutes  5.158 seconds)
+   **Total running time of the script:** ( 2 minutes  57.719 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_object_detection_pytorch.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
index 09e454020..494b2430f 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
@@ -232,7 +232,7 @@ training. Other models require a full post training calibration.
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/mobilenet_v2-b0353104.pth" to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
-
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 176MB/s]
+
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 168MB/s]
 
 
 
@@ -412,7 +412,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      90.3587      90.3543      91.8387      90.0842       0.1957   
+      90.1677      90.0595      96.0860      89.9498       0.6435   
                
 
 
@@ -461,7 +461,7 @@ TODO
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  10.869 seconds)
+   **Total running time of the script:** ( 1 minutes  9.702 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
index 3de91c68f..0579f4109 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
@@ -439,7 +439,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      121.0594     120.9719     124.8811     120.1730      0.5547   
+      118.7824     118.7393     121.0677     118.0321      0.3706   
                
 
 
@@ -476,7 +476,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  52.696 seconds)
+   **Total running time of the script:** ( 1 minutes  53.612 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized_tflite.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
index 9c2f6a234..ae7a18aee 100644
--- a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
@@ -255,7 +255,7 @@ We create a Relay VM to build and execute the model.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  20.780 seconds)
+   **Total running time of the script:** ( 1 minutes  51.204 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_quantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
index 3533ff027..1bce208ff 100644
--- a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
@@ -158,7 +158,7 @@ Convert and compile model for CPU.
             data: None
       input_sym_arg_type = in_param.infer_type()[0]
     Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
-
      0%|          | 0/132723 [00:00<?, ?KB/s]
      4%|3         | 5018/132723 [00:00<00:02, 50173.35KB/s]
      9%|9         | 12125/132723 [00:00<00:01, 62461.36KB/s]
     15%|#5        | 20091/132723 [00:00<00:01, 70309.28KB/s]
     21%|##1       | 28115/132723 [00:00<00:01, 74223.67KB/s]
     27%|##7       | 36142/132723 [00:00<00:01, 76401.99KB/s]
     33%|###3      | 44155/132723 [00:00<00:01, 77665.09KB/s]
     39%|###9      | 52160/132723 [00:00<00:01, 78434.15KB/s]
     45%|####5     | 60211/132723 [00:00<00:00, 79092.22KB/s]
     51%|#####1    | 68300/132723 [00:00<00:00, 79650.92KB/s]
     58%|#####7    | 76331/132723 [00:01<00:00, 79852.16KB/s]
     64%|######3   | 84387/132723 [00:01<00:00, 80066.11KB/s]
     70%|######9   | 92394/132723 [00:01<00:00, 79949.90KB/s]
     76%|#######5  | 100427/132723 [00:01<00:00, 80063.39KB/s]
     82%|########1 | 108459/132723 [00:01<00:00, 80131.23KB/s]
     88%|########7 | 116560/132723 [00:01<00:00, 80393.72KB/s]
     94%|########
 #3| 124600/132723 [00:01<00:00, 80183.86KB/s]
    100%|#########9| 132620/132723 [00:01<00:00, 80185.97KB/s]
    100%|##########| 132723/132723 [00:01<00:00, 77828.16KB/s]
+
      0%|          | 0/132723 [00:00<?, ?KB/s]
      4%|3         | 4967/132723 [00:00<00:02, 49666.61KB/s]
     10%|9         | 13035/132723 [00:00<00:01, 67905.21KB/s]
     15%|#4        | 19826/132723 [00:00<00:02, 46076.00KB/s]
     21%|##1       | 28056/132723 [00:00<00:01, 57188.46KB/s]
     26%|##5       | 34460/132723 [00:00<00:01, 50663.10KB/s]
     32%|###2      | 42576/132723 [00:00<00:01, 58931.78KB/s]
     38%|###8      | 50764/132723 [00:00<00:01, 65338.42KB/s]
     44%|####4     | 58975/132723 [00:00<00:01, 70119.11KB/s]
     51%|#####     | 67141/132723 [00:01<00:00, 73457.31KB/s]
     57%|#####6    | 75377/132723 [00:01<00:00, 76058.53KB/s]
     63%|######2   | 83197/132723 [00:01<00:00, 55172.82KB/s]
     69%|######8   | 91333/132723 [00:01<00:00, 61258.99KB/s]
     74%|#######4  | 98301/132723 [00:01<00:00, 45110.55KB/s]
     78%|#######8  | 104088/132723 [00:01<00:00, 47690.73KB/s]
     85%|########4 | 112283/132723 [00:01<00:00, 55370.12KB/s]
     91%|#########
  | 120519/132723 [00:02<00:00, 61916.00KB/s]
     97%|#########7| 128752/132723 [00:02<00:00, 67168.56KB/s]
    100%|##########| 132723/132723 [00:02<00:00, 57332.06KB/s]
 
 
 
@@ -241,7 +241,7 @@ Display result
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  40.544 seconds)
+   **Total running time of the script:** ( 2 minutes  34.754 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_ssd_gluoncv.py:
diff --git a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
index d0722daa1..d6174fdb6 100644
--- a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
@@ -5,24 +5,24 @@
 
 Computation times
 =================
-**11:25.711** total execution time for **how_to_deploy_models** files:
+**11:40.675** total execution time for **how_to_deploy_models** files:
 
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``) | 03:05.158 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``) | 02:57.719 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)                           | 02:40.544 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)                           | 02:34.754 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)           | 01:52.696 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)           | 01:53.612 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)                               | 01:20.780 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)                               | 01:51.204 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)                         | 01:10.869 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)                         | 01:09.702 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)                 | 00:30.326 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)                 | 00:29.531 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_nano.py` (``deploy_model_on_nano.py``)                       | 00:22.963 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_nano.py` (``deploy_model_on_nano.py``)                       | 00:22.311 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)                       | 00:22.369 | 0.0 MB |
+| :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)                       | 00:21.836 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_deploy_models_deploy_sparse.py` (``deploy_sparse.py``)                                     | 00:00.006 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
index 293411787..90c0029d4 100644
--- a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
@@ -476,7 +476,7 @@ First let us define two helper functions to get the mobilenet model and a cat im
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip2870b69e-a103-4d4c-95ee-3a9991999beb from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip6222c333-f560-463f-93e3-c67ac93647d9 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 
 
 
@@ -590,7 +590,7 @@ Now, to actually convert the entire network, we have written `a pass in Relay <h
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-      Check failed: (lower) is false: Intrinsic lowering function for target llvm, intrinsic name tir.sqrt, type 150 not found
+      Check failed: (lower) is false: FloatImm lowering function for target llvm type 150 not found
 
 
 
diff --git a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
index 9291fe425..fe1f9dba4 100644
--- a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
@@ -5,14 +5,14 @@
 
 Computation times
 =================
-**00:41.678** total execution time for **how_to_extend_tvm** files:
+**00:41.101** total execution time for **how_to_extend_tvm** files:
 
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``) | 00:38.397 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``) | 00:37.861 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)           | 00:02.305 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)           | 00:02.287 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)                     | 00:00.967 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)                     | 00:00.944 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_extend_tvm_low_level_custom_pass.py` (``low_level_custom_pass.py``)       | 00:00.009 | 0.0 MB |
+| :ref:`sphx_glr_how_to_extend_tvm_low_level_custom_pass.py` (``low_level_custom_pass.py``)       | 00:00.008 | 0.0 MB |
 +-------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
index fd42ba1ab..ea255a313 100644
--- a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
@@ -216,10 +216,10 @@ profile the execution time of each passes.
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 6885us [6885us] (46.33%; 46.33%)
-    FoldScaleAxis: 7975us [6us] (53.67%; 53.67%)
-            FoldConstant: 7968us [1598us] (53.62%; 99.92%)
-                    InferType: 6370us [6370us] (42.87%; 79.94%)
+    InferType: 6965us [6965us] (46.40%; 46.40%)
+    FoldScaleAxis: 8047us [6us] (53.60%; 53.60%)
+            FoldConstant: 8041us [1642us] (53.56%; 99.93%)
+                    InferType: 6399us [6399us] (42.63%; 79.58%)
 
 
 
@@ -258,10 +258,10 @@ Refer to following sections and :py:func:`tvm.instrument.pass_instrument` for th
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 6441us [6441us] (45.00%; 45.00%)
-    FoldScaleAxis: 7872us [5us] (55.00%; 55.00%)
-            FoldConstant: 7866us [1620us] (54.96%; 99.93%)
-                    InferType: 6246us [6246us] (43.64%; 79.40%)
+    InferType: 6454us [6454us] (44.60%; 44.60%)
+    FoldScaleAxis: 8016us [6us] (55.40%; 55.40%)
+            FoldConstant: 8010us [1656us] (55.36%; 99.93%)
+                    InferType: 6354us [6354us] (43.91%; 79.33%)
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
index 329422133..264993683 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
@@ -340,7 +340,7 @@ latency of convolution.
 
  .. code-block:: none
 
-    Convolution: 54.245192 ms
+    Convolution: 54.269767 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
index b3e80856d..d227ece41 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
@@ -671,7 +671,7 @@ be able to run on our build server
 
  .. code-block:: none
 
-    conv2d with tensor core: 9.119490 ms
+    conv2d with tensor core: 7.168272 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
index d92577fe0..c8826a4ec 100644
--- a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
@@ -143,8 +143,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 
  .. code-block:: none
 
-    Numpy running time: 0.018935
-    Baseline: 3.409753
+    Numpy running time: 0.018922
+    Baseline: 3.316339
 
 
 
@@ -239,7 +239,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 
  .. code-block:: none
 
-    Opt1: 0.316767
+    Opt1: 0.305577
 
 
 
@@ -342,7 +342,7 @@ In this tutorial, we chose to vectorize the inner loop row data since it is cach
 
  .. code-block:: none
 
-    Opt2: 0.346355
+    Opt2: 0.336554
 
 
 
@@ -438,7 +438,7 @@ the access pattern for A matrix is more cache friendly.
 
  .. code-block:: none
 
-    Opt3: 0.119693
+    Opt3: 0.115825
 
 
 
@@ -563,7 +563,7 @@ flattening.
 
  .. code-block:: none
 
-    Opt4: 0.110624
+    Opt4: 0.108125
 
 
 
@@ -685,7 +685,7 @@ write to C when all the block results are ready.
 
  .. code-block:: none
 
-    Opt5: 0.111787
+    Opt5: 0.111973
 
 
 
@@ -761,7 +761,7 @@ Here is the generated IR after blocking.
 
 Parallel
 --------
-Futhermore, we can also utilize multi-core processors to do the thread-level parallelization.
+Furthermore, we can also utilize multi-core processors to do the thread-level parallelization.
 
 .. GENERATED FROM PYTHON SOURCE LINES 350-385
 
@@ -810,7 +810,7 @@ Futhermore, we can also utilize multi-core processors to do the thread-level par
 
  .. code-block:: none
 
-    Opt6: 0.147276
+    Opt6: 0.146406
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
index e749a7667..82d193f20 100644
--- a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
@@ -5,12 +5,12 @@
 
 Computation times
 =================
-**00:35.008** total execution time for **how_to_optimize_operators** files:
+**00:34.475** total execution time for **how_to_optimize_operators** files:
 
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)                       | 00:32.738 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)                       | 00:32.142 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``) | 00:01.280 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``) | 00:01.279 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)             | 00:00.990 | 0.0 MB |
+| :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)             | 00:01.053 | 0.0 MB |
 +-----------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
index 865c51ebd..ba74db975 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
@@ -5,18 +5,18 @@
 
 Computation times
 =================
-**06:10.861** total execution time for **how_to_tune_with_autoscheduler** files:
+**06:12.217** total execution time for **how_to_tune_with_autoscheduler** files:
 
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``) | 03:19.348 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``) | 03:26.198 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)             | 01:23.967 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)             | 01:22.571 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)           | 00:48.069 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)           | 00:46.992 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)               | 00:21.280 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)               | 00:18.754 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)           | 00:09.184 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)           | 00:08.990 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)             | 00:09.013 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)             | 00:08.712 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
index d0f852464..6b5f8d08d 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
@@ -206,6 +206,13 @@ file and apply it.
 
 
 
+.. rst-class:: sphx-glr-script-out
+
+ .. code-block:: none
+
+    .T
+
+
 
 
 
@@ -771,7 +778,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 0.364 ms
+    Execution time of this operator: 0.366 ms
 
 
 
@@ -1378,7 +1385,7 @@ In the example below we resume the status and do more 5 trials.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 3 minutes  19.348 seconds)
+   **Total running time of the script:** ( 3 minutes  26.198 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
index 7e287f7dc..f1dd5bf21 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
@@ -647,7 +647,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       9.9529       9.9579       9.9821       9.9188       0.0261   
+      10.0015       9.9899      10.0619       9.9527       0.0453   
                
 
 
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
index d40cd8eee..086f7ce74 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
@@ -666,7 +666,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      761.4623     761.3826     762.0521     760.9522      0.4525   
+      753.4226     753.4309     753.6561     753.1808      0.1941   
                
 
 
@@ -694,7 +694,7 @@ Other Tips
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  23.967 seconds)
+   **Total running time of the script:** ( 1 minutes  22.571 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_network_x86.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
index f5176ac4c..4a4fb245f 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
@@ -397,29 +397,79 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
                  placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
                  compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
       buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-      preflattened_buffer_map = {placeholder_8: placeholder_15: Buffer(placeholder_13, int32, [33], []), placeholder_7: placeholder_16: Buffer(placeholder_12, int32, [4916], []), placeholder_9: placeholder_17: Buffer(placeholder_14, float32, [128, 512], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_6: placeholder_19: Buffer(placeholder_11, float32, [4916, 16, 1], [])} {
-      for (i0.outer.i1.outer.fused: int32, 0, 32) "parallel" {
-        allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global {
-          for (i.outer.inner: int32, 0, 32) {
-            for (i.inner.init: int32, 0, 4) {
-              for (j.init: int32, 0, 16) {
-                compute_5: Buffer(compute_4, float32, [2048], [])[(((i.outer.inner*64) + (i.inner.init*16)) + j.init)] = 0f32
+      preflattened_buffer_map = {placeholder_9: placeholder_15: Buffer(placeholder_14, float32, [128, 512], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_7: placeholder_17: Buffer(placeholder_12, int32, [4916], []), placeholder_8: placeholder_18: Buffer(placeholder_13, int32, [33], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_5: placeholder_19: Buffer(placeholder_10, float32, [128, 256], [])} {
+      for (i0.outer: int32, 0, 2) "parallel" {
+        allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global;
+        for (i1.outer: int32, 0, 16) {
+          for (i.outer.inner: int32, 0, 8) {
+            for (nb_j.inner: int32, 0, 2) {
+              for (i.inner.init: int32, 0, 8) {
+                let cse_var_1: int32 = (((i.outer.inner*256) + (i.inner.init*32)) + (nb_j.inner*16))
+                 {
+                  compute_5: Buffer(compute_4, float32, [2048], [])[cse_var_1] = 0f32
+                  compute_5[(cse_var_1 + 1)] = 0f32
+                  compute_5[(cse_var_1 + 2)] = 0f32
+                  compute_5[(cse_var_1 + 3)] = 0f32
+                  compute_5[(cse_var_1 + 4)] = 0f32
+                  compute_5[(cse_var_1 + 5)] = 0f32
+                  compute_5[(cse_var_1 + 6)] = 0f32
+                  compute_5[(cse_var_1 + 7)] = 0f32
+                  compute_5[(cse_var_1 + 8)] = 0f32
+                  compute_5[(cse_var_1 + 9)] = 0f32
+                  compute_5[(cse_var_1 + 10)] = 0f32
+                  compute_5[(cse_var_1 + 11)] = 0f32
+                  compute_5[(cse_var_1 + 12)] = 0f32
+                  compute_5[(cse_var_1 + 13)] = 0f32
+                  compute_5[(cse_var_1 + 14)] = 0f32
+                  compute_5[(cse_var_1 + 15)] = 0f32
+                }
               }
-            }
-            for (elem_idx: int32, 0, (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])) {
-              for (i.inner: int32, 0, 4) {
-                for (j: int32, 0, 16) {
-                  if @tir.likely((elem_idx < (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
-                    let cse_var_1: int32 = (((i.outer.inner*64) + (i.inner*16)) + j)
-                    compute_5[cse_var_1] = (compute_5[cse_var_1] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + j)]*max(placeholder[(((i.outer.inner*1024) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+              for (elem_idx: int32, 0, let cse_var_2: int32 = ((i1.outer*2) + nb_j.inner) in (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])) {
+                for (i.inner: int32, 0, 8) {
+                  let cse_var_21: int32 = (elem_idx*16)
+                  let cse_var_20: int32 = ((i1.outer*2) + nb_j.inner)
+                  let cse_var_19: int32 = (((i0.outer*16384) + (i.outer.inner*2048)) + (i.inner*256))
+                  let cse_var_18: int32 = (((i.outer.inner*256) + (i.inner*32)) + (nb_j.inner*16))
+                  let cse_var_17: int32 = (cse_var_18 + 9)
+                  let cse_var_16: int32 = (cse_var_18 + 8)
+                  let cse_var_15: int32 = (cse_var_18 + 7)
+                  let cse_var_14: int32 = (cse_var_18 + 6)
+                  let cse_var_13: int32 = (cse_var_18 + 5)
+                  let cse_var_12: int32 = (cse_var_18 + 4)
+                  let cse_var_11: int32 = (cse_var_18 + 3)
+                  let cse_var_10: int32 = (cse_var_18 + 2)
+                  let cse_var_9: int32 = (cse_var_18 + 15)
+                  let cse_var_8: int32 = (cse_var_18 + 14)
+                  let cse_var_7: int32 = (cse_var_18 + 13)
+                  let cse_var_6: int32 = (cse_var_18 + 12)
+                  let cse_var_5: int32 = (cse_var_18 + 11)
+                  let cse_var_4: int32 = (cse_var_18 + 10)
+                  let cse_var_3: int32 = (cse_var_18 + 1)
+                   {
+                    compute_5[cse_var_18] = (compute_5[cse_var_18] + (placeholder_1[((placeholder_3[cse_var_20]*16) + cse_var_21)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_3] = (compute_5[cse_var_3] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 1)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 2)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 3)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 4)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 5)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 6)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 7)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 8)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 9)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 10)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 11)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 12)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 13)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 14)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                    compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 15)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
                   }
                 }
               }
             }
           }
-          for (i0.inner: int32, 0, 128) {
-            let cse_var_2: int32 = ((i0.inner*512) + (i0.outer.i1.outer.fused*16))
-            compute[ramp(cse_var_2, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_2, 1, 16)]), broadcast(0f32, 16))
+          for (i0.inner: int32, 0, 64) {
+            let cse_var_22: int32 = (((i0.outer*32768) + (i0.inner*512)) + (i1.outer*32))
+            compute[ramp(cse_var_22, 1, 32)] = max((compute_5[ramp((i0.inner*32), 1, 32)] + placeholder_4[ramp(cse_var_22, 1, 32)]), broadcast(0f32, 32))
           }
         }
       }
@@ -475,7 +525,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 1.465 ms
+    Execution time of this operator: 1.808 ms
 
 
 
diff --git a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
index 038d2c6aa..c26209672 100644
--- a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
@@ -5,14 +5,14 @@
 
 Computation times
 =================
-**00:46.252** total execution time for **how_to_tune_with_autotvm** files:
+**00:45.843** total execution time for **how_to_tune_with_autotvm** files:
 
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)           | 00:46.213 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)           | 00:45.808 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_x86.py` (``tune_relay_x86.py``)               | 00:00.023 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_x86.py` (``tune_relay_x86.py``)               | 00:00.020 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_cuda.py` (``tune_relay_cuda.py``)             | 00:00.006 | 0.0 MB |
+| :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_cuda.py` (``tune_relay_cuda.py``)             | 00:00.005 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_arm.py` (``tune_relay_arm.py``)               | 00:00.005 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
index d1831b083..83c529425 100644
--- a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
@@ -1156,8 +1156,8 @@ for this template
     TimeoutError
 
             [('tile_f', [-1, 2, 1, 64]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4909501
-    No: 9   GFLOPS: 80.88/80.88     result: MeasureResult(costs=(0.002862254142857143,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6815993785858154, timestamp=1660956105.774105)        [('tile_f', [-1, 1, 4, 8]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 2, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5072689
-    No: 10  GFLOPS: 0.00/80.88      result: Traceback (most recent call last):
+    No: 9   GFLOPS: 181.84/181.84   result: MeasureResult(costs=(0.0012730925666666667,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.0782089233398438, timestamp=1660955750.7828293)      [('tile_f', [-1, 1, 4, 8]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 2, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5072689
+    No: 10  GFLOPS: 0.00/181.84     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1280,8 +1280,8 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 4, 8]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 64, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5092711
-    No: 11  GFLOPS: 260.94/260.94   result: MeasureResult(costs=(0.0008871673259668507,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7710692882537842, timestamp=1660956106.696446)       [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
-    No: 12  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+    No: 11  GFLOPS: 261.55/261.55   result: MeasureResult(costs=(0.0008851229944751381,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7694880962371826, timestamp=1660955751.698509)       [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
+    No: 12  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1404,7 +1404,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 128, 1, 2]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 256]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,183542
-    No: 13  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+    No: 13  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1527,7 +1527,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 8, 8]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 64]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2482196
-    No: 14  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+    No: 14  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1650,9 +1650,9 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 64, 1, 4]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10306226
-    No: 15  GFLOPS: 5.28/260.94     result: MeasureResult(costs=(0.043832007,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8575994968414307, timestamp=1660956111.309146) [('tile_f', [-1, 2, 2, 8]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 8]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,5330964
-    No: 16  GFLOPS: 3.35/260.94     result: MeasureResult(costs=(0.06914761450000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.611823081970215, timestamp=1660956112.54857)   [('tile_f', [-1, 8, 4, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2140058
-    No: 17  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+    No: 15  GFLOPS: 5.47/261.55     result: MeasureResult(costs=(0.04229653225,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8152391910552979, timestamp=1660955756.22979)        [('tile_f', [-1, 2, 2, 8]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 8]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,5330964
+    No: 16  GFLOPS: 3.33/261.55     result: MeasureResult(costs=(0.06949369325,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.554236173629761, timestamp=1660955757.4622016)       [('tile_f', [-1, 8, 4, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2140058
+    No: 17  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 142, in build
         res = future.result()
       File "/usr/lib/python3.7/concurrent/futures/_base.py", line 435, in result
@@ -1670,8 +1670,8 @@ for this template
     TimeoutError
 
             [('tile_f', [-1, 2, 2, 1]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 16]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10195251
-    No: 18  GFLOPS: 28.12/260.94    result: MeasureResult(costs=(0.008233555571428571,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2862968444824219, timestamp=1660956123.5834103)       [('tile_f', [-1, 4, 8, 4]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6068603
-    No: 19  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+    No: 18  GFLOPS: 26.90/261.55    result: MeasureResult(costs=(0.008606866785714285,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2779605388641357, timestamp=1660955768.4915335)       [('tile_f', [-1, 4, 8, 4]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 1, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6068603
+    No: 19  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1794,7 +1794,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 16, 4, 8]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 128]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6956993
-    No: 20  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+    No: 20  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
@@ -1973,7 +1973,7 @@ and measure running time.
     Best config:
     [('tile_f', [-1, 8, 2, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 1]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4264713
     Finish loading 20 records
-    Time cost of this operator: 0.001251
+    Time cost of this operator: 0.001285
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
index d6361c2cc..e1a623111 100644
--- a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
@@ -329,10 +329,10 @@ Timing the untuned program
     ########## Build without Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)  
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  321.6     98.753   (1, 2, 10, 10, 3)  2       1        [321.6]           
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.108     0.954    (1, 6, 10, 10)     1       1        [3.108]           
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.953     0.293    (1, 1, 10, 10, 3)  1       1        [0.953]           
-    Total_time                                    -                                             325.662   -        -                  -       -        -                 
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  311.4     98.717   (1, 2, 10, 10, 3)  2       1        [311.4]           
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.045     0.965    (1, 6, 10, 10)     1       1        [3.045]           
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         1.002     0.318    (1, 1, 10, 10, 3)  1       1        [1.002]           
+    Total_time                                    -                                             315.448   -        -                  -       -        -                 
 
 
 
@@ -398,10 +398,10 @@ Timing the tuned program
     ########## Build with Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)  
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  130.3     97.893   (1, 6, 10, 10, 1)  2       1        [130.3]           
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.804     1.356    (1, 6, 10, 10)     1       1        [1.804]           
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         1.0       0.751    (1, 1, 10, 10, 3)  1       1        [1.0]             
-    Total_time                                    -                                             133.104   -        -                  -       -        -                 
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  151.2     98.223   (1, 6, 10, 10, 1)  2       1        [151.2]           
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.772     1.151    (1, 6, 10, 10)     1       1        [1.772]           
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.963     0.626    (1, 1, 10, 10, 3)  1       1        [0.963]           
+    Total_time                                    -                                             153.935   -        -                  -       -        -                 
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt b/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
index 11ad5dd5a..85be39c68 100644
--- a/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/micro_train.rst.txt
@@ -225,7 +225,7 @@ take about **2 minutes** to download the Stanford Cars, while COCO 2017 validati
  .. code-block:: none
 
 
-    '/tmp/tmps76zrvrg/images/random'
+    '/tmp/tmpwlew76_p/images/random'
 
 
 
@@ -325,8 +325,8 @@ objects to other stuff? We can display some examples from our datasets using ``m
 
  .. code-block:: none
 
-    /tmp/tmps76zrvrg/images/target contains 8144 images
-    /tmp/tmps76zrvrg/images/random contains 5000 images
+    /tmp/tmpwlew76_p/images/target contains 8144 images
+    /tmp/tmpwlew76_p/images/random contains 5000 images
 
 
 
@@ -501,13 +501,13 @@ the time on our validation set).
  .. code-block:: none
 
     Epoch 1/3
-    328/328 - 55s - loss: 0.2238 - accuracy: 0.9232 - val_loss: 0.1414 - val_accuracy: 0.9551
+    328/328 - 55s - loss: 0.2688 - accuracy: 0.9145 - val_loss: 0.1875 - val_accuracy: 0.9456
     Epoch 2/3
-    328/328 - 53s - loss: 0.1021 - accuracy: 0.9629 - val_loss: 0.1238 - val_accuracy: 0.9596
+    328/328 - 52s - loss: 0.1037 - accuracy: 0.9618 - val_loss: 0.1280 - val_accuracy: 0.9630
     Epoch 3/3
-    328/328 - 52s - loss: 0.0651 - accuracy: 0.9757 - val_loss: 0.1455 - val_accuracy: 0.9532
+    328/328 - 52s - loss: 0.0645 - accuracy: 0.9758 - val_loss: 0.1197 - val_accuracy: 0.9660
 
-    <keras.callbacks.History object at 0x7f39001d2810>
+    <keras.callbacks.History object at 0x7f89d005d590>
 
 
 
@@ -864,7 +864,7 @@ Arduino tutorial for how to do that `on GitHub <https://github.com/guberti/tvm-a
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 4 minutes  45.650 seconds)
+   **Total running time of the script:** ( 5 minutes  33.813 seconds)
 
 
 .. _sphx_glr_download_how_to_work_with_microtvm_micro_train.py:
diff --git a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
index 5ae1d00aa..f27a172c5 100644
--- a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
@@ -5,16 +5,16 @@
 
 Computation times
 =================
-**05:40.534** total execution time for **how_to_work_with_microtvm** files:
+**06:27.051** total execution time for **how_to_work_with_microtvm** files:
 
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_train.py` (``micro_train.py``)               | 04:45.650 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_train.py` (``micro_train.py``)               | 05:33.813 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)         | 00:43.578 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)         | 00:42.328 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_aot.py` (``micro_aot.py``)                   | 00:07.836 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_aot.py` (``micro_aot.py``)                   | 00:07.591 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)             | 00:03.468 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)             | 00:03.318 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_microtvm_micro_ethosu.py` (``micro_ethosu.py``)             | 00:00.001 | 0.0 MB |
 +---------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
index 53129a5dc..c6ab80710 100644
--- a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
@@ -5,14 +5,14 @@
 
 Computation times
 =================
-**00:45.905** total execution time for **how_to_work_with_relay** files:
+**00:41.565** total execution time for **how_to_work_with_relay** files:
 
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_using_pipeline_executor.py` (``using_pipeline_executor.py``) | 00:32.611 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_using_pipeline_executor.py` (``using_pipeline_executor.py``) | 00:31.265 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)           | 00:11.160 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)           | 00:08.753 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)                             | 00:02.127 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)                             | 00:01.539 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_relay_using_relay_viz.py` (``using_relay_viz.py``)                 | 00:00.007 | 0.0 MB |
 +----------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt b/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
index f197f9de2..40f806828 100644
--- a/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/intrin_math.rst.txt
@@ -261,7 +261,7 @@ The following example customizes CUDA lowering rule for :code:`exp`.
  .. code-block:: none
 
 
-    <function my_cuda_math_rule at 0x7f39011f1710>
+    <function my_cuda_math_rule at 0x7f8a421d0dd0>
 
 
 
diff --git a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
index 01479b7ab..934baad98 100644
--- a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
@@ -5,22 +5,22 @@
 
 Computation times
 =================
-**00:04.238** total execution time for **how_to_work_with_schedules** files:
+**00:04.157** total execution time for **how_to_work_with_schedules** files:
 
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)                 | 00:01.934 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)                 | 00:01.917 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)                     | 00:01.045 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)                     | 00:00.996 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)                     | 00:00.544 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)                     | 00:00.539 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)                               | 00:00.524 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)                               | 00:00.520 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_extern_op.py` (``extern_op.py``)                     | 00:00.105 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_extern_op.py` (``extern_op.py``)                     | 00:00.101 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_schedules_schedule_primitives.py` (``schedule_primitives.py``) | 00:00.042 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)                               | 00:00.027 | 0.0 MB |
+| :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)                               | 00:00.028 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_how_to_work_with_schedules_tuple_inputs.py` (``tuple_inputs.py``)               | 00:00.015 | 0.0 MB |
 +------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
index d3542e47a..ce2ecc184 100644
--- a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
@@ -347,7 +347,7 @@ The importing needs to happen before the tensorized GEMV being executed.
                  C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
       buffer_map = {A_1: A, B_1: B, C_1: C}
       preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmp2atr81uo/input0.cc'\nsource_filename = \"/tmp/tmp2atr81uo/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
+      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmpf8ixcx8o/input0.cc'\nsource_filename = \"/tmp/tmpf8ixcx8o/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
       for (i, 0, 1024) {
         for (j.outer: int32, 0, 32) {
           @tir.call_extern("gemv_update", @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/_sources/index.rst.txt b/docs/_sources/index.rst.txt
index a264c9beb..95b193767 100644
--- a/docs/_sources/index.rst.txt
+++ b/docs/_sources/index.rst.txt
@@ -18,7 +18,7 @@
 Apache TVM Documentation
 ========================
 
-Welcome to the the documentation for Apache TVM, a deep learning compiler that
+Welcome to the documentation for Apache TVM, a deep learning compiler that
 enables access to high-performance machine learning anywhere for everyone.
 TVM's diverse community of hardware vendors, compiler engineers and ML
 researchers work together to build a unified, programmable software stack, that
diff --git a/docs/_sources/reference/langref/hybrid_script.rst.txt b/docs/_sources/reference/langref/hybrid_script.rst.txt
index 1def162a7..eeed07a03 100644
--- a/docs/_sources/reference/langref/hybrid_script.rst.txt
+++ b/docs/_sources/reference/langref/hybrid_script.rst.txt
@@ -110,7 +110,7 @@ In HalideIR, loops have in total 4 types: ``serial``, ``unrolled``, ``parallel``
 
 Here we use ``range`` aka ``serial``, ``unroll``, ``parallel``, and ``vectorize``,
 these **4** keywords to annotate the corresponding types of for loops.
-The the usage is roughly the same as Python standard ``range``.
+The usage is roughly the same as Python standard ``range``.
 
 Besides all the loop types supported in Halide, ``const_range`` is supported for some specific conditions.
 Sometimes, ``tvm.container.Array`` is desired to pass as an argument, but in TVM-HalideIR, there is no
diff --git a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
index b83b6d002..d6c93fe88 100644
--- a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:22.073** total execution time for **topic_vta_tutorials_autotvm** files:
+**00:21.515** total execution time for **topic_vta_tutorials_autotvm** files:
 
 +---------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``) | 00:22.066 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``) | 00:21.508 | 0.0 MB |
 +---------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_alu_vta.py` (``tune_alu_vta.py``)     | 00:00.007 | 0.0 MB |
 +---------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
index b7c2b829f..08aad7001 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
@@ -291,7 +291,7 @@ The compilation steps are:
       DeprecationWarning,
     /workspace/vta/tutorials/frontend/deploy_classification.py:213: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
       relay_prog, target=tvm.target.Target(target, host=env.target_host), params=params
-    resnet18_v1 inference graph built in 24.29s!
+    resnet18_v1 inference graph built in 23.34s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
index a55610dfc..3d502d8be 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
@@ -177,7 +177,7 @@ Execute on CPU vs. VTA, and define the model.
     # The ``start_pack`` and ``stop_pack`` labels indicate where
     # to start and end the graph packing relay pass: in other words
     # where to start and finish offloading to VTA.
-    # the number 4 indicate the the ``start_pack`` index is 4, the
+    # the number 4 indicate the ``start_pack`` index is 4, the
     # number 186 indicate the ``stop_pack index`` is 186, by using
     # name and index number, here we can located to correct place
     # where to start/end when there are multiple ``nn.max_pool2d``
@@ -335,7 +335,7 @@ The compilation steps are:
       "target_host parameter is going to be deprecated. "
     /workspace/python/tvm/relay/build_module.py:411: DeprecationWarning: Please use input parameter mod (tvm.IRModule) instead of deprecated parameter mod (tvm.relay.function.Function)
       DeprecationWarning,
-    yolov3-tiny inference graph built in 16.72s!
+    yolov3-tiny inference graph built in 16.18s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
index 3d3386514..6495306e1 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**01:32.308** total execution time for **topic_vta_tutorials_frontend** files:
+**01:32.674** total execution time for **topic_vta_tutorials_frontend** files:
 
 +------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)           | 00:48.157 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)           | 00:49.203 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``) | 00:44.151 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``) | 00:43.470 | 0.0 MB |
 +------------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
index 3a28ed4d6..8c19305b9 100644
--- a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:03.346** total execution time for **topic_vta_tutorials_optimize** files:
+**00:03.307** total execution time for **topic_vta_tutorials_optimize** files:
 
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)         | 00:02.938 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)         | 00:02.900 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``) | 00:00.408 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``) | 00:00.407 | 0.0 MB |
 +--------------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
index 9323b110b..b89fa298a 100644
--- a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:00.768** total execution time for **topic_vta_tutorials** files:
+**00:00.745** total execution time for **topic_vta_tutorials** files:
 
 +---------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``) | 00:00.404 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``) | 00:00.398 | 0.0 MB |
 +---------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_topic_vta_tutorials_vta_get_started.py` (``vta_get_started.py``) | 00:00.364 | 0.0 MB |
+| :ref:`sphx_glr_topic_vta_tutorials_vta_get_started.py` (``vta_get_started.py``) | 00:00.347 | 0.0 MB |
 +---------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
index df3e9cd84..c87f6dd57 100644
--- a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
+++ b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
@@ -328,7 +328,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 93.504 ms
+    Execution time of this operator: 93.750 ms
 
 
 
diff --git a/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt b/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
index a5e3550e8..f9d2378c3 100644
--- a/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
+++ b/docs/_sources/tutorial/autotvm_matmul_x86.rst.txt
@@ -462,16 +462,16 @@ reduce variance, we take 5 measurements and average them.
     waiting for device...
     device available
     Get devices for measurement successfully!
-    No: 1   GFLOPS: 9.56/9.56       result: MeasureResult(costs=(0.028093525799999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5862655639648438, timestamp=1660954892.565883)        [('tile_y', [-1, 1]), ('tile_x', [-1, 256])],None,80
-    No: 2   GFLOPS: 2.63/9.56       result: MeasureResult(costs=(0.10218028900000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.785834550857544, timestamp=1660954894.3652294) [('tile_y', [-1, 4]), ('tile_x', [-1, 8])],None,32
-    No: 3   GFLOPS: 11.80/11.80     result: MeasureResult(costs=(0.0227458472,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5830347537994385, timestamp=1660954895.4353027)       [('tile_y', [-1, 64]), ('tile_x', [-1, 32])],None,56
-    No: 4   GFLOPS: 1.63/11.80      result: MeasureResult(costs=(0.1642617204,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.7438504695892334, timestamp=1660954898.7687416)       [('tile_y', [-1, 1]), ('tile_x', [-1, 4])],None,20
-    No: 5   GFLOPS: 3.61/11.80      result: MeasureResult(costs=(0.07439890299999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.3300185203552246, timestamp=1660954900.2274308)        [('tile_y', [-1, 256]), ('tile_x', [-1, 16])],None,48
-    No: 6   GFLOPS: 1.88/11.80      result: MeasureResult(costs=(0.1430079806,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.416994333267212, timestamp=1660954903.2184362)        [('tile_y', [-1, 512]), ('tile_x', [-1, 4])],None,29
-    No: 7   GFLOPS: 0.87/11.80      result: MeasureResult(costs=(0.30716909300000006,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.0313568115234375, timestamp=1660954908.295409) [('tile_y', [-1, 512]), ('tile_x', [-1, 2])],None,19
-    No: 8   GFLOPS: 9.91/11.80      result: MeasureResult(costs=(0.027098215199999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5780808925628662, timestamp=1660954908.892745)        [('tile_y', [-1, 4]), ('tile_x', [-1, 64])],None,62
-    No: 9   GFLOPS: 1.63/11.80      result: MeasureResult(costs=(0.16499144100000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.7602829933166504, timestamp=1660954911.7741477)        [('tile_y', [-1, 2]), ('tile_x', [-1, 2])],None,11
-    No: 10  GFLOPS: 2.46/11.80      result: MeasureResult(costs=(0.10900768639999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.874542474746704, timestamp=1660954913.6869636) [('tile_y', [-1, 4]), ('tile_x', [-1, 4])],None,22
+    No: 1   GFLOPS: 9.89/9.89       result: MeasureResult(costs=(0.027148858000000005,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5685811042785645, timestamp=1660954524.2582011)       [('tile_y', [-1, 1]), ('tile_x', [-1, 256])],None,80
+    No: 2   GFLOPS: 2.76/9.89       result: MeasureResult(costs=(0.0972901468,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7027850151062012, timestamp=1660954525.977176)        [('tile_y', [-1, 4]), ('tile_x', [-1, 8])],None,32
+    No: 3   GFLOPS: 11.85/11.85     result: MeasureResult(costs=(0.022652452400000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5598218441009521, timestamp=1660954527.0345967)       [('tile_y', [-1, 64]), ('tile_x', [-1, 32])],None,56
+    No: 4   GFLOPS: 1.86/11.85      result: MeasureResult(costs=(0.144490027,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.433370351791382, timestamp=1660954530.0399187) [('tile_y', [-1, 1]), ('tile_x', [-1, 4])],None,20
+    No: 5   GFLOPS: 3.64/11.85      result: MeasureResult(costs=(0.0736710072,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.319761037826538, timestamp=1660954531.487127) [('tile_y', [-1, 256]), ('tile_x', [-1, 16])],None,48
+    No: 6   GFLOPS: 1.75/11.85      result: MeasureResult(costs=(0.15376552059999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.5849740505218506, timestamp=1660954534.6411028)        [('tile_y', [-1, 512]), ('tile_x', [-1, 4])],None,29
+    No: 7   GFLOPS: 0.87/11.85      result: MeasureResult(costs=(0.3082451514,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.0465192794799805, timestamp=1660954539.734188)        [('tile_y', [-1, 512]), ('tile_x', [-1, 2])],None,19
+    No: 8   GFLOPS: 10.52/11.85     result: MeasureResult(costs=(0.0255276866,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5613396167755127, timestamp=1660954540.304363)        [('tile_y', [-1, 4]), ('tile_x', [-1, 64])],None,62
+    No: 9   GFLOPS: 1.66/11.85      result: MeasureResult(costs=(0.1621344482,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.6930415630340576, timestamp=1660954543.1178157)       [('tile_y', [-1, 2]), ('tile_x', [-1, 2])],None,11
+    No: 10  GFLOPS: 2.71/11.85      result: MeasureResult(costs=(0.0989214036,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6874268054962158, timestamp=1660954544.8631723)       [('tile_y', [-1, 4]), ('tile_x', [-1, 4])],None,22
 
 
 
diff --git a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
index 66a3b8f2b..23f5d1595 100644
--- a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
+++ b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
@@ -327,7 +327,7 @@ standard deviation.
 
  .. code-block:: none
 
-    {'mean': 497.01132035, 'median': 496.9583858500016, 'std': 0.719820825023517}
+    {'mean': 496.51104268999006, 'median': 495.9818490999851, 'std': 1.709787608932748}
 
 
 
@@ -563,30 +563,30 @@ the tuning data to.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  1/25]  Current/Best:   17.50/  17.50 GFLOPS | Progress: (4/20) | 6.44 s
    [Task  1/25]  Current/Best:    6.16/  17.50 GFLOPS | Progress: (8/20) | 9.48 s
    [Task  1/25]  Current/Best:   11.53/  22.75 GFLOPS | Progress: (12/20) | 11.94 s
    [Task  1/25]  Current/Best:   16.38/  22.76 GFLOPS | Progress: (16/20) | 13.64 s
    [Task  1/25]  Current/Best:   11.59/  23.80 GFLOPS | Progress: (20/20) | 15.38 s Done.
-
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  2/25]  Current/Best:   12.20/  13.28 GFLOPS | Progress: (4/20) | 3.75 s
    [Task  2/25]  Current/Best:   13.21/  18.32 GFLOPS | Progress: (8/20) | 5.05 s
    [Task  2/25]  Current/Best:   21.18/  21.18 GFLOPS | Progress: (12/20) | 6.37 s
    [Task  2/25]  Current/Best:   12.13/  21.18 GFLOPS | Progress: (16/20) | 7.65 s
    [Task  2/25]  Current/Best:   18.74/  21.18 GFLOPS | Progress: (20/20) | 9.28 s Done.
-
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  3/25]  Current/Best:    1.63/  10.81 GFLOPS | Progress: (4/20) | 5.89 s
    [Task  3/25]  Current/Best:   15.28/  16.77 GFLOPS | Progress: (8/20) | 7.84 s
    [Task  3/25]  Current/Best:   14.94/  16.77 GFLOPS | Progress: (12/20) | 9.56 s
    [Task  3/25]  Current/Best:    7.23/  23.62 GFLOPS | Progress: (16/20) | 11.49 s
    [Task  3/25]  Current/Best:   12.62/  23.62 GFLOPS | Progress: (20/20) | 16.04 s Done.
-
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  4/25]  Current/Best:    9.54/  20.42 GFLOPS | Progress: (4/20) | 2.42 s
    [Task  4/25]  Current/Best:    6.88/  20.42 GFLOPS | Progress: (8/20) | 6.77 s
    [Task  4/25]  Current/Best:   22.07/  22.07 GFLOPS | Progress: (12/20) | 11.37 s
    [Task  4/25]  Current/Best:   16.70/  22.07 GFLOPS | Progress: (16/20) | 13.63 s
    [Task  4/25]  Current/Best:   13.37/  22.07 GFLOPS | Progress: (20/20) | 15.65 s Done.
-
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  5/25]  Current/Best:    9.68/  10.13 GFLOPS | Progress: (4/20) | 2.63 s
    [Task  5/25]  Current/Best:   11.81/  12.77 GFLOPS | Progress: (8/20) | 4.69 s
    [Task  5/25]  Current/Best:    9.76/  18.03 GFLOPS | Progress: (12/20) | 7.84 s
    [Task  5/25]  Current/Best:   11.57/  19.57 GFLOPS | Progress: (16/20) | 9.29 s
    [Task  5/25]  Current/Best:   11.92/  20.95 GFLOPS | Progress: (20/20) | 11.20 s Done.
-
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  6/25]  Current/Best:   12.30/  20.58 GFLOPS | Progress: (4/20) | 4.03 s
    [Task  6/25]  Current/Best:   19.05/  20.58 GFLOPS | Progress: (8/20) | 5.81 s
    [Task  6/25]  Current/Best:   13.24/  20.58 GFLOPS | Progress: (12/20) | 7.74 s
    [Task  6/25]  Current/Best:   20.10/  20.58 GFLOPS | Progress: (16/20) | 10.02 s
    [Task  6/25]  Current/Best:    3.71/  20.58 GFLOPS | Progress: (20/20) | 12.55 s Done.
-
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  7/25]  Current/Best:   11.17/  12.72 GFLOPS | Progress: (4/20) | 3.68 s
    [Task  7/25]  Current/Best:   20.24/  21.02 GFLOPS | Progress: (8/20) | 5.20 s
    [Task  7/25]  Current/Best:   15.89/  21.02 GFLOPS | Progress: (12/20) | 7.18 s
    [Task  7/25]  Current/Best:   12.22/  21.02 GFLOPS | Progress: (16/20) | 9.22 s
    [Task  7/25]  Current/Best:    6.46/  21.68 GFLOPS | Progress: (20/20) | 11.68 s Done.
-
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  8/25]  Current/Best:   10.36/  14.62 GFLOPS | Progress: (4/20) | 2.91 s
    [Task  8/25]  Current/Best:    9.56/  14.62 GFLOPS | Progress: (8/20) | 7.73 s
    [Task  8/25]  Current/Best:   12.78/  14.62 GFLOPS | Progress: (12/20) | 13.98 s
    [Task  8/25]  Current/Best:   19.00/  19.00 GFLOPS | Progress: (16/20) | 16.07 s
    [Task  8/25]  Current/Best:   20.22/  20.22 GFLOPS | Progress: (20/20) | 22.57 s Done.
-
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  9/25]  Current/Best:   14.19/  15.63 GFLOPS | Progress: (4/20) | 12.05 s
    [Task  9/25]  Current/Best:   23.25/  23.25 GFLOPS | Progress: (8/20) | 13.89 s
    [Task  9/25]  Current/Best:    8.24/  23.25 GFLOPS | Progress: (12/20) | 16.29 s
    [Task  9/25]  Current/Best:   17.92/  23.25 GFLOPS | Progress: (16/20) | 18.96 s
    [Task  9/25]  Current/Best:    9.09/  23.25 GFLOPS | Progress: (20/20) | 26.64 s
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 10/25]  Current/Best:   18.18/  18.18 GFLOPS | Progress: (4/20) | 2.64 s
    [Task 10/25]  Current/Best:   15.54/  18.18 GFLOPS | Progress: (8/20) | 4.22 s
    [Task 10/25]  Current/Best:   12.83/  18.98 GFLOPS | Progress: (12/20) | 5.75 s
    [Task 10/25]  Current/Best:   19.09/  20.48 GFLOPS | Progress: (16/20) | 6.88 s
    [Task 10/25]  Current/Best:    8.92/  20.48 GFLOPS | Progress: (20/20
 ) | 8.44 s Done.
-
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 11/25]  Current/Best:   12.32/  18.19 GFLOPS | Progress: (4/20) | 3.37 s
    [Task 11/25]  Current/Best:   16.86/  18.19 GFLOPS | Progress: (8/20) | 6.14 s
    [Task 11/25]  Current/Best:   18.20/  18.20 GFLOPS | Progress: (12/20) | 8.22 s
    [Task 11/25]  Current/Best:   13.49/  20.95 GFLOPS | Progress: (16/20) | 11.02 s
    [Task 11/25]  Current/Best:   19.44/  21.59 GFLOPS | Progress: (20/20) | 13.06 s Done.
-
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 12/25]  Current/Best:    7.79/  18.19 GFLOPS | Progress: (4/20) | 5.46 s
    [Task 12/25]  Current/Best:    5.21/  18.19 GFLOPS | Progress: (8/20) | 9.15 s
    [Task 12/25]  Current/Best:   19.07/  19.07 GFLOPS | Progress: (12/20) | 11.14 s
    [Task 12/25]  Current/Best:   14.50/  19.07 GFLOPS | Progress: (16/20) | 13.96 s
    [Task 12/25]  Current/Best:   15.21/  19.07 GFLOPS | Progress: (20/20) | 15.94 s Done.
-
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 13/25]  Current/Best:    8.71/  16.96 GFLOPS | Progress: (4/20) | 3.75 s
    [Task 13/25]  Current/Best:   15.99/  20.76 GFLOPS | Progress: (8/20) | 6.21 s
    [Task 13/25]  Current/Best:   19.37/  21.08 GFLOPS | Progress: (12/20) | 9.17 s
    [Task 13/25]  Current/Best:   12.22/  21.08 GFLOPS | Progress: (16/20) | 12.61 s
    [Task 13/25]  Current/Best:   18.69/  21.08 GFLOPS | Progress: (20/20) | 14.86 s Done.
-
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 14/25]  Current/Best:   13.60/  13.60 GFLOPS | Progress: (4/20) | 3.29 s
    [Task 14/25]  Current/Best:    6.10/  13.60 GFLOPS | Progress: (8/20) | 5.47 s
    [Task 14/25]  Current/Best:   20.81/  20.81 GFLOPS | Progress: (12/20) | 8.04 s
    [Task 14/25]  Current/Best:   17.10/  20.81 GFLOPS | Progress: (16/20) | 9.74 s Done.
-
    [Task 14/25]  Current/Best:   17.54/  20.81 GFLOPS | Progress: (20/20) | 11.50 s
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 15/25]  Current/Best:   16.18/  17.55 GFLOPS | Progress: (4/20) | 2.77 s
    [Task 15/25]  Current/Best:   14.36/  18.12 GFLOPS | Progress: (8/20) | 4.12 s
    [Task 15/25]  Current/Best:   10.33/  22.22 GFLOPS | Progress: (12/20) | 6.25 s
    [Task 15/25]  Current/Best:   20.25/  22.22 GFLOPS | Progress: (16/20) | 9.22 s
    [Task 15/25]  Current/Best:    9.70/  22.22 GFLOPS | Progress: (20/20) | 10.19 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 16/25]  Current/Best:   20.25/  20.25 GFLOPS | Progress: (4/20) | 3.01 s
    [Task 16/25]  Current/Best:    3.04/  20.25 GFLOPS | Progress: (8/20) | 4.64 s
    [Task 16/25]  Current/Best:   19.44/  20.25 GFLOPS | Progress: (12/20) | 5.87 s
    [Task 16/25]  Current/Best:   17.89/  20.25 GFLOPS | Progress: (16/20) |
  7.23 s
    [Task 16/25]  Current/Best:   10.04/  21.94 GFLOPS | Progress: (20/20) | 9.27 s Done.
-
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 17/25]  Current/Best:   11.86/  18.10 GFLOPS | Progress: (4/20) | 4.80 s
    [Task 17/25]  Current/Best:   14.40/  22.93 GFLOPS | Progress: (8/20) | 7.67 s
    [Task 17/25]  Current/Best:   18.06/  22.93 GFLOPS | Progress: (12/20) | 9.72 s
    [Task 17/25]  Current/Best:   16.42/  22.93 GFLOPS | Progress: (16/20) | 11.86 s
    [Task 17/25]  Current/Best:   10.01/  22.93 GFLOPS | Progress: (20/20) | 14.01 s Done.
-
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 18/25]  Current/Best:   11.40/  17.84 GFLOPS | Progress: (4/20) | 3.75 s
    [Task 18/25]  Current/Best:   10.63/  19.89 GFLOPS | Progress: (8/20) | 7.22 s
    [Task 18/25]  Current/Best:   19.28/  19.89 GFLOPS | Progress: (12/20) | 9.15 s
    [Task 18/25]  Current/Best:   10.00/  19.89 GFLOPS | Progress: (16/20) | 12.73 s
    [Task 18/25]  Current/Best:   20.53/  20.53 GFLOPS | Progress: (20/20) | 14.27 s Done.
-
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 19/25]  Current/Best:    7.07/  20.06 GFLOPS | Progress: (4/20) | 6.09 s
    [Task 19/25]  Current/Best:    2.69/  20.06 GFLOPS | Progress: (8/20) | 9.32 s
    [Task 19/25]  Current/Best:   19.61/  20.82 GFLOPS | Progress: (12/20) | 12.09 s
    [Task 19/25]  Current/Best:   14.75/  21.28 GFLOPS | Progress: (16/20) | 14.91 s
    [Task 19/25]  Current/Best:    2.70/  22.62 GFLOPS | Progress: (20/20) | 17.70 s Done.
-
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 20/25]  Current/Best:    9.01/  14.91 GFLOPS | Progress: (4/20) | 3.35 s Done.
+
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  1/25]  Current/Best:   17.50/  17.50 GFLOPS | Progress: (4/20) | 6.33 s
    [Task  1/25]  Current/Best:    6.15/  17.50 GFLOPS | Progress: (8/20) | 9.37 s
    [Task  1/25]  Current/Best:   11.51/  22.82 GFLOPS | Progress: (12/20) | 11.83 s
    [Task  1/25]  Current/Best:   16.48/  22.82 GFLOPS | Progress: (16/20) | 13.52 s
    [Task  1/25]  Current/Best:   11.62/  23.82 GFLOPS | Progress: (20/20) | 15.25 s Done.
+
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  2/25]  Current/Best:   12.21/  13.18 GFLOPS | Progress: (4/20) | 3.81 s
    [Task  2/25]  Current/Best:   13.94/  18.06 GFLOPS | Progress: (8/20) | 5.12 s
    [Task  2/25]  Current/Best:   20.99/  20.99 GFLOPS | Progress: (12/20) | 6.45 s
    [Task  2/25]  Current/Best:   12.36/  20.99 GFLOPS | Progress: (16/20) | 7.72 s
    [Task  2/25]  Current/Best:   18.89/  20.99 GFLOPS | Progress: (20/20) | 9.28 s Done.
+
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  3/25]  Current/Best:    1.63/  10.79 GFLOPS | Progress: (4/20) | 5.89 s
    [Task  3/25]  Current/Best:   15.31/  16.87 GFLOPS | Progress: (8/20) | 7.83 s
    [Task  3/25]  Current/Best:   14.96/  16.87 GFLOPS | Progress: (12/20) | 9.55 s
    [Task  3/25]  Current/Best:    7.20/  23.69 GFLOPS | Progress: (16/20) | 11.47 s
    [Task  3/25]  Current/Best:   11.20/  23.69 GFLOPS | Progress: (20/20) | 16.02 s Done.
+
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  4/25]  Current/Best:    9.55/  20.39 GFLOPS | Progress: (4/20) | 2.42 s
    [Task  4/25]  Current/Best:    6.86/  20.39 GFLOPS | Progress: (8/20) | 6.78 s
    [Task  4/25]  Current/Best:   22.22/  22.22 GFLOPS | Progress: (12/20) | 11.32 s
    [Task  4/25]  Current/Best:   17.44/  22.22 GFLOPS | Progress: (16/20) | 13.53 s
    [Task  4/25]  Current/Best:   13.17/  22.22 GFLOPS | Progress: (20/20) | 15.54 s Done.
+
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  5/25]  Current/Best:    9.52/  10.25 GFLOPS | Progress: (4/20) | 2.61 s
    [Task  5/25]  Current/Best:   11.74/  12.69 GFLOPS | Progress: (8/20) | 4.68 s
    [Task  5/25]  Current/Best:   11.26/  18.10 GFLOPS | Progress: (12/20) | 7.81 s
    [Task  5/25]  Current/Best:   11.61/  22.45 GFLOPS | Progress: (16/20) | 9.29 s
    [Task  5/25]  Current/Best:   11.90/  22.45 GFLOPS | Progress: (20/20) | 11.17 s Done.
+
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  6/25]  Current/Best:   12.17/  20.71 GFLOPS | Progress: (4/20) | 3.99 s
    [Task  6/25]  Current/Best:   18.96/  20.71 GFLOPS | Progress: (8/20) | 5.77 s
    [Task  6/25]  Current/Best:   13.31/  20.71 GFLOPS | Progress: (12/20) | 7.70 s
    [Task  6/25]  Current/Best:   19.97/  20.71 GFLOPS | Progress: (16/20) | 9.94 s
    [Task  6/25]  Current/Best:    3.72/  20.71 GFLOPS | Progress: (20/20) | 12.46 s Done.
+
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  7/25]  Current/Best:   11.19/  12.94 GFLOPS | Progress: (4/20) | 3.67 s
    [Task  7/25]  Current/Best:   20.31/  21.15 GFLOPS | Progress: (8/20) | 5.20 s
    [Task  7/25]  Current/Best:   14.33/  21.15 GFLOPS | Progress: (12/20) | 7.16 s
    [Task  7/25]  Current/Best:   12.23/  21.15 GFLOPS | Progress: (16/20) | 9.19 s
    [Task  7/25]  Current/Best:    6.31/  21.63 GFLOPS | Progress: (20/20) | 11.66 s Done.
+
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  8/25]  Current/Best:   10.14/  14.09 GFLOPS | Progress: (4/20) | 2.92 s
    [Task  8/25]  Current/Best:   10.09/  14.09 GFLOPS | Progress: (8/20) | 7.64 s
    [Task  8/25]  Current/Best:   12.60/  14.09 GFLOPS | Progress: (12/20) | 13.76 s
    [Task  8/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (16/20) | 15.86 s
    [Task  8/25]  Current/Best:   20.26/  20.26 GFLOPS | Progress: (20/20) | 22.38 s Done.
+
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task  9/25]  Current/Best:   14.31/  14.31 GFLOPS | Progress: (4/20) | 12.00 s
    [Task  9/25]  Current/Best:   23.35/  23.35 GFLOPS | Progress: (8/20) | 13.82 s
    [Task  9/25]  Current/Best:    8.23/  23.35 GFLOPS | Progress: (12/20) | 16.19 s
    [Task  9/25]  Current/Best:   17.84/  23.35 GFLOPS | Progress: (16/20) | 18.88 s
    [Task  9/25]  Current/Best:    9.21/  23.35 GFLOPS | Progress: (20/20) | 26.65 s
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 10/25]  Current/Best:   18.24/  18.24 GFLOPS | Progress: (4/20) | 2.59 s
    [Task 10/25]  Current/Best:   15.42/  18.24 GFLOPS | Progress: (8/20) | 4.16 s
    [Task 10/25]  Current/Best:   12.49/  18.88 GFLOPS | Progress: (12/20) | 5.69 s
    [Task 10/25]  Current/Best:   19.14/  20.30 GFLOPS | Progress: (16/20) | 6.81 s
    [Task 10/25]  Current/Best:    8.81/  20.30 GFLOPS | Progress: (20/20
 ) | 8.38 s Done.
+
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 11/25]  Current/Best:   11.71/  18.26 GFLOPS | Progress: (4/20) | 3.34 s
    [Task 11/25]  Current/Best:   16.82/  18.26 GFLOPS | Progress: (8/20) | 6.11 s
    [Task 11/25]  Current/Best:   16.42/  18.26 GFLOPS | Progress: (12/20) | 8.21 s
    [Task 11/25]  Current/Best:   13.49/  20.96 GFLOPS | Progress: (16/20) | 10.92 s
    [Task 11/25]  Current/Best:   19.43/  21.55 GFLOPS | Progress: (20/20) | 12.97 s Done.
+
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 12/25]  Current/Best:    7.72/  18.11 GFLOPS | Progress: (4/20) | 5.47 s
    [Task 12/25]  Current/Best:    5.20/  18.11 GFLOPS | Progress: (8/20) | 9.17 s
    [Task 12/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (12/20) | 11.16 s
    [Task 12/25]  Current/Best:   15.02/  18.80 GFLOPS | Progress: (16/20) | 13.95 s
    [Task 12/25]  Current/Best:   15.10/  18.86 GFLOPS | Progress: (20/20) | 15.87 s Done.
+
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 13/25]  Current/Best:    8.75/  17.30 GFLOPS | Progress: (4/20) | 3.70 s
    [Task 13/25]  Current/Best:   16.07/  20.87 GFLOPS | Progress: (8/20) | 6.14 s
    [Task 13/25]  Current/Best:   19.54/  21.46 GFLOPS | Progress: (12/20) | 9.01 s
    [Task 13/25]  Current/Best:   12.23/  21.46 GFLOPS | Progress: (16/20) | 12.39 s
    [Task 13/25]  Current/Best:   18.66/  21.46 GFLOPS | Progress: (20/20) | 14.70 s Done.
+
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 14/25]  Current/Best:   13.72/  13.72 GFLOPS | Progress: (4/20) | 3.27 s
    [Task 14/25]  Current/Best:    6.12/  13.72 GFLOPS | Progress: (8/20) | 5.45 s
    [Task 14/25]  Current/Best:   20.51/  20.51 GFLOPS | Progress: (12/20) | 7.98 s
    [Task 14/25]  Current/Best:   17.17/  20.51 GFLOPS | Progress: (16/20) | 9.64 s Done.
+
    [Task 14/25]  Current/Best:   17.53/  20.51 GFLOPS | Progress: (20/20) | 11.44 s
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 15/25]  Current/Best:   16.16/  17.42 GFLOPS | Progress: (4/20) | 2.76 s
    [Task 15/25]  Current/Best:   14.42/  17.88 GFLOPS | Progress: (8/20) | 4.10 s
    [Task 15/25]  Current/Best:   10.40/  22.36 GFLOPS | Progress: (12/20) | 6.17 s
    [Task 15/25]  Current/Best:   20.27/  22.36 GFLOPS | Progress: (16/20) | 9.17 s
    [Task 15/25]  Current/Best:    9.67/  22.36 GFLOPS | Progress: (20/20) | 10.15 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 16/25]  Current/Best:   20.44/  20.44 GFLOPS | Progress: (4/20) | 2.99 s
    [Task 16/25]  Current/Best:    3.04/  20.44 GFLOPS | Progress: (8/20) | 4.61 s
    [Task 16/25]  Current/Best:   19.46/  20.44 GFLOPS | Progress: (12/20) | 5.83 s
    [Task 16/25]  Current/Best:   18.31/  20.44 GFLOPS | Progress: (16/20) |
  7.18 s
    [Task 16/25]  Current/Best:   10.00/  21.59 GFLOPS | Progress: (20/20) | 9.23 s Done.
+
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 17/25]  Current/Best:   13.37/  18.13 GFLOPS | Progress: (4/20) | 4.75 s
    [Task 17/25]  Current/Best:   14.43/  23.25 GFLOPS | Progress: (8/20) | 7.61 s
    [Task 17/25]  Current/Best:   18.46/  23.25 GFLOPS | Progress: (12/20) | 9.67 s
    [Task 17/25]  Current/Best:   16.40/  23.25 GFLOPS | Progress: (16/20) | 11.79 s
    [Task 17/25]  Current/Best:   10.05/  23.25 GFLOPS | Progress: (20/20) | 13.93 s Done.
+
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 18/25]  Current/Best:   11.26/  18.00 GFLOPS | Progress: (4/20) | 3.73 s
    [Task 18/25]  Current/Best:   10.55/  20.01 GFLOPS | Progress: (8/20) | 7.15 s
    [Task 18/25]  Current/Best:   19.14/  20.01 GFLOPS | Progress: (12/20) | 9.09 s
    [Task 18/25]  Current/Best:    9.96/  20.01 GFLOPS | Progress: (16/20) | 12.67 s
    [Task 18/25]  Current/Best:   20.27/  20.27 GFLOPS | Progress: (20/20) | 14.21 s Done.
+
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 19/25]  Current/Best:    7.03/  20.27 GFLOPS | Progress: (4/20) | 6.10 s
    [Task 19/25]  Current/Best:    2.69/  20.27 GFLOPS | Progress: (8/20) | 9.33 s
    [Task 19/25]  Current/Best:   19.84/  21.41 GFLOPS | Progress: (12/20) | 12.10 s
    [Task 19/25]  Current/Best:   15.48/  21.64 GFLOPS | Progress: (16/20) | 14.92 s
    [Task 19/25]  Current/Best:    2.70/  23.07 GFLOPS | Progress: (20/20) | 17.69 s Done.
+
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 20/25]  Current/Best:    9.41/  15.46 GFLOPS | Progress: (4/20) | 3.32 s Done.
      Done.
-
    [Task 20/25]  Current/Best:   10.46/  14.91 GFLOPS | Progress: (8/20) | 6.74 s
    [Task 20/25]  Current/Best:    2.33/  16.54 GFLOPS | Progress: (12/20) | 10.70 s
    [Task 20/25]  Current/Best:   12.55/  16.54 GFLOPS | Progress: (16/20) | 14.29 s
    [Task 20/25]  Current/Best:   13.64/  21.51 GFLOPS | Progress: (20/20) | 16.38 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 21/25]  Current/Best:    6.38/  17.57 GFLOPS | Progress: (4/20) | 3.28 s
    [Task 21/25]  Current/Best:   14.48/  17.57 GFLOPS | Progress: (8/20) | 4.89 s
    [Task 21/25]  Current/Best:    1.61/  17.57 GFLOPS | Progress: (12/20) | 7.05 s
    [Task 21/25]  Current/Best:   17.95/  17.95 GFLOPS | Progress: (16/20) | 10.53 s
    [Task 21/25]  Current/Best:    4.45/  17.95 GFLOPS | Progress: (20/20) | 17.85 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 22/25]  Current/Best:    2.70/  17.00 GFLOPS | Progress: (4/20
 ) | 2.74 s
    [Task 22/25]  Current/Best:    8.70/  21.91 GFLOPS | Progress: (8/20) | 4.71 s
    [Task 22/25]  Current/Best:   19.72/  21.91 GFLOPS | Progress: (12/20) | 7.05 s
    [Task 22/25]  Current/Best:   15.01/  21.91 GFLOPS | Progress: (16/20) | 9.11 s
    [Task 22/25]  Current/Best:   15.24/  21.91 GFLOPS | Progress: (20/20) | 10.84 s Done.
-
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 23/25]  Current/Best:   17.31/  20.25 GFLOPS | Progress: (4/20) | 3.32 s
    [Task 23/25]  Current/Best:   16.15/  20.25 GFLOPS | Progress: (8/20) | 6.68 s
    [Task 23/25]  Current/Best:   20.75/  21.19 GFLOPS | Progress: (12/20) | 8.50 s
    [Task 23/25]  Current/Best:    6.23/  21.19 GFLOPS | Progress: (16/20) | 15.70 s
    [Task 23/25]  Current/Best:    7.64/  21.19 GFLOPS | Progress: (20/20) | 19.97 s Done.
-
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 24/25]  Current/Best:    8.42/   8.42 GFLOPS | Progress: (4/20) | 11.84 s
    [Task 24/25]  Current/Best:    1.95/   8.42 GFLOPS | Progress: (8/20) | 22.88 s
    [Task 24/25]  Current/Best:    3.85/   8.42 GFLOPS | Progress: (12/20) | 34.46 s Done.
-
    [Task 24/25]  Current/Best:    7.13/   8.78 GFLOPS | Progress: (16/20) | 39.92 s
    [Task 24/25]  Current/Best:    3.31/   8.86 GFLOPS | Progress: (20/20) | 45.85 s Done.
-
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 25/25]  Current/Best:    1.55/   2.88 GFLOPS | Progress: (4/20) | 11.63 s
    [Task 25/25]  Current/Best:    5.50/   7.86 GFLOPS | Progress: (8/20) | 22.94 s
    [Task 25/25]  Current/Best:    5.60/   7.86 GFLOPS | Progress: (12/20) | 34.44 s
    [Task 25/25]  Current/Best:    5.57/   9.25 GFLOPS | Progress: (16/20) | 36.25 s
    [Task 25/25]  Current/Best:    2.88/   9.25 GFLOPS | Progress: (20/20) | 46.92 s
+
    [Task 20/25]  Current/Best:   10.18/  15.46 GFLOPS | Progress: (8/20) | 6.61 s
    [Task 20/25]  Current/Best:    2.32/  16.65 GFLOPS | Progress: (12/20) | 10.67 s
    [Task 20/25]  Current/Best:   11.20/  16.65 GFLOPS | Progress: (16/20) | 14.46 s
    [Task 20/25]  Current/Best:   13.08/  22.14 GFLOPS | Progress: (20/20) | 16.54 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 21/25]  Current/Best:    6.40/  17.66 GFLOPS | Progress: (4/20) | 3.26 s
    [Task 21/25]  Current/Best:   14.63/  17.66 GFLOPS | Progress: (8/20) | 4.80 s
    [Task 21/25]  Current/Best:    1.61/  17.66 GFLOPS | Progress: (12/20) | 6.96 s
    [Task 21/25]  Current/Best:   18.12/  18.12 GFLOPS | Progress: (16/20) | 10.41 s
    [Task 21/25]  Current/Best:    4.47/  18.12 GFLOPS | Progress: (20/20) | 17.57 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 22/25]  Current/Best:    2.70/  17.05 GFLOPS | Progress: (4/20
 ) | 2.70 s
    [Task 22/25]  Current/Best:    8.73/  21.88 GFLOPS | Progress: (8/20) | 4.68 s
    [Task 22/25]  Current/Best:   20.00/  21.88 GFLOPS | Progress: (12/20) | 7.00 s
    [Task 22/25]  Current/Best:   14.92/  21.88 GFLOPS | Progress: (16/20) | 9.08 s
    [Task 22/25]  Current/Best:   14.24/  21.88 GFLOPS | Progress: (20/20) | 10.75 s Done.
+
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 23/25]  Current/Best:   17.56/  20.54 GFLOPS | Progress: (4/20) | 3.29 s
    [Task 23/25]  Current/Best:   15.49/  20.54 GFLOPS | Progress: (8/20) | 6.54 s
    [Task 23/25]  Current/Best:   20.94/  21.51 GFLOPS | Progress: (12/20) | 8.35 s
    [Task 23/25]  Current/Best:    6.30/  21.51 GFLOPS | Progress: (16/20) | 15.24 s
    [Task 23/25]  Current/Best:    7.67/  21.51 GFLOPS | Progress: (20/20) | 19.46 s Done.
+
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 24/25]  Current/Best:    8.10/   8.10 GFLOPS | Progress: (4/20) | 11.85 s
    [Task 24/25]  Current/Best:    3.61/   8.10 GFLOPS | Progress: (8/20) | 23.11 s
    [Task 24/25]  Current/Best:    4.68/   8.10 GFLOPS | Progress: (12/20) | 33.84 s Done.
+
    [Task 24/25]  Current/Best:    7.16/   8.95 GFLOPS | Progress: (16/20) | 39.22 s
    [Task 24/25]  Current/Best:    3.28/   8.99 GFLOPS | Progress: (20/20) | 45.12 s Done.
+
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
    [Task 25/25]  Current/Best:    1.55/   2.89 GFLOPS | Progress: (4/20) | 11.64 s
    [Task 25/25]  Current/Best:    5.70/   7.89 GFLOPS | Progress: (8/20) | 22.96 s
    [Task 25/25]  Current/Best:    6.02/   7.89 GFLOPS | Progress: (12/20) | 34.45 s
    [Task 25/25]  Current/Best:    5.78/   8.93 GFLOPS | Progress: (16/20) | 36.33 s
    [Task 25/25]  Current/Best:    2.90/   9.05 GFLOPS | Progress: (20/20) | 46.99 s
 
 
 
@@ -748,8 +748,8 @@ improvement in comparing the optimized model to the unoptimized model.
 
  .. code-block:: none
 
-    optimized: {'mean': 413.67374406999943, 'median': 413.19100769999295, 'std': 1.3671008475901267}
-    unoptimized: {'mean': 497.01132035, 'median': 496.9583858500016, 'std': 0.719820825023517}
+    optimized: {'mean': 414.7075195700063, 'median': 414.6528698500106, 'std': 1.308842791249852}
+    unoptimized: {'mean': 496.51104268999006, 'median': 495.9818490999851, 'std': 1.709787608932748}
 
 
 
@@ -772,7 +772,7 @@ profiling/benchmarking.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 10 minutes  21.737 seconds)
+   **Total running time of the script:** ( 10 minutes  16.869 seconds)
 
 
 .. _sphx_glr_download_tutorial_autotvm_relay_x86.py:
diff --git a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
index 2d2b829f3..0f81a1e36 100644
--- a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
@@ -282,7 +282,7 @@ device and returns the measured cost. Network overhead is excluded.
 
  .. code-block:: none
 
-    1.246e-07 secs/op
+    1.279e-07 secs/op
 
 
 
diff --git a/docs/_sources/tutorial/intro_topi.rst.txt b/docs/_sources/tutorial/intro_topi.rst.txt
index 8f59aca82..16358f97e 100644
--- a/docs/_sources/tutorial/intro_topi.rst.txt
+++ b/docs/_sources/tutorial/intro_topi.rst.txt
@@ -263,7 +263,7 @@ As you can see, scheduled stages of computation have been accumulated and we can
 
  .. code-block:: none
 
-    [stage(a, placeholder(a, 0x1c281cb0)), stage(b, placeholder(b, 0x1c282b20)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(mi [...]
+    [stage(a, placeholder(a, 0x124ca830)), stage(b, placeholder(b, 0x12482b70)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(mi [...]
 
 
 
diff --git a/docs/_sources/tutorial/sg_execution_times.rst.txt b/docs/_sources/tutorial/sg_execution_times.rst.txt
index 239659a51..7f6654800 100644
--- a/docs/_sources/tutorial/sg_execution_times.rst.txt
+++ b/docs/_sources/tutorial/sg_execution_times.rst.txt
@@ -5,32 +5,32 @@
 
 Computation times
 =================
-**13:09.031** total execution time for **tutorial** files:
+**13:05.923** total execution time for **tutorial** files:
 
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)                 | 10:21.737 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)                 | 10:16.869 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)     | 01:00.306 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)     | 00:59.413 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``) | 00:48.890 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``) | 00:53.320 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)                 | 00:31.358 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)                 | 00:30.823 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)               | 00:24.696 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)               | 00:24.118 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)       | 00:01.172 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)                               | 00:00.708 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)                               | 00:00.706 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)       | 00:00.513 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``) | 00:00.156 | 0.0 MB |
+| :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``) | 00:00.151 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_introduction.py` (``introduction.py``)                           | 00:00.005 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_uma.py` (``uma.py``)                                             | 00:00.001 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_tutorial_tvmc_python.py` (``tvmc_python.py``)                             | 00:00.001 | 0.0 MB |
-+------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_install.py` (``install.py``)                                     | 00:00.001 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_tutorial_tvmc_command_line_driver.py` (``tvmc_command_line_driver.py``)   | 00:00.001 | 0.0 MB |
 +------------------------------------------------------------------------------------------+-----------+--------+
+| :ref:`sphx_glr_tutorial_tvmc_python.py` (``tvmc_python.py``)                             | 00:00.001 | 0.0 MB |
++------------------------------------------------------------------------------------------+-----------+--------+
diff --git a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
index 5e0c743ca..54af6d1b2 100644
--- a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
+++ b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
@@ -301,8 +301,8 @@ helper function to run a profile of the TVM generated code.
 
  .. code-block:: none
 
-    Numpy running time: 0.000008
-    naive: 0.000007
+    Numpy running time: 0.000012
+    naive: 0.000010
 
 
 
@@ -403,7 +403,7 @@ compile and run this new schedule with the parallel operation applied:
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    parallel: 0.000006
+    parallel: 0.000011
 
 
 
@@ -512,10 +512,10 @@ We can now compare the different schedules
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                   numpy    8.439260000159265e-06                    1.0
-                   naive    6.598399999999999e-06     0.7818695003916782
-                parallel              5.9822e-06      0.7088536198537673
-                  vector    2.4694700000000003e-05     2.926168882050555
+                   numpy    1.2196850002510473e-05                   1.0
+                   naive             1.02922e-05      0.8438408275810196
+                parallel             1.10824e-05      0.9086280472186601
+                  vector    2.4550700000000002e-05    2.0128721755983503
 
 
 
@@ -936,7 +936,7 @@ matrix multiplication.
 
  .. code-block:: none
 
-    Numpy running time: 0.018766
+    Numpy running time: 0.018257
 
 
 
@@ -996,7 +996,7 @@ optimizations.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    none: 3.327939
+    none: 3.290384
 
 
 
@@ -1101,7 +1101,7 @@ schedule.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    blocking: 0.311314
+    blocking: 0.301687
 
 
 
@@ -1199,7 +1199,7 @@ already cache friendly from our previous optimizations.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    vectorization: 0.346457
+    vectorization: 0.336929
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1275,7 +1275,7 @@ more cache friendly.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    loop permutation: 0.114644
+    loop permutation: 0.118714
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1376,7 +1376,7 @@ optimized schedule.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    array packing: 0.107473
+    array packing: 0.109632
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1471,7 +1471,7 @@ to `C` when all the block results are ready.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    block caching: 0.110177
+    block caching: 0.111129
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1559,7 +1559,7 @@ of thread-level parallelization.
 
     /workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
       "target_host parameter is going to be deprecated. "
-    parallelization: 0.146244
+    parallelization: 0.145083
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1640,13 +1640,13 @@ working, we can compare the results.
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                    none            3.3279385334                     1.0
-                blocking            0.3113135968     0.09354547677956822
-           vectorization            0.3464572697      0.1041056696879677
-        loop permutation             0.114643695    0.034448861915389366
-           array packing     0.10747286520000002     0.03229412566409392
-           block caching            0.1101772656     0.03310676098558738
-         parallelization            0.1462436361    0.043944211899427627
+                    none            3.2903840148                     1.0
+                blocking     0.30168669979999996     0.09168738312702307
+           vectorization             0.336928585     0.10239795217959677
+        loop permutation     0.11871362569999999     0.03607895770403437
+           array packing     0.10963219760000001     0.03331896736273921
+           block caching            0.1111290892     0.03377389651181939
+         parallelization            0.1450830362     0.04409304067471243
 
 
 
@@ -1686,11 +1686,6 @@ operations with tunable parameters that allows you to automatically optimize
 the computation for specific platforms.
 
 
-.. rst-class:: sphx-glr-timing
-
-   **Total running time of the script:** ( 1 minutes  0.306 seconds)
-
-
 .. _sphx_glr_download_tutorial_tensor_expr_get_started.py:
 
 .. only:: html
diff --git a/docs/arch/pass_infra.html b/docs/arch/pass_infra.html
index 249bc9c84..cf0d59b7c 100644
--- a/docs/arch/pass_infra.html
+++ b/docs/arch/pass_infra.html
@@ -381,7 +381,7 @@ Gluon, also have the tendency to enable pass-style layer construction
 scheme through <a class="reference external" href="https://pytorch.org/docs/stable/nn.html?highlight=sequential#torch.nn.Sequential">Sequential</a> and <a class="reference external" href="https://mxnet.apache.org/api/python/docs/api/gluon/block.html#gluon-block">Block</a>, respectively. With such constructs,
 these modern frameworks are able to conveniently add modules/layers to their
 containers and build up neural networks easily.</p>
-<p>The design of the Relay pass infra is largely inspired by the the hierarchical
+<p>The design of the Relay pass infra is largely inspired by the hierarchical
 pass manager used in LLVM and the block-style containers used in the popular
 deep learning frameworks. The major goals of the pass infra include:</p>
 <ol class="arabic simple">
diff --git a/docs/arch/security.html b/docs/arch/security.html
index 48a09e720..cc1c14ebc 100644
--- a/docs/arch/security.html
+++ b/docs/arch/security.html
@@ -362,7 +362,7 @@
 We strongly encourage folks to report such problems to our private security mailing list first, before disclosing them in a public forum.</p>
 <p>Please note that the security mailing list should only be used for reporting undisclosed security vulnerabilities and managing the process of fixing such vulnerabilities. We cannot accept regular bug reports or other queries at this address. All mail sent to this address that does not relate to an undisclosed security problem in our source code will be ignored.
 Questions about: if a vulnerability applies to your particular application obtaining further information on a published vulnerability availability of patches
-and/or new releases should be addressed to to the user discuss forum.</p>
+and/or new releases should be addressed to the user Discuss forum.</p>
 <p>The private security mailing address is: <a class="reference external" href="mailto:security&#37;&#52;&#48;apache&#46;org">security<span>&#64;</span>apache<span>&#46;</span>org</a>.
 Feel free to consult the <a class="reference external" href="https://www.apache.org/security/">Apache Security guide</a>.</p>
 </div>
diff --git a/docs/commit_hash b/docs/commit_hash
index 7f504b31f..68db659ae 100644
--- a/docs/commit_hash
+++ b/docs/commit_hash
@@ -1 +1 @@
-9d6039b879364a320b3202d8504626019da3a8f3
+bdcfa01eae3ffe8c6d39aa26d0d1e5b311d47efb
diff --git a/docs/how_to/compile_models/from_darknet.html b/docs/how_to/compile_models/from_darknet.html
index b3dbdb95a..fb93e72d6 100644
--- a/docs/how_to/compile_models/from_darknet.html
+++ b/docs/how_to/compile_models/from_darknet.html
@@ -574,7 +574,7 @@ class:[&#39;truck 0.9266&#39;] left:471 top:83 right:689 bottom:169
 class:[&#39;bicycle 0.9984&#39;] left:111 top:113 right:577 bottom:447
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  1.139 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  5.116 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-darknet-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7716f96385bd5abb6e822041e285be54/from_darknet.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_darknet.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/from_mxnet.html b/docs/how_to/compile_models/from_mxnet.html
index e4de76605..9b1ee2f08 100644
--- a/docs/how_to/compile_models/from_mxnet.html
+++ b/docs/how_to/compile_models/from_mxnet.html
@@ -427,7 +427,7 @@ to download the full example code</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;x&quot;</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#tuple" title="builtins.tuple" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">x</span><span class="o">.</span><span class="n">shape</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_from_mxnet_001.png" srcset="../../_images/sphx_glr_from_mxnet_001.png" alt="from mxnet" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zipfd30f395-f5ff-4b3c-9faa-ce429c571eec from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+<img src="../../_images/sphx_glr_from_mxnet_001.png" srcset="../../_images/sphx_glr_from_mxnet_001.png" alt="from mxnet" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip8b84f8d6-39a9-41c1-b841-c30cb3b6f630 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
 x (1, 3, 224, 224)
 </pre></div>
 </div>
diff --git a/docs/how_to/compile_models/from_oneflow.html b/docs/how_to/compile_models/from_oneflow.html
index 0818fc908..e120937cd 100644
--- a/docs/how_to/compile_models/from_oneflow.html
+++ b/docs/how_to/compile_models/from_oneflow.html
@@ -432,11 +432,15 @@ python3 -m pip install -f https://release.oneflow.info <span class="nv">oneflow<
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip&quot; to /workspace/.oneflow/flowvision_cache/resnet18.zip
 
   0%|          | 0.00/41.5M [00:00&lt;?, ?B/s]
- 21%|##1       | 8.81M/41.5M [00:00&lt;00:00, 92.4MB/s]
- 42%|####2     | 17.6M/41.5M [00:00&lt;00:00, 68.6MB/s]
- 59%|#####9    | 24.5M/41.5M [00:00&lt;00:00, 40.0MB/s]
- 82%|########2 | 34.1M/41.5M [00:00&lt;00:00, 51.0MB/s]
-100%|##########| 41.5M/41.5M [00:00&lt;00:00, 59.4MB/s]
+ 15%|#5        | 6.33M/41.5M [00:00&lt;00:01, 34.8MB/s]
+ 23%|##3       | 9.65M/41.5M [00:00&lt;00:01, 27.3MB/s]
+ 37%|###6      | 15.2M/41.5M [00:00&lt;00:00, 37.2MB/s]
+ 46%|####6     | 19.1M/41.5M [00:00&lt;00:00, 34.4MB/s]
+ 58%|#####7    | 24.0M/41.5M [00:00&lt;00:00, 30.1MB/s]
+ 77%|#######7  | 32.0M/41.5M [00:00&lt;00:00, 39.9MB/s]
+ 87%|########6 | 36.1M/41.5M [00:01&lt;00:00, 39.9MB/s]
+ 97%|#########6| 40.1M/41.5M [00:01&lt;00:00, 36.0MB/s]
+100%|##########| 41.5M/41.5M [00:01&lt;00:00, 36.3MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_pytorch.html b/docs/how_to/compile_models/from_pytorch.html
index e8bcff28b..d6d3df4d3 100644
--- a/docs/how_to/compile_models/from_pytorch.html
+++ b/docs/how_to/compile_models/from_pytorch.html
@@ -414,8 +414,8 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/resnet18-f37072fd.pth&quot; to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
 
   0%|          | 0.00/44.7M [00:00&lt;?, ?B/s]
- 48%|####8     | 21.6M/44.7M [00:00&lt;00:00, 226MB/s]
-100%|##########| 44.7M/44.7M [00:00&lt;00:00, 255MB/s]
+ 44%|####4     | 19.8M/44.7M [00:00&lt;00:00, 207MB/s]
+100%|##########| 44.7M/44.7M [00:00&lt;00:00, 238MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_tensorflow.html b/docs/how_to/compile_models/from_tensorflow.html
index c9c2d3649..959f63585 100644
--- a/docs/how_to/compile_models/from_tensorflow.html
+++ b/docs/how_to/compile_models/from_tensorflow.html
@@ -636,7 +636,7 @@ banana (score = 0.00022)
 desk (score = 0.00019)
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  2.342 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  0.864 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-tensorflow-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7f1d3d1b878694c201c614c807cdebc8/from_tensorflow.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_tensorflow.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/sg_execution_times.html b/docs/how_to/compile_models/sg_execution_times.html
index d67d31119..2006311b3 100644
--- a/docs/how_to/compile_models/sg_execution_times.html
+++ b/docs/how_to/compile_models/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-compile-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>05:01.673</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
+<p><strong>05:03.223</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 81%" />
@@ -335,44 +335,44 @@
 <col style="width: 8%" />
 </colgroup>
 <tbody>
-<tr class="row-odd"><td><p><a class="reference internal" href="from_tensorflow.html#sphx-glr-how-to-compile-models-from-tensorflow-py"><span class="std std-ref">Compile Tensorflow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tensorflow.py</span></code>)</p></td>
-<td><p>01:02.342</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="from_darknet.html#sphx-glr-how-to-compile-models-from-darknet-py"><span class="std std-ref">Compile YOLO-V2 and YOLO-V3 in DarkNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_darknet.py</span></code>)</p></td>
+<td><p>01:05.116</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="from_darknet.html#sphx-glr-how-to-compile-models-from-darknet-py"><span class="std std-ref">Compile YOLO-V2 and YOLO-V3 in DarkNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_darknet.py</span></code>)</p></td>
-<td><p>01:01.139</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="from_tensorflow.html#sphx-glr-how-to-compile-models-from-tensorflow-py"><span class="std std-ref">Compile Tensorflow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tensorflow.py</span></code>)</p></td>
+<td><p>01:00.864</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_paddle.html#sphx-glr-how-to-compile-models-from-paddle-py"><span class="std std-ref">Compile PaddlePaddle Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_paddle.py</span></code>)</p></td>
-<td><p>00:39.461</p></td>
+<td><p>00:39.238</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_oneflow.html#sphx-glr-how-to-compile-models-from-oneflow-py"><span class="std std-ref">Compile OneFlow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_oneflow.py</span></code>)</p></td>
-<td><p>00:28.160</p></td>
+<td><p>00:29.048</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_mxnet.html#sphx-glr-how-to-compile-models-from-mxnet-py"><span class="std std-ref">Compile MXNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_mxnet.py</span></code>)</p></td>
-<td><p>00:25.808</p></td>
+<td><p>00:25.201</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_tflite.html#sphx-glr-how-to-compile-models-from-tflite-py"><span class="std std-ref">Compile TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tflite.py</span></code>)</p></td>
-<td><p>00:24.177</p></td>
+<td><p>00:24.474</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_coreml.html#sphx-glr-how-to-compile-models-from-coreml-py"><span class="std std-ref">Compile CoreML Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_coreml.py</span></code>)</p></td>
-<td><p>00:23.146</p></td>
+<td><p>00:22.189</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_pytorch.html#sphx-glr-how-to-compile-models-from-pytorch-py"><span class="std std-ref">Compile PyTorch Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_pytorch.py</span></code>)</p></td>
-<td><p>00:19.420</p></td>
+<td><p>00:19.194</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="from_keras.html#sphx-glr-how-to-compile-models-from-keras-py"><span class="std std-ref">Compile Keras Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_keras.py</span></code>)</p></td>
-<td><p>00:15.703</p></td>
+<td><p>00:15.453</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="from_onnx.html#sphx-glr-how-to-compile-models-from-onnx-py"><span class="std std-ref">Compile ONNX Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_onnx.py</span></code>)</p></td>
-<td><p>00:02.316</p></td>
+<td><p>00:02.446</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/deploy/index.html b/docs/how_to/deploy/index.html
index 36020451f..05617e918 100644
--- a/docs/how_to/deploy/index.html
+++ b/docs/how_to/deploy/index.html
@@ -406,7 +406,7 @@ After you get the TVM runtime library, you can link the compiled library</p>
 </div>
 <p>A model (optimized or not by TVM) can be cross compiled by TVM for
 different architectures such as <code class="docutils literal notranslate"><span class="pre">aarch64</span></code> on a <code class="docutils literal notranslate"><span class="pre">x64_64</span></code> host. Once the model
-is cross compiled it is neccessary to have a runtime compatible with the target
+is cross compiled it is necessary to have a runtime compatible with the target
 architecture to be able to run the cross compiled model.</p>
 </div>
 <div class="section" id="cross-compile-the-tvm-runtime-for-other-architectures">
diff --git a/docs/how_to/deploy_models/deploy_model_on_android.html b/docs/how_to/deploy_models/deploy_model_on_android.html
index f1a914bc7..f85f62201 100644
--- a/docs/how_to/deploy_models/deploy_model_on_android.html
+++ b/docs/how_to/deploy_models/deploy_model_on_android.html
@@ -653,7 +653,7 @@ to the remote android device.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  16.0282      15.9883      16.2973      15.8762       0.1440
+  15.8096      15.7889      15.9555      15.6716       0.0876
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
index 7b56962d9..fbafbcb4b 100644
--- a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
+++ b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
@@ -436,15 +436,13 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth&quot; to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
 
   0%|          | 0.00/170M [00:00&lt;?, ?B/s]
-  2%|2         | 3.87M/170M [00:00&lt;00:04, 40.4MB/s]
-  5%|4         | 8.02M/170M [00:00&lt;00:04, 42.3MB/s]
- 19%|#9        | 32.7M/170M [00:00&lt;00:01, 141MB/s]
- 34%|###3      | 56.9M/170M [00:00&lt;00:00, 185MB/s]
- 47%|####6     | 79.5M/170M [00:00&lt;00:00, 204MB/s]
- 60%|######    | 102M/170M [00:00&lt;00:00, 216MB/s]
- 73%|#######3  | 125M/170M [00:00&lt;00:00, 222MB/s]
- 87%|########7 | 148M/170M [00:00&lt;00:00, 228MB/s]
-100%|##########| 170M/170M [00:00&lt;00:00, 199MB/s]
+ 12%|#1        | 20.1M/170M [00:00&lt;00:00, 210MB/s]
+ 26%|##6       | 44.4M/170M [00:00&lt;00:00, 237MB/s]
+ 41%|####      | 69.1M/170M [00:00&lt;00:00, 247MB/s]
+ 56%|#####5    | 95.0M/170M [00:00&lt;00:00, 257MB/s]
+ 72%|#######1  | 122M/170M [00:00&lt;00:00, 265MB/s]
+ 87%|########7 | 148M/170M [00:00&lt;00:00, 268MB/s]
+100%|##########| 170M/170M [00:00&lt;00:00, 261MB/s]
 /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
   for i in range(dim)
 /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the &#39;trunc&#39; function NOT &#39;floor&#39;). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=&#39;trunc&#39;), or for actual floor division, use torch.div(a, b, rounding_mode=&#39;floor&#39;).
@@ -539,7 +537,7 @@ torchvision rcnn models.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Get 9 valid boxes
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  5.158 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  57.719 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-object-detection-pytorch-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7795da4b258c8feff986668b95ef57ad/deploy_object_detection_pytorch.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_object_detection_pytorch.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized.html b/docs/how_to/deploy_models/deploy_prequantized.html
index b629deab8..816d81725 100644
--- a/docs/how_to/deploy_models/deploy_prequantized.html
+++ b/docs/how_to/deploy_models/deploy_prequantized.html
@@ -480,7 +480,7 @@ training. Other models require a full post training calibration.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/mobilenet_v2-b0353104.pth&quot; to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
 
   0%|          | 0.00/13.6M [00:00&lt;?, ?B/s]
-100%|##########| 13.6M/13.6M [00:00&lt;00:00, 176MB/s]
+100%|##########| 13.6M/13.6M [00:00&lt;00:00, 168MB/s]
 </pre></div>
 </div>
 </div>
@@ -569,7 +569,7 @@ output values are identical out of 1000 outputs from mobilenet v2.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  90.3587      90.3543      91.8387      90.0842       0.1957
+  90.1677      90.0595      96.0860      89.9498       0.6435
 </pre></div>
 </div>
 <div class="admonition note">
@@ -608,7 +608,7 @@ This includes support for the VNNI 8 bit dot product instruction (CascadeLake or
 <div class="section" id="deploy-a-quantized-tflite-model">
 <h2>Deploy a quantized TFLite Model<a class="headerlink" href="#deploy-a-quantized-tflite-model" title="Permalink to this headline">¶</a></h2>
 <p>TODO</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  10.869 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  9.702 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/fb8217c13f4351224c6cf3aacf1a87fc/deploy_prequantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized_tflite.html b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
index 38a0c0612..7ac89ee0e 100644
--- a/docs/how_to/deploy_models/deploy_prequantized_tflite.html
+++ b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
@@ -573,7 +573,7 @@ TFLite Top-5 labels: [387 102 386 341 349]
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  121.0594     120.9719     124.8811     120.1730      0.5547
+  118.7824     118.7393     121.0677     118.0321      0.3706
 </pre></div>
 </div>
 <div class="admonition note">
@@ -601,7 +601,7 @@ network for ARM CPU</span></a>.</p></li>
 </ul>
 </div></blockquote>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  52.696 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  53.612 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-tflite-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/56691c7a27d45da61d112276334640d3/deploy_prequantized_tflite.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized_tflite.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_quantized.html b/docs/how_to/deploy_models/deploy_quantized.html
index ce83a4826..43d23d98f 100644
--- a/docs/how_to/deploy_models/deploy_quantized.html
+++ b/docs/how_to/deploy_models/deploy_quantized.html
@@ -509,7 +509,7 @@ for calibration. But the accuracy might be impacted.</p>
   DeprecationWarning,
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  20.780 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  51.204 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-quantized-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7810ecf51bfc05f7d5e8a400ac3e815d/deploy_quantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_quantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
index 0cbbaf87d..6e42fb526 100644
--- a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
+++ b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
@@ -441,24 +441,24 @@ to your device.</p>
 Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
 
   0%|          | 0/132723 [00:00&lt;?, ?KB/s]
-  4%|3         | 5018/132723 [00:00&lt;00:02, 50173.35KB/s]
-  9%|9         | 12125/132723 [00:00&lt;00:01, 62461.36KB/s]
- 15%|#5        | 20091/132723 [00:00&lt;00:01, 70309.28KB/s]
- 21%|##1       | 28115/132723 [00:00&lt;00:01, 74223.67KB/s]
- 27%|##7       | 36142/132723 [00:00&lt;00:01, 76401.99KB/s]
- 33%|###3      | 44155/132723 [00:00&lt;00:01, 77665.09KB/s]
- 39%|###9      | 52160/132723 [00:00&lt;00:01, 78434.15KB/s]
- 45%|####5     | 60211/132723 [00:00&lt;00:00, 79092.22KB/s]
- 51%|#####1    | 68300/132723 [00:00&lt;00:00, 79650.92KB/s]
- 58%|#####7    | 76331/132723 [00:01&lt;00:00, 79852.16KB/s]
- 64%|######3   | 84387/132723 [00:01&lt;00:00, 80066.11KB/s]
- 70%|######9   | 92394/132723 [00:01&lt;00:00, 79949.90KB/s]
- 76%|#######5  | 100427/132723 [00:01&lt;00:00, 80063.39KB/s]
- 82%|########1 | 108459/132723 [00:01&lt;00:00, 80131.23KB/s]
- 88%|########7 | 116560/132723 [00:01&lt;00:00, 80393.72KB/s]
- 94%|#########3| 124600/132723 [00:01&lt;00:00, 80183.86KB/s]
-100%|#########9| 132620/132723 [00:01&lt;00:00, 80185.97KB/s]
-100%|##########| 132723/132723 [00:01&lt;00:00, 77828.16KB/s]
+  4%|3         | 4967/132723 [00:00&lt;00:02, 49666.61KB/s]
+ 10%|9         | 13035/132723 [00:00&lt;00:01, 67905.21KB/s]
+ 15%|#4        | 19826/132723 [00:00&lt;00:02, 46076.00KB/s]
+ 21%|##1       | 28056/132723 [00:00&lt;00:01, 57188.46KB/s]
+ 26%|##5       | 34460/132723 [00:00&lt;00:01, 50663.10KB/s]
+ 32%|###2      | 42576/132723 [00:00&lt;00:01, 58931.78KB/s]
+ 38%|###8      | 50764/132723 [00:00&lt;00:01, 65338.42KB/s]
+ 44%|####4     | 58975/132723 [00:00&lt;00:01, 70119.11KB/s]
+ 51%|#####     | 67141/132723 [00:01&lt;00:00, 73457.31KB/s]
+ 57%|#####6    | 75377/132723 [00:01&lt;00:00, 76058.53KB/s]
+ 63%|######2   | 83197/132723 [00:01&lt;00:00, 55172.82KB/s]
+ 69%|######8   | 91333/132723 [00:01&lt;00:00, 61258.99KB/s]
+ 74%|#######4  | 98301/132723 [00:01&lt;00:00, 45110.55KB/s]
+ 78%|#######8  | 104088/132723 [00:01&lt;00:00, 47690.73KB/s]
+ 85%|########4 | 112283/132723 [00:01&lt;00:00, 55370.12KB/s]
+ 91%|######### | 120519/132723 [00:02&lt;00:00, 61916.00KB/s]
+ 97%|#########7| 128752/132723 [00:02&lt;00:00, 67168.56KB/s]
+100%|##########| 132723/132723 [00:02&lt;00:00, 57332.06KB/s]
 </pre></div>
 </div>
 <p>Create TVM runtime and do inference
@@ -501,7 +501,7 @@ Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from h
 <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" srcset="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" alt="deploy ssd gluoncv" class = "sphx-glr-single-img"/><p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  40.544 seconds)</p>
+<img src="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" srcset="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" alt="deploy ssd gluoncv" class = "sphx-glr-single-img"/><p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  34.754 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-ssd-gluoncv-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/cccb17d28e5e8b2e94ea8cd5ec59f6ed/deploy_ssd_gluoncv.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_ssd_gluoncv.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/sg_execution_times.html b/docs/how_to/deploy_models/sg_execution_times.html
index cac80ac41..762d741bf 100644
--- a/docs/how_to/deploy_models/sg_execution_times.html
+++ b/docs/how_to/deploy_models/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-deploy-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>11:25.711</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
+<p><strong>11:40.675</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 86%" />
@@ -336,35 +336,35 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_object_detection_pytorch.html#sphx-glr-how-to-deploy-models-deploy-object-detection-pytorch-py"><span class="std std-ref">Compile PyTorch Object Detection Models</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_object_detection_pytorch.py</span></code>)</p></td>
-<td><p>03:05.158</p></td>
+<td><p>02:57.719</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_ssd_gluoncv.html#sphx-glr-how-to-deploy-models-deploy-ssd-gluoncv-py"><span class="std std-ref">Deploy Single Shot Multibox Detector(SSD) model</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_ssd_gluoncv.py</span></code>)</p></td>
-<td><p>02:40.544</p></td>
+<td><p>02:34.754</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_prequantized_tflite.html#sphx-glr-how-to-deploy-models-deploy-prequantized-tflite-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM - Part 3 (TFLite)</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized_tflite.py</span></code>)</p></td>
-<td><p>01:52.696</p></td>
+<td><p>01:53.612</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_quantized.html#sphx-glr-how-to-deploy-models-deploy-quantized-py"><span class="std std-ref">Deploy a Quantized Model on Cuda</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_quantized.py</span></code>)</p></td>
-<td><p>01:20.780</p></td>
+<td><p>01:51.204</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_prequantized.html#sphx-glr-how-to-deploy-models-deploy-prequantized-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized.py</span></code>)</p></td>
-<td><p>01:10.869</p></td>
+<td><p>01:09.702</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_model_on_android.html#sphx-glr-how-to-deploy-models-deploy-model-on-android-py"><span class="std std-ref">Deploy the Pretrained Model on Android</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_android.py</span></code>)</p></td>
-<td><p>00:30.326</p></td>
+<td><p>00:29.531</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_model_on_nano.html#sphx-glr-how-to-deploy-models-deploy-model-on-nano-py"><span class="std std-ref">Deploy the Pretrained Model on Jetson Nano</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_nano.py</span></code>)</p></td>
-<td><p>00:22.963</p></td>
+<td><p>00:22.311</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_model_on_rasp.html#sphx-glr-how-to-deploy-models-deploy-model-on-rasp-py"><span class="std std-ref">Deploy the Pretrained Model on Raspberry Pi</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_rasp.py</span></code>)</p></td>
-<td><p>00:22.369</p></td>
+<td><p>00:21.836</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_sparse.html#sphx-glr-how-to-deploy-models-deploy-sparse-py"><span class="std std-ref">Deploy a Hugging Face Pruned Model on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_sparse.py</span></code>)</p></td>
diff --git a/docs/how_to/extend_tvm/bring_your_own_datatypes.html b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
index 735f3db26..691560d6b 100644
--- a/docs/how_to/extend_tvm/bring_your_own_datatypes.html
+++ b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
@@ -612,7 +612,7 @@ In this alpha state of the Bring Your Own Datatypes framework, we have not imple
 <span class="n">module</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">params</span></a> <span class="o">=</span> <span class="n">get_mobilenet</span><span class="p">()</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip2870b69e-a103-4d4c-95ee-3a9991999beb from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip6222c333-f560-463f-93e3-c67ac93647d9 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 </pre></div>
 </div>
 <p>It’s easy to execute MobileNet with native TVM:</p>
@@ -676,7 +676,7 @@ In this alpha state of the Bring Your Own Datatypes framework, we have not imple
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-  Check failed: (lower) is false: Intrinsic lowering function for target llvm, intrinsic name tir.sqrt, type 150 not found
+  Check failed: (lower) is false: FloatImm lowering function for target llvm type 150 not found
 </pre></div>
 </div>
 <p>When we attempt to run the model, we get a familiar error telling us that more functions need to be registered for myfloat.</p>
diff --git a/docs/how_to/extend_tvm/sg_execution_times.html b/docs/how_to/extend_tvm/sg_execution_times.html
index d876335bf..2b430f306 100644
--- a/docs/how_to/extend_tvm/sg_execution_times.html
+++ b/docs/how_to/extend_tvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-extend-tvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:41.678</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
+<p><strong>00:41.101</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,19 +336,19 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="bring_your_own_datatypes.html#sphx-glr-how-to-extend-tvm-bring-your-own-datatypes-py"><span class="std std-ref">Bring Your Own Datatypes to TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">bring_your_own_datatypes.py</span></code>)</p></td>
-<td><p>00:38.397</p></td>
+<td><p>00:37.861</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="use_pass_instrument.html#sphx-glr-how-to-extend-tvm-use-pass-instrument-py"><span class="std std-ref">How to Use TVM Pass Instrument</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_instrument.py</span></code>)</p></td>
-<td><p>00:02.305</p></td>
+<td><p>00:02.287</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="use_pass_infra.html#sphx-glr-how-to-extend-tvm-use-pass-infra-py"><span class="std std-ref">How to Use TVM Pass Infra</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_infra.py</span></code>)</p></td>
-<td><p>00:00.967</p></td>
+<td><p>00:00.944</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="low_level_custom_pass.html#sphx-glr-how-to-extend-tvm-low-level-custom-pass-py"><span class="std std-ref">Writing a Customized Pass</span></a> (<code class="docutils literal notranslate"><span class="pre">low_level_custom_pass.py</span></code>)</p></td>
-<td><p>00:00.009</p></td>
+<td><p>00:00.008</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/extend_tvm/use_pass_instrument.html b/docs/how_to/extend_tvm/use_pass_instrument.html
index 704441dfd..cb05e2ac8 100644
--- a/docs/how_to/extend_tvm/use_pass_instrument.html
+++ b/docs/how_to/extend_tvm/use_pass_instrument.html
@@ -512,10 +512,10 @@ profile the execution time of each passes.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 6885us [6885us] (46.33%; 46.33%)
-FoldScaleAxis: 7975us [6us] (53.67%; 53.67%)
-        FoldConstant: 7968us [1598us] (53.62%; 99.92%)
-                InferType: 6370us [6370us] (42.87%; 79.94%)
+InferType: 6965us [6965us] (46.40%; 46.40%)
+FoldScaleAxis: 8047us [6us] (53.60%; 53.60%)
+        FoldConstant: 8041us [1642us] (53.56%; 99.93%)
+                InferType: 6399us [6399us] (42.63%; 79.58%)
 </pre></div>
 </div>
 </div>
@@ -537,10 +537,10 @@ Refer to following sections and <a class="reference internal" href="../../refere
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 6441us [6441us] (45.00%; 45.00%)
-FoldScaleAxis: 7872us [5us] (55.00%; 55.00%)
-        FoldConstant: 7866us [1620us] (54.96%; 99.93%)
-                InferType: 6246us [6246us] (43.64%; 79.40%)
+InferType: 6454us [6454us] (44.60%; 44.60%)
+FoldScaleAxis: 8016us [6us] (55.40%; 55.40%)
+        FoldConstant: 8010us [1656us] (55.36%; 99.93%)
+                InferType: 6354us [6354us] (43.91%; 79.33%)
 </pre></div>
 </div>
 <p>Register empty list to clear existing instruments.</p>
diff --git a/docs/how_to/optimize_operators/opt_conv_cuda.html b/docs/how_to/optimize_operators/opt_conv_cuda.html
index c5b943448..2702dc617 100644
--- a/docs/how_to/optimize_operators/opt_conv_cuda.html
+++ b/docs/how_to/optimize_operators/opt_conv_cuda.html
@@ -564,7 +564,7 @@ latency of convolution.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Convolution: </span><span class="si">%f</span><span class="s2"> ms&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span> <span class="o">*</span> <span cl [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 54.245192 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 54.269767 ms
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-optimize-operators-opt-conv-cuda-py">
diff --git a/docs/how_to/optimize_operators/opt_conv_tensorcore.html b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
index 4a0727b55..90ebdff7a 100644
--- a/docs/how_to/optimize_operators/opt_conv_tensorcore.html
+++ b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
@@ -906,7 +906,7 @@ be able to run on our build server</p>
     <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;conv2d with tensor core: </span><span class="si">%f</span><span class="s2"> ms&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span> <span class="o">* [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 9.119490 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 7.168272 ms
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/optimize_operators/opt_gemm.html b/docs/how_to/optimize_operators/opt_gemm.html
index 0bc41de8d..8a668d583 100644
--- a/docs/how_to/optimize_operators/opt_gemm.html
+++ b/docs/how_to/optimize_operators/opt_gemm.html
@@ -461,8 +461,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Baseline: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018935
-Baseline: 3.409753
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018922
+Baseline: 3.316339
 </pre></div>
 </div>
 <p>In TVM, we can always inspect lower level IR to debug or optimize our schedule.
@@ -522,7 +522,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt1: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.316767
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.305577
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -589,7 +589,7 @@ vastly.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt2: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.346355
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.336554
 </pre></div>
 </div>
 <p>Here is the generated IR after vectorization.</p>
@@ -650,7 +650,7 @@ the access pattern for A matrix is more cache friendly.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt3: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.119693
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.115825
 </pre></div>
 </div>
 <p>Here is the generated IR after loop permutation.</p>
@@ -733,7 +733,7 @@ flattening.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt4: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.110624
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.108125
 </pre></div>
 </div>
 <p>Here is the generated IR after array packing.</p>
@@ -819,7 +819,7 @@ write to C when all the block results are ready.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt5: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">evaluator</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.111787
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.111973
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -873,7 +873,7 @@ write to C when all the block results are ready.</p>
 </div>
 <div class="section" id="parallel">
 <h2>Parallel<a class="headerlink" href="#parallel" title="Permalink to this headline">¶</a></h2>
-<p>Futhermore, we can also utilize multi-core processors to do the thread-level parallelization.</p>
+<p>Furthermore, we can also utilize multi-core processors to do the thread-level parallelization.</p>
 <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><a href="../../reference/api/python/te.html#tvm.te.Schedule" title="tvm.te.Schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">s</span></a> <span class="o">=</span> <a href="../../reference/api/python/te.html#tvm.te.create_schedule" title="tvm.te.create_schedule" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-function">< [...]
 
 <a href="../../reference/api/python/te.html#tvm.te.Tensor" title="tvm.te.Tensor" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">CC</span></a> <span class="o">=</span> <a href="../../reference/api/python/te.html#tvm.te.Schedule.cache_write" title="tvm.te.Schedule.cache_write" class="sphx-glr-backref-module-tvm-te sphx-glr-backref-type-py-method"><span class="n">s</span><span class="o">.</span><span class="n">cache_write</spa [...]
@@ -909,7 +909,7 @@ write to C when all the block results are ready.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Opt6: </span><span class="si">%f</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="n">opt6_time</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.147276
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.146406
 </pre></div>
 </div>
 <p>Here is the generated IR after parallelization.</p>
diff --git a/docs/how_to/optimize_operators/sg_execution_times.html b/docs/how_to/optimize_operators/sg_execution_times.html
index e4b20f29a..13e68da16 100644
--- a/docs/how_to/optimize_operators/sg_execution_times.html
+++ b/docs/how_to/optimize_operators/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-optimize-operators-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:35.008</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
+<p><strong>00:34.475</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,15 +336,15 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="opt_gemm.html#sphx-glr-how-to-optimize-operators-opt-gemm-py"><span class="std std-ref">How to optimize GEMM on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_gemm.py</span></code>)</p></td>
-<td><p>00:32.738</p></td>
+<td><p>00:32.142</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="opt_conv_tensorcore.html#sphx-glr-how-to-optimize-operators-opt-conv-tensorcore-py"><span class="std std-ref">How to optimize convolution using TensorCores</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_tensorcore.py</span></code>)</p></td>
-<td><p>00:01.280</p></td>
+<td><p>00:01.279</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="opt_conv_cuda.html#sphx-glr-how-to-optimize-operators-opt-conv-cuda-py"><span class="std std-ref">How to optimize convolution on GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_cuda.py</span></code>)</p></td>
-<td><p>00:00.990</p></td>
+<td><p>00:01.053</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
index bfa9f22f3..d39cc49fb 100644
--- a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
+++ b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autoscheduler-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>06:10.861</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
+<p><strong>06:12.217</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 85%" />
@@ -336,27 +336,27 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_conv2d_layer_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py"><span class="std std-ref">Auto-scheduling a Convolution Layer for GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_layer_cuda.py</span></code>)</p></td>
-<td><p>03:19.348</p></td>
+<td><p>03:26.198</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_network_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-x86-py"><span class="std std-ref">Auto-scheduling a Neural Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_x86.py</span></code>)</p></td>
-<td><p>01:23.967</p></td>
+<td><p>01:22.571</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_network_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-cuda-py"><span class="std std-ref">Auto-scheduling a Neural Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_cuda.py</span></code>)</p></td>
-<td><p>00:48.069</p></td>
+<td><p>00:46.992</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_sparse_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-sparse-x86-py"><span class="std std-ref">Auto-scheduling Sparse Matrix Multiplication on CPU with Custom Sketch Rule</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_sparse_x86.py</span></code>)</p></td>
-<td><p>00:21.280</p></td>
+<td><p>00:18.754</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_network_mali.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-mali-py"><span class="std std-ref">Auto-scheduling a Neural Network for mali GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_mali.py</span></code>)</p></td>
-<td><p>00:09.184</p></td>
+<td><p>00:08.990</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_network_arm.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-arm-py"><span class="std std-ref">Auto-scheduling a Neural Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_arm.py</span></code>)</p></td>
-<td><p>00:09.013</p></td>
+<td><p>00:08.712</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
index c7b0cf3c2..13d34e811 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
@@ -475,6 +475,9 @@ file and apply it.</p>
 <span class="k">del</span> <span class="n">measure_ctx</span>
 </pre></div>
 </div>
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>.T
+</pre></div>
+</div>
 <p>We can lower the schedule to see the IR after auto-scheduling.
 The auto-scheduler correctly performs optimizations including multi-level tiling,
 cooperative fetching, unrolling and operator fusion.</p>
@@ -1004,7 +1007,7 @@ cooperative fetching, unrolling and operator fusion.</p>
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.364 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.366 ms
 </pre></div>
 </div>
 </div>
@@ -1567,7 +1570,7 @@ In the example below we resume the status and do more 5 trials.</p>
 Get devices for measurement successfully!
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  19.348 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  26.198 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e3e540f3b477c0c52d8eb73e674e8ffd/tune_conv2d_layer_cuda.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_conv2d_layer_cuda.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
index 6a153bc1a..e9df03e21 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
@@ -906,7 +906,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   9.9529       9.9579       9.9821       9.9188       0.0261
+  10.0015       9.9899      10.0619       9.9527       0.0453
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
index f5f2ae0b4..56adab97f 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
@@ -925,7 +925,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  761.4623     761.3826     762.0521     760.9522      0.4525
+  753.4226     753.4309     753.6561     753.1808      0.1941
 </pre></div>
 </div>
 </div>
@@ -947,7 +947,7 @@ to learn how to use the RPC Tracker and RPC Server.
 To use the RPC Tracker in auto-scheduler, replace the runner in <code class="code docutils literal notranslate"><span class="pre">TuningOptions</span></code>
 with <a class="reference internal" href="../../reference/api/python/auto_scheduler.html#tvm.auto_scheduler.RPCRunner" title="tvm.auto_scheduler.RPCRunner"><code class="xref any py py-class docutils literal notranslate"><span class="pre">auto_scheduler.RPCRunner</span></code></a>.</p></li>
 </ol>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  23.967 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  22.571 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-network-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e416b94ca1090b0897c0f6e0df95b911/tune_network_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_network_x86.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
index f4d1f4d8f..5f152a7fc 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
@@ -625,29 +625,79 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
              placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
              compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
   buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-  preflattened_buffer_map = {placeholder_8: placeholder_15: Buffer(placeholder_13, int32, [33], []), placeholder_7: placeholder_16: Buffer(placeholder_12, int32, [4916], []), placeholder_9: placeholder_17: Buffer(placeholder_14, float32, [128, 512], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_6: placeholder_19: Buffer(placeholder_11, float32, [4916, 16, 1], [])} {
-  for (i0.outer.i1.outer.fused: int32, 0, 32) &quot;parallel&quot; {
-    allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global {
-      for (i.outer.inner: int32, 0, 32) {
-        for (i.inner.init: int32, 0, 4) {
-          for (j.init: int32, 0, 16) {
-            compute_5: Buffer(compute_4, float32, [2048], [])[(((i.outer.inner*64) + (i.inner.init*16)) + j.init)] = 0f32
+  preflattened_buffer_map = {placeholder_9: placeholder_15: Buffer(placeholder_14, float32, [128, 512], []), placeholder_6: placeholder_16: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_7: placeholder_17: Buffer(placeholder_12, int32, [4916], []), placeholder_8: placeholder_18: Buffer(placeholder_13, int32, [33], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_5: placeholder_19: Buffer(placeholder_10, float32, [128, 256], [])} {
+  for (i0.outer: int32, 0, 2) &quot;parallel&quot; {
+    allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global;
+    for (i1.outer: int32, 0, 16) {
+      for (i.outer.inner: int32, 0, 8) {
+        for (nb_j.inner: int32, 0, 2) {
+          for (i.inner.init: int32, 0, 8) {
+            let cse_var_1: int32 = (((i.outer.inner*256) + (i.inner.init*32)) + (nb_j.inner*16))
+             {
+              compute_5: Buffer(compute_4, float32, [2048], [])[cse_var_1] = 0f32
+              compute_5[(cse_var_1 + 1)] = 0f32
+              compute_5[(cse_var_1 + 2)] = 0f32
+              compute_5[(cse_var_1 + 3)] = 0f32
+              compute_5[(cse_var_1 + 4)] = 0f32
+              compute_5[(cse_var_1 + 5)] = 0f32
+              compute_5[(cse_var_1 + 6)] = 0f32
+              compute_5[(cse_var_1 + 7)] = 0f32
+              compute_5[(cse_var_1 + 8)] = 0f32
+              compute_5[(cse_var_1 + 9)] = 0f32
+              compute_5[(cse_var_1 + 10)] = 0f32
+              compute_5[(cse_var_1 + 11)] = 0f32
+              compute_5[(cse_var_1 + 12)] = 0f32
+              compute_5[(cse_var_1 + 13)] = 0f32
+              compute_5[(cse_var_1 + 14)] = 0f32
+              compute_5[(cse_var_1 + 15)] = 0f32
+            }
           }
-        }
-        for (elem_idx: int32, 0, (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])) {
-          for (i.inner: int32, 0, 4) {
-            for (j: int32, 0, 16) {
-              if @tir.likely((elem_idx &lt; (placeholder_3[(i0.outer.i1.outer.fused + 1)] - placeholder_3[i0.outer.i1.outer.fused])), dtype=bool) {
-                let cse_var_1: int32 = (((i.outer.inner*64) + (i.inner*16)) + j)
-                compute_5[cse_var_1] = (compute_5[cse_var_1] + (placeholder_1[(((placeholder_3[i0.outer.i1.outer.fused]*16) + (elem_idx*16)) + j)]*max(placeholder[(((i.outer.inner*1024) + (i.inner*256)) + placeholder_2[(placeholder_3[i0.outer.i1.outer.fused] + elem_idx)])], 0f32)))
+          for (elem_idx: int32, 0, let cse_var_2: int32 = ((i1.outer*2) + nb_j.inner) in (placeholder_3[(cse_var_2 + 1)] - placeholder_3[cse_var_2])) {
+            for (i.inner: int32, 0, 8) {
+              let cse_var_21: int32 = (elem_idx*16)
+              let cse_var_20: int32 = ((i1.outer*2) + nb_j.inner)
+              let cse_var_19: int32 = (((i0.outer*16384) + (i.outer.inner*2048)) + (i.inner*256))
+              let cse_var_18: int32 = (((i.outer.inner*256) + (i.inner*32)) + (nb_j.inner*16))
+              let cse_var_17: int32 = (cse_var_18 + 9)
+              let cse_var_16: int32 = (cse_var_18 + 8)
+              let cse_var_15: int32 = (cse_var_18 + 7)
+              let cse_var_14: int32 = (cse_var_18 + 6)
+              let cse_var_13: int32 = (cse_var_18 + 5)
+              let cse_var_12: int32 = (cse_var_18 + 4)
+              let cse_var_11: int32 = (cse_var_18 + 3)
+              let cse_var_10: int32 = (cse_var_18 + 2)
+              let cse_var_9: int32 = (cse_var_18 + 15)
+              let cse_var_8: int32 = (cse_var_18 + 14)
+              let cse_var_7: int32 = (cse_var_18 + 13)
+              let cse_var_6: int32 = (cse_var_18 + 12)
+              let cse_var_5: int32 = (cse_var_18 + 11)
+              let cse_var_4: int32 = (cse_var_18 + 10)
+              let cse_var_3: int32 = (cse_var_18 + 1)
+               {
+                compute_5[cse_var_18] = (compute_5[cse_var_18] + (placeholder_1[((placeholder_3[cse_var_20]*16) + cse_var_21)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_3] = (compute_5[cse_var_3] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 1)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_10] = (compute_5[cse_var_10] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 2)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_11] = (compute_5[cse_var_11] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 3)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_12] = (compute_5[cse_var_12] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 4)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_13] = (compute_5[cse_var_13] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 5)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_14] = (compute_5[cse_var_14] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 6)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_15] = (compute_5[cse_var_15] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 7)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_16] = (compute_5[cse_var_16] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 8)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_17] = (compute_5[cse_var_17] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 9)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_4] = (compute_5[cse_var_4] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 10)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_5] = (compute_5[cse_var_5] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 11)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_6] = (compute_5[cse_var_6] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 12)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_7] = (compute_5[cse_var_7] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 13)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_8] = (compute_5[cse_var_8] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 14)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
+                compute_5[cse_var_9] = (compute_5[cse_var_9] + (placeholder_1[(((placeholder_3[cse_var_20]*16) + cse_var_21) + 15)]*max(placeholder[(cse_var_19 + placeholder_2[(placeholder_3[cse_var_20] + elem_idx)])], 0f32)))
               }
             }
           }
         }
       }
-      for (i0.inner: int32, 0, 128) {
-        let cse_var_2: int32 = ((i0.inner*512) + (i0.outer.i1.outer.fused*16))
-        compute[ramp(cse_var_2, 1, 16)] = max((compute_5[ramp((i0.inner*16), 1, 16)] + placeholder_4[ramp(cse_var_2, 1, 16)]), broadcast(0f32, 16))
+      for (i0.inner: int32, 0, 64) {
+        let cse_var_22: int32 = (((i0.outer*32768) + (i0.inner*512)) + (i1.outer*32))
+        compute[ramp(cse_var_22, 1, 32)] = max((compute_5[ramp((i0.inner*32), 1, 32)] + placeholder_4[ramp(cse_var_22, 1, 32)]), broadcast(0f32, 32))
       }
     }
   }
@@ -685,7 +735,7 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 1.465 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 1.808 ms
 </pre></div>
 </div>
 <div class="admonition note">
diff --git a/docs/how_to/tune_with_autotvm/sg_execution_times.html b/docs/how_to/tune_with_autotvm/sg_execution_times.html
index 00e6d6e68..7a97022db 100644
--- a/docs/how_to/tune_with_autotvm/sg_execution_times.html
+++ b/docs/how_to/tune_with_autotvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autotvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:46.252</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
+<p><strong>00:45.843</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,15 +336,15 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_conv2d_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-conv2d-cuda-py"><span class="std std-ref">Tuning High Performance Convolution on NVIDIA GPUs</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_cuda.py</span></code>)</p></td>
-<td><p>00:46.213</p></td>
+<td><p>00:45.808</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_relay_x86.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-x86-py"><span class="std std-ref">Auto-tuning a Convolutional Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_x86.py</span></code>)</p></td>
-<td><p>00:00.023</p></td>
+<td><p>00:00.020</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_relay_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-cuda-py"><span class="std std-ref">Auto-tuning a Convolutional Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_cuda.py</span></code>)</p></td>
-<td><p>00:00.006</p></td>
+<td><p>00:00.005</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_relay_arm.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-arm-py"><span class="std std-ref">Auto-tuning a Convolutional Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_arm.py</span></code>)</p></td>
diff --git a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
index 3aaf5dc8f..060c6a542 100644
--- a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
+++ b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
@@ -1436,8 +1436,8 @@ No: 8   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
 TimeoutError
 
         [(&#39;tile_f&#39;, [-1, 2, 1, 64]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4909501
-No: 9   GFLOPS: 80.88/80.88     result: MeasureResult(costs=(0.002862254142857143,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6815993785858154, timestamp=1660956105.774105)        [(&#39;tile_f&#39;, [-1, 1, 4, 8]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 2, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5072689
-No: 10  GFLOPS: 0.00/80.88      result: Traceback (most recent call last):
+No: 9   GFLOPS: 181.84/181.84   result: MeasureResult(costs=(0.0012730925666666667,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.0782089233398438, timestamp=1660955750.7828293)      [(&#39;tile_f&#39;, [-1, 1, 4, 8]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 2, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5072689
+No: 10  GFLOPS: 0.00/181.84     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1560,8 +1560,8 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 4, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 64, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,5092711
-No: 11  GFLOPS: 260.94/260.94   result: MeasureResult(costs=(0.0008871673259668507,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7710692882537842, timestamp=1660956106.696446)       [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
-No: 12  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+No: 11  GFLOPS: 261.55/261.55   result: MeasureResult(costs=(0.0008851229944751381,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7694880962371826, timestamp=1660955751.698509)       [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
+No: 12  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1684,7 +1684,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 128, 1, 2]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 256]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 0)],None,183542
-No: 13  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+No: 13  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1807,7 +1807,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 8, 8]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 64]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2482196
-No: 14  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+No: 14  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -1930,9 +1930,9 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 64, 1, 4]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 2]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,10306226
-No: 15  GFLOPS: 5.28/260.94     result: MeasureResult(costs=(0.043832007,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8575994968414307, timestamp=1660956111.309146) [(&#39;tile_f&#39;, [-1, 2, 2, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 8]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,5330964
-No: 16  GFLOPS: 3.35/260.94     result: MeasureResult(costs=(0.06914761450000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.611823081970215, timestamp=1660956112.54857)   [(&#39;tile_f&#39;, [-1, 8, 4, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2140058
-No: 17  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+No: 15  GFLOPS: 5.47/261.55     result: MeasureResult(costs=(0.04229653225,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.8152391910552979, timestamp=1660955756.22979)        [(&#39;tile_f&#39;, [-1, 2, 2, 8]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 8]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,5330964
+No: 16  GFLOPS: 3.33/261.55     result: MeasureResult(costs=(0.06949369325,), error_no=MeasureErrorNo.NO_ERROR, all_cost=4.554236173629761, timestamp=1660955757.4622016)       [(&#39;tile_f&#39;, [-1, 8, 4, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2140058
+No: 17  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 142, in build
     res = future.result()
   File &quot;/usr/lib/python3.7/concurrent/futures/_base.py&quot;, line 435, in result
@@ -1950,8 +1950,8 @@ No: 17  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
 TimeoutError
 
         [(&#39;tile_f&#39;, [-1, 2, 2, 1]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 16]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,10195251
-No: 18  GFLOPS: 28.12/260.94    result: MeasureResult(costs=(0.008233555571428571,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2862968444824219, timestamp=1660956123.5834103)       [(&#39;tile_f&#39;, [-1, 4, 8, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6068603
-No: 19  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+No: 18  GFLOPS: 26.90/261.55    result: MeasureResult(costs=(0.008606866785714285,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.2779605388641357, timestamp=1660955768.4915335)       [(&#39;tile_f&#39;, [-1, 4, 8, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6068603
+No: 19  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -2074,7 +2074,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 871, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 16, 4, 8]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 128]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6956993
-No: 20  GFLOPS: 0.00/260.94     result: Traceback (most recent call last):
+No: 20  GFLOPS: 0.00/261.55     result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 588, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 540, in _build_func_common
@@ -2237,7 +2237,7 @@ and measure running time.</p>
 Best config:
 [(&#39;tile_f&#39;, [-1, 8, 2, 1]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 2, 1]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4264713
 Finish loading 20 records
-Time cost of this operator: 0.001251
+Time cost of this operator: 0.001285
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autotvm-tune-conv2d-cuda-py">
diff --git a/docs/how_to/work_with_microtvm/micro_autotune.html b/docs/how_to/work_with_microtvm/micro_autotune.html
index 5363bb2f8..c1e8dd43a 100644
--- a/docs/how_to/work_with_microtvm/micro_autotune.html
+++ b/docs/how_to/work_with_microtvm/micro_autotune.html
@@ -584,10 +584,10 @@ the tuned operator.</p>
 ########## Build without Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
 ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  321.6     98.753   (1, 2, 10, 10, 3)  2       1        [321.6]
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.108     0.954    (1, 6, 10, 10)     1       1        [3.108]
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.953     0.293    (1, 1, 10, 10, 3)  1       1        [0.953]
-Total_time                                    -                                             325.662   -        -                  -       -        -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  311.4     98.717   (1, 2, 10, 10, 3)  2       1        [311.4]
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.045     0.965    (1, 6, 10, 10)     1       1        [3.045]
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         1.002     0.318    (1, 1, 10, 10, 3)  1       1        [1.002]
+Total_time                                    -                                             315.448   -        -                  -       -        -
 </pre></div>
 </div>
 </div>
@@ -640,10 +640,10 @@ Total_time                                    -
 ########## Build with Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
 ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  130.3     97.893   (1, 6, 10, 10, 1)  2       1        [130.3]
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.804     1.356    (1, 6, 10, 10)     1       1        [1.804]
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         1.0       0.751    (1, 1, 10, 10, 3)  1       1        [1.0]
-Total_time                                    -                                             133.104   -        -                  -       -        -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  151.2     98.223   (1, 6, 10, 10, 1)  2       1        [151.2]
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.772     1.151    (1, 6, 10, 10)     1       1        [1.772]
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.963     0.626    (1, 1, 10, 10, 3)  1       1        [0.963]
+Total_time                                    -                                             153.935   -        -                  -       -        -
 </pre></div>
 </div>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-work-with-microtvm-micro-autotune-py">
diff --git a/docs/how_to/work_with_microtvm/micro_train.html b/docs/how_to/work_with_microtvm/micro_train.html
index fac341955..4772ec619 100644
--- a/docs/how_to/work_with_microtvm/micro_train.html
+++ b/docs/how_to/work_with_microtvm/micro_train.html
@@ -516,7 +516,7 @@ take about <strong>2 minutes</strong> to download the Stanford Cars, while COCO
 <a href="https://docs.python.org/3/library/shutil.html#shutil.move" title="shutil.move" class="sphx-glr-backref-module-shutil sphx-glr-backref-type-py-function"><span class="n">shutil</span><span class="o">.</span><span class="n">move</span></a><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><a href="https://docs.python.org/3/library/stdtypes.html#str" title="builtins.str" class="sphx-glr-backref-module-builtins sphx-glr-backref-typ [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&#39;/tmp/tmps76zrvrg/images/random&#39;
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&#39;/tmp/tmpwlew76_p/images/random&#39;
 </pre></div>
 </div>
 </div>
@@ -576,8 +576,8 @@ objects to other stuff? We can display some examples from our datasets using <co
     <span class="n">plt</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&quot;off&quot;</span><span class="p">)</span>
 </pre></div>
 </div>
-<img src="../../_images/sphx_glr_micro_train_001.png" srcset="../../_images/sphx_glr_micro_train_001.png" alt="[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0], [1.0, 0.0]" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/tmp/tmps76zrvrg/images/target contains 8144 images
-/tmp/tmps76zrvrg/images/random contains 5000 images
+<img src="../../_images/sphx_glr_micro_train_001.png" srcset="../../_images/sphx_glr_micro_train_001.png" alt="[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0], [1.0, 0.0]" class = "sphx-glr-single-img"/><div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/tmp/tmpwlew76_p/images/target contains 8144 images
+/tmp/tmpwlew76_p/images/random contains 5000 images
 </pre></div>
 </div>
 </div>
@@ -689,13 +689,13 @@ the time on our validation set).</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Epoch 1/3
-328/328 - 55s - loss: 0.2238 - accuracy: 0.9232 - val_loss: 0.1414 - val_accuracy: 0.9551
+328/328 - 55s - loss: 0.2688 - accuracy: 0.9145 - val_loss: 0.1875 - val_accuracy: 0.9456
 Epoch 2/3
-328/328 - 53s - loss: 0.1021 - accuracy: 0.9629 - val_loss: 0.1238 - val_accuracy: 0.9596
+328/328 - 52s - loss: 0.1037 - accuracy: 0.9618 - val_loss: 0.1280 - val_accuracy: 0.9630
 Epoch 3/3
-328/328 - 52s - loss: 0.0651 - accuracy: 0.9757 - val_loss: 0.1455 - val_accuracy: 0.9532
+328/328 - 52s - loss: 0.0645 - accuracy: 0.9758 - val_loss: 0.1197 - val_accuracy: 0.9660
 
-&lt;keras.callbacks.History object at 0x7f39001d2810&gt;
+&lt;keras.callbacks.History object at 0x7f89d005d590&gt;
 </pre></div>
 </div>
 </div>
@@ -957,7 +957,7 @@ as intended.</p>
 <p>From here, we could modify the model to read live images from the camera - we have another
 Arduino tutorial for how to do that <a class="reference external" href="https://github.com/guberti/tvm-arduino-demos/tree/master/examples/person_detection">on GitHub</a>. Alternatively, we could also
 <a class="reference external" href="https://tvm.apache.org/docs/how_to/work_with_microtvm/micro_autotune.html">use TVM’s autotuning capabilities</a> to dramatically improve the model’s performance.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 4 minutes  45.650 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 5 minutes  33.813 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-work-with-microtvm-micro-train-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/b52cec46baf4f78d6bcd94cbe269c8a6/micro_train.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">micro_train.py</span></code></a></p>
diff --git a/docs/how_to/work_with_microtvm/sg_execution_times.html b/docs/how_to/work_with_microtvm/sg_execution_times.html
index da157f873..7e2ade92b 100644
--- a/docs/how_to/work_with_microtvm/sg_execution_times.html
+++ b/docs/how_to/work_with_microtvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-microtvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>05:40.534</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
+<p><strong>06:27.051</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,19 +336,19 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_train.html#sphx-glr-how-to-work-with-microtvm-micro-train-py"><span class="std std-ref">Training Vision Models for microTVM on Arduino</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_train.py</span></code>)</p></td>
-<td><p>04:45.650</p></td>
+<td><p>05:33.813</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="micro_autotune.html#sphx-glr-how-to-work-with-microtvm-micro-autotune-py"><span class="std std-ref">Autotuning with microTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_autotune.py</span></code>)</p></td>
-<td><p>00:43.578</p></td>
+<td><p>00:42.328</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_aot.html#sphx-glr-how-to-work-with-microtvm-micro-aot-py"><span class="std std-ref">microTVM Host-Driven AoT</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_aot.py</span></code>)</p></td>
-<td><p>00:07.836</p></td>
+<td><p>00:07.591</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="micro_tflite.html#sphx-glr-how-to-work-with-microtvm-micro-tflite-py"><span class="std std-ref">microTVM with TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_tflite.py</span></code>)</p></td>
-<td><p>00:03.468</p></td>
+<td><p>00:03.318</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="micro_ethosu.html#sphx-glr-how-to-work-with-microtvm-micro-ethosu-py"><span class="std std-ref">Running TVM on bare metal Arm(R) Cortex(R)-M55 CPU and Ethos(TM)-U55 NPU with CMSIS-NN</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_ethosu.py</span></code>)</p></td>
diff --git a/docs/how_to/work_with_relay/sg_execution_times.html b/docs/how_to/work_with_relay/sg_execution_times.html
index a0f0d4a32..f8bec6064 100644
--- a/docs/how_to/work_with_relay/sg_execution_times.html
+++ b/docs/how_to/work_with_relay/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-relay-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:45.905</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
+<p><strong>00:41.565</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,15 +336,15 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="using_pipeline_executor.html#sphx-glr-how-to-work-with-relay-using-pipeline-executor-py"><span class="std std-ref">Using Pipeline Executor in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_pipeline_executor.py</span></code>)</p></td>
-<td><p>00:32.611</p></td>
+<td><p>00:31.265</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="using_external_lib.html#sphx-glr-how-to-work-with-relay-using-external-lib-py"><span class="std std-ref">Using External Libraries in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_external_lib.py</span></code>)</p></td>
-<td><p>00:11.160</p></td>
+<td><p>00:08.753</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="build_gcn.html#sphx-glr-how-to-work-with-relay-build-gcn-py"><span class="std std-ref">Building a Graph Convolutional Network</span></a> (<code class="docutils literal notranslate"><span class="pre">build_gcn.py</span></code>)</p></td>
-<td><p>00:02.127</p></td>
+<td><p>00:01.539</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="using_relay_viz.html#sphx-glr-how-to-work-with-relay-using-relay-viz-py"><span class="std std-ref">Use Relay Visualizer to Visualize Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_relay_viz.py</span></code>)</p></td>
diff --git a/docs/how_to/work_with_schedules/intrin_math.html b/docs/how_to/work_with_schedules/intrin_math.html
index 8845202ba..9796c679b 100644
--- a/docs/how_to/work_with_schedules/intrin_math.html
+++ b/docs/how_to/work_with_schedules/intrin_math.html
@@ -522,7 +522,7 @@ The following example customizes CUDA lowering rule for <code class="code docuti
 <a href="../../reference/api/python/ir.html#tvm.ir.register_intrin_lowering" title="tvm.ir.register_intrin_lowering" class="sphx-glr-backref-module-tvm-ir sphx-glr-backref-type-py-function"><span class="n">register_intrin_lowering</span></a><span class="p">(</span><span class="s2">&quot;tir.exp&quot;</span><span class="p">,</span> <span class="n">target</span><span class="o">=</span><span class="s2">&quot;cuda&quot;</span><span class="p">,</span> <span class="n">f</span><span class="o">= [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&lt;function my_cuda_math_rule at 0x7f39011f1710&gt;
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>&lt;function my_cuda_math_rule at 0x7f8a421d0dd0&gt;
 </pre></div>
 </div>
 <p>Register the rule to TVM with override option to override existing rule.
diff --git a/docs/how_to/work_with_schedules/sg_execution_times.html b/docs/how_to/work_with_schedules/sg_execution_times.html
index 1fe54c3d3..9cdaff551 100644
--- a/docs/how_to/work_with_schedules/sg_execution_times.html
+++ b/docs/how_to/work_with_schedules/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-schedules-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:04.238</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
+<p><strong>00:04.157</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,23 +336,23 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="intrin_math.html#sphx-glr-how-to-work-with-schedules-intrin-math-py"><span class="std std-ref">Intrinsics and Math Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">intrin_math.py</span></code>)</p></td>
-<td><p>00:01.934</p></td>
+<td><p>00:01.917</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tensorize.html#sphx-glr-how-to-work-with-schedules-tensorize-py"><span class="std std-ref">Use Tensorize to Leverage Hardware Intrinsics</span></a> (<code class="docutils literal notranslate"><span class="pre">tensorize.py</span></code>)</p></td>
-<td><p>00:01.045</p></td>
+<td><p>00:00.996</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="reduction.html#sphx-glr-how-to-work-with-schedules-reduction-py"><span class="std std-ref">Reduction</span></a> (<code class="docutils literal notranslate"><span class="pre">reduction.py</span></code>)</p></td>
-<td><p>00:00.544</p></td>
+<td><p>00:00.539</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="scan.html#sphx-glr-how-to-work-with-schedules-scan-py"><span class="std std-ref">Scan and Recurrent Kernel</span></a> (<code class="docutils literal notranslate"><span class="pre">scan.py</span></code>)</p></td>
-<td><p>00:00.524</p></td>
+<td><p>00:00.520</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="extern_op.html#sphx-glr-how-to-work-with-schedules-extern-op-py"><span class="std std-ref">External Tensor Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">extern_op.py</span></code>)</p></td>
-<td><p>00:00.105</p></td>
+<td><p>00:00.101</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="schedule_primitives.html#sphx-glr-how-to-work-with-schedules-schedule-primitives-py"><span class="std std-ref">Schedule Primitives in TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">schedule_primitives.py</span></code>)</p></td>
@@ -360,7 +360,7 @@
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="tedd.html#sphx-glr-how-to-work-with-schedules-tedd-py"><span class="std std-ref">Use Tensor Expression Debug Display (TEDD) for Visualization</span></a> (<code class="docutils literal notranslate"><span class="pre">tedd.py</span></code>)</p></td>
-<td><p>00:00.027</p></td>
+<td><p>00:00.028</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tuple_inputs.html#sphx-glr-how-to-work-with-schedules-tuple-inputs-py"><span class="std std-ref">Compute and Reduce with Tuple Inputs</span></a> (<code class="docutils literal notranslate"><span class="pre">tuple_inputs.py</span></code>)</p></td>
diff --git a/docs/how_to/work_with_schedules/tensorize.html b/docs/how_to/work_with_schedules/tensorize.html
index 55d6f7b43..942fb9056 100644
--- a/docs/how_to/work_with_schedules/tensorize.html
+++ b/docs/how_to/work_with_schedules/tensorize.html
@@ -577,7 +577,7 @@ The importing needs to happen before the tensorized GEMV being executed.</p>
              C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
   buffer_map = {A_1: A, B_1: B, C_1: C}
   preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmp2atr81uo/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmp2atr81uo/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
+  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmpf8ixcx8o/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmpf8ixcx8o/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
   for (i, 0, 1024) {
     for (j.outer: int32, 0, 32) {
       @tir.call_extern(&quot;gemv_update&quot;, @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/index.html b/docs/index.html
index 170ebdc1f..76b30c405 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -328,7 +328,7 @@
             
   <div class="section" id="apache-tvm-documentation">
 <h1>Apache TVM Documentation<a class="headerlink" href="#apache-tvm-documentation" title="Permalink to this headline">¶</a></h1>
-<p>Welcome to the the documentation for Apache TVM, a deep learning compiler that
+<p>Welcome to the documentation for Apache TVM, a deep learning compiler that
 enables access to high-performance machine learning anywhere for everyone.
 TVM’s diverse community of hardware vendors, compiler engineers and ML
 researchers work together to build a unified, programmable software stack, that
diff --git a/docs/reference/api/doxygen/annotated.html b/docs/reference/api/doxygen/annotated.html
index c10f8ef5d..d7eacf37d 100644
--- a/docs/reference/api/doxygen/annotated.html
+++ b/docs/reference/api/doxygen/annotated.html
@@ -112,7 +112,7 @@ $(function() {
 <tr id="row_1_1_8_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStep.html" target="_self">CacheReadStep</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStepNode.html" title="Cache read step that corresponds to te::Schedule::cache_read. ">CacheReadStep [...]
 <tr id="row_1_1_9_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStepNode.html" target="_self">CacheReadStepNode</a></td><td class="desc">Cache read step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#a38ef95a62faf0c15f132847efa20249b" title="create a cache read of original tensor for [...]
 <tr id="row_1_1_10_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStep.html" target="_self">CacheWriteStep</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html" title="Cache write step that corresponds to te::Schedule::cache_write. ">CacheWr [...]
-<tr id="row_1_1_11_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html" target="_self">CacheWriteStepNode</a></td><td class="desc">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for produc [...]
+<tr id="row_1_1_11_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html" target="_self">CacheWriteStepNode</a></td><td class="desc">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for produc [...]
 <tr id="row_1_1_12_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStep.html" target="_self">ComputeAtStep</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStepNode.html" title="Compute at step that corresponds to te::Stage::compute_at. ">ComputeAtStepNo [...]
 <tr id="row_1_1_13_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStepNode.html" target="_self">ComputeAtStepNode</a></td><td class="desc">Compute at step that corresponds to <a class="el" href="classtvm_1_1te_1_1Stage.html#a071545484de7a894c01ccf0e77183730" title="specify the schedule to be computed at the p [...]
 <tr id="row_1_1_14_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAG.html" target="_self">ComputeDAG</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAGNode.html" title="The auto-scheduler&#39;s computational graph and related program analyses. ">ComputeD [...]
diff --git a/docs/reference/api/doxygen/c__backend__api_8h.html b/docs/reference/api/doxygen/c__backend__api_8h.html
index 43e4d39b6..6d783506c 100644
--- a/docs/reference/api/doxygen/c__backend__api_8h.html
+++ b/docs/reference/api/doxygen/c__backend__api_8h.html
@@ -177,7 +177,7 @@ Functions</h2></td></tr>
     <tr><td class="paramname">args</td><td>The arguments </td></tr>
     <tr><td class="paramname">type_codes</td><td>The type codes of the arguments </td></tr>
     <tr><td class="paramname">num_args</td><td>Number of arguments. </td></tr>
-    <tr><td class="paramname">out_ret_value</td><td>The output value of the the return value. </td></tr>
+    <tr><td class="paramname">out_ret_value</td><td>The output value of the return value. </td></tr>
     <tr><td class="paramname">out_ret_tcode</td><td>The output type code of the return value. </td></tr>
     <tr><td class="paramname">resource_handle</td><td>Pointer to associated resource.</td></tr>
   </table>
diff --git a/docs/reference/api/doxygen/classtvm_1_1FuncTypeNode.html b/docs/reference/api/doxygen/classtvm_1_1FuncTypeNode.html
index 1483c74d4..8cd079db8 100644
--- a/docs/reference/api/doxygen/classtvm_1_1FuncTypeNode.html
+++ b/docs/reference/api/doxygen/classtvm_1_1FuncTypeNode.html
@@ -408,7 +408,7 @@ Additional Inherited Members</h2></td></tr>
 </div><div class="memdoc">
 
 <p>potential constraint the type need to obey </p>
-<dl class="section note"><dt>Note</dt><dd>this field is reserved for futher purposes. </dd></dl>
+<dl class="section note"><dt>Note</dt><dd>this field is reserved for further purposes. </dd></dl>
 
 </div>
 </div>
diff --git a/docs/reference/api/doxygen/classtvm_1_1TargetNode.html b/docs/reference/api/doxygen/classtvm_1_1TargetNode.html
index 4b58a1d89..d1f711c9d 100644
--- a/docs/reference/api/doxygen/classtvm_1_1TargetNode.html
+++ b/docs/reference/api/doxygen/classtvm_1_1TargetNode.html
@@ -163,7 +163,7 @@ Public Attributes</h2></td></tr>
 <tr class="memdesc:abdeae1bf6e037771b1b931f26dba15c6"><td class="mdescLeft">&#160;</td><td class="mdescRight"><a class="el" href="classtvm_1_1Target.html" title="Managed reference class to TargetNode. ">Target</a> host information, must be <a class="el" href="classtvm_1_1Target.html" title="Managed reference class to TargetNode. ">Target</a> type.  <a href="#abdeae1bf6e037771b1b931f26dba15c6">More...</a><br /></td></tr>
 <tr class="separator:abdeae1bf6e037771b1b931f26dba15c6"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a3046260cd16b7b134fa99705b41d2aee"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1String.html">String</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1TargetNode.html#a3046260cd16b7b134fa99705b41d2aee">tag</a></td></tr>
-<tr class="memdesc:a3046260cd16b7b134fa99705b41d2aee"><td class="mdescLeft">&#160;</td><td class="mdescRight">Tag of the the target, can be empty.  <a href="#a3046260cd16b7b134fa99705b41d2aee">More...</a><br /></td></tr>
+<tr class="memdesc:a3046260cd16b7b134fa99705b41d2aee"><td class="mdescLeft">&#160;</td><td class="mdescRight">Tag of the target, can be empty.  <a href="#a3046260cd16b7b134fa99705b41d2aee">More...</a><br /></td></tr>
 <tr class="separator:a3046260cd16b7b134fa99705b41d2aee"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:aec9e821b23172eb9460f46df0dc346fb"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1runtime_1_1String.html">String</a> &gt;&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1TargetNode.html#aec9e821b23172eb9460f46df0dc346fb">keys</a></td></tr>
 <tr class="memdesc:aec9e821b23172eb9460f46df0dc346fb"><td class="mdescLeft">&#160;</td><td class="mdescRight">Keys for this target.  <a href="#aec9e821b23172eb9460f46df0dc346fb">More...</a><br /></td></tr>
@@ -857,7 +857,7 @@ template&lt;typename TObjectRef &gt; </div>
       </table>
 </div><div class="memdoc">
 
-<p>Tag of the the target, can be empty. </p>
+<p>Tag of the target, can be empty. </p>
 
 </div>
 </div>
diff --git a/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html b/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html
index 338b016b0..b92186a12 100644
--- a/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html
+++ b/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html
@@ -72,7 +72,7 @@ $(function() {
 </div><!--header-->
 <div class="contents">
 
-<p>Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The the tensor will take over body of original tens...">te::Schedule::cache_write</a>.  
+<p>Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The tensor will take over body of original tensor o...">te::Schedule::cache_write</a>.  
  <a href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html#details">More...</a></p>
 
 <p><code>#include &lt;<a class="el" href="transform__step_8h_source.html">transform_step.h</a>&gt;</code></p>
@@ -213,7 +213,7 @@ Additional Inherited Members</h2></td></tr>
 <tr class="separator:af4407d2b59132e803ff791482dbe0145 inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
 </table>
 <a name="details" id="details"></a><h2 class="groupheader">Detailed Description</h2>
-<div class="textblock"><p>Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The the tensor will take over body of original tens...">te::Schedule::cache_write</a>. </p>
+<div class="textblock"><p>Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The tensor will take over body of original tensor o...">te::Schedule::cache_write</a>. </p>
 <dl class="section note"><dt>Note</dt><dd>Cache write step will add an extra stage to the original <a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAG.html" title="Managed reference to ComputeDAGNode. ">ComputeDAG</a>, a up-to-date <a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAG.html" title="Managed reference to ComputeDAGNode. ">ComputeDAG</a> is stored in <a class="el" href="classtvm_1_1auto__scheduler_1_1State.html" title="Managed reference to StateNode. ">Stat [...]
 </div><h2 class="groupheader">Member Function Documentation</h2>
 <a id="a734ea6cd3430fda8db43c78b2d6ea166"></a>
diff --git a/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1State.html b/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1State.html
index 5e2fa27fb..7f0ebb785 100644
--- a/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1State.html
+++ b/docs/reference/api/doxygen/classtvm_1_1auto__scheduler_1_1State.html
@@ -141,7 +141,7 @@ Public Member Functions</h2></td></tr>
 <tr class="memdesc:a4ff71b692f0eabfabf515ed91b59a116"><td class="mdescLeft">&#160;</td><td class="mdescRight">The schedule primitive corresponding to <code><a class="el" href="classtvm_1_1te_1_1Schedule.html#a38ef95a62faf0c15f132847efa20249b" title="create a cache read of original tensor for readers. This will mutate the body of the readers...">te::Schedule::cache_read</a></code>.  <a href="#a4ff71b692f0eabfabf515ed91b59a116">More...</a><br /></td></tr>
 <tr class="separator:a4ff71b692f0eabfabf515ed91b59a116"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a385adc36d7cb242e8204fe14c4df8335"><td class="memItemLeft" align="right" valign="top">int&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1auto__scheduler_1_1State.html#a385adc36d7cb242e8204fe14c4df8335">cache_write</a> (int stage_id, const <a class="el" href="classtvm_1_1runtime_1_1String.html">String</a> &amp;scope_name, const <a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAG.html">ComputeDAG</a> &amp;dag)</td></tr>
-<tr class="memdesc:a385adc36d7cb242e8204fe14c4df8335"><td class="mdescLeft">&#160;</td><td class="mdescRight">The schedule primitive corresponding to <code><a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The the tensor will take over body of original tens...">te::Schedule::cache_write</a></code>.  <a href="#a385adc36d7cb242e8204fe14c4df8335">More...</a><br /></td></tr>
+<tr class="memdesc:a385adc36d7cb242e8204fe14c4df8335"><td class="mdescLeft">&#160;</td><td class="mdescRight">The schedule primitive corresponding to <code><a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The tensor will take over body of original tensor o...">te::Schedule::cache_write</a></code>.  <a href="#a385adc36d7cb242e8204fe14c4df8335">More...</a><br /></td></tr>
 <tr class="separator:a385adc36d7cb242e8204fe14c4df8335"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a21c27b06d439267f8b981fa05c5f48a0"><td class="memItemLeft" align="right" valign="top">int&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1auto__scheduler_1_1State.html#a21c27b06d439267f8b981fa05c5f48a0">rfactor</a> (int stage_id, const <a class="el" href="classtvm_1_1auto__scheduler_1_1Iterator.html">Iterator</a> &amp;it, int factor_iter_id, const <a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAG.html">ComputeDAG</a> &amp; [...]
 <tr class="memdesc:a21c27b06d439267f8b981fa05c5f48a0"><td class="mdescLeft">&#160;</td><td class="mdescRight">The schedule primitive corresponding to <code><a class="el" href="classtvm_1_1te_1_1Schedule.html#a34ae85add41bbed0140726d024d08862" title="Factor a reduction axis in tensor&#39;s schedule to be an explicit axis. This will create a new stage tha...">te::Schedule::rfactor</a></code>.  <a href="#a21c27b06d439267f8b981fa05c5f48a0">More...</a><br /></td></tr>
@@ -381,7 +381,7 @@ Additional Inherited Members</h2></td></tr>
       </table>
 </div><div class="memdoc">
 
-<p>The schedule primitive corresponding to <code><a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The the tensor will take over body of original tens...">te::Schedule::cache_write</a></code>. </p>
+<p>The schedule primitive corresponding to <code><a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The tensor will take over body of original tensor o...">te::Schedule::cache_write</a></code>. </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">stage_id</td><td>The index of the stage to be cache_write. </td></tr>
diff --git a/docs/reference/api/doxygen/classtvm_1_1meta__schedule_1_1PyCostModelNode.html b/docs/reference/api/doxygen/classtvm_1_1meta__schedule_1_1PyCostModelNode.html
index 520cdd3b9..d554965bf 100644
--- a/docs/reference/api/doxygen/classtvm_1_1meta__schedule_1_1PyCostModelNode.html
+++ b/docs/reference/api/doxygen/classtvm_1_1meta__schedule_1_1PyCostModelNode.html
@@ -303,7 +303,7 @@ Additional Inherited Members</h2></td></tr>
   <table class="params">
     <tr><td class="paramname">context</td><td>The tuning context. </td></tr>
     <tr><td class="paramname">candidates</td><td>The measure candidates. </td></tr>
-    <tr><td class="paramname">p_addr</td><td>The address to save the the estimated running results. </td></tr>
+    <tr><td class="paramname">p_addr</td><td>The address to save the estimated running results. </td></tr>
   </table>
   </dd>
 </dl>
diff --git a/docs/reference/api/doxygen/classtvm_1_1te_1_1Schedule.html b/docs/reference/api/doxygen/classtvm_1_1te_1_1Schedule.html
index 5f1fb8347..05d5d3a65 100644
--- a/docs/reference/api/doxygen/classtvm_1_1te_1_1Schedule.html
+++ b/docs/reference/api/doxygen/classtvm_1_1te_1_1Schedule.html
@@ -122,10 +122,10 @@ Public Member Functions</h2></td></tr>
 <tr class="memdesc:a38ef95a62faf0c15f132847efa20249b"><td class="mdescLeft">&#160;</td><td class="mdescRight">create a cache read of original tensor for readers. This will mutate the body of the readers. A new stage will be created for the tensor.  <a href="#a38ef95a62faf0c15f132847efa20249b">More...</a><br /></td></tr>
 <tr class="separator:a38ef95a62faf0c15f132847efa20249b"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:ada9825f59ef130a0ab0b3a01ea348d71"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a> &gt;&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71">cache_write</a> (const <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="c [...]
-<tr class="memdesc:ada9825f59ef130a0ab0b3a01ea348d71"><td class="mdescLeft">&#160;</td><td class="mdescRight">Create a cache write tensor for producing tensor. The the tensor will take over body of original tensor op.  <a href="#ada9825f59ef130a0ab0b3a01ea348d71">More...</a><br /></td></tr>
+<tr class="memdesc:ada9825f59ef130a0ab0b3a01ea348d71"><td class="mdescLeft">&#160;</td><td class="mdescRight">Create a cache write tensor for producing tensor. The tensor will take over body of original tensor op.  <a href="#ada9825f59ef130a0ab0b3a01ea348d71">More...</a><br /></td></tr>
 <tr class="separator:ada9825f59ef130a0ab0b3a01ea348d71"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a15582f96d0aaf9a2bd9f2afcad3935d4"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1te_1_1Schedule.html#a15582f96d0aaf9a2bd9f2afcad3935d4">cache_write</a> (const <a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a> &amp;tensor, const std::string &amp;scope)</td></tr>
-<tr class="memdesc:a15582f96d0aaf9a2bd9f2afcad3935d4"><td class="mdescLeft">&#160;</td><td class="mdescRight">Create a cache write tensor for producing tensor. The the tensor will take over body of original tensor op.  <a href="#a15582f96d0aaf9a2bd9f2afcad3935d4">More...</a><br /></td></tr>
+<tr class="memdesc:a15582f96d0aaf9a2bd9f2afcad3935d4"><td class="mdescLeft">&#160;</td><td class="mdescRight">Create a cache write tensor for producing tensor. The tensor will take over body of original tensor op.  <a href="#a15582f96d0aaf9a2bd9f2afcad3935d4">More...</a><br /></td></tr>
 <tr class="separator:a15582f96d0aaf9a2bd9f2afcad3935d4"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a34ae85add41bbed0140726d024d08862"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a> &gt;&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1te_1_1Schedule.html#a34ae85add41bbed0140726d024d08862">rfactor</a> (const <a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a> &amp;tensor, const <a class="el" [...]
 <tr class="memdesc:a34ae85add41bbed0140726d024d08862"><td class="mdescLeft">&#160;</td><td class="mdescRight">Factor a reduction axis in tensor's schedule to be an explicit axis. This will create a new stage that generated the new tensor with axis as the first dimension. The tensor's body will be rewritten as a reduction over the factored tensor.  <a href="#a34ae85add41bbed0140726d024d08862">More...</a><br /></td></tr>
@@ -377,7 +377,7 @@ Additional Inherited Members</h2></td></tr>
       </table>
 </div><div class="memdoc">
 
-<p>Create a cache write tensor for producing tensor. The the tensor will take over body of original tensor op. </p>
+<p>Create a cache write tensor for producing tensor. The tensor will take over body of original tensor op. </p>
 <p>This function can be used to do data layout transformation. If there is a split/fuse/reorder on the data parallel axis of tensor before cache_write is called. The intermediate cache stores the data in the layout as the iteration order of leave axis. The data will be transformed back to the original layout in the original tensor. User can further call compute_inline to inline the original layout and keep the data stored in the transformed layout.</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
@@ -416,7 +416,7 @@ Additional Inherited Members</h2></td></tr>
       </table>
 </div><div class="memdoc">
 
-<p>Create a cache write tensor for producing tensor. The the tensor will take over body of original tensor op. </p>
+<p>Create a cache write tensor for producing tensor. The tensor will take over body of original tensor op. </p>
 <p>This function can be used to do data layout transformation. If there is a split/fuse/reorder on the data parallel axis of tensor before cache_write is called. The intermediate cache stores the data in the layout as the iteration order of leave axis. The data will be transformed back to the original layout in the original tensor. User can further call compute_inline to inline the original layout and keep the data stored in the transformed layout.</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
diff --git a/docs/reference/api/doxygen/hierarchy.html b/docs/reference/api/doxygen/hierarchy.html
index cd7765d61..9404901a7 100644
--- a/docs/reference/api/doxygen/hierarchy.html
+++ b/docs/reference/api/doxygen/hierarchy.html
@@ -234,7 +234,7 @@ This inheritance list is sorted roughly, but not completely, alphabetically:</di
 <tr id="row_101_29_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_101_29_" class="arrow" onclick="toggleFolder('101_29_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1StepNode.html" target="_self">tvm::auto_scheduler::StepNode</a></td><td class="desc">The base class of transformation steps. Each step has its corresponding <a class="e [...]
 <tr id="row_101_29_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1AnnotationStepNode.html" target="_self">tvm::auto_scheduler::AnnotationStepNode</a></td><td class="desc">Annotation step that corresponds to vectorize, parallel, unroll and thread binding. (i.e. <a class="el" href="classtvm_1_1te_1_1Stage.html#a44d33e [...]
 <tr id="row_101_29_1_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStepNode.html" target="_self">tvm::auto_scheduler::CacheReadStepNode</a></td><td class="desc">Cache read step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#a38ef95a62faf0c15f132847efa20249b" title="create a cache rea [...]
-<tr id="row_101_29_2_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html" target="_self">tvm::auto_scheduler::CacheWriteStepNode</a></td><td class="desc">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache  [...]
+<tr id="row_101_29_2_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html" target="_self">tvm::auto_scheduler::CacheWriteStepNode</a></td><td class="desc">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache  [...]
 <tr id="row_101_29_3_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStepNode.html" target="_self">tvm::auto_scheduler::ComputeAtStepNode</a></td><td class="desc">Compute at step that corresponds to <a class="el" href="classtvm_1_1te_1_1Stage.html#a071545484de7a894c01ccf0e77183730" title="specify the schedule  [...]
 <tr id="row_101_29_4_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeInlineStepNode.html" target="_self">tvm::auto_scheduler::ComputeInlineStepNode</a></td><td class="desc">Compute inline step that corresponds to <a class="el" href="classtvm_1_1te_1_1Stage.html#a1c58b35e37561739440b322c29d30c3b" title="Compute t [...]
 <tr id="row_101_29_5_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeRootStepNode.html" target="_self">tvm::auto_scheduler::ComputeRootStepNode</a></td><td class="desc">Compute root step that corresponds to <a class="el" href="classtvm_1_1te_1_1Stage.html#a95b58b2d2ec034ecd0bdb99f95c0b0ba" title="Compute the fun [...]
diff --git a/docs/reference/api/doxygen/local__response__norm_8h_source.html b/docs/reference/api/doxygen/local__response__norm_8h_source.html
index 85cb00549..aa52f00d9 100644
--- a/docs/reference/api/doxygen/local__response__norm_8h_source.html
+++ b/docs/reference/api/doxygen/local__response__norm_8h_source.html
@@ -76,7 +76,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1topi_html_a13aaf23f0ab77f1ed4a7d4b7816bf210"><div class="ttname"><a href="namespacetvm_1_1topi.html#a13aaf23f0ab77f1ed4a7d4b7816bf210">tvm::topi::kBroadcast</a></div><div class="ttdeci">constexpr auto kBroadcast</div><div class="ttdef"><b>Definition:</b> tags.h:36</div></div>
 <div class="ttc" id="classtvm_1_1Range_html"><div class="ttname"><a href="classtvm_1_1Range.html">tvm::Range</a></div><div class="ttdoc">Range constainer. </div><div class="ttdef"><b>Definition:</b> expr.h:711</div></div>
 <div class="ttc" id="namespacetvm_html_a16f9cd9219b505e2cc05c5a7558ac61f"><div class="ttname"><a href="namespacetvm.html#a16f9cd9219b505e2cc05c5a7558ac61f">tvm::div</a></div><div class="ttdeci">PrimExpr div(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute division in C semantics. </div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a0250c4095f19ae8a22ed85bc4ce5a40d"><div class="ttname"><a href="namespacetvm_1_1topi.html#a0250c4095f19ae8a22ed85bc4ce5a40d">tvm::topi::kElementWise</a></div><div class="ttdeci">constexpr auto kElementWise</div><div class="ttdef"><b>Definition:</b> tags.h:32</div></div>
 <div class="ttc" id="namespacetvm_1_1te_html_aae384e9b73c2271905486e4a74b69265"><div class="ttname"><a href="namespacetvm_1_1te.html#aae384e9b73c2271905486e4a74b69265">tvm::te::reduce_axis</a></div><div class="ttdeci">IterVar reduce_axis(Range dom, std::string name=&quot;rv&quot;)</div><div class="ttdoc">Create a new IterVar for reduction operations. </div></div>
diff --git a/docs/reference/api/doxygen/namespacetvm.html b/docs/reference/api/doxygen/namespacetvm.html
index b2e69d75a..fde5015d8 100644
--- a/docs/reference/api/doxygen/namespacetvm.html
+++ b/docs/reference/api/doxygen/namespacetvm.html
@@ -976,22 +976,22 @@ Functions</h2></td></tr>
 <tr class="memdesc:ac06a47be8386f28b313903e174dfe151"><td class="mdescLeft">&#160;</td><td class="mdescRight">Check if x is infinite.  <a href="#ac06a47be8386f28b313903e174dfe151">More...</a><br /></td></tr>
 <tr class="separator:ac06a47be8386f28b313903e174dfe151"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:afdad0c0329bd39949ba8d296cfb85d76"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">sum</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> source, <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IterVar.html">tir: [...]
-<tr class="memdesc:afdad0c0329bd39949ba8d296cfb85d76"><td class="mdescLeft">&#160;</td><td class="mdescRight">sum of of source expression over axis  <a href="#afdad0c0329bd39949ba8d296cfb85d76">More...</a><br /></td></tr>
+<tr class="memdesc:afdad0c0329bd39949ba8d296cfb85d76"><td class="mdescLeft">&#160;</td><td class="mdescRight">sum of source expression over axis  <a href="#afdad0c0329bd39949ba8d296cfb85d76">More...</a><br /></td></tr>
 <tr class="separator:afdad0c0329bd39949ba8d296cfb85d76"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:adeeaff4fb29f75a9da8ff4d67723c693"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">all</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> source, <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IterVar.html">tir: [...]
-<tr class="memdesc:adeeaff4fb29f75a9da8ff4d67723c693"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical And of of source expression over axis  <a href="#adeeaff4fb29f75a9da8ff4d67723c693">More...</a><br /></td></tr>
+<tr class="memdesc:adeeaff4fb29f75a9da8ff4d67723c693"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical And of source expression over axis  <a href="#adeeaff4fb29f75a9da8ff4d67723c693">More...</a><br /></td></tr>
 <tr class="separator:adeeaff4fb29f75a9da8ff4d67723c693"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a5efd9942cdee5a56cfc438ba523c04f0"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">any</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> source, <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IterVar.html">tir: [...]
-<tr class="memdesc:a5efd9942cdee5a56cfc438ba523c04f0"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical Or of of source expression over axis  <a href="#a5efd9942cdee5a56cfc438ba523c04f0">More...</a><br /></td></tr>
+<tr class="memdesc:a5efd9942cdee5a56cfc438ba523c04f0"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical Or of source expression over axis  <a href="#a5efd9942cdee5a56cfc438ba523c04f0">More...</a><br /></td></tr>
 <tr class="separator:a5efd9942cdee5a56cfc438ba523c04f0"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a6dd252d98b23eaa28c43d6da6cffc1ec">max</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> source, <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IterVar.html">tir: [...]
-<tr class="memdesc:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of of source expression over axis  <a href="#a6dd252d98b23eaa28c43d6da6cffc1ec">More...</a><br /></td></tr>
+<tr class="memdesc:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of source expression over axis  <a href="#a6dd252d98b23eaa28c43d6da6cffc1ec">More...</a><br /></td></tr>
 <tr class="separator:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a2ea31d1a02499ea4b87694b4eeaa2306">min</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> source, <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IterVar.html">tir: [...]
-<tr class="memdesc:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of of source expression over axis  <a href="#a2ea31d1a02499ea4b87694b4eeaa2306">More...</a><br /></td></tr>
+<tr class="memdesc:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of source expression over axis  <a href="#a2ea31d1a02499ea4b87694b4eeaa2306">More...</a><br /></td></tr>
 <tr class="separator:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">prod</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> source, <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IterVar.html">tir [...]
-<tr class="memdesc:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="mdescLeft">&#160;</td><td class="mdescRight">product of of source expression over axis  <a href="#a32a87ae9eacafb2b5b71b28bcc9ef35e">More...</a><br /></td></tr>
+<tr class="memdesc:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="mdescLeft">&#160;</td><td class="mdescRight">product of source expression over axis  <a href="#a32a87ae9eacafb2b5b71b28bcc9ef35e">More...</a><br /></td></tr>
 <tr class="separator:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:aaff65dde3044433b2220677aedf4855f"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#aaff65dde3044433b2220677aedf4855f">floor</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> x, <a class="el" href="classtvm_1_1Span.html">Span</a> span=<a class="el" href="classtvm_1_1Span.html">Span</a>())</td></tr>
 <tr class="memdesc:aaff65dde3044433b2220677aedf4855f"><td class="mdescLeft">&#160;</td><td class="mdescRight">Calculate floor(x)  <a href="#aaff65dde3044433b2220677aedf4855f">More...</a><br /></td></tr>
@@ -2052,7 +2052,7 @@ Variables</h2></td></tr>
       </table>
 </div><div class="memdoc">
 
-<p>logical And of of source expression over axis </p>
+<p>logical And of source expression over axis </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">source</td><td>The source expression. </td></tr>
@@ -2103,7 +2103,7 @@ Variables</h2></td></tr>
       </table>
 </div><div class="memdoc">
 
-<p>logical Or of of source expression over axis </p>
+<p>logical Or of source expression over axis </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">source</td><td>The source expression. </td></tr>
@@ -7064,7 +7064,7 @@ template&lt;typename RefT &gt; </div>
       </table>
 </div><div class="memdoc">
 
-<p>max of of source expression over axis </p>
+<p>max of source expression over axis </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">source</td><td>The source expression. </td></tr>
@@ -7408,7 +7408,7 @@ template&lt;typename RefT &gt; </div>
       </table>
 </div><div class="memdoc">
 
-<p>max of of source expression over axis </p>
+<p>max of source expression over axis </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">source</td><td>The source expression. </td></tr>
@@ -11394,7 +11394,7 @@ template&lt;typename TB &gt; </div>
       </table>
 </div><div class="memdoc">
 
-<p>product of of source expression over axis </p>
+<p>product of source expression over axis </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">source</td><td>The source expression. </td></tr>
@@ -12324,7 +12324,7 @@ template&lt;typename TB &gt; </div>
       </table>
 </div><div class="memdoc">
 
-<p>sum of of source expression over axis </p>
+<p>sum of source expression over axis </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">source</td><td>The source expression. </td></tr>
diff --git a/docs/reference/api/doxygen/namespacetvm_1_1auto__scheduler.html b/docs/reference/api/doxygen/namespacetvm_1_1auto__scheduler.html
index af689fbe1..34de2f7fe 100644
--- a/docs/reference/api/doxygen/namespacetvm_1_1auto__scheduler.html
+++ b/docs/reference/api/doxygen/namespacetvm_1_1auto__scheduler.html
@@ -109,7 +109,7 @@ Classes</h2></td></tr>
 <tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html" title="Cache write step that corresponds to te::Schedule::cache_write. ">CacheWriteStepNode</a>.  <a href="classtvm_1_1auto__scheduler_1_1CacheWriteStep.html#details">More...</a><br /></td></tr>
 <tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:"><td class="memItemLeft" align="right" valign="top">class &#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html">CacheWriteStepNode</a></td></tr>
-<tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The the tensor will take over body of original tens...">te::Schedule::cache_write</a>.  <a href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html#details">More...</a><br /></td></tr>
+<tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The tensor will take over body of original tensor o...">te::Schedule::cache_write</a>.  <a href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html#details">More...</a><br /></td></tr>
 <tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:"><td class="memItemLeft" align="right" valign="top">class &#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStep.html">ComputeAtStep</a></td></tr>
 <tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStepNode.html" title="Compute at step that corresponds to te::Stage::compute_at. ">ComputeAtStepNode</a>.  <a href="classtvm_1_1auto__scheduler_1_1ComputeAtStep.html#details">More...</a><br /></td></tr>
diff --git a/docs/reference/api/doxygen/namespacetvm_1_1relay_1_1transform.html b/docs/reference/api/doxygen/namespacetvm_1_1relay_1_1transform.html
index 405b10b67..225210f81 100644
--- a/docs/reference/api/doxygen/namespacetvm_1_1relay_1_1transform.html
+++ b/docs/reference/api/doxygen/namespacetvm_1_1relay_1_1transform.html
@@ -104,7 +104,7 @@ Functions</h2></td></tr>
 <tr class="memdesc:a2425d757b896168a109498e8d34ba960"><td class="mdescLeft">&#160;</td><td class="mdescRight">Split function with huge number of arguments to smaller pieces.  <a href="#a2425d757b896168a109498e8d34ba960">More...</a><br /></td></tr>
 <tr class="separator:a2425d757b896168a109498e8d34ba960"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a2a6be6024a96a84f7230faa2519f1a97"><td class="memItemLeft" align="right" valign="top"><a class="el" href="namespacetvm_1_1relay_1_1transform.html#afa666ade112e9955059095d695238a9a">Pass</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1relay_1_1transform.html#a2a6be6024a96a84f7230faa2519f1a97">FuseOps</a> (int fuse_opt_level=-1)</td></tr>
-<tr class="memdesc:a2a6be6024a96a84f7230faa2519f1a97"><td class="mdescLeft">&#160;</td><td class="mdescRight">Fuse operations into expr into seperate functions.  <a href="#a2a6be6024a96a84f7230faa2519f1a97">More...</a><br /></td></tr>
+<tr class="memdesc:a2a6be6024a96a84f7230faa2519f1a97"><td class="mdescLeft">&#160;</td><td class="mdescRight">Fuse operations into expr into separate functions.  <a href="#a2a6be6024a96a84f7230faa2519f1a97">More...</a><br /></td></tr>
 <tr class="separator:a2a6be6024a96a84f7230faa2519f1a97"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a8f3eee7092f7e3e58e1c76f4498e32e7"><td class="memItemLeft" align="right" valign="top"><a class="el" href="namespacetvm_1_1relay_1_1transform.html#afa666ade112e9955059095d695238a9a">Pass</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1relay_1_1transform.html#a8f3eee7092f7e3e58e1c76f4498e32e7">DefuseOps</a> ()</td></tr>
 <tr class="memdesc:a8f3eee7092f7e3e58e1c76f4498e32e7"><td class="mdescLeft">&#160;</td><td class="mdescRight">The inverse operation of FuseOps. It transforms a fused program returned by FuseOps into the program before FuseOps. (i.e. x == DefuseOps(FuseOps(x)))  <a href="#a8f3eee7092f7e3e58e1c76f4498e32e7">More...</a><br /></td></tr>
@@ -927,7 +927,7 @@ Functions</h2></td></tr>
       </table>
 </div><div class="memdoc">
 
-<p>Fuse operations into expr into seperate functions. </p>
+<p>Fuse operations into expr into separate functions. </p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">fuse_opt_level</td><td>Optimization level. <a class="el" href="classtvm_1_1relay_1_1If.html">If</a> it is -1 it will be inferred from pass context.</td></tr>
diff --git a/docs/reference/api/doxygen/namespacetvm_1_1te.html b/docs/reference/api/doxygen/namespacetvm_1_1te.html
index ff8cd8fff..dfed3e3ad 100644
--- a/docs/reference/api/doxygen/namespacetvm_1_1te.html
+++ b/docs/reference/api/doxygen/namespacetvm_1_1te.html
@@ -2797,7 +2797,7 @@ template&lt;typename T &gt; </div>
 <p>Postprocessing the Stmt generated by ScheduleOps to create a PrimFunc that can then be used for further TIR optimizations. </p>
 <p>Perform this translation before running any TIR optimizations.</p>
 <p>List of actions taken by the function:</p><ul>
-<li>Remove occurences of <a class="el" href="classtvm_1_1te_1_1Tensor.html" title="Tensor structure representing a possible input, or intermediate computation result. ">te::Tensor</a>, <a class="el" href="classtvm_1_1te_1_1Operation.html" title="Operation that produces tensors. ">te::Operation</a> in the IR and replace them by corresponding IR nodes via <a class="el" href="classtvm_1_1tir_1_1Buffer.html" title="Buffer is a symbolic n-darray structure. It is a composition of primitive sym [...]
+<li>Remove occurrences of <a class="el" href="classtvm_1_1te_1_1Tensor.html" title="Tensor structure representing a possible input, or intermediate computation result. ">te::Tensor</a>, <a class="el" href="classtvm_1_1te_1_1Operation.html" title="Operation that produces tensors. ">te::Operation</a> in the IR and replace them by corresponding IR nodes via <a class="el" href="classtvm_1_1tir_1_1Buffer.html" title="Buffer is a symbolic n-darray structure. It is a composition of primitive sy [...]
 <li>Add annotation of extern buffers using the buffer_map field in the PrimFunc type.</li>
 </ul>
 <dl class="params"><dt>Parameters</dt><dd>
diff --git a/docs/reference/api/doxygen/namespacetvm_1_1topi.html b/docs/reference/api/doxygen/namespacetvm_1_1topi.html
index 9b2fd9250..2e3247698 100644
--- a/docs/reference/api/doxygen/namespacetvm_1_1topi.html
+++ b/docs/reference/api/doxygen/namespacetvm_1_1topi.html
@@ -588,7 +588,7 @@ Functions</h2></td></tr>
 <tr class="memdesc:a38fe82b0db9eab041324da16e532baff"><td class="mdescLeft">&#160;</td><td class="mdescRight">Wrap <a class="el" href="namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb" title="take maximum of two values ">tvm::max</a> to ensure we get the correct overload.  <a href="#a38fe82b0db9eab041324da16e532baff">More...</a><br /></td></tr>
 <tr class="separator:a38fe82b0db9eab041324da16e532baff"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:af62dd10dd04c1fbf820581b14498de6e"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1topi.html#af62dd10dd04c1fbf820581b14498de6e">ProdOp</a> (<a class="el" href="classtvm_1_1PrimExpr.html">PrimExpr</a> source, <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1_1tir_1_1IterVar [...]
-<tr class="memdesc:af62dd10dd04c1fbf820581b14498de6e"><td class="mdescLeft">&#160;</td><td class="mdescRight">Wrap <a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e" title="product of of source expression over axis ">tvm::prod</a> to ensure we get the correct overload.  <a href="#af62dd10dd04c1fbf820581b14498de6e">More...</a><br /></td></tr>
+<tr class="memdesc:af62dd10dd04c1fbf820581b14498de6e"><td class="mdescLeft">&#160;</td><td class="mdescRight">Wrap <a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e" title="product of source expression over axis ">tvm::prod</a> to ensure we get the correct overload.  <a href="#af62dd10dd04c1fbf820581b14498de6e">More...</a><br /></td></tr>
 <tr class="separator:af62dd10dd04c1fbf820581b14498de6e"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:abee7c35e8c15e2e61afe35852dfcb252"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1topi.html#abee7c35e8c15e2e61afe35852dfcb252">sum</a> (const <a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a> &amp;data, const <a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a>&lt; <a class="el" href="classtvm_1 [...]
 <tr class="memdesc:abee7c35e8c15e2e61afe35852dfcb252"><td class="mdescLeft">&#160;</td><td class="mdescRight">Creates an operation that sums array elements over a given axis.  <a href="#abee7c35e8c15e2e61afe35852dfcb252">More...</a><br /></td></tr>
@@ -2776,7 +2776,7 @@ Variables</h2></td></tr>
   <table class="params">
     <tr><td class="paramname">data</td><td>The input tensor. </td></tr>
     <tr><td class="paramname">axis</td><td>The axes along which the reduction is performed. </td></tr>
-    <tr><td class="paramname">func</td><td>The reduction function eg. <a class="el" href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76" title="sum of of source expression over axis ">tvm::sum</a> </td></tr>
+    <tr><td class="paramname">func</td><td>The reduction function eg. <a class="el" href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76" title="sum of source expression over axis ">tvm::sum</a> </td></tr>
     <tr><td class="paramname">keepdims</td><td>If this is set to true, the axes which are reduced are left in the result as dimensions with size one. This enables the result to broadcast correctly against the input array. </td></tr>
     <tr><td class="paramname">atleast1d</td><td>Whether the output need to be atleast1d.</td></tr>
   </table>
@@ -3621,7 +3621,7 @@ Variables</h2></td></tr>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">data</td><td>The input tensor. </td></tr>
-    <tr><td class="paramname">func</td><td>The reduction function eg. <a class="el" href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76" title="sum of of source expression over axis ">tvm::sum</a> </td></tr>
+    <tr><td class="paramname">func</td><td>The reduction function eg. <a class="el" href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76" title="sum of source expression over axis ">tvm::sum</a> </td></tr>
     <tr><td class="paramname">target_shape</td><td>The output Tensor shape. </td></tr>
     <tr><td class="paramname">reduce_axes</td><td>The real axes along which the reduction is performed. </td></tr>
     <tr><td class="paramname">squeeze_axes</td><td>The real axes to squeeze. Unsqueezed, reduced axes will have shape 1 in the output tensor. </td></tr>
@@ -3763,7 +3763,7 @@ Variables</h2></td></tr>
   <table class="params">
     <tr><td class="paramname">x</td><td>The input tensor </td></tr>
     <tr><td class="paramname">begin</td><td>The indices to begin with in the slicing </td></tr>
-    <tr><td class="paramname">end</td><td>Indicies indicating end of the slice </td></tr>
+    <tr><td class="paramname">end</td><td>Indices indicating end of the slice </td></tr>
     <tr><td class="paramname">strides</td><td>Specifies the stride values, it can be negative in that case, the input tensor will be reversed in that particular axis </td></tr>
     <tr><td class="paramname">name</td><td>The name of the operation </td></tr>
     <tr><td class="paramname">tag</td><td>The tag to mark the operation</td></tr>
@@ -3837,7 +3837,7 @@ Variables</h2></td></tr>
   <table class="params">
     <tr><td class="paramname">x</td><td>The input tensor </td></tr>
     <tr><td class="paramname">begin</td><td>The indices to begin with in the slicing </td></tr>
-    <tr><td class="paramname">end</td><td>Indicies indicating end of the slice </td></tr>
+    <tr><td class="paramname">end</td><td>Indices indicating end of the slice </td></tr>
     <tr><td class="paramname">strides</td><td>Specifies the stride values, it can be negative in that case, the input tensor will be reversed in that particular axis </td></tr>
     <tr><td class="paramname">name</td><td>The name of the operation </td></tr>
     <tr><td class="paramname">tag</td><td>The tag to mark the operation</td></tr>
@@ -11316,7 +11316,7 @@ Variables</h2></td></tr>
 </table>
 </div><div class="memdoc">
 
-<p>Wrap <a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e" title="product of of source expression over axis ">tvm::prod</a> to ensure we get the correct overload. </p>
+<p>Wrap <a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e" title="product of source expression over axis ">tvm::prod</a> to ensure we get the correct overload. </p>
 
 </div>
 </div>
@@ -12913,7 +12913,7 @@ template&lt;typename T &gt; </div>
   <table class="params">
     <tr><td class="paramname">x</td><td>The input tensor </td></tr>
     <tr><td class="paramname">begin</td><td>The indices to begin with in the slicing </td></tr>
-    <tr><td class="paramname">end</td><td>Indicies indicating end of the slice </td></tr>
+    <tr><td class="paramname">end</td><td>Indices indicating end of the slice </td></tr>
     <tr><td class="paramname">strides</td><td>Specifies the stride values, it can be negative in that case, the input tensor will be reversed in that particular axis </td></tr>
     <tr><td class="paramname">slice_mode</td><td>Specifies the slice mode </td></tr>
     <tr><td class="paramname">name</td><td>The name of the operation </td></tr>
@@ -13000,7 +13000,7 @@ template&lt;typename T &gt; </div>
   <table class="params">
     <tr><td class="paramname">x</td><td>The input tensor </td></tr>
     <tr><td class="paramname">begin</td><td>The indices to begin with in the slicing </td></tr>
-    <tr><td class="paramname">end</td><td>Indicies indicating end of the slice </td></tr>
+    <tr><td class="paramname">end</td><td>Indices indicating end of the slice </td></tr>
     <tr><td class="paramname">strides</td><td>Specifies the stride values, it can be negative in that case, the input tensor will be reversed in that particular axis </td></tr>
     <tr><td class="paramname">axes</td><td>Axes along which slicing is applied. When it is specified, the length of begin, end, strides, and axes argument must be equal </td></tr>
     <tr><td class="paramname">slice_mode</td><td>Specifies the slice mode </td></tr>
@@ -13076,7 +13076,7 @@ template&lt;typename T &gt; </div>
   <table class="params">
     <tr><td class="paramname">ishape</td><td>The input tensor shape </td></tr>
     <tr><td class="paramname">begin</td><td>The indices to begin with in the slicing </td></tr>
-    <tr><td class="paramname">end</td><td>Indicies indicating end of the slice </td></tr>
+    <tr><td class="paramname">end</td><td>Indices indicating end of the slice </td></tr>
     <tr><td class="paramname">strides</td><td>Specifies the stride values, it can be negative in that case, the input tensor will be reversed in that particular axis </td></tr>
     <tr><td class="paramname">axes</td><td>Axes along which slicing is applied. When it is specified, the length of begin, end, strides, and axes argument must be equal </td></tr>
     <tr><td class="paramname">slice_mode</td><td>Specifies the slice mode</td></tr>
diff --git a/docs/reference/api/doxygen/nn_2bnn_8h_source.html b/docs/reference/api/doxygen/nn_2bnn_8h_source.html
index 985bcb310..2ac08a162 100644
--- a/docs/reference/api/doxygen/nn_2bnn_8h_source.html
+++ b/docs/reference/api/doxygen/nn_2bnn_8h_source.html
@@ -78,7 +78,7 @@ $(function() {
 <div class="ttc" id="constant__utils_8h_html"><div class="ttname"><a href="constant__utils_8h.html">constant_utils.h</a></div><div class="ttdoc">Utility functions for handling constants in TVM expressions. </div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_af580cd1bea6e862f41c7fad4c4c7eea3"><div class="ttname"><a href="namespacetvm_1_1topi.html#af580cd1bea6e862f41c7fad4c4c7eea3">tvm::topi::sign</a></div><div class="ttdeci">Tensor sign(const Tensor &amp;x, std::string name=&quot;T_sign&quot;, std::string tag=kElementWise)</div><div class="ttdoc">Returns the sign of the tensor. </div><div class="ttdef"><b>Definition:</b> elemwise.h:211</div></div>
 <div class="ttc" id="classtvm_1_1Range_html"><div class="ttname"><a href="classtvm_1_1Range.html">tvm::Range</a></div><div class="ttdoc">Range constainer. </div><div class="ttdef"><b>Definition:</b> expr.h:711</div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a0250c4095f19ae8a22ed85bc4ce5a40d"><div class="ttname"><a href="namespacetvm_1_1topi.html#a0250c4095f19ae8a22ed85bc4ce5a40d">tvm::topi::kElementWise</a></div><div class="ttdeci">constexpr auto kElementWise</div><div class="ttdef"><b>Definition:</b> tags.h:32</div></div>
 <div class="ttc" id="namespacetvm_html_a8f30aa0685ca52f846843e76a1ad1dc7"><div class="ttname"><a href="namespacetvm.html#a8f30aa0685ca52f846843e76a1ad1dc7">tvm::indexdiv</a></div><div class="ttdeci">PrimExpr indexdiv(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute floor(a / b) where a and b are non-negative. </div></div>
diff --git a/docs/reference/api/doxygen/nn_2dense_8h_source.html b/docs/reference/api/doxygen/nn_2dense_8h_source.html
index 8295e6e33..62a14763e 100644
--- a/docs/reference/api/doxygen/nn_2dense_8h_source.html
+++ b/docs/reference/api/doxygen/nn_2dense_8h_source.html
@@ -75,7 +75,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1topi_1_1nn_html_a34e1a8305acf89ef2f745c8d99bf8e89"><div class="ttname"><a href="namespacetvm_1_1topi_1_1nn.html#a34e1a8305acf89ef2f745c8d99bf8e89">tvm::topi::nn::dense</a></div><div class="ttdeci">tvm::te::Tensor dense(const tvm::te::Tensor &amp;data, const tvm::te::Tensor &amp;weight, const tvm::te::Tensor &amp;bias, const DataType &amp;out_dtype)</div><div class="ttdoc">Creates an operation that calculates data * weight^T + bias. </div><div class="t [...]
 <div class="ttc" id="classtvm_1_1runtime_1_1ObjectRef_html_a17d8d5ad92691f9e18e3e0ae8ef69e4f"><div class="ttname"><a href="classtvm_1_1runtime_1_1ObjectRef.html#a17d8d5ad92691f9e18e3e0ae8ef69e4f">tvm::runtime::ObjectRef::defined</a></div><div class="ttdeci">bool defined() const</div><div class="ttdef"><b>Definition:</b> object.h:544</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1DataType_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1DataType.html">tvm::runtime::DataType</a></div><div class="ttdoc">Runtime primitive data type. </div><div class="ttdef"><b>Definition:</b> data_type.h:41</div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="namespacetvm_1_1te_html_aae384e9b73c2271905486e4a74b69265"><div class="ttname"><a href="namespacetvm_1_1te.html#aae384e9b73c2271905486e4a74b69265">tvm::te::reduce_axis</a></div><div class="ttdeci">IterVar reduce_axis(Range dom, std::string name=&quot;rv&quot;)</div><div class="ttdoc">Create a new IterVar for reduction operations. </div></div>
 <div class="ttc" id="classtvm_1_1te_1_1Tensor_html"><div class="ttname"><a href="classtvm_1_1te_1_1Tensor.html">tvm::te::Tensor</a></div><div class="ttdoc">Tensor structure representing a possible input, or intermediate computation result. </div><div class="ttdef"><b>Definition:</b> tensor.h:102</div></div>
 <div class="ttc" id="operation_8h_html"><div class="ttname"><a href="operation_8h.html">operation.h</a></div><div class="ttdoc">Operation node can generate one or multiple Tensors. </div></div>
diff --git a/docs/reference/api/doxygen/nn_2pooling_8h_source.html b/docs/reference/api/doxygen/nn_2pooling_8h_source.html
index 3351de5e5..98b0120d8 100644
--- a/docs/reference/api/doxygen/nn_2pooling_8h_source.html
+++ b/docs/reference/api/doxygen/nn_2pooling_8h_source.html
@@ -92,7 +92,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_html_a16f9cd9219b505e2cc05c5a7558ac61f"><div class="ttname"><a href="namespacetvm.html#a16f9cd9219b505e2cc05c5a7558ac61f">tvm::div</a></div><div class="ttdeci">PrimExpr div(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute division in C semantics. </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html_aed6387e67d18b9d5ad18f510fd600a25"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html#aed6387e67d18b9d5ad18f510fd600a25">tvm::runtime::Array::size</a></div><div class="ttdeci">size_t size() const</div><div class="ttdef"><b>Definition:</b> array.h:399</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_1_1nn_html_a3ffa0974d8cdcd5b8ca7afb3cfbaf53ca9780bd19bf6706355258ca09ddeab335"><div class="ttname"><a href="namespacetvm_1_1topi_1_1nn.html#a3ffa0974d8cdcd5b8ca7afb3cfbaf53ca9780bd19bf6706355258ca09ddeab335">tvm::topi::nn::kAvgPool</a></div><div class="ttdef"><b>Definition:</b> pooling.h:45</div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_1_1nn_html_a27571804c2096b32ab05e7b3e32c5af6"><div class="ttname"><a href="namespacetvm_1_1topi_1_1nn.html#a27571804c2096b32ab05e7b3e32c5af6">tvm::topi::nn::pool_impl_nd</a></div><div class="ttdeci">Tensor pool_impl_nd(const Tensor &amp;x, const Array&lt; PrimExpr &gt; &amp;kernel_size, const Array&lt; PrimExpr &gt; &amp;stride_size, const Array&lt; PrimExpr &gt; &amp;dilation_size, const Array&lt; PrimExpr &gt; &amp;padding_size, PoolType pool_t [...]
 <div class="ttc" id="pad__utils_8h_html"><div class="ttname"><a href="pad__utils_8h.html">pad_utils.h</a></div><div class="ttdoc">Padding helpers. </div></div>
diff --git a/docs/reference/api/doxygen/nn_2softmax_8h_source.html b/docs/reference/api/doxygen/nn_2softmax_8h_source.html
index e17fc2f84..ed060fa5b 100644
--- a/docs/reference/api/doxygen/nn_2softmax_8h_source.html
+++ b/docs/reference/api/doxygen/nn_2softmax_8h_source.html
@@ -77,7 +77,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1topi_1_1nn_html_ac0e20b6b30ec8296c1f037866d3bf772"><div class="ttname"><a href="namespacetvm_1_1topi_1_1nn.html#ac0e20b6b30ec8296c1f037866d3bf772">tvm::topi::nn::log_softmax</a></div><div class="ttdeci">Tensor log_softmax(const Tensor &amp;x, std::string name=&quot;tensor&quot;, std::string tag=&quot;log_softmax_output&quot;)</div><div class="ttdoc">Log softmax activation. </div><div class="ttdef"><b>Definition:</b> softmax.h:126</div></div>
 <div class="ttc" id="classtvm_1_1Range_html"><div class="ttname"><a href="classtvm_1_1Range.html">tvm::Range</a></div><div class="ttdoc">Range constainer. </div><div class="ttdef"><b>Definition:</b> expr.h:711</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a38fe82b0db9eab041324da16e532baff"><div class="ttname"><a href="namespacetvm_1_1topi.html#a38fe82b0db9eab041324da16e532baff">tvm::topi::MaxOp</a></div><div class="ttdeci">PrimExpr MaxOp(PrimExpr source, Array&lt; IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">Wrap tvm::max to ensure we get the correct overload. </div><div class="ttdef"><b>Definition:</b> reduction.h:302</div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="namespacetvm_html_a0df5ca82d2c566f628ebb2f1e84a3fcb"><div class="ttname"><a href="namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb">tvm::max</a></div><div class="ttdeci">PrimExpr max(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">take maximum of two values </div></div>
 <div class="ttc" id="namespacetvm_1_1te_html_aae384e9b73c2271905486e4a74b69265"><div class="ttname"><a href="namespacetvm_1_1te.html#aae384e9b73c2271905486e4a74b69265">tvm::te::reduce_axis</a></div><div class="ttdeci">IterVar reduce_axis(Range dom, std::string name=&quot;rv&quot;)</div><div class="ttdoc">Create a new IterVar for reduction operations. </div></div>
diff --git a/docs/reference/api/doxygen/reduction_8h.html b/docs/reference/api/doxygen/reduction_8h.html
index 6d6089b14..0973c4fde 100644
--- a/docs/reference/api/doxygen/reduction_8h.html
+++ b/docs/reference/api/doxygen/reduction_8h.html
@@ -152,7 +152,7 @@ Functions</h2></td></tr>
 <tr class="memdesc:a38fe82b0db9eab041324da16e532baff"><td class="mdescLeft">&#160;</td><td class="mdescRight">Wrap <a class="el" href="namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb" title="take maximum of two values ">tvm::max</a> to ensure we get the correct overload.  <a href="namespacetvm_1_1topi.html#a38fe82b0db9eab041324da16e532baff">More...</a><br /></td></tr>
 <tr class="separator:a38fe82b0db9eab041324da16e532baff"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:af62dd10dd04c1fbf820581b14498de6e"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1topi.html#af62dd10dd04c1fbf820581b14498de6e">tvm::topi::ProdOp</a> (PrimExpr source, Array&lt; IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</td></tr>
-<tr class="memdesc:af62dd10dd04c1fbf820581b14498de6e"><td class="mdescLeft">&#160;</td><td class="mdescRight">Wrap <a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e" title="product of of source expression over axis ">tvm::prod</a> to ensure we get the correct overload.  <a href="namespacetvm_1_1topi.html#af62dd10dd04c1fbf820581b14498de6e">More...</a><br /></td></tr>
+<tr class="memdesc:af62dd10dd04c1fbf820581b14498de6e"><td class="mdescLeft">&#160;</td><td class="mdescRight">Wrap <a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e" title="product of source expression over axis ">tvm::prod</a> to ensure we get the correct overload.  <a href="namespacetvm_1_1topi.html#af62dd10dd04c1fbf820581b14498de6e">More...</a><br /></td></tr>
 <tr class="separator:af62dd10dd04c1fbf820581b14498de6e"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:abee7c35e8c15e2e61afe35852dfcb252"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1topi.html#abee7c35e8c15e2e61afe35852dfcb252">tvm::topi::sum</a> (const <a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a> &amp;data, const Array&lt; Integer &gt; &amp;axis, bool keepdims=false, bool atleast1d=false)</td></tr>
 <tr class="memdesc:abee7c35e8c15e2e61afe35852dfcb252"><td class="mdescLeft">&#160;</td><td class="mdescRight">Creates an operation that sums array elements over a given axis.  <a href="namespacetvm_1_1topi.html#abee7c35e8c15e2e61afe35852dfcb252">More...</a><br /></td></tr>
diff --git a/docs/reference/api/doxygen/reduction_8h_source.html b/docs/reference/api/doxygen/reduction_8h_source.html
index fe2589ad5..089cda219 100644
--- a/docs/reference/api/doxygen/reduction_8h_source.html
+++ b/docs/reference/api/doxygen/reduction_8h_source.html
@@ -97,12 +97,12 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1topi_html_aea9a989b0aaa2aef03fe8ee237d8257e"><div class="ttname"><a href="namespacetvm_1_1topi.html#aea9a989b0aaa2aef03fe8ee237d8257e">tvm::topi::MinOp</a></div><div class="ttdeci">PrimExpr MinOp(PrimExpr source, Array&lt; IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">Wrap tvm::min to ensure we get the correct overload. </div><div class="ttdef"><b>Definition:</b> reduction.h:296</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_aa86493f01692ff5ecfdb2d202306854c"><div class="ttname"><a href="namespacetvm_1_1topi.html#aa86493f01692ff5ecfdb2d202306854c">tvm::topi::identity</a></div><div class="ttdeci">Tensor identity(const Tensor &amp;x, std::string name=&quot;T_identity&quot;, std::string tag=kElementWise)</div><div class="ttdoc">Creates an operation that returns identity of a given tensor. </div><div class="ttdef"><b>Definition:</b> elemwise.h:151</div></div>
 <div class="ttc" id="elemwise_8h_html"><div class="ttname"><a href="elemwise_8h.html">elemwise.h</a></div><div class="ttdoc">Elementwise op constructions. </div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a7dd84c370a3377aec67ce83f94605df9"><div class="ttname"><a href="namespacetvm_1_1topi.html#a7dd84c370a3377aec67ce83f94605df9">tvm::topi::FIdentity</a></div><div class="ttdeci">std::function&lt; Array&lt; PrimExpr &gt;(std::vector&lt; DataType &gt; types)&gt; FIdentity</div><div class="ttdoc">An initializer function for a reduction. </div><div class="ttdef"><b>Definition:</b> reduction.h:257</div></div>
 <div class="ttc" id="namespacetvm_html_a0df5ca82d2c566f628ebb2f1e84a3fcb"><div class="ttname"><a href="namespacetvm.html#a0df5ca82d2c566f628ebb2f1e84a3fcb">tvm::max</a></div><div class="ttdeci">PrimExpr max(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">take maximum of two values </div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a05ddefea973989205ef8cb14fcfe6ffe"><div class="ttname"><a href="namespacetvm_1_1topi.html#a05ddefea973989205ef8cb14fcfe6ffe">tvm::topi::FCommReduce</a></div><div class="ttdeci">std::function&lt; Array&lt; PrimExpr &gt;(Array&lt; PrimExpr &gt; exprs, const Array&lt; IterVar &gt; &amp;axis, PrimExpr *condition)&gt; FCommReduce</div><div class="ttdoc">The operation to use for CommReduceIdx. </div><div class="ttdef"><b>Definition:</b> reduction. [...]
-<div class="ttc" id="namespacetvm_html_a5efd9942cdee5a56cfc438ba523c04f0"><div class="ttname"><a href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">tvm::any</a></div><div class="ttdeci">PrimExpr any(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical Or of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_a5efd9942cdee5a56cfc438ba523c04f0"><div class="ttname"><a href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">tvm::any</a></div><div class="ttdeci">PrimExpr any(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical Or of source expression over axis </div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_af62dd10dd04c1fbf820581b14498de6e"><div class="ttname"><a href="namespacetvm_1_1topi.html#af62dd10dd04c1fbf820581b14498de6e">tvm::topi::ProdOp</a></div><div class="ttdeci">PrimExpr ProdOp(PrimExpr source, Array&lt; IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">Wrap tvm::prod to ensure we get the correct overload. </div><div class="ttdef"><b>Definition:</b> reduction.h:308</div></div>
 <div class="ttc" id="namespacetvm_1_1te_html_aae384e9b73c2271905486e4a74b69265"><div class="ttname"><a href="namespacetvm_1_1te.html#aae384e9b73c2271905486e4a74b69265">tvm::te::reduce_axis</a></div><div class="ttdeci">IterVar reduce_axis(Range dom, std::string name=&quot;rv&quot;)</div><div class="ttdoc">Create a new IterVar for reduction operations. </div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a7b1acf424786ee187f0f19a725b85d8c"><div class="ttname"><a href="namespacetvm_1_1topi.html#a7b1acf424786ee187f0f19a725b85d8c">tvm::topi::kCommReduce</a></div><div class="ttdeci">constexpr auto kCommReduce</div><div class="ttdef"><b>Definition:</b> tags.h:34</div></div>
@@ -111,7 +111,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html_a6b097149e69ea03fe3b812a3f5f7fcd9"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html#a6b097149e69ea03fe3b812a3f5f7fcd9">tvm::runtime::Array::end</a></div><div class="ttdeci">iterator end() const</div><div class="ttdef"><b>Definition:</b> array.h:369</div></div>
 <div class="ttc" id="classtvm_1_1te_1_1Tensor_html"><div class="ttname"><a href="classtvm_1_1te_1_1Tensor.html">tvm::te::Tensor</a></div><div class="ttdoc">Tensor structure representing a possible input, or intermediate computation result. </div><div class="ttdef"><b>Definition:</b> tensor.h:102</div></div>
 <div class="ttc" id="operation_8h_html"><div class="ttname"><a href="operation_8h.html">operation.h</a></div><div class="ttdoc">Operation node can generate one or multiple Tensors. </div></div>
-<div class="ttc" id="namespacetvm_html_adeeaff4fb29f75a9da8ff4d67723c693"><div class="ttname"><a href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">tvm::all</a></div><div class="ttdeci">PrimExpr all(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical And of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_adeeaff4fb29f75a9da8ff4d67723c693"><div class="ttname"><a href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">tvm::all</a></div><div class="ttdeci">PrimExpr all(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical And of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1Select_html"><div class="ttname"><a href="classtvm_1_1tir_1_1Select.html">tvm::tir::Select</a></div><div class="ttdoc">Managed reference to SelectNode. </div><div class="ttdef"><b>Definition:</b> expr.h:589</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a4bc269a40cbdbac3b8b764950820dc8c"><div class="ttname"><a href="namespacetvm_1_1topi.html#a4bc269a40cbdbac3b8b764950820dc8c">tvm::topi::prod</a></div><div class="ttdeci">Tensor prod(const Tensor &amp;data, const Array&lt; Integer &gt; &amp;axis, bool keepdims=false, bool atleast1d=false)</div><div class="ttdoc">Creates product operation over given axis. </div><div class="ttdef"><b>Definition:</b> reduction.h:568</div></div>
 <div class="ttc" id="topi_2transform_8h_html"><div class="ttname"><a href="topi_2transform_8h.html">transform.h</a></div><div class="ttdoc">Transform op constructors. </div></div>
@@ -126,7 +126,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1PrimExpr_html"><div class="ttname"><a href="classtvm_1_1PrimExpr.html">tvm::PrimExpr</a></div><div class="ttdoc">Reference to PrimExprNode. </div><div class="ttdef"><b>Definition:</b> expr.h:112</div></div>
 <div class="ttc" id="ravel__unravel_8h_html"><div class="ttname"><a href="ravel__unravel_8h.html">ravel_unravel.h</a></div><div class="ttdoc">Index ravel and unraval operations. </div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_ab7fb7a9f1651970c4ba04a48acdb695f"><div class="ttname"><a href="namespacetvm_1_1topi.html#ab7fb7a9f1651970c4ba04a48acdb695f">tvm::topi::DoCommReduce</a></div><div class="ttdeci">Tensor DoCommReduce(const Tensor &amp;data, FReduce func, const Array&lt; PrimExpr &gt; &amp;target_shape, const std::vector&lt; int &gt; &amp;reduce_axes, const std::vector&lt; int &gt; &amp;squeeze_axes, Span span=Span())</div><div class="ttdoc">Create a reduction  [...]
-<div class="ttc" id="namespacetvm_html_a32a87ae9eacafb2b5b71b28bcc9ef35e"><div class="ttname"><a href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">tvm::prod</a></div><div class="ttdeci">PrimExpr prod(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">product of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_a32a87ae9eacafb2b5b71b28bcc9ef35e"><div class="ttname"><a href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">tvm::prod</a></div><div class="ttdeci">PrimExpr prod(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">product of source expression over axis </div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_aed8fdf7a1568bacd2b2d2dd53192c59e"><div class="ttname"><a href="namespacetvm_1_1topi.html#aed8fdf7a1568bacd2b2d2dd53192c59e">tvm::topi::argmin</a></div><div class="ttdeci">Tensor argmin(const Tensor &amp;data, const Array&lt; Integer &gt; &amp;axis, bool keepdims=false, bool atleast1d=false, bool select_last_index=false)</div><div class="ttdoc">Creates an operation that finds the indices of the minimum values over a given axis. </div><div cl [...]
 <div class="ttc" id="namespacetvm_1_1topi_html_aec9d2c654a75e1be977d159b87a6b8f5"><div class="ttname"><a href="namespacetvm_1_1topi.html#aec9d2c654a75e1be977d159b87a6b8f5">tvm::topi::CommReduce</a></div><div class="ttdeci">Tensor CommReduce(const Tensor &amp;data, const Array&lt; Integer &gt; &amp;axis, FReduce func, bool keepdims, bool atleast1d)</div><div class="ttdoc">Create a reduction operation. </div><div class="ttdef"><b>Definition:</b> reduction.h:182</div></div>
 </div><!-- fragment --></div><!-- contents -->
diff --git a/docs/reference/api/doxygen/relay_2transform_8h.html b/docs/reference/api/doxygen/relay_2transform_8h.html
index 912468a0a..c879c312a 100644
--- a/docs/reference/api/doxygen/relay_2transform_8h.html
+++ b/docs/reference/api/doxygen/relay_2transform_8h.html
@@ -143,7 +143,7 @@ Functions</h2></td></tr>
 <tr class="memdesc:a2425d757b896168a109498e8d34ba960"><td class="mdescLeft">&#160;</td><td class="mdescRight">Split function with huge number of arguments to smaller pieces.  <a href="namespacetvm_1_1relay_1_1transform.html#a2425d757b896168a109498e8d34ba960">More...</a><br /></td></tr>
 <tr class="separator:a2425d757b896168a109498e8d34ba960"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a2a6be6024a96a84f7230faa2519f1a97"><td class="memItemLeft" align="right" valign="top">Pass&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1relay_1_1transform.html#a2a6be6024a96a84f7230faa2519f1a97">tvm::relay::transform::FuseOps</a> (int fuse_opt_level=-1)</td></tr>
-<tr class="memdesc:a2a6be6024a96a84f7230faa2519f1a97"><td class="mdescLeft">&#160;</td><td class="mdescRight">Fuse operations into expr into seperate functions.  <a href="namespacetvm_1_1relay_1_1transform.html#a2a6be6024a96a84f7230faa2519f1a97">More...</a><br /></td></tr>
+<tr class="memdesc:a2a6be6024a96a84f7230faa2519f1a97"><td class="mdescLeft">&#160;</td><td class="mdescRight">Fuse operations into expr into separate functions.  <a href="namespacetvm_1_1relay_1_1transform.html#a2a6be6024a96a84f7230faa2519f1a97">More...</a><br /></td></tr>
 <tr class="separator:a2a6be6024a96a84f7230faa2519f1a97"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a8f3eee7092f7e3e58e1c76f4498e32e7"><td class="memItemLeft" align="right" valign="top">Pass&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm_1_1relay_1_1transform.html#a8f3eee7092f7e3e58e1c76f4498e32e7">tvm::relay::transform::DefuseOps</a> ()</td></tr>
 <tr class="memdesc:a8f3eee7092f7e3e58e1c76f4498e32e7"><td class="mdescLeft">&#160;</td><td class="mdescRight">The inverse operation of FuseOps. It transforms a fused program returned by FuseOps into the program before FuseOps. (i.e. x == DefuseOps(FuseOps(x)))  <a href="namespacetvm_1_1relay_1_1transform.html#a8f3eee7092f7e3e58e1c76f4498e32e7">More...</a><br /></td></tr>
diff --git a/docs/reference/api/doxygen/relay_2transform_8h_source.html b/docs/reference/api/doxygen/relay_2transform_8h_source.html
index 748eefa9a..fcf230e3c 100644
--- a/docs/reference/api/doxygen/relay_2transform_8h_source.html
+++ b/docs/reference/api/doxygen/relay_2transform_8h_source.html
@@ -121,7 +121,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1relay_1_1transform_html_a03b053f3d99d5c420ddc8492e6b987bf"><div class="ttname"><a href="namespacetvm_1_1relay_1_1transform.html#a03b053f3d99d5c420ddc8492e6b987bf">tvm::relay::transform::RewriteAnnotatedOps</a></div><div class="ttdeci">Pass RewriteAnnotatedOps(int fallback_device)</div><div class="ttdoc">Rewrite the annotated program. </div></div>
 <div class="ttc" id="namespacetvm_1_1relay_1_1transform_html_a6185cc89297d9216551db7a3816d5180"><div class="ttname"><a href="namespacetvm_1_1relay_1_1transform.html#a6185cc89297d9216551db7a3816d5180">tvm::relay::transform::ToBasicBlockNormalForm</a></div><div class="ttdeci">Pass ToBasicBlockNormalForm()</div><div class="ttdoc">Turn an expression to Basic Block Normal Form. </div></div>
 <div class="ttc" id="namespacetvm_1_1relay_1_1transform_html_a2101aa797e69d398012ef94b63db51da"><div class="ttname"><a href="namespacetvm_1_1relay_1_1transform.html#a2101aa797e69d398012ef94b63db51da">tvm::relay::transform::CreateFunctionPass</a></div><div class="ttdeci">Pass CreateFunctionPass(const runtime::TypedPackedFunc&lt; Function(Function, IRModule, PassContext)&gt; &amp;pass_func, int opt_level, String name, tvm::Array&lt; String &gt; required)</div></div>
-<div class="ttc" id="namespacetvm_1_1relay_1_1transform_html_a2a6be6024a96a84f7230faa2519f1a97"><div class="ttname"><a href="namespacetvm_1_1relay_1_1transform.html#a2a6be6024a96a84f7230faa2519f1a97">tvm::relay::transform::FuseOps</a></div><div class="ttdeci">Pass FuseOps(int fuse_opt_level=-1)</div><div class="ttdoc">Fuse operations into expr into seperate functions. </div></div>
+<div class="ttc" id="namespacetvm_1_1relay_1_1transform_html_a2a6be6024a96a84f7230faa2519f1a97"><div class="ttname"><a href="namespacetvm_1_1relay_1_1transform.html#a2a6be6024a96a84f7230faa2519f1a97">tvm::relay::transform::FuseOps</a></div><div class="ttdeci">Pass FuseOps(int fuse_opt_level=-1)</div><div class="ttdoc">Fuse operations into expr into separate functions. </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1ObjectRef_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></div><div class="ttdoc">Base class of all object reference. </div><div class="ttdef"><b>Definition:</b> object.h:511</div></div>
 <div class="ttc" id="namespacetvm_1_1relay_html_aa65d5cde84db61b456ce982b5328fae2"><div class="ttname"><a href="namespacetvm_1_1relay.html#aa65d5cde84db61b456ce982b5328fae2">tvm::relay::SubstituteBoundVars</a></div><div class="ttdeci">Function SubstituteBoundVars(const Function &amp;func, const tvm::Map&lt; Var, Expr &gt; &amp;binds)</div><div class="ttdoc">Substitute variables with new variables (including function parameters) in a function. This is a helper function usually called by o [...]
 <div class="ttc" id="relay_2attrs_2transform_8h_html"><div class="ttname"><a href="relay_2attrs_2transform_8h.html">transform.h</a></div><div class="ttdoc">Transform operators. </div></div>
diff --git a/docs/reference/api/doxygen/runtime_2crt_2module_8h.html b/docs/reference/api/doxygen/runtime_2crt_2module_8h.html
index b8841688f..f47d1548b 100644
--- a/docs/reference/api/doxygen/runtime_2crt_2module_8h.html
+++ b/docs/reference/api/doxygen/runtime_2crt_2module_8h.html
@@ -161,7 +161,7 @@ Functions</h2></td></tr>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">mod</td><td>The module instance to register. </td></tr>
-    <tr><td class="paramname">out_handle</td><td>Pointer to recieve the newly-minted handle for this module. </td></tr>
+    <tr><td class="paramname">out_handle</td><td>Pointer to receive the newly-minted handle for this module. </td></tr>
   </table>
   </dd>
 </dl>
diff --git a/docs/reference/api/doxygen/target_8h_source.html b/docs/reference/api/doxygen/target_8h_source.html
index e30662c88..74616d0a1 100644
--- a/docs/reference/api/doxygen/target_8h_source.html
+++ b/docs/reference/api/doxygen/target_8h_source.html
@@ -85,7 +85,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1TargetNode_html_acedf257c039c25a6a16bf36b664d35c6"><div class="ttname"><a href="classtvm_1_1TargetNode.html#acedf257c039c25a6a16bf36b664d35c6">tvm::TargetNode::SEqualReduce</a></div><div class="ttdeci">bool SEqualReduce(const TargetNode *other, SEqualReducer equal) const</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Object_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></div><div class="ttdoc">base class of all object containers. </div><div class="ttdef"><b>Definition:</b> object.h:167</div></div>
 <div class="ttc" id="classtvm_1_1TargetNode_html_ac19a4ee0f0ec7ea607ec746bc4551b71"><div class="ttname"><a href="classtvm_1_1TargetNode.html#ac19a4ee0f0ec7ea607ec746bc4551b71">tvm::TargetNode::kind</a></div><div class="ttdeci">TargetKind kind</div><div class="ttdoc">The kind of the target device. </div><div class="ttdef"><b>Definition:</b> target.h:49</div></div>
-<div class="ttc" id="classtvm_1_1TargetNode_html_a3046260cd16b7b134fa99705b41d2aee"><div class="ttname"><a href="classtvm_1_1TargetNode.html#a3046260cd16b7b134fa99705b41d2aee">tvm::TargetNode::tag</a></div><div class="ttdeci">String tag</div><div class="ttdoc">Tag of the the target, can be empty. </div><div class="ttdef"><b>Definition:</b> target.h:53</div></div>
+<div class="ttc" id="classtvm_1_1TargetNode_html_a3046260cd16b7b134fa99705b41d2aee"><div class="ttname"><a href="classtvm_1_1TargetNode.html#a3046260cd16b7b134fa99705b41d2aee">tvm::TargetNode::tag</a></div><div class="ttdeci">String tag</div><div class="ttdoc">Tag of the target, can be empty. </div><div class="ttdef"><b>Definition:</b> target.h:53</div></div>
 <div class="ttc" id="classtvm_1_1TargetKind_html"><div class="ttname"><a href="classtvm_1_1TargetKind.html">tvm::TargetKind</a></div><div class="ttdoc">Managed reference class to TargetKindNode. </div><div class="ttdef"><b>Definition:</b> target_kind.h:145</div></div>
 <div class="ttc" id="classtvm_1_1AttrVisitor_html"><div class="ttname"><a href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a></div><div class="ttdoc">Visitor class to get the attributes of an AST/IR node. The content is going to be called for each fie...</div><div class="ttdef"><b>Definition:</b> reflection.h:52</div></div>
 <div class="ttc" id="target__kind_8h_html"><div class="ttname"><a href="target__kind_8h.html">target_kind.h</a></div><div class="ttdoc">Target kind registry. </div></div>
diff --git a/docs/reference/api/doxygen/tir_2op_8h.html b/docs/reference/api/doxygen/tir_2op_8h.html
index cb8c83d4a..946007f45 100644
--- a/docs/reference/api/doxygen/tir_2op_8h.html
+++ b/docs/reference/api/doxygen/tir_2op_8h.html
@@ -263,22 +263,22 @@ Functions</h2></td></tr>
 <tr class="memdesc:ac06a47be8386f28b313903e174dfe151"><td class="mdescLeft">&#160;</td><td class="mdescRight">Check if x is infinite.  <a href="namespacetvm.html#ac06a47be8386f28b313903e174dfe151">More...</a><br /></td></tr>
 <tr class="separator:ac06a47be8386f28b313903e174dfe151"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:afdad0c0329bd39949ba8d296cfb85d76"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a> (PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</td></tr>
-<tr class="memdesc:afdad0c0329bd39949ba8d296cfb85d76"><td class="mdescLeft">&#160;</td><td class="mdescRight">sum of of source expression over axis  <a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">More...</a><br /></td></tr>
+<tr class="memdesc:afdad0c0329bd39949ba8d296cfb85d76"><td class="mdescLeft">&#160;</td><td class="mdescRight">sum of source expression over axis  <a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">More...</a><br /></td></tr>
 <tr class="separator:afdad0c0329bd39949ba8d296cfb85d76"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:adeeaff4fb29f75a9da8ff4d67723c693"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">tvm::all</a> (PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</td></tr>
-<tr class="memdesc:adeeaff4fb29f75a9da8ff4d67723c693"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical And of of source expression over axis  <a href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">More...</a><br /></td></tr>
+<tr class="memdesc:adeeaff4fb29f75a9da8ff4d67723c693"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical And of source expression over axis  <a href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">More...</a><br /></td></tr>
 <tr class="separator:adeeaff4fb29f75a9da8ff4d67723c693"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a5efd9942cdee5a56cfc438ba523c04f0"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">tvm::any</a> (PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</td></tr>
-<tr class="memdesc:a5efd9942cdee5a56cfc438ba523c04f0"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical Or of of source expression over axis  <a href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">More...</a><br /></td></tr>
+<tr class="memdesc:a5efd9942cdee5a56cfc438ba523c04f0"><td class="mdescLeft">&#160;</td><td class="mdescRight">logical Or of source expression over axis  <a href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">More...</a><br /></td></tr>
 <tr class="separator:a5efd9942cdee5a56cfc438ba523c04f0"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a6dd252d98b23eaa28c43d6da6cffc1ec">tvm::max</a> (PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</td></tr>
-<tr class="memdesc:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of of source expression over axis  <a href="namespacetvm.html#a6dd252d98b23eaa28c43d6da6cffc1ec">More...</a><br /></td></tr>
+<tr class="memdesc:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of source expression over axis  <a href="namespacetvm.html#a6dd252d98b23eaa28c43d6da6cffc1ec">More...</a><br /></td></tr>
 <tr class="separator:a6dd252d98b23eaa28c43d6da6cffc1ec"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a2ea31d1a02499ea4b87694b4eeaa2306">tvm::min</a> (PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</td></tr>
-<tr class="memdesc:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of of source expression over axis  <a href="namespacetvm.html#a2ea31d1a02499ea4b87694b4eeaa2306">More...</a><br /></td></tr>
+<tr class="memdesc:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="mdescLeft">&#160;</td><td class="mdescRight">max of source expression over axis  <a href="namespacetvm.html#a2ea31d1a02499ea4b87694b4eeaa2306">More...</a><br /></td></tr>
 <tr class="separator:a2ea31d1a02499ea4b87694b4eeaa2306"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">tvm::prod</a> (PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</td></tr>
-<tr class="memdesc:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="mdescLeft">&#160;</td><td class="mdescRight">product of of source expression over axis  <a href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">More...</a><br /></td></tr>
+<tr class="memdesc:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="mdescLeft">&#160;</td><td class="mdescRight">product of source expression over axis  <a href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">More...</a><br /></td></tr>
 <tr class="separator:a32a87ae9eacafb2b5b71b28bcc9ef35e"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:aaff65dde3044433b2220677aedf4855f"><td class="memItemLeft" align="right" valign="top">PrimExpr&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="namespacetvm.html#aaff65dde3044433b2220677aedf4855f">tvm::floor</a> (PrimExpr x, Span span=Span())</td></tr>
 <tr class="memdesc:aaff65dde3044433b2220677aedf4855f"><td class="mdescLeft">&#160;</td><td class="mdescRight">Calculate floor(x)  <a href="namespacetvm.html#aaff65dde3044433b2220677aedf4855f">More...</a><br /></td></tr>
diff --git a/docs/reference/api/doxygen/tir_2op_8h_source.html b/docs/reference/api/doxygen/tir_2op_8h_source.html
index 2e30e7f23..b9d7730cd 100644
--- a/docs/reference/api/doxygen/tir_2op_8h_source.html
+++ b/docs/reference/api/doxygen/tir_2op_8h_source.html
@@ -141,7 +141,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1runtime_1_1DataType_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1DataType.html">tvm::runtime::DataType</a></div><div class="ttdoc">Runtime primitive data type. </div><div class="ttdef"><b>Definition:</b> data_type.h:41</div></div>
 <div class="ttc" id="namespacetvm_html_a2cc4aceb274161870cfd06afacda5ca1"><div class="ttname"><a href="namespacetvm.html#a2cc4aceb274161870cfd06afacda5ca1">tvm::log1p</a></div><div class="ttdeci">PrimExpr log1p(PrimExpr x, Span span=Span())</div><div class="ttdef"><b>Definition:</b> op.h:704</div></div>
 <div class="ttc" id="tir_2expr_8h_html"><div class="ttname"><a href="tir_2expr_8h.html">expr.h</a></div><div class="ttdoc">TIR expressions. </div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="namespacetvm_html_a34084606675cd2c73c6b0f10e1618280"><div class="ttname"><a href="namespacetvm.html#a34084606675cd2c73c6b0f10e1618280">tvm::reinterpret</a></div><div class="ttdeci">PrimExpr reinterpret(const DataType &amp;t, PrimExpr value, Span span=Span())</div><div class="ttdoc">perform reinterpret cast value to type. </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="namespacetvm_html_a8f30aa0685ca52f846843e76a1ad1dc7"><div class="ttname"><a href="namespacetvm.html#a8f30aa0685ca52f846843e76a1ad1dc7">tvm::indexdiv</a></div><div class="ttdeci">PrimExpr indexdiv(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute floor(a / b) where a and b are non-negative. </div></div>
@@ -161,7 +161,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_html_a15f25703cfce73c75cb4cd33c74ea8f0"><div class="ttname"><a href="namespacetvm.html#a15f25703cfce73c75cb4cd33c74ea8f0">tvm::shapediv</a></div><div class="ttdeci">PrimExpr shapediv(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute ceil(a / b) where a and b are non-negative. </div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_aed3f57cf8d1c3546f075701898c5b70f"><div class="ttname"><a href="namespacetvm_1_1tir.html#aed3f57cf8d1c3546f075701898c5b70f">tvm::tir::make_zero</a></div><div class="ttdeci">PrimExpr make_zero(DataType t, Span span=Span())</div><div class="ttdoc">Make a const zero expr. </div><div class="ttdef"><b>Definition:</b> op.h:944</div></div>
 <div class="ttc" id="namespacetvm_html_a3f6d8fba545c2944efc83b57e6190459"><div class="ttname"><a href="namespacetvm.html#a3f6d8fba545c2944efc83b57e6190459">tvm::bitwise_neg</a></div><div class="ttdeci">PrimExpr bitwise_neg(PrimExpr a, Span span=Span())</div><div class="ttdoc">take bitwise negation of two values </div></div>
-<div class="ttc" id="namespacetvm_html_a5efd9942cdee5a56cfc438ba523c04f0"><div class="ttname"><a href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">tvm::any</a></div><div class="ttdeci">PrimExpr any(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical Or of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_a5efd9942cdee5a56cfc438ba523c04f0"><div class="ttname"><a href="namespacetvm.html#a5efd9942cdee5a56cfc438ba523c04f0">tvm::any</a></div><div class="ttdeci">PrimExpr any(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical Or of source expression over axis </div></div>
 <div class="ttc" id="namespacetvm_html_a139870d327497d548e2ef8bddba2f114"><div class="ttname"><a href="namespacetvm.html#a139870d327497d548e2ef8bddba2f114">tvm::erf</a></div><div class="ttdeci">PrimExpr erf(PrimExpr x, Span span=Span())</div><div class="ttdef"><b>Definition:</b> op.h:696</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1EvaluateNode_html"><div class="ttname"><a href="classtvm_1_1tir_1_1EvaluateNode.html">tvm::tir::EvaluateNode</a></div><div class="ttdoc">Evaluates an expression. This is mostly used for putting a Call node into Stmt. </div><div class="ttdef"><b>Definition:</b> stmt.h:868</div></div>
 <div class="ttc" id="namespacetvm_html_a96d86ba91e4855c84879ba886465cacf"><div class="ttname"><a href="namespacetvm.html#a96d86ba91e4855c84879ba886465cacf">tvm::nextafter</a></div><div class="ttdeci">PrimExpr nextafter(PrimExpr x, PrimExpr y, Span span=Span())</div><div class="ttdef"><b>Definition:</b> op.h:726</div></div>
@@ -174,7 +174,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_html_aa8e1cc91eb14b427e3018836d82e15e6"><div class="ttname"><a href="namespacetvm.html#aa8e1cc91eb14b427e3018836d82e15e6">tvm::acos</a></div><div class="ttdeci">PrimExpr acos(PrimExpr x, Span span=Span())</div><div class="ttdef"><b>Definition:</b> op.h:712</div></div>
 <div class="ttc" id="namespacetvm_html_a48fb9755f38ffcfcd03592a47ffbbd14"><div class="ttname"><a href="namespacetvm.html#a48fb9755f38ffcfcd03592a47ffbbd14">tvm::GetType</a></div><div class="ttdeci">Type GetType(const PrimExpr &amp;expr)</div><div class="ttdoc">Get the type of the expression under the unified type system. </div></div>
 <div class="ttc" id="namespacetvm_html_ac1b3a94a13d11c02d7e79cad2638e74a"><div class="ttname"><a href="namespacetvm.html#ac1b3a94a13d11c02d7e79cad2638e74a">tvm::log2</a></div><div class="ttdeci">PrimExpr log2(PrimExpr x, Span span=Span())</div><div class="ttdef"><b>Definition:</b> op.h:702</div></div>
-<div class="ttc" id="namespacetvm_html_adeeaff4fb29f75a9da8ff4d67723c693"><div class="ttname"><a href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">tvm::all</a></div><div class="ttdeci">PrimExpr all(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical And of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_adeeaff4fb29f75a9da8ff4d67723c693"><div class="ttname"><a href="namespacetvm.html#adeeaff4fb29f75a9da8ff4d67723c693">tvm::all</a></div><div class="ttdeci">PrimExpr all(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">logical And of source expression over axis </div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_a48cd6ae7623f42cddbb05cc008c33711"><div class="ttname"><a href="namespacetvm_1_1tir.html#a48cd6ae7623f42cddbb05cc008c33711">tvm::tir::IsPointerType</a></div><div class="ttdeci">bool IsPointerType(const Type &amp;type, const DataType &amp;element_type)</div><div class="ttdoc">Check if type is a pointer to a runtime element type. </div><div class="ttdef"><b>Definition:</b> op.h:739</div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_a5c414d5e54c099ad7287be302aac8f02"><div class="ttname"><a href="namespacetvm_1_1tir.html#a5c414d5e54c099ad7287be302aac8f02">tvm::tir::is_const_int</a></div><div class="ttdeci">bool is_const_int(const PrimExpr &amp;x, int64_t value)</div><div class="ttdoc">Check whether x is a constant integer expression. </div><div class="ttdef"><b>Definition:</b> op.h:892</div></div>
 <div class="ttc" id="namespacetvm_html_ac3932d85fd31819eae6a80841296af51"><div class="ttname"><a href="namespacetvm.html#ac3932d85fd31819eae6a80841296af51">tvm::not_equal</a></div><div class="ttdeci">PrimExpr not_equal(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">not_equal </div></div>
@@ -213,7 +213,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_html_aa048961a5d19e9f32071c1372809ecbd"><div class="ttname"><a href="namespacetvm.html#aa048961a5d19e9f32071c1372809ecbd">tvm::sigmoid</a></div><div class="ttdeci">PrimExpr sigmoid(PrimExpr x, Span span=Span())</div><div class="ttdef"><b>Definition:</b> op.h:698</div></div>
 <div class="ttc" id="tir_2op_8h_html_a8fc539385c2bb11740d0a6bef19be7b8"><div class="ttname"><a href="tir_2op_8h.html#a8fc539385c2bb11740d0a6bef19be7b8">TVM_DEFINE_BINOP_CONST_VAL_OVERLOAD_SPANNED</a></div><div class="ttdeci">#define TVM_DEFINE_BINOP_CONST_VAL_OVERLOAD_SPANNED(Name)</div><div class="ttdef"><b>Definition:</b> op.h:982</div></div>
 <div class="ttc" id="namespacetvm_html_ad4fceb4266c6e7644fa373eacf73359f"><div class="ttname"><a href="namespacetvm.html#ad4fceb4266c6e7644fa373eacf73359f">tvm::left_shift</a></div><div class="ttdeci">PrimExpr left_shift(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">left shift operator </div></div>
-<div class="ttc" id="namespacetvm_html_a32a87ae9eacafb2b5b71b28bcc9ef35e"><div class="ttname"><a href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">tvm::prod</a></div><div class="ttdeci">PrimExpr prod(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">product of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_a32a87ae9eacafb2b5b71b28bcc9ef35e"><div class="ttname"><a href="namespacetvm.html#a32a87ae9eacafb2b5b71b28bcc9ef35e">tvm::prod</a></div><div class="ttdeci">PrimExpr prod(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">product of source expression over axis </div></div>
 <div class="ttc" id="namespacetvm_1_1tir_html_a48bad3db162b334837716bf8e7ba9285"><div class="ttname"><a href="namespacetvm_1_1tir.html#a48bad3db162b334837716bf8e7ba9285">tvm::tir::is_zero</a></div><div class="ttdeci">bool is_zero(const PrimExpr &amp;x)</div><div class="ttdoc">Check whether x is a constant integer 0. </div><div class="ttdef"><b>Definition:</b> op.h:829</div></div>
 <div class="ttc" id="namespacetvm_html_a41918af1a1dc386388639a9d3ad06c5d"><div class="ttname"><a href="namespacetvm.html#a41918af1a1dc386388639a9d3ad06c5d">tvm::DataType</a></div><div class="ttdeci">runtime::DataType DataType</div><div class="ttdef"><b>Definition:</b> data_type.h:389</div></div>
 <div class="ttc" id="namespacetvm_html_a52fa1dc57423a077eb098960162e7b85"><div class="ttname"><a href="namespacetvm.html#a52fa1dc57423a077eb098960162e7b85">tvm::less</a></div><div class="ttdeci">PrimExpr less(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">less </div></div>
diff --git a/docs/reference/api/doxygen/topi_2nn_8h_source.html b/docs/reference/api/doxygen/topi_2nn_8h_source.html
index 15e5958cd..767d1e85c 100644
--- a/docs/reference/api/doxygen/topi_2nn_8h_source.html
+++ b/docs/reference/api/doxygen/topi_2nn_8h_source.html
@@ -97,7 +97,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_html_a16f9cd9219b505e2cc05c5a7558ac61f"><div class="ttname"><a href="namespacetvm.html#a16f9cd9219b505e2cc05c5a7558ac61f">tvm::div</a></div><div class="ttdeci">PrimExpr div(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute division in C semantics. </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html_aed6387e67d18b9d5ad18f510fd600a25"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html#aed6387e67d18b9d5ad18f510fd600a25">tvm::runtime::Array::size</a></div><div class="ttdeci">size_t size() const</div><div class="ttdef"><b>Definition:</b> array.h:399</div></div>
 <div class="ttc" id="tir_2expr_8h_html"><div class="ttname"><a href="tir_2expr_8h.html">expr.h</a></div><div class="ttdoc">TIR expressions. </div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="namespacetvm_1_1topi_html_a0250c4095f19ae8a22ed85bc4ce5a40d"><div class="ttname"><a href="namespacetvm_1_1topi.html#a0250c4095f19ae8a22ed85bc4ce5a40d">tvm::topi::kElementWise</a></div><div class="ttdeci">constexpr auto kElementWise</div><div class="ttdef"><b>Definition:</b> tags.h:32</div></div>
 <div class="ttc" id="namespacetvm_html_a8f30aa0685ca52f846843e76a1ad1dc7"><div class="ttname"><a href="namespacetvm.html#a8f30aa0685ca52f846843e76a1ad1dc7">tvm::indexdiv</a></div><div class="ttdeci">PrimExpr indexdiv(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute floor(a / b) where a and b are non-negative. </div></div>
diff --git a/docs/reference/api/doxygen/topi_2transform_8h_source.html b/docs/reference/api/doxygen/topi_2transform_8h_source.html
index 0713169dd..177cd5cd1 100644
--- a/docs/reference/api/doxygen/topi_2transform_8h_source.html
+++ b/docs/reference/api/doxygen/topi_2transform_8h_source.html
@@ -119,7 +119,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1topi_html_acb438962b08475a05e086907bf8eb26a"><div class="ttname"><a href="namespacetvm_1_1topi.html#acb438962b08475a05e086907bf8eb26a">tvm::topi::stack</a></div><div class="ttdeci">Tensor stack(const Array&lt; Tensor &gt; &amp;inputs, int axis=0, std::string name=&quot;T_stack&quot;, std::string tag=kInjective)</div><div class="ttdoc">Join a sequence of tensors along a new axis. </div><div class="ttdef"><b>Definition:</b> transform.h:529</div></div>
 <div class="ttc" id="tensor__utils_8h_html"><div class="ttname"><a href="tensor__utils_8h.html">tensor_utils.h</a></div><div class="ttdoc">Utility functions for handling tensor. </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1DataType_html_a237a714a6a16e14aa01fa4ac52426551"><div class="ttname"><a href="classtvm_1_1runtime_1_1DataType.html#a237a714a6a16e14aa01fa4ac52426551">tvm::runtime::DataType::Float</a></div><div class="ttdeci">static DataType Float(int bits, int lanes=1)</div><div class="ttdoc">Construct an float type. </div><div class="ttdef"><b>Definition:</b> data_type.h:168</div></div>
-<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of of source expression over axis </div></div>
+<div class="ttc" id="namespacetvm_html_afdad0c0329bd39949ba8d296cfb85d76"><div class="ttname"><a href="namespacetvm.html#afdad0c0329bd39949ba8d296cfb85d76">tvm::sum</a></div><div class="ttdeci">PrimExpr sum(PrimExpr source, Array&lt; tir::IterVar &gt; axis, Array&lt; PrimExpr &gt; init={}, Span span=Span())</div><div class="ttdoc">sum of source expression over axis </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Array_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array</a></div><div class="ttdoc">Array, container representing a contiguous sequence of ObjectRefs. </div><div class="ttdef"><b>Definition:</b> array.h:270</div></div>
 <div class="ttc" id="classtvm_1_1tir_1_1IndexMap_html"><div class="ttname"><a href="classtvm_1_1tir_1_1IndexMap.html">tvm::tir::IndexMap</a></div><div class="ttdef"><b>Definition:</b> index_map.h:154</div></div>
 <div class="ttc" id="namespacetvm_html_a8f30aa0685ca52f846843e76a1ad1dc7"><div class="ttname"><a href="namespacetvm.html#a8f30aa0685ca52f846843e76a1ad1dc7">tvm::indexdiv</a></div><div class="ttdeci">PrimExpr indexdiv(PrimExpr a, PrimExpr b, Span span=Span())</div><div class="ttdoc">compute floor(a / b) where a and b are non-negative. </div></div>
diff --git a/docs/reference/api/doxygen/transform__step_8h.html b/docs/reference/api/doxygen/transform__step_8h.html
index 22580bcfc..bbc568b17 100644
--- a/docs/reference/api/doxygen/transform__step_8h.html
+++ b/docs/reference/api/doxygen/transform__step_8h.html
@@ -183,7 +183,7 @@ Classes</h2></td></tr>
 <tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStepNode.html" title="Cache read step that corresponds to te::Schedule::cache_read. ">CacheReadStepNode</a>.  <a href="classtvm_1_1auto__scheduler_1_1CacheReadStep.html#details">More...</a><br /></td></tr>
 <tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:"><td class="memItemLeft" align="right" valign="top">class &#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html">tvm::auto_scheduler::CacheWriteStepNode</a></td></tr>
-<tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The the tensor will take over body of original tens...">te::Schedule::cache_write</a>.  <a href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html#details">More...</a><br /></td></tr>
+<tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Cache write step that corresponds to <a class="el" href="classtvm_1_1te_1_1Schedule.html#ada9825f59ef130a0ab0b3a01ea348d71" title="Create a cache write tensor for producing tensor. The tensor will take over body of original tensor o...">te::Schedule::cache_write</a>.  <a href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html#details">More...</a><br /></td></tr>
 <tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:"><td class="memItemLeft" align="right" valign="top">class &#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStep.html">tvm::auto_scheduler::CacheWriteStep</a></td></tr>
 <tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Managed reference to <a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html" title="Cache write step that corresponds to te::Schedule::cache_write. ">CacheWriteStepNode</a>.  <a href="classtvm_1_1auto__scheduler_1_1CacheWriteStep.html#details">More...</a><br /></td></tr>
diff --git a/docs/reference/api/python/auto_scheduler.html b/docs/reference/api/python/auto_scheduler.html
index 9d5a4f0c8..5b12f26e3 100644
--- a/docs/reference/api/python/auto_scheduler.html
+++ b/docs/reference/api/python/auto_scheduler.html
@@ -1602,7 +1602,7 @@ history states as starting point to perform Evolutionary Search).</p></li>
 
 <dl class="py class">
 <dt class="sig sig-object py" id="tvm.auto_scheduler.SketchPolicy">
-<em class="property"><span class="pre">class</span> </em><span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">SketchPolicy</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">program_cost_model</span></span><span class="o"><span class="pre">=</span></span><span class="defau [...]
+<em class="property"><span class="pre">class</span> </em><span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">SketchPolicy</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">program_cost_model</span></span><span class="o"><span class="pre">=</span></span><span class="defau [...]
 <dd><p>The search policy that searches in a hierarchical search space defined by sketches.
 The policy randomly samples programs from the space defined by sketches and use evolutionary
 search to fine-tune them.</p>
@@ -1886,7 +1886,7 @@ Candidates:
 
 <dl class="py function">
 <dt class="sig sig-object py" id="tvm.auto_scheduler.auto_schedule">
-<span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">auto_schedule</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">search_policy</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em clas [...]
+<span class="sig-prename descclassname"><span class="pre">tvm.auto_scheduler.</span></span><span class="sig-name descname"><span class="pre">auto_schedule</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">task</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">search_policy</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em clas [...]
 <dd><p>THIS API IS DEPRECATED.</p>
 <p>Run auto scheduling search for a task.</p>
 <dl class="field-list simple">
diff --git a/docs/reference/api/python/error.html b/docs/reference/api/python/error.html
index 181ad1fdb..bb3b3055d 100644
--- a/docs/reference/api/python/error.html
+++ b/docs/reference/api/python/error.html
@@ -367,7 +367,7 @@
 <span id="tvm-error"></span><h1>tvm.error<a class="headerlink" href="#module-tvm.error" title="Permalink to this headline">¶</a></h1>
 <p>Structured error classes in TVM.</p>
 <p>Each error class takes an error message as its input.
-See the example sections for for suggested message conventions.
+See the example sections for suggested message conventions.
 To make the code more readable, we recommended developers to
 copy the examples and raise errors with the same message convention.</p>
 <div class="admonition note">
diff --git a/docs/reference/api/python/ir.html b/docs/reference/api/python/ir.html
index 9998d39d9..ecea68e47 100644
--- a/docs/reference/api/python/ir.html
+++ b/docs/reference/api/python/ir.html
@@ -756,7 +756,7 @@ that the two nodes are structurally equal to each other.</p>
 <dd class="field-odd"><ul class="simple">
 <li><p><strong>node</strong> (<a class="reference internal" href="runtime.html#tvm.runtime.Object" title="tvm.runtime.Object"><em>Object</em></a>) – The input to be hashed.</p></li>
 <li><p><strong>map_free_vars</strong> (<a class="reference external" href="https://docs.python.org/3/library/functions.html#bool" title="(in Python v3.10)"><em>bool</em></a>) – If map_free_vars is set to true, we will hash free variables
-by the order of their occurences. Otherwise, we will hash by
+by the order of their occurrences. Otherwise, we will hash by
 their in-memory pointer address.</p></li>
 </ul>
 </dd>
diff --git a/docs/reference/api/python/relay/index.html b/docs/reference/api/python/relay/index.html
index 6181e4564..e67a4f7e6 100644
--- a/docs/reference/api/python/relay/index.html
+++ b/docs/reference/api/python/relay/index.html
@@ -435,7 +435,7 @@
 <td><p>Returns the indices of the minimum values along an axis.</p></td>
 </tr>
 <tr class="row-odd"><td><p><code class="xref py py-obj docutils literal notranslate"><span class="pre">argsort</span></code>(data[, axis, is_ascend, dtype])</p></td>
-<td><p>Performs sorting along the given axis and returns an array of indicies having same shape as an input array that index data in sorted order.</p></td>
+<td><p>Performs sorting along the given axis and returns an array of indices having same shape as an input array that index data in sorted order.</p></td>
 </tr>
 <tr class="row-even"><td><p><code class="xref py py-obj docutils literal notranslate"><span class="pre">argwhere</span></code>(condition)</p></td>
 <td><p>Find the indices of elements of a tensor that are non-zero.</p></td>
@@ -1973,7 +1973,7 @@ multiple indices, default is False (first index).</p></li>
 <dl class="py function">
 <dt class="sig sig-object py">
 <span class="sig-prename descclassname"><span class="pre">tvm.relay.</span></span><span class="sig-name descname"><span class="pre">argsort</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">data</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">axis</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">-</span> <span class="pre">1</span></span></em>, <em clas [...]
-<dd><p>Performs sorting along the given axis and returns an array of indicies
+<dd><p>Performs sorting along the given axis and returns an array of indices
 having same shape as an input array that index data in sorted order.</p>
 <dl class="field-list simple">
 <dt class="field-odd">Parameters</dt>
diff --git a/docs/reference/api/python/runtime.html b/docs/reference/api/python/runtime.html
index 7fd461040..0038d02f5 100644
--- a/docs/reference/api/python/runtime.html
+++ b/docs/reference/api/python/runtime.html
@@ -679,7 +679,7 @@ Returns None for all other devices.</p>
 <dt class="sig sig-object py" id="tvm.runtime.Device.warp_size">
 <em class="property"><span class="pre">property</span> </em><span class="sig-name descname"><span class="pre">warp_size</span></span><a class="headerlink" href="#tvm.runtime.Device.warp_size" title="Permalink to this definition">¶</a></dt>
 <dd><p>Number of threads that execute concurrently.</p>
-<p>Returns device value for for cuda, rocm, and vulkan.  Returns
+<p>Returns device value for cuda, rocm, and vulkan.  Returns
 1 for metal and opencl devices, regardless of the physical
 device.  Returns remote device value for RPC devices.  Returns
 None for all other devices.</p>
diff --git a/docs/reference/api/python/target.html b/docs/reference/api/python/target.html
index 1a49387cc..799a35c8e 100644
--- a/docs/reference/api/python/target.html
+++ b/docs/reference/api/python/target.html
@@ -920,7 +920,7 @@ testing.</p>
 <dd><p>Returns a dict of tags, which maps each tag name to its corresponding target.</p>
 <dl class="field-list simple">
 <dt class="field-odd">Returns</dt>
-<dd class="field-odd"><p><strong>tag_dict</strong> – The dict of tags mapping each tag name to to its corresponding target.
+<dd class="field-odd"><p><strong>tag_dict</strong> – The dict of tags mapping each tag name to its corresponding target.
 None if TVM is built in runtime-only mode.</p>
 </dd>
 <dt class="field-even">Return type</dt>
diff --git a/docs/reference/api/python/te.html b/docs/reference/api/python/te.html
index 08eb933f8..cc0dd2c12 100644
--- a/docs/reference/api/python/te.html
+++ b/docs/reference/api/python/te.html
@@ -2593,7 +2593,7 @@ be useful for the compiler to detect scan body faster.</p></li>
 </ul>
 </dd>
 <dt class="field-even">Returns</dt>
-<dd class="field-even"><p><strong>tensor</strong> – The created tensor or tuple of tensors it it contains multiple outputs.</p>
+<dd class="field-even"><p><strong>tensor</strong> – The created tensor or tuple of tensors contains multiple outputs.</p>
 </dd>
 <dt class="field-odd">Return type</dt>
 <dd class="field-odd"><p><a class="reference internal" href="#tvm.te.Tensor" title="tvm.te.Tensor">Tensor</a> or list of Tensors</p>
@@ -2652,7 +2652,7 @@ by default dtype will be same as inputs.</p></li>
 </dl>
 <dl class="field-list simple">
 <dt class="field-odd">Returns</dt>
-<dd class="field-odd"><p><strong>tensor</strong> – The created tensor or tuple of tensors it it contains multiple outputs.</p>
+<dd class="field-odd"><p><strong>tensor</strong> – The created tensor or tuple of tensors contains multiple outputs.</p>
 </dd>
 <dt class="field-even">Return type</dt>
 <dd class="field-even"><p><a class="reference internal" href="#tvm.te.Tensor" title="tvm.te.Tensor">Tensor</a> or list of Tensors</p>
diff --git a/docs/reference/api/python/tir.html b/docs/reference/api/python/tir.html
index e064b2cc5..f12ac5483 100644
--- a/docs/reference/api/python/tir.html
+++ b/docs/reference/api/python/tir.html
@@ -2826,7 +2826,7 @@ Same as call_packed, except that the first argument is the function name
 <p>The argument to packed function can be Expr or Buffer.
 The argument is the corresponding POD type when Expr is presented.</p>
 <p>When the argument is Buffer, the corresponding PackedFunc
-will recieve an TVMArrayHandle whose content is valid during the callback period.
+will receive an TVMArrayHandle whose content is valid during the callback period.
 If the PackedFunc is a python callback, then the corresponding argument is NDArray.</p>
 <dl class="field-list simple">
 <dt class="field-odd">Parameters</dt>
diff --git a/docs/reference/api/python/topi.html b/docs/reference/api/python/topi.html
index 39c10ab97..94a37e740 100644
--- a/docs/reference/api/python/topi.html
+++ b/docs/reference/api/python/topi.html
@@ -1913,7 +1913,7 @@ by default dtype will be same as inputs.</p></li>
 </dl>
 <dl class="field-list simple">
 <dt class="field-odd">Returns</dt>
-<dd class="field-odd"><p><strong>tensor</strong> – The created tensor or tuple of tensors it it contains multiple outputs.</p>
+<dd class="field-odd"><p><strong>tensor</strong> – The created tensor or tuple of tensors contains multiple outputs.</p>
 </dd>
 <dt class="field-even">Return type</dt>
 <dd class="field-even"><p><a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor">Tensor</a> or list of Tensors</p>
@@ -3712,7 +3712,7 @@ This gives frequency components of the signal as they change over time.
 <li><p><strong>a</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a>) – The tensor to be sliced.</p></li>
 <li><p><strong>v</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a>) – The values to set</p></li>
 <li><p><strong>begin</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a>) – The indices to begin with in the slicing.</p></li>
-<li><p><strong>end</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a>) – Indicies indicating end of the slice.</p></li>
+<li><p><strong>end</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a>) – Indices indicating end of the slice.</p></li>
 <li><p><strong>strides</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a><em>, </em><em>optional</em>) – Specifies the stride values, it can be negative
 in that case, the input tensor will be reversed
 in that particular axis.</p></li>
@@ -3736,7 +3736,7 @@ in that particular axis.</p></li>
 <dd class="field-odd"><ul class="simple">
 <li><p><strong>a</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a>) – The tensor to be sliced.</p></li>
 <li><p><strong>begin</strong> (<em>list of int</em>) – The indices to begin with in the slicing.</p></li>
-<li><p><strong>end</strong> (<em>list of int</em>) – Indicies indicating end of the slice.</p></li>
+<li><p><strong>end</strong> (<em>list of int</em>) – Indices indicating end of the slice.</p></li>
 <li><p><strong>strides</strong> (<em>list of int</em><em>, </em><em>optional</em>) – Specifies the stride values, it can be negative
 in that case, the input tensor will be reversed
 in that particular axis.</p></li>
@@ -7281,7 +7281,7 @@ the target.</p>
 <dd class="field-odd"><ul class="simple">
 <li><p><strong>a</strong> (<a class="reference internal" href="te.html#tvm.te.Tensor" title="tvm.te.Tensor"><em>tvm.te.Tensor</em></a>) – The tensor to be sliced.</p></li>
 <li><p><strong>begin</strong> (<em>list of int</em>) – The indices to begin with in the slicing.</p></li>
-<li><p><strong>end</strong> (<em>list of int</em>) – Indicies indicating end of the slice.</p></li>
+<li><p><strong>end</strong> (<em>list of int</em>) – Indices indicating end of the slice.</p></li>
 <li><p><strong>strides</strong> (<em>list of int</em><em>, </em><em>optional</em>) – Specifies the stride values, it can be negative
 in that case, the input tensor will be reversed
 in that particular axis.</p></li>
diff --git a/docs/reference/api/typedoc/classes/bytestreamreader.html b/docs/reference/api/typedoc/classes/bytestreamreader.html
index 3896fc9d2..492df5474 100644
--- a/docs/reference/api/typedoc/classes/bytestreamreader.html
+++ b/docs/reference/api/typedoc/classes/bytestreamreader.html
@@ -119,7 +119,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -141,7 +141,7 @@
 					<div class="tsd-signature tsd-kind-icon">bytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Uint8Array</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L43">rpc_server.ts:43</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -151,7 +151,7 @@
 					<div class="tsd-signature tsd-kind-icon">offset<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L42">rpc_server.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L42">rpc_server.ts:42</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -168,7 +168,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L63">rpc_server.ts:63</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L63">rpc_server.ts:63</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">Uint8Array</span></h4>
@@ -185,7 +185,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L49">rpc_server.ts:49</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L49">rpc_server.ts:49</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -202,7 +202,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L57">rpc_server.ts:57</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L57">rpc_server.ts:57</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
diff --git a/docs/reference/api/typedoc/classes/cachedcallstack.html b/docs/reference/api/typedoc/classes/cachedcallstack.html
index f6df5b2a7..af48a9a8f 100644
--- a/docs/reference/api/typedoc/classes/cachedcallstack.html
+++ b/docs/reference/api/typedoc/classes/cachedcallstack.html
@@ -144,7 +144,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L223">memory.ts:223</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L223">memory.ts:223</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -172,7 +172,7 @@
 					<div class="tsd-signature tsd-kind-icon">temp<wbr>Args<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><a href="../interfaces/disposable.html" class="tsd-signature-type">Disposable</a><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = []</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L208">memory.ts:208</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L208">memory.ts:208</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -194,7 +194,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L312">memory.ts:312</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L312">memory.ts:312</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -226,7 +226,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L284">memory.ts:284</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L284">memory.ts:284</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -262,7 +262,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L388">memory.ts:388</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L388">memory.ts:388</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -300,7 +300,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L376">memory.ts:376</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L376">memory.ts:376</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -340,7 +340,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L267">memory.ts:267</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L267">memory.ts:267</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -373,7 +373,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L243">memory.ts:243</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L243">memory.ts:243</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -390,7 +390,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L321">memory.ts:321</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L321">memory.ts:321</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -422,7 +422,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L252">memory.ts:252</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L252">memory.ts:252</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -444,7 +444,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L359">memory.ts:359</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L359">memory.ts:359</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -470,7 +470,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L342">memory.ts:342</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L342">memory.ts:342</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -496,7 +496,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L350">memory.ts:350</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L350">memory.ts:350</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -522,7 +522,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L326">memory.ts:326</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L326">memory.ts:326</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -548,7 +548,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L363">memory.ts:363</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L363">memory.ts:363</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -574,7 +574,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L346">memory.ts:346</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L346">memory.ts:346</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -600,7 +600,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L334">memory.ts:334</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L334">memory.ts:334</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
diff --git a/docs/reference/api/typedoc/classes/dldatatype.html b/docs/reference/api/typedoc/classes/dldatatype.html
index 3af59ed75..a808d14e9 100644
--- a/docs/reference/api/typedoc/classes/dldatatype.html
+++ b/docs/reference/api/typedoc/classes/dldatatype.html
@@ -119,7 +119,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L262">runtime.ts:262</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L262">runtime.ts:262</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -147,7 +147,7 @@
 					<div class="tsd-signature tsd-kind-icon">bits<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L260">runtime.ts:260</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L260">runtime.ts:260</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">code<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L258">runtime.ts:258</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L258">runtime.ts:258</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -177,7 +177,7 @@
 					<div class="tsd-signature tsd-kind-icon">lanes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L262">runtime.ts:262</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L262">runtime.ts:262</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -199,7 +199,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L279">runtime.ts:279</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L279">runtime.ts:279</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -216,7 +216,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L270">runtime.ts:270</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L270">runtime.ts:270</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">string</span></h4>
diff --git a/docs/reference/api/typedoc/classes/dldevice.html b/docs/reference/api/typedoc/classes/dldevice.html
index 552985d75..e683130b4 100644
--- a/docs/reference/api/typedoc/classes/dldevice.html
+++ b/docs/reference/api/typedoc/classes/dldevice.html
@@ -118,7 +118,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L202">runtime.ts:202</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L202">runtime.ts:202</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -146,7 +146,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<wbr>Id<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L200">runtime.ts:200</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L200">runtime.ts:200</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -161,7 +161,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L198">runtime.ts:198</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L198">runtime.ts:198</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -183,7 +183,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L223">runtime.ts:223</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L223">runtime.ts:223</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -205,7 +205,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L230">runtime.ts:230</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L230">runtime.ts:230</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">string</span></h4>
diff --git a/docs/reference/api/typedoc/classes/environment.html b/docs/reference/api/typedoc/classes/environment.html
index 967f99243..628ae3f54 100644
--- a/docs/reference/api/typedoc/classes/environment.html
+++ b/docs/reference/api/typedoc/classes/environment.html
@@ -125,7 +125,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/environment.ts#L86">environment.ts:86</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/environment.ts#L86">environment.ts:86</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -169,7 +169,7 @@
 					<aside class="tsd-sources">
 						<p>Implementation of <a href="../interfaces/libraryprovider.html">LibraryProvider</a>.<a href="../interfaces/libraryprovider.html#imports">imports</a></p>
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/environment.ts#L70">environment.ts:70</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/environment.ts#L70">environment.ts:70</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 					<div class="tsd-signature tsd-kind-icon">logger<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>msg<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/environment.ts#L69">environment.ts:69</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/environment.ts#L69">environment.ts:69</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -210,7 +210,7 @@
 					<div class="tsd-signature tsd-kind-icon">packedCFunc<wbr>Table<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">ctypes.FTVMWasmPackedCFunc</span><span class="tsd-signature-symbol"> | </span><span class="tsd-signature-type">undefined</span><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = [undefined,]</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/environment.ts#L78">environment.ts:78</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/environment.ts#L78">environment.ts:78</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -228,7 +228,7 @@
 					<div class="tsd-signature tsd-kind-icon">packedCFunc<wbr>Table<wbr>Free<wbr>Id<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">&gt;</span><span class="tsd-signature-symbol"> = []</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/environment.ts#L84">environment.ts:84</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/environment.ts#L84">environment.ts:84</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -250,7 +250,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/environment.ts#L105">environment.ts:105</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/environment.ts#L105">environment.ts:105</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/ffilibrary.html b/docs/reference/api/typedoc/classes/ffilibrary.html
index d7b26fda2..e5cba53d6 100644
--- a/docs/reference/api/typedoc/classes/ffilibrary.html
+++ b/docs/reference/api/typedoc/classes/ffilibrary.html
@@ -131,7 +131,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L49">runtime.ts:49</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L49">runtime.ts:49</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -156,7 +156,7 @@
 					<div class="tsd-signature tsd-kind-icon">exports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">Function</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L46">runtime.ts:46</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L46">runtime.ts:46</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -166,7 +166,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L45">runtime.ts:45</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L45">runtime.ts:45</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">wasm32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">boolean</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L44">runtime.ts:44</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L44">runtime.ts:44</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -186,7 +186,7 @@
 					<div class="tsd-signature tsd-kind-icon">webGPUContext<span class="tsd-signature-symbol">:</span> <a href="webgpucontext.html" class="tsd-signature-type">WebGPUContext</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L47">runtime.ts:47</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L47">runtime.ts:47</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -203,7 +203,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L76">runtime.ts:76</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L76">runtime.ts:76</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -226,7 +226,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L66">runtime.ts:66</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L66">runtime.ts:66</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -243,7 +243,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L84">runtime.ts:84</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L84">runtime.ts:84</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <a href="cachedcallstack.html" class="tsd-signature-type">CachedCallStack</a></h4>
@@ -260,7 +260,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L95">runtime.ts:95</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L95">runtime.ts:95</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -283,7 +283,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L72">runtime.ts:72</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L72">runtime.ts:72</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
diff --git a/docs/reference/api/typedoc/classes/graphexecutor.html b/docs/reference/api/typedoc/classes/graphexecutor.html
index 7e7766c43..aa3fc526e 100644
--- a/docs/reference/api/typedoc/classes/graphexecutor.html
+++ b/docs/reference/api/typedoc/classes/graphexecutor.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L583">runtime.ts:583</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L583">runtime.ts:583</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">module<span class="tsd-signature-symbol">:</span> <a href="module.html" class="tsd-signature-type">Module</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L579">runtime.ts:579</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L579">runtime.ts:579</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L654">runtime.ts:654</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L654">runtime.ts:654</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -224,7 +224,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L597">runtime.ts:597</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L597">runtime.ts:597</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -241,7 +241,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L631">runtime.ts:631</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L631">runtime.ts:631</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -279,7 +279,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L644">runtime.ts:644</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L644">runtime.ts:644</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -310,7 +310,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L621">runtime.ts:621</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L621">runtime.ts:621</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -332,7 +332,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L609">runtime.ts:609</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L609">runtime.ts:609</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/instance.html b/docs/reference/api/typedoc/classes/instance.html
index 062644bfa..1b812f2ed 100644
--- a/docs/reference/api/typedoc/classes/instance.html
+++ b/docs/reference/api/typedoc/classes/instance.html
@@ -139,7 +139,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L692">runtime.ts:692</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L692">runtime.ts:692</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -202,7 +202,7 @@
 					<div class="tsd-signature tsd-kind-icon">exports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">Function</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L684">runtime.ts:684</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L684">runtime.ts:684</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -212,7 +212,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L683">runtime.ts:683</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L683">runtime.ts:683</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -229,7 +229,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L932">runtime.ts:932</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L932">runtime.ts:932</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -260,7 +260,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L994">runtime.ts:994</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L994">runtime.ts:994</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -303,7 +303,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L924">runtime.ts:924</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L924">runtime.ts:924</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -341,7 +341,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L732">runtime.ts:732</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L732">runtime.ts:732</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -358,7 +358,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L952">runtime.ts:952</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L952">runtime.ts:952</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -402,7 +402,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L816">runtime.ts:816</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L816">runtime.ts:816</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -434,7 +434,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L1033">runtime.ts:1033</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L1033">runtime.ts:1033</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -465,7 +465,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L846">runtime.ts:846</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L846">runtime.ts:846</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -497,7 +497,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L750">runtime.ts:750</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L750">runtime.ts:750</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -520,7 +520,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L1013">runtime.ts:1013</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L1013">runtime.ts:1013</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -568,7 +568,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L789">runtime.ts:789</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L789">runtime.ts:789</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -608,7 +608,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L914">runtime.ts:914</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L914">runtime.ts:914</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -646,7 +646,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L1145">runtime.ts:1145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L1145">runtime.ts:1145</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -698,7 +698,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L740">runtime.ts:740</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L740">runtime.ts:740</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -722,7 +722,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L868">runtime.ts:868</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L868">runtime.ts:868</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -754,7 +754,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L857">runtime.ts:857</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L857">runtime.ts:857</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -786,7 +786,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L940">runtime.ts:940</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L940">runtime.ts:940</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/memory.html b/docs/reference/api/typedoc/classes/memory.html
index 36f65bd71..5b6c60e5a 100644
--- a/docs/reference/api/typedoc/classes/memory.html
+++ b/docs/reference/api/typedoc/classes/memory.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L40">memory.ts:40</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L40">memory.ts:40</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -152,7 +152,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Memory</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L32">memory.ts:32</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L32">memory.ts:32</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -162,7 +162,7 @@
 					<div class="tsd-signature tsd-kind-icon">wasm32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">boolean</span><span class="tsd-signature-symbol"> = true</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L33">memory.ts:33</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L33">memory.ts:33</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -179,7 +179,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L154">memory.ts:154</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L154">memory.ts:154</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -210,7 +210,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L90">memory.ts:90</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L90">memory.ts:90</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -233,7 +233,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L97">memory.ts:97</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L97">memory.ts:97</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -256,7 +256,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L74">memory.ts:74</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L74">memory.ts:74</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -279,7 +279,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L81">memory.ts:81</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L81">memory.ts:81</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -302,7 +302,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L104">memory.ts:104</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L104">memory.ts:104</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -325,7 +325,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L132">memory.ts:132</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L132">memory.ts:132</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -362,7 +362,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L145">memory.ts:145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L145">memory.ts:145</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -393,7 +393,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L60">memory.ts:60</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L60">memory.ts:60</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -416,7 +416,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L67">memory.ts:67</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L67">memory.ts:67</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -439,7 +439,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L53">memory.ts:53</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L53">memory.ts:53</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -462,7 +462,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L114">memory.ts:114</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L114">memory.ts:114</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -485,7 +485,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L124">memory.ts:124</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L124">memory.ts:124</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">number</span></h4>
@@ -502,7 +502,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/memory.ts#L175">memory.ts:175</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/memory.ts#L175">memory.ts:175</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/module.html b/docs/reference/api/typedoc/classes/module.html
index ae3d45556..dd4e46aa3 100644
--- a/docs/reference/api/typedoc/classes/module.html
+++ b/docs/reference/api/typedoc/classes/module.html
@@ -124,7 +124,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L504">runtime.ts:504</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L504">runtime.ts:504</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -170,7 +170,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L502">runtime.ts:502</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L502">runtime.ts:502</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -187,7 +187,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L516">runtime.ts:516</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L516">runtime.ts:516</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -204,7 +204,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L530">runtime.ts:530</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L530">runtime.ts:530</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -236,7 +236,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L561">runtime.ts:561</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L561">runtime.ts:561</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/ndarray.html b/docs/reference/api/typedoc/classes/ndarray.html
index 77529b9ad..9221dbf97 100644
--- a/docs/reference/api/typedoc/classes/ndarray.html
+++ b/docs/reference/api/typedoc/classes/ndarray.html
@@ -130,7 +130,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L304">runtime.ts:304</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L304">runtime.ts:304</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -158,7 +158,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<span class="tsd-signature-symbol">:</span> <a href="dldevice.html" class="tsd-signature-type">DLDevice</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L297">runtime.ts:297</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L297">runtime.ts:297</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -173,7 +173,7 @@
 					<div class="tsd-signature tsd-kind-icon">dtype<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L293">runtime.ts:293</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L293">runtime.ts:293</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -188,7 +188,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L289">runtime.ts:289</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L289">runtime.ts:289</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -203,7 +203,7 @@
 					<div class="tsd-signature tsd-kind-icon">ndim<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L291">runtime.ts:291</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L291">runtime.ts:291</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -218,7 +218,7 @@
 					<div class="tsd-signature tsd-kind-icon">shape<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L295">runtime.ts:295</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L295">runtime.ts:295</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -240,7 +240,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L370">runtime.ts:370</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L370">runtime.ts:370</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -273,7 +273,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L414">runtime.ts:414</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L414">runtime.ts:414</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -305,7 +305,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L355">runtime.ts:355</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L355">runtime.ts:355</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
@@ -322,7 +322,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L474">runtime.ts:474</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L474">runtime.ts:474</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -346,7 +346,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L443">runtime.ts:443</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L443">runtime.ts:443</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/packedfunccell.html b/docs/reference/api/typedoc/classes/packedfunccell.html
index b1d564936..e484e85d5 100644
--- a/docs/reference/api/typedoc/classes/packedfunccell.html
+++ b/docs/reference/api/typedoc/classes/packedfunccell.html
@@ -122,7 +122,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L158">runtime.ts:158</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L158">runtime.ts:158</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -147,7 +147,7 @@
 					<div class="tsd-signature tsd-kind-icon">handle<span class="tsd-signature-symbol">:</span> <a href="../index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L157">runtime.ts:157</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L157">runtime.ts:157</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -164,7 +164,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L165">runtime.ts:165</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L165">runtime.ts:165</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">void</span></h4>
diff --git a/docs/reference/api/typedoc/classes/rpcserver.html b/docs/reference/api/typedoc/classes/rpcserver.html
index 5cb4b3fc5..dc09f00d9 100644
--- a/docs/reference/api/typedoc/classes/rpcserver.html
+++ b/docs/reference/api/typedoc/classes/rpcserver.html
@@ -115,7 +115,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L92">rpc_server.ts:92</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L92">rpc_server.ts:92</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">get<wbr>Imports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">unknown</span><span class="tsd-signat [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L82">rpc_server.ts:82</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L82">rpc_server.ts:82</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -201,7 +201,7 @@
 					<div class="tsd-signature tsd-kind-icon">key<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L78">rpc_server.ts:78</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L78">rpc_server.ts:78</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -211,7 +211,7 @@
 					<div class="tsd-signature tsd-kind-icon">logger<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>msg<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L81">rpc_server.ts:81</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L81">rpc_server.ts:81</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-type-declaration">
@@ -242,7 +242,7 @@
 					<div class="tsd-signature tsd-kind-icon">socket<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">WebSocket</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L79">rpc_server.ts:79</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L79">rpc_server.ts:79</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -252,7 +252,7 @@
 					<div class="tsd-signature tsd-kind-icon">state<span class="tsd-signature-symbol">:</span> <a href="../enums/rpcserverstate.html" class="tsd-signature-type">RPCServerState</a><span class="tsd-signature-symbol"> = RPCServerState.InitHeader</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L80">rpc_server.ts:80</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L80">rpc_server.ts:80</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -262,7 +262,7 @@
 					<div class="tsd-signature tsd-kind-icon">url<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L77">rpc_server.ts:77</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L77">rpc_server.ts:77</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/classes/scalar.html b/docs/reference/api/typedoc/classes/scalar.html
index 7d128bba8..114f50e14 100644
--- a/docs/reference/api/typedoc/classes/scalar.html
+++ b/docs/reference/api/typedoc/classes/scalar.html
@@ -112,7 +112,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L145">runtime.ts:145</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L145">runtime.ts:145</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -137,7 +137,7 @@
 					<div class="tsd-signature tsd-kind-icon">dtype<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L145">runtime.ts:145</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L145">runtime.ts:145</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -152,7 +152,7 @@
 					<div class="tsd-signature tsd-kind-icon">value<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L143">runtime.ts:143</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L143">runtime.ts:143</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/classes/webgpucontext.html b/docs/reference/api/typedoc/classes/webgpucontext.html
index c1daf8524..892ee2d3e 100644
--- a/docs/reference/api/typedoc/classes/webgpucontext.html
+++ b/docs/reference/api/typedoc/classes/webgpucontext.html
@@ -120,7 +120,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L57">webgpu.ts:57</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L57">webgpu.ts:57</a></li>
 								</ul>
 							</aside>
 							<h4 class="tsd-parameters-title">Parameters</h4>
@@ -145,7 +145,7 @@
 					<div class="tsd-signature tsd-kind-icon">device<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">GPUDevice</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L50">webgpu.ts:50</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L50">webgpu.ts:50</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -155,7 +155,7 @@
 					<div class="tsd-signature tsd-kind-icon">memory<span class="tsd-signature-symbol">:</span> <a href="memory.html" class="tsd-signature-type">Memory</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L51">webgpu.ts:51</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L51">webgpu.ts:51</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -172,7 +172,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L84">webgpu.ts:84</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L84">webgpu.ts:84</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -209,7 +209,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L170">webgpu.ts:170</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L170">webgpu.ts:170</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -238,7 +238,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L67">webgpu.ts:67</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L67">webgpu.ts:67</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/enums/argtypecode.html b/docs/reference/api/typedoc/enums/argtypecode.html
index 9012c2df5..a3285b05d 100644
--- a/docs/reference/api/typedoc/enums/argtypecode.html
+++ b/docs/reference/api/typedoc/enums/argtypecode.html
@@ -106,7 +106,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLDevice<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 6</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L220">ctypes.ts:220</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L220">ctypes.ts:220</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -116,7 +116,7 @@
 					<div class="tsd-signature tsd-kind-icon">Float<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L216">ctypes.ts:216</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L216">ctypes.ts:216</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -126,7 +126,7 @@
 					<div class="tsd-signature tsd-kind-icon">Int<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L214">ctypes.ts:214</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L214">ctypes.ts:214</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -136,7 +136,7 @@
 					<div class="tsd-signature tsd-kind-icon">Null<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L218">ctypes.ts:218</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L218">ctypes.ts:218</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -146,7 +146,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMBytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 12</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L226">ctypes.ts:226</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L226">ctypes.ts:226</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -156,7 +156,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMDLTensor<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 7</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L221">ctypes.ts:221</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L221">ctypes.ts:221</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -166,7 +166,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMData<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 5</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L219">ctypes.ts:219</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L219">ctypes.ts:219</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -176,7 +176,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMModule<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 9</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L223">ctypes.ts:223</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L223">ctypes.ts:223</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -186,7 +186,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMNDArray<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 13</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L227">ctypes.ts:227</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L227">ctypes.ts:227</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -196,7 +196,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMObject<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L222">ctypes.ts:222</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L222">ctypes.ts:222</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -206,7 +206,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMObjectRValue<wbr>Ref<wbr>Arg<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 14</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L228">ctypes.ts:228</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L228">ctypes.ts:228</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -216,7 +216,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMOpaque<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 3</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L217">ctypes.ts:217</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L217">ctypes.ts:217</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -226,7 +226,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMPacked<wbr>Func<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 10</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L224">ctypes.ts:224</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L224">ctypes.ts:224</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -236,7 +236,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 11</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L225">ctypes.ts:225</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L225">ctypes.ts:225</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -246,7 +246,7 @@
 					<div class="tsd-signature tsd-kind-icon">UInt<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L215">ctypes.ts:215</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L215">ctypes.ts:215</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/aynccallbackcode.html b/docs/reference/api/typedoc/enums/aynccallbackcode.html
index c88ae3697..b2397b77d 100644
--- a/docs/reference/api/typedoc/enums/aynccallbackcode.html
+++ b/docs/reference/api/typedoc/enums/aynccallbackcode.html
@@ -93,7 +93,7 @@
 					<div class="tsd-signature tsd-kind-icon">k<wbr>Exception<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 5</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L676">runtime.ts:676</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L676">runtime.ts:676</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -103,7 +103,7 @@
 					<div class="tsd-signature tsd-kind-icon">k<wbr>Return<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L675">runtime.ts:675</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L675">runtime.ts:675</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/dldatatypecode.html b/docs/reference/api/typedoc/enums/dldatatypecode.html
index 2dcf0ae12..b30299ea5 100644
--- a/docs/reference/api/typedoc/enums/dldatatypecode.html
+++ b/docs/reference/api/typedoc/enums/dldatatypecode.html
@@ -95,7 +95,7 @@
 					<div class="tsd-signature tsd-kind-icon">Float<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L242">runtime.ts:242</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L242">runtime.ts:242</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -105,7 +105,7 @@
 					<div class="tsd-signature tsd-kind-icon">Int<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 0</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L240">runtime.ts:240</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L240">runtime.ts:240</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -115,7 +115,7 @@
 					<div class="tsd-signature tsd-kind-icon">Opaque<wbr>Handle<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 3</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L243">runtime.ts:243</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L243">runtime.ts:243</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -125,7 +125,7 @@
 					<div class="tsd-signature tsd-kind-icon">UInt<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L241">runtime.ts:241</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L241">runtime.ts:241</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/rpcserverstate.html b/docs/reference/api/typedoc/enums/rpcserverstate.html
index 9876f461b..429437afc 100644
--- a/docs/reference/api/typedoc/enums/rpcserverstate.html
+++ b/docs/reference/api/typedoc/enums/rpcserverstate.html
@@ -90,7 +90,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Header<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L27">rpc_server.ts:27</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L27">rpc_server.ts:27</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -100,7 +100,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Header<wbr>Key<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L28">rpc_server.ts:28</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L28">rpc_server.ts:28</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -110,7 +110,7 @@
 					<div class="tsd-signature tsd-kind-icon">Init<wbr>Server<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L29">rpc_server.ts:29</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L29">rpc_server.ts:29</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -120,7 +120,7 @@
 					<div class="tsd-signature tsd-kind-icon">Receive<wbr>Packet<wbr>Body<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L32">rpc_server.ts:32</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L32">rpc_server.ts:32</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -130,7 +130,7 @@
 					<div class="tsd-signature tsd-kind-icon">Receive<wbr>Packet<wbr>Header<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L31">rpc_server.ts:31</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L31">rpc_server.ts:31</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -140,7 +140,7 @@
 					<div class="tsd-signature tsd-kind-icon">Wait<wbr>For<wbr>Callback<span class="tsd-signature-symbol">:</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L30">rpc_server.ts:30</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L30">rpc_server.ts:30</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/enums/sizeof.html b/docs/reference/api/typedoc/enums/sizeof.html
index cff779bb3..22ed752f3 100644
--- a/docs/reference/api/typedoc/enums/sizeof.html
+++ b/docs/reference/api/typedoc/enums/sizeof.html
@@ -100,7 +100,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLData<wbr>Type<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = I32</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L206">ctypes.ts:206</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L206">ctypes.ts:206</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -110,7 +110,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLDevice<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = I32 + I32</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L207">ctypes.ts:207</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L207">ctypes.ts:207</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -120,7 +120,7 @@
 					<div class="tsd-signature tsd-kind-icon">F32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L203">ctypes.ts:203</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L203">ctypes.ts:203</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -130,7 +130,7 @@
 					<div class="tsd-signature tsd-kind-icon">F64<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L204">ctypes.ts:204</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L204">ctypes.ts:204</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -140,7 +140,7 @@
 					<div class="tsd-signature tsd-kind-icon">I32<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 4</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L201">ctypes.ts:201</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L201">ctypes.ts:201</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -150,7 +150,7 @@
 					<div class="tsd-signature tsd-kind-icon">I64<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L202">ctypes.ts:202</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L202">ctypes.ts:202</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -160,7 +160,7 @@
 					<div class="tsd-signature tsd-kind-icon">TVMValue<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 8</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L205">ctypes.ts:205</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L205">ctypes.ts:205</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -170,7 +170,7 @@
 					<div class="tsd-signature tsd-kind-icon">U16<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 2</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L200">ctypes.ts:200</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L200">ctypes.ts:200</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -180,7 +180,7 @@
 					<div class="tsd-signature tsd-kind-icon">U8<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol"> = 1</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L199">ctypes.ts:199</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L199">ctypes.ts:199</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/index.html b/docs/reference/api/typedoc/index.html
index de1858f68..dfc935976 100644
--- a/docs/reference/api/typedoc/index.html
+++ b/docs/reference/api/typedoc/index.html
@@ -174,7 +174,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Alloc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>shape<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, ndim<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, dtypeCode<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, dtypeBits<span class="tsd [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L112">ctypes.ts:112</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L112">ctypes.ts:112</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -238,7 +238,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>From<wbr>Bytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, data<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nbytes<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">num [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L128">ctypes.ts:128</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L128">ctypes.ts:128</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -282,7 +282,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>From<wbr>To<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>from<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, to<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, stream<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-sig [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L144">ctypes.ts:144</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L144">ctypes.ts:144</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -326,7 +326,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Copy<wbr>ToBytes<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, data<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nbytes<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</sp [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L136">ctypes.ts:136</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L136">ctypes.ts:136</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -370,7 +370,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMArray<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>handle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L121">ctypes.ts:121</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L121">ctypes.ts:121</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -406,7 +406,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMBackend<wbr>PackedCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>argValues<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, argCodes<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nargs<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number< [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L160">ctypes.ts:160</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L160">ctypes.ts:160</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -458,7 +458,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMCFunc<wbr>Set<wbr>Return<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>ret<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, value<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCode<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signa [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L77">ctypes.ts:77</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L77">ctypes.ts:77</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -506,7 +506,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMCb<wbr>Arg<wbr>ToReturn<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>value<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, code<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span c [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L83">ctypes.ts:83</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L83">ctypes.ts:83</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -545,7 +545,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Call<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>func<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, argValues<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCode<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-t [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L67">ctypes.ts:67</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L67">ctypes.ts:67</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -601,7 +601,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>func<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L57">ctypes.ts:57</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L57">ctypes.ts:57</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -637,7 +637,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Get<wbr>Global<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>name<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, out<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span cla [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L100">ctypes.ts:100</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L100">ctypes.ts:100</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -676,7 +676,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>List<wbr>Global<wbr>Names<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>outSize<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, outArray<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&g [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L88">ctypes.ts:88</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L88">ctypes.ts:88</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -715,7 +715,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMFunc<wbr>Register<wbr>Global<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>name<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, f<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, override<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</spa [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L94">ctypes.ts:94</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L94">ctypes.ts:94</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -758,7 +758,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMGet<wbr>Last<wbr>Error<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L34">ctypes.ts:34</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L34">ctypes.ts:34</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -788,7 +788,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Free<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L52">ctypes.ts:52</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L52">ctypes.ts:52</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -824,7 +824,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Get<wbr>Function<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, funcName<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, queryImports<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">numbe [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L42">ctypes.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L42">ctypes.ts:42</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -872,7 +872,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMMod<wbr>Import<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>mod<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, dep<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-si [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L48">ctypes.ts:48</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L48">ctypes.ts:48</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -912,7 +912,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMSynchronize<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>deviceType<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, deviceId<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, stream<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signatur [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L150">ctypes.ts:150</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L150">ctypes.ts:150</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -954,7 +954,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Alloc<wbr>Space<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>size<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L167">ctypes.ts:167</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L167">ctypes.ts:167</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -990,7 +990,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Free<wbr>Space<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>ptr<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L170">ctypes.ts:170</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L170">ctypes.ts:170</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1026,7 +1026,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>Func<wbr>Create<wbr>FromCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>resource<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, out<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&g [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L187">ctypes.ts:187</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L187">ctypes.ts:187</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1066,7 +1066,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>PackedCFunc<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>args<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, typeCodes<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a>, nargs<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span>, [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L179">ctypes.ts:179</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L179">ctypes.ts:179</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1118,7 +1118,7 @@
 					<div class="tsd-signature tsd-kind-icon">FTVMWasm<wbr>PackedCFunc<wbr>Finalizer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>resourceHandle<span class="tsd-signature-symbol">: </span><a href="index.html#pointer" class="tsd-signature-type">Pointer</a><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L193">ctypes.ts:193</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L193">ctypes.ts:193</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1154,7 +1154,7 @@
 					<div class="tsd-signature tsd-kind-icon">GPUPointer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L25">webgpu.ts:25</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L25">webgpu.ts:25</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1169,7 +1169,7 @@
 					<div class="tsd-signature tsd-kind-icon">Packed<wbr>Func<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">...</span>args<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol"> &amp; </span><a href="interfaces/disp [...]
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L36">runtime.ts:36</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L36">runtime.ts:36</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1184,7 +1184,7 @@
 					<div class="tsd-signature tsd-kind-icon">Pointer<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L25">ctypes.ts:25</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L25">ctypes.ts:25</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1199,7 +1199,7 @@
 					<div class="tsd-signature tsd-kind-icon">Ptr<wbr>Offset<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/ctypes.ts#L28">ctypes.ts:28</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/ctypes.ts#L28">ctypes.ts:28</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1217,7 +1217,7 @@
 					<div class="tsd-signature tsd-kind-icon">RPC_<wbr>MAGIC<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">1045105</span><span class="tsd-signature-symbol"> = 1045105</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/rpc_server.ts#L36">rpc_server.ts:36</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/rpc_server.ts#L36">rpc_server.ts:36</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -1239,7 +1239,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/support.ts#L25">support.ts:25</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/support.ts#L25">support.ts:25</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1271,7 +1271,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/support.ts#L39">support.ts:39</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/support.ts#L39">support.ts:39</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1300,7 +1300,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/support.ts#L52">support.ts:52</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/support.ts#L52">support.ts:52</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1337,7 +1337,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/compact.ts#L38">compact.ts:38</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/compact.ts#L38">compact.ts:38</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1368,7 +1368,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L30">webgpu.ts:30</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L30">webgpu.ts:30</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1390,7 +1390,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/environment.ts#L32">environment.ts:32</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/environment.ts#L32">environment.ts:32</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1421,7 +1421,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/compact.ts#L24">compact.ts:24</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/compact.ts#L24">compact.ts:24</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1443,7 +1443,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L1367">runtime.ts:1367</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L1367">runtime.ts:1367</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1508,7 +1508,7 @@
 						<li class="tsd-description">
 							<aside class="tsd-sources">
 								<ul>
-									<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/support.ts#L62">support.ts:62</a></li>
+									<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/support.ts#L62">support.ts:62</a></li>
 								</ul>
 							</aside>
 							<div class="tsd-comment tsd-typography">
@@ -1530,7 +1530,7 @@
 					<div class="tsd-signature tsd-kind-icon">DLData<wbr>Type<wbr>Code<wbr>ToStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L246">runtime.ts:246</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L246">runtime.ts:246</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1539,7 +1539,7 @@
 						<div class="tsd-signature tsd-kind-icon">0<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;int&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L247">runtime.ts:247</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L247">runtime.ts:247</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1549,7 +1549,7 @@
 						<div class="tsd-signature tsd-kind-icon">1<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;uint&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L248">runtime.ts:248</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L248">runtime.ts:248</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1559,7 +1559,7 @@
 						<div class="tsd-signature tsd-kind-icon">2<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;float&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L249">runtime.ts:249</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L249">runtime.ts:249</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1569,7 +1569,7 @@
 						<div class="tsd-signature tsd-kind-icon">3<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;handle&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L250">runtime.ts:250</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L250">runtime.ts:250</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1580,7 +1580,7 @@
 					<div class="tsd-signature tsd-kind-icon">Device<wbr>Enum<wbr>ToStr<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L175">runtime.ts:175</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L175">runtime.ts:175</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1589,7 +1589,7 @@
 						<div class="tsd-signature tsd-kind-icon">1<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;cpu&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L176">runtime.ts:176</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L176">runtime.ts:176</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1599,7 +1599,7 @@
 						<div class="tsd-signature tsd-kind-icon">15<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;webgpu&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L180">runtime.ts:180</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L180">runtime.ts:180</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1609,7 +1609,7 @@
 						<div class="tsd-signature tsd-kind-icon">2<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;cuda&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L177">runtime.ts:177</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L177">runtime.ts:177</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1619,7 +1619,7 @@
 						<div class="tsd-signature tsd-kind-icon">4<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;opencl&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L178">runtime.ts:178</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L178">runtime.ts:178</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1629,7 +1629,7 @@
 						<div class="tsd-signature tsd-kind-icon">8<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol"> = &quot;metal&quot;</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L179">runtime.ts:179</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L179">runtime.ts:179</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1640,7 +1640,7 @@
 					<div class="tsd-signature tsd-kind-icon">Device<wbr>Str<wbr>ToEnum<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">object</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L183">runtime.ts:183</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L183">runtime.ts:183</a></li>
 						</ul>
 					</aside>
 					<section class="tsd-panel tsd-member tsd-kind-variable tsd-parent-kind-object-literal">
@@ -1649,7 +1649,7 @@
 						<div class="tsd-signature tsd-kind-icon">cl<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 4</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L186">runtime.ts:186</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L186">runtime.ts:186</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1659,7 +1659,7 @@
 						<div class="tsd-signature tsd-kind-icon">cpu<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 1</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L184">runtime.ts:184</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L184">runtime.ts:184</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1669,7 +1669,7 @@
 						<div class="tsd-signature tsd-kind-icon">cuda<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 2</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L185">runtime.ts:185</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L185">runtime.ts:185</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1679,7 +1679,7 @@
 						<div class="tsd-signature tsd-kind-icon">metal<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 8</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L189">runtime.ts:189</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L189">runtime.ts:189</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1689,7 +1689,7 @@
 						<div class="tsd-signature tsd-kind-icon">opencl<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 4</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L187">runtime.ts:187</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L187">runtime.ts:187</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1699,7 +1699,7 @@
 						<div class="tsd-signature tsd-kind-icon">vulkan<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 7</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L188">runtime.ts:188</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L188">runtime.ts:188</a></li>
 							</ul>
 						</aside>
 					</section>
@@ -1709,7 +1709,7 @@
 						<div class="tsd-signature tsd-kind-icon">webgpu<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">number</span><span class="tsd-signature-symbol"> = 15</span></div>
 						<aside class="tsd-sources">
 							<ul>
-								<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/runtime.ts#L190">runtime.ts:190</a></li>
+								<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/runtime.ts#L190">runtime.ts:190</a></li>
 							</ul>
 						</aside>
 					</section>
diff --git a/docs/reference/api/typedoc/interfaces/disposable.html b/docs/reference/api/typedoc/interfaces/disposable.html
index 58b88a086..1e6b24759 100644
--- a/docs/reference/api/typedoc/interfaces/disposable.html
+++ b/docs/reference/api/typedoc/interfaces/disposable.html
@@ -113,7 +113,7 @@
 					<div class="tsd-signature tsd-kind-icon">dispose<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/types.ts#L52">types.ts:52</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/types.ts#L52">types.ts:52</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/api/typedoc/interfaces/functioninfo.html b/docs/reference/api/typedoc/interfaces/functioninfo.html
index 63989544c..5f5566791 100644
--- a/docs/reference/api/typedoc/interfaces/functioninfo.html
+++ b/docs/reference/api/typedoc/interfaces/functioninfo.html
@@ -95,7 +95,7 @@
 					<div class="tsd-signature tsd-kind-icon">arg_<wbr>types<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L41">webgpu.ts:41</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L41">webgpu.ts:41</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -105,7 +105,7 @@
 					<div class="tsd-signature tsd-kind-icon">launch_<wbr>param_<wbr>tags<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Array</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L42">webgpu.ts:42</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L42">webgpu.ts:42</a></li>
 						</ul>
 					</aside>
 				</section>
@@ -115,7 +115,7 @@
 					<div class="tsd-signature tsd-kind-icon">name<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">string</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/webgpu.ts#L40">webgpu.ts:40</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/webgpu.ts#L40">webgpu.ts:40</a></li>
 						</ul>
 					</aside>
 				</section>
diff --git a/docs/reference/api/typedoc/interfaces/libraryprovider.html b/docs/reference/api/typedoc/interfaces/libraryprovider.html
index a3b92c897..676fae29b 100644
--- a/docs/reference/api/typedoc/interfaces/libraryprovider.html
+++ b/docs/reference/api/typedoc/interfaces/libraryprovider.html
@@ -112,7 +112,7 @@
 					<div class="tsd-signature tsd-kind-icon">imports<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-type">Record</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">, </span><span class="tsd-signature-type">any</span><span class="tsd-signature-symbol">&gt;</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/types.ts#L34">types.ts:34</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/types.ts#L34">types.ts:34</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
@@ -127,7 +127,7 @@
 					<div class="tsd-signature tsd-kind-icon">start<span class="tsd-signature-symbol">:</span> <span class="tsd-signature-symbol">(</span>inst<span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">Instance</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol"> =&gt; </span><span class="tsd-signature-type">void</span></div>
 					<aside class="tsd-sources">
 						<ul>
-							<li>Defined in <a href="https://github.com/apache/tvm/blob/9d6039b87/web/src/types.ts#L39">types.ts:39</a></li>
+							<li>Defined in <a href="https://github.com/apache/tvm/blob/bdcfa01ea/web/src/types.ts#L39">types.ts:39</a></li>
 						</ul>
 					</aside>
 					<div class="tsd-comment tsd-typography">
diff --git a/docs/reference/langref/hybrid_script.html b/docs/reference/langref/hybrid_script.html
index 5f600c5d8..1fadd28c5 100644
--- a/docs/reference/langref/hybrid_script.html
+++ b/docs/reference/langref/hybrid_script.html
@@ -414,7 +414,7 @@ fusing and reorderding imperfect loops.</p>
 <p>In HalideIR, loops have in total 4 types: <code class="docutils literal notranslate"><span class="pre">serial</span></code>, <code class="docutils literal notranslate"><span class="pre">unrolled</span></code>, <code class="docutils literal notranslate"><span class="pre">parallel</span></code>, and <code class="docutils literal notranslate"><span class="pre">vectorized</span></code>.</p>
 <p>Here we use <code class="docutils literal notranslate"><span class="pre">range</span></code> aka <code class="docutils literal notranslate"><span class="pre">serial</span></code>, <code class="docutils literal notranslate"><span class="pre">unroll</span></code>, <code class="docutils literal notranslate"><span class="pre">parallel</span></code>, and <code class="docutils literal notranslate"><span class="pre">vectorize</span></code>,
 these <strong>4</strong> keywords to annotate the corresponding types of for loops.
-The the usage is roughly the same as Python standard <code class="docutils literal notranslate"><span class="pre">range</span></code>.</p>
+The usage is roughly the same as Python standard <code class="docutils literal notranslate"><span class="pre">range</span></code>.</p>
 <p>Besides all the loop types supported in Halide, <code class="docutils literal notranslate"><span class="pre">const_range</span></code> is supported for some specific conditions.
 Sometimes, <code class="docutils literal notranslate"><span class="pre">tvm.container.Array</span></code> is desired to pass as an argument, but in TVM-HalideIR, there is no
 such support that converts <code class="docutils literal notranslate"><span class="pre">tvm.container.Array</span></code> to an <code class="docutils literal notranslate"><span class="pre">Expr</span></code>. Thus, a limited feature is supported.
diff --git a/docs/reference/langref/relay_op.html b/docs/reference/langref/relay_op.html
index 2cfb774a3..b9b60c940 100644
--- a/docs/reference/langref/relay_op.html
+++ b/docs/reference/langref/relay_op.html
@@ -769,7 +769,7 @@ these operators in the python frontend.</p>
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><code class="xref py py-obj docutils literal notranslate"><span class="pre">tvm.relay.argsort</span></code></p></td>
-<td><p>Performs sorting along the given axis and returns an array of indicies having same shape as an input array that index data in sorted order.</p></td>
+<td><p>Performs sorting along the given axis and returns an array of indices having same shape as an input array that index data in sorted order.</p></td>
 </tr>
 <tr class="row-even"><td><p><code class="xref py py-obj docutils literal notranslate"><span class="pre">tvm.relay.topk</span></code></p></td>
 <td><p>Get the top k elements in an input tensor along the given axis.</p></td>
diff --git a/docs/searchindex.js b/docs/searchindex.js
index a60be579e..810c716ca 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({docnames:["arch/benchmark","arch/convert_layout","arch/debugger","arch/device_target_interactions","arch/frontend/tensorflow","arch/hybrid_script","arch/index","arch/inferbound","arch/introduction_to_module_serialization","arch/microtvm_design","arch/microtvm_project_api","arch/model_library_format","arch/pass_infra","arch/relay_intro","arch/relay_op_strategy","arch/runtime","arch/runtimes/vulkan","arch/security","arch/virtual_machine","contribute/ci","contribute/code_gu [...]
\ No newline at end of file
+Search.setIndex({docnames:["arch/benchmark","arch/convert_layout","arch/debugger","arch/device_target_interactions","arch/frontend/tensorflow","arch/hybrid_script","arch/index","arch/inferbound","arch/introduction_to_module_serialization","arch/microtvm_design","arch/microtvm_project_api","arch/model_library_format","arch/pass_infra","arch/relay_intro","arch/relay_op_strategy","arch/runtime","arch/runtimes/vulkan","arch/security","arch/virtual_machine","contribute/ci","contribute/code_gu [...]
\ No newline at end of file
diff --git a/docs/topic/vta/tutorials/autotvm/sg_execution_times.html b/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
index becfbd92e..473a1ce57 100644
--- a/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/autotvm/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-autotvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:22.073</strong> total execution time for <strong>topic_vta_tutorials_autotvm</strong> files:</p>
+<p><strong>00:21.515</strong> total execution time for <strong>topic_vta_tutorials_autotvm</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 82%" />
@@ -336,7 +336,7 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="tune_relay_vta.html#sphx-glr-topic-vta-tutorials-autotvm-tune-relay-vta-py"><span class="std std-ref">Auto-tuning a convolutional network on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_vta.py</span></code>)</p></td>
-<td><p>00:22.066</p></td>
+<td><p>00:21.508</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tune_alu_vta.html#sphx-glr-topic-vta-tutorials-autotvm-tune-alu-vta-py"><span class="std std-ref">Auto-tuning a ALU fused op on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_alu_vta.py</span></code>)</p></td>
diff --git a/docs/topic/vta/tutorials/frontend/deploy_classification.html b/docs/topic/vta/tutorials/frontend/deploy_classification.html
index 7887a2887..1e3b9ff76 100644
--- a/docs/topic/vta/tutorials/frontend/deploy_classification.html
+++ b/docs/topic/vta/tutorials/frontend/deploy_classification.html
@@ -571,7 +571,7 @@ and dense layer which will both be executed in fp32 on the CPU.</p></li>
   DeprecationWarning,
 /workspace/vta/tutorials/frontend/deploy_classification.py:213: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
   relay_prog, target=tvm.target.Target(target, host=env.target_host), params=params
-resnet18_v1 inference graph built in 24.29s!
+resnet18_v1 inference graph built in 23.34s!
 </pre></div>
 </div>
 </div>
diff --git a/docs/topic/vta/tutorials/frontend/deploy_detection.html b/docs/topic/vta/tutorials/frontend/deploy_detection.html
index 7c3c0c26f..4f1b19eea 100644
--- a/docs/topic/vta/tutorials/frontend/deploy_detection.html
+++ b/docs/topic/vta/tutorials/frontend/deploy_detection.html
@@ -463,7 +463,7 @@ Model Name
 <span class="c1"># The ``start_pack`` and ``stop_pack`` labels indicate where</span>
 <span class="c1"># to start and end the graph packing relay pass: in other words</span>
 <span class="c1"># where to start and finish offloading to VTA.</span>
-<span class="c1"># the number 4 indicate the the ``start_pack`` index is 4, the</span>
+<span class="c1"># the number 4 indicate the ``start_pack`` index is 4, the</span>
 <span class="c1"># number 186 indicate the ``stop_pack index`` is 186, by using</span>
 <span class="c1"># name and index number, here we can located to correct place</span>
 <span class="c1"># where to start/end when there are multiple ``nn.max_pool2d``</span>
@@ -589,7 +589,7 @@ and dense layer which will both be executed in fp32 on the CPU.</p></li>
   &quot;target_host parameter is going to be deprecated. &quot;
 /workspace/python/tvm/relay/build_module.py:411: DeprecationWarning: Please use input parameter mod (tvm.IRModule) instead of deprecated parameter mod (tvm.relay.function.Function)
   DeprecationWarning,
-yolov3-tiny inference graph built in 16.72s!
+yolov3-tiny inference graph built in 16.18s!
 </pre></div>
 </div>
 </div>
diff --git a/docs/topic/vta/tutorials/frontend/sg_execution_times.html b/docs/topic/vta/tutorials/frontend/sg_execution_times.html
index f9521d447..8329eca3e 100644
--- a/docs/topic/vta/tutorials/frontend/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/frontend/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-frontend-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>01:32.308</strong> total execution time for <strong>topic_vta_tutorials_frontend</strong> files:</p>
+<p><strong>01:32.674</strong> total execution time for <strong>topic_vta_tutorials_frontend</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="deploy_detection.html#sphx-glr-topic-vta-tutorials-frontend-deploy-detection-py"><span class="std std-ref">Deploy Pretrained Vision Detection Model from Darknet on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_detection.py</span></code>)</p></td>
-<td><p>00:48.157</p></td>
+<td><p>00:49.203</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="deploy_classification.html#sphx-glr-topic-vta-tutorials-frontend-deploy-classification-py"><span class="std std-ref">Deploy Pretrained Vision Model from MxNet on VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_classification.py</span></code>)</p></td>
-<td><p>00:44.151</p></td>
+<td><p>00:43.470</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/topic/vta/tutorials/optimize/sg_execution_times.html b/docs/topic/vta/tutorials/optimize/sg_execution_times.html
index bc4b303ca..657a7fa4b 100644
--- a/docs/topic/vta/tutorials/optimize/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/optimize/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-optimize-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:03.346</strong> total execution time for <strong>topic_vta_tutorials_optimize</strong> files:</p>
+<p><strong>00:03.307</strong> total execution time for <strong>topic_vta_tutorials_optimize</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 84%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="convolution_opt.html#sphx-glr-topic-vta-tutorials-optimize-convolution-opt-py"><span class="std std-ref">2D Convolution Optimization</span></a> (<code class="docutils literal notranslate"><span class="pre">convolution_opt.py</span></code>)</p></td>
-<td><p>00:02.938</p></td>
+<td><p>00:02.900</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="matrix_multiply_opt.html#sphx-glr-topic-vta-tutorials-optimize-matrix-multiply-opt-py"><span class="std std-ref">Matrix Multiply Blocking</span></a> (<code class="docutils literal notranslate"><span class="pre">matrix_multiply_opt.py</span></code>)</p></td>
-<td><p>00:00.408</p></td>
+<td><p>00:00.407</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/topic/vta/tutorials/sg_execution_times.html b/docs/topic/vta/tutorials/sg_execution_times.html
index 82e8feeee..64572e319 100644
--- a/docs/topic/vta/tutorials/sg_execution_times.html
+++ b/docs/topic/vta/tutorials/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-topic-vta-tutorials-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:00.768</strong> total execution time for <strong>topic_vta_tutorials</strong> files:</p>
+<p><strong>00:00.745</strong> total execution time for <strong>topic_vta_tutorials</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 81%" />
@@ -336,11 +336,11 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="matrix_multiply.html#sphx-glr-topic-vta-tutorials-matrix-multiply-py"><span class="std std-ref">Simple Matrix Multiply</span></a> (<code class="docutils literal notranslate"><span class="pre">matrix_multiply.py</span></code>)</p></td>
-<td><p>00:00.404</p></td>
+<td><p>00:00.398</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="vta_get_started.html#sphx-glr-topic-vta-tutorials-vta-get-started-py"><span class="std std-ref">Get Started with VTA</span></a> (<code class="docutils literal notranslate"><span class="pre">vta_get_started.py</span></code>)</p></td>
-<td><p>00:00.364</p></td>
+<td><p>00:00.347</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 </tbody>
diff --git a/docs/tutorial/auto_scheduler_matmul_x86.html b/docs/tutorial/auto_scheduler_matmul_x86.html
index 23b05ece0..a04e8699e 100644
--- a/docs/tutorial/auto_scheduler_matmul_x86.html
+++ b/docs/tutorial/auto_scheduler_matmul_x86.html
@@ -567,7 +567,7 @@ operator fusion.</p>
 <span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 93.504 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 93.750 ms
 </pre></div>
 </div>
 </div>
diff --git a/docs/tutorial/autotvm_matmul_x86.html b/docs/tutorial/autotvm_matmul_x86.html
index a059bb009..b4fc78754 100644
--- a/docs/tutorial/autotvm_matmul_x86.html
+++ b/docs/tutorial/autotvm_matmul_x86.html
@@ -669,16 +669,16 @@ reduce variance, we take 5 measurements and average them.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>waiting for device...
 device available
 Get devices for measurement successfully!
-No: 1   GFLOPS: 9.56/9.56       result: MeasureResult(costs=(0.028093525799999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5862655639648438, timestamp=1660954892.565883)        [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 256])],None,80
-No: 2   GFLOPS: 2.63/9.56       result: MeasureResult(costs=(0.10218028900000001,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.785834550857544, timestamp=1660954894.3652294) [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 8])],None,32
-No: 3   GFLOPS: 11.80/11.80     result: MeasureResult(costs=(0.0227458472,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5830347537994385, timestamp=1660954895.4353027)       [(&#39;tile_y&#39;, [-1, 64]), (&#39;tile_x&#39;, [-1, 32])],None,56
-No: 4   GFLOPS: 1.63/11.80      result: MeasureResult(costs=(0.1642617204,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.7438504695892334, timestamp=1660954898.7687416)       [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 4])],None,20
-No: 5   GFLOPS: 3.61/11.80      result: MeasureResult(costs=(0.07439890299999999,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.3300185203552246, timestamp=1660954900.2274308)        [(&#39;tile_y&#39;, [-1, 256]), (&#39;tile_x&#39;, [-1, 16])],None,48
-No: 6   GFLOPS: 1.88/11.80      result: MeasureResult(costs=(0.1430079806,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.416994333267212, timestamp=1660954903.2184362)        [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 4])],None,29
-No: 7   GFLOPS: 0.87/11.80      result: MeasureResult(costs=(0.30716909300000006,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.0313568115234375, timestamp=1660954908.295409) [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 2])],None,19
-No: 8   GFLOPS: 9.91/11.80      result: MeasureResult(costs=(0.027098215199999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5780808925628662, timestamp=1660954908.892745)        [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 64])],None,62
-No: 9   GFLOPS: 1.63/11.80      result: MeasureResult(costs=(0.16499144100000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.7602829933166504, timestamp=1660954911.7741477)        [(&#39;tile_y&#39;, [-1, 2]), (&#39;tile_x&#39;, [-1, 2])],None,11
-No: 10  GFLOPS: 2.46/11.80      result: MeasureResult(costs=(0.10900768639999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.874542474746704, timestamp=1660954913.6869636) [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 4])],None,22
+No: 1   GFLOPS: 9.89/9.89       result: MeasureResult(costs=(0.027148858000000005,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5685811042785645, timestamp=1660954524.2582011)       [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 256])],None,80
+No: 2   GFLOPS: 2.76/9.89       result: MeasureResult(costs=(0.0972901468,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.7027850151062012, timestamp=1660954525.977176)        [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 8])],None,32
+No: 3   GFLOPS: 11.85/11.85     result: MeasureResult(costs=(0.022652452400000002,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5598218441009521, timestamp=1660954527.0345967)       [(&#39;tile_y&#39;, [-1, 64]), (&#39;tile_x&#39;, [-1, 32])],None,56
+No: 4   GFLOPS: 1.86/11.85      result: MeasureResult(costs=(0.144490027,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.433370351791382, timestamp=1660954530.0399187) [(&#39;tile_y&#39;, [-1, 1]), (&#39;tile_x&#39;, [-1, 4])],None,20
+No: 5   GFLOPS: 3.64/11.85      result: MeasureResult(costs=(0.0736710072,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.319761037826538, timestamp=1660954531.487127) [(&#39;tile_y&#39;, [-1, 256]), (&#39;tile_x&#39;, [-1, 16])],None,48
+No: 6   GFLOPS: 1.75/11.85      result: MeasureResult(costs=(0.15376552059999998,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.5849740505218506, timestamp=1660954534.6411028)        [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 4])],None,29
+No: 7   GFLOPS: 0.87/11.85      result: MeasureResult(costs=(0.3082451514,), error_no=MeasureErrorNo.NO_ERROR, all_cost=5.0465192794799805, timestamp=1660954539.734188)        [(&#39;tile_y&#39;, [-1, 512]), (&#39;tile_x&#39;, [-1, 2])],None,19
+No: 8   GFLOPS: 10.52/11.85     result: MeasureResult(costs=(0.0255276866,), error_no=MeasureErrorNo.NO_ERROR, all_cost=0.5613396167755127, timestamp=1660954540.304363)        [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 64])],None,62
+No: 9   GFLOPS: 1.66/11.85      result: MeasureResult(costs=(0.1621344482,), error_no=MeasureErrorNo.NO_ERROR, all_cost=2.6930415630340576, timestamp=1660954543.1178157)       [(&#39;tile_y&#39;, [-1, 2]), (&#39;tile_x&#39;, [-1, 2])],None,11
+No: 10  GFLOPS: 2.71/11.85      result: MeasureResult(costs=(0.0989214036,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6874268054962158, timestamp=1660954544.8631723)       [(&#39;tile_y&#39;, [-1, 4]), (&#39;tile_x&#39;, [-1, 4])],None,22
 </pre></div>
 </div>
 <p>With tuning completed, we can choose the configuration from the log file that
diff --git a/docs/tutorial/autotvm_relay_x86.html b/docs/tutorial/autotvm_relay_x86.html
index 9681fafb7..20d777949 100644
--- a/docs/tutorial/autotvm_relay_x86.html
+++ b/docs/tutorial/autotvm_relay_x86.html
@@ -551,7 +551,7 @@ standard deviation.</p>
 <span class="nb">print</span><span class="p">(</span><a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">unoptimized</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>{&#39;mean&#39;: 497.01132035, &#39;median&#39;: 496.9583858500016, &#39;std&#39;: 0.719820825023517}
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>{&#39;mean&#39;: 496.51104268999006, &#39;median&#39;: 495.9818490999851, &#39;std&#39;: 1.709787608932748}
 </pre></div>
 </div>
 </div>
@@ -706,178 +706,178 @@ depending on the specifics of the model and the target platform.</p>
   &quot;target_host parameter is going to be deprecated. &quot;
 
 [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  1/25]  Current/Best:   17.50/  17.50 GFLOPS | Progress: (4/20) | 6.44 s
-[Task  1/25]  Current/Best:    6.16/  17.50 GFLOPS | Progress: (8/20) | 9.48 s
-[Task  1/25]  Current/Best:   11.53/  22.75 GFLOPS | Progress: (12/20) | 11.94 s
-[Task  1/25]  Current/Best:   16.38/  22.76 GFLOPS | Progress: (16/20) | 13.64 s
-[Task  1/25]  Current/Best:   11.59/  23.80 GFLOPS | Progress: (20/20) | 15.38 s Done.
+[Task  1/25]  Current/Best:   17.50/  17.50 GFLOPS | Progress: (4/20) | 6.33 s
+[Task  1/25]  Current/Best:    6.15/  17.50 GFLOPS | Progress: (8/20) | 9.37 s
+[Task  1/25]  Current/Best:   11.51/  22.82 GFLOPS | Progress: (12/20) | 11.83 s
+[Task  1/25]  Current/Best:   16.48/  22.82 GFLOPS | Progress: (16/20) | 13.52 s
+[Task  1/25]  Current/Best:   11.62/  23.82 GFLOPS | Progress: (20/20) | 15.25 s Done.
 
 [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  2/25]  Current/Best:   12.20/  13.28 GFLOPS | Progress: (4/20) | 3.75 s
-[Task  2/25]  Current/Best:   13.21/  18.32 GFLOPS | Progress: (8/20) | 5.05 s
-[Task  2/25]  Current/Best:   21.18/  21.18 GFLOPS | Progress: (12/20) | 6.37 s
-[Task  2/25]  Current/Best:   12.13/  21.18 GFLOPS | Progress: (16/20) | 7.65 s
-[Task  2/25]  Current/Best:   18.74/  21.18 GFLOPS | Progress: (20/20) | 9.28 s Done.
+[Task  2/25]  Current/Best:   12.21/  13.18 GFLOPS | Progress: (4/20) | 3.81 s
+[Task  2/25]  Current/Best:   13.94/  18.06 GFLOPS | Progress: (8/20) | 5.12 s
+[Task  2/25]  Current/Best:   20.99/  20.99 GFLOPS | Progress: (12/20) | 6.45 s
+[Task  2/25]  Current/Best:   12.36/  20.99 GFLOPS | Progress: (16/20) | 7.72 s
+[Task  2/25]  Current/Best:   18.89/  20.99 GFLOPS | Progress: (20/20) | 9.28 s Done.
 
 [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  3/25]  Current/Best:    1.63/  10.81 GFLOPS | Progress: (4/20) | 5.89 s
-[Task  3/25]  Current/Best:   15.28/  16.77 GFLOPS | Progress: (8/20) | 7.84 s
-[Task  3/25]  Current/Best:   14.94/  16.77 GFLOPS | Progress: (12/20) | 9.56 s
-[Task  3/25]  Current/Best:    7.23/  23.62 GFLOPS | Progress: (16/20) | 11.49 s
-[Task  3/25]  Current/Best:   12.62/  23.62 GFLOPS | Progress: (20/20) | 16.04 s Done.
+[Task  3/25]  Current/Best:    1.63/  10.79 GFLOPS | Progress: (4/20) | 5.89 s
+[Task  3/25]  Current/Best:   15.31/  16.87 GFLOPS | Progress: (8/20) | 7.83 s
+[Task  3/25]  Current/Best:   14.96/  16.87 GFLOPS | Progress: (12/20) | 9.55 s
+[Task  3/25]  Current/Best:    7.20/  23.69 GFLOPS | Progress: (16/20) | 11.47 s
+[Task  3/25]  Current/Best:   11.20/  23.69 GFLOPS | Progress: (20/20) | 16.02 s Done.
 
 [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  4/25]  Current/Best:    9.54/  20.42 GFLOPS | Progress: (4/20) | 2.42 s
-[Task  4/25]  Current/Best:    6.88/  20.42 GFLOPS | Progress: (8/20) | 6.77 s
-[Task  4/25]  Current/Best:   22.07/  22.07 GFLOPS | Progress: (12/20) | 11.37 s
-[Task  4/25]  Current/Best:   16.70/  22.07 GFLOPS | Progress: (16/20) | 13.63 s
-[Task  4/25]  Current/Best:   13.37/  22.07 GFLOPS | Progress: (20/20) | 15.65 s Done.
+[Task  4/25]  Current/Best:    9.55/  20.39 GFLOPS | Progress: (4/20) | 2.42 s
+[Task  4/25]  Current/Best:    6.86/  20.39 GFLOPS | Progress: (8/20) | 6.78 s
+[Task  4/25]  Current/Best:   22.22/  22.22 GFLOPS | Progress: (12/20) | 11.32 s
+[Task  4/25]  Current/Best:   17.44/  22.22 GFLOPS | Progress: (16/20) | 13.53 s
+[Task  4/25]  Current/Best:   13.17/  22.22 GFLOPS | Progress: (20/20) | 15.54 s Done.
 
 [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  5/25]  Current/Best:    9.68/  10.13 GFLOPS | Progress: (4/20) | 2.63 s
-[Task  5/25]  Current/Best:   11.81/  12.77 GFLOPS | Progress: (8/20) | 4.69 s
-[Task  5/25]  Current/Best:    9.76/  18.03 GFLOPS | Progress: (12/20) | 7.84 s
-[Task  5/25]  Current/Best:   11.57/  19.57 GFLOPS | Progress: (16/20) | 9.29 s
-[Task  5/25]  Current/Best:   11.92/  20.95 GFLOPS | Progress: (20/20) | 11.20 s Done.
+[Task  5/25]  Current/Best:    9.52/  10.25 GFLOPS | Progress: (4/20) | 2.61 s
+[Task  5/25]  Current/Best:   11.74/  12.69 GFLOPS | Progress: (8/20) | 4.68 s
+[Task  5/25]  Current/Best:   11.26/  18.10 GFLOPS | Progress: (12/20) | 7.81 s
+[Task  5/25]  Current/Best:   11.61/  22.45 GFLOPS | Progress: (16/20) | 9.29 s
+[Task  5/25]  Current/Best:   11.90/  22.45 GFLOPS | Progress: (20/20) | 11.17 s Done.
 
 [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  6/25]  Current/Best:   12.30/  20.58 GFLOPS | Progress: (4/20) | 4.03 s
-[Task  6/25]  Current/Best:   19.05/  20.58 GFLOPS | Progress: (8/20) | 5.81 s
-[Task  6/25]  Current/Best:   13.24/  20.58 GFLOPS | Progress: (12/20) | 7.74 s
-[Task  6/25]  Current/Best:   20.10/  20.58 GFLOPS | Progress: (16/20) | 10.02 s
-[Task  6/25]  Current/Best:    3.71/  20.58 GFLOPS | Progress: (20/20) | 12.55 s Done.
+[Task  6/25]  Current/Best:   12.17/  20.71 GFLOPS | Progress: (4/20) | 3.99 s
+[Task  6/25]  Current/Best:   18.96/  20.71 GFLOPS | Progress: (8/20) | 5.77 s
+[Task  6/25]  Current/Best:   13.31/  20.71 GFLOPS | Progress: (12/20) | 7.70 s
+[Task  6/25]  Current/Best:   19.97/  20.71 GFLOPS | Progress: (16/20) | 9.94 s
+[Task  6/25]  Current/Best:    3.72/  20.71 GFLOPS | Progress: (20/20) | 12.46 s Done.
 
 [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  7/25]  Current/Best:   11.17/  12.72 GFLOPS | Progress: (4/20) | 3.68 s
-[Task  7/25]  Current/Best:   20.24/  21.02 GFLOPS | Progress: (8/20) | 5.20 s
-[Task  7/25]  Current/Best:   15.89/  21.02 GFLOPS | Progress: (12/20) | 7.18 s
-[Task  7/25]  Current/Best:   12.22/  21.02 GFLOPS | Progress: (16/20) | 9.22 s
-[Task  7/25]  Current/Best:    6.46/  21.68 GFLOPS | Progress: (20/20) | 11.68 s Done.
+[Task  7/25]  Current/Best:   11.19/  12.94 GFLOPS | Progress: (4/20) | 3.67 s
+[Task  7/25]  Current/Best:   20.31/  21.15 GFLOPS | Progress: (8/20) | 5.20 s
+[Task  7/25]  Current/Best:   14.33/  21.15 GFLOPS | Progress: (12/20) | 7.16 s
+[Task  7/25]  Current/Best:   12.23/  21.15 GFLOPS | Progress: (16/20) | 9.19 s
+[Task  7/25]  Current/Best:    6.31/  21.63 GFLOPS | Progress: (20/20) | 11.66 s Done.
 
 [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  8/25]  Current/Best:   10.36/  14.62 GFLOPS | Progress: (4/20) | 2.91 s
-[Task  8/25]  Current/Best:    9.56/  14.62 GFLOPS | Progress: (8/20) | 7.73 s
-[Task  8/25]  Current/Best:   12.78/  14.62 GFLOPS | Progress: (12/20) | 13.98 s
-[Task  8/25]  Current/Best:   19.00/  19.00 GFLOPS | Progress: (16/20) | 16.07 s
-[Task  8/25]  Current/Best:   20.22/  20.22 GFLOPS | Progress: (20/20) | 22.57 s Done.
+[Task  8/25]  Current/Best:   10.14/  14.09 GFLOPS | Progress: (4/20) | 2.92 s
+[Task  8/25]  Current/Best:   10.09/  14.09 GFLOPS | Progress: (8/20) | 7.64 s
+[Task  8/25]  Current/Best:   12.60/  14.09 GFLOPS | Progress: (12/20) | 13.76 s
+[Task  8/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (16/20) | 15.86 s
+[Task  8/25]  Current/Best:   20.26/  20.26 GFLOPS | Progress: (20/20) | 22.38 s Done.
 
 [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task  9/25]  Current/Best:   14.19/  15.63 GFLOPS | Progress: (4/20) | 12.05 s
-[Task  9/25]  Current/Best:   23.25/  23.25 GFLOPS | Progress: (8/20) | 13.89 s
-[Task  9/25]  Current/Best:    8.24/  23.25 GFLOPS | Progress: (12/20) | 16.29 s
-[Task  9/25]  Current/Best:   17.92/  23.25 GFLOPS | Progress: (16/20) | 18.96 s
-[Task  9/25]  Current/Best:    9.09/  23.25 GFLOPS | Progress: (20/20) | 26.64 s
+[Task  9/25]  Current/Best:   14.31/  14.31 GFLOPS | Progress: (4/20) | 12.00 s
+[Task  9/25]  Current/Best:   23.35/  23.35 GFLOPS | Progress: (8/20) | 13.82 s
+[Task  9/25]  Current/Best:    8.23/  23.35 GFLOPS | Progress: (12/20) | 16.19 s
+[Task  9/25]  Current/Best:   17.84/  23.35 GFLOPS | Progress: (16/20) | 18.88 s
+[Task  9/25]  Current/Best:    9.21/  23.35 GFLOPS | Progress: (20/20) | 26.65 s
 [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 10/25]  Current/Best:   18.18/  18.18 GFLOPS | Progress: (4/20) | 2.64 s
-[Task 10/25]  Current/Best:   15.54/  18.18 GFLOPS | Progress: (8/20) | 4.22 s
-[Task 10/25]  Current/Best:   12.83/  18.98 GFLOPS | Progress: (12/20) | 5.75 s
-[Task 10/25]  Current/Best:   19.09/  20.48 GFLOPS | Progress: (16/20) | 6.88 s
-[Task 10/25]  Current/Best:    8.92/  20.48 GFLOPS | Progress: (20/20) | 8.44 s Done.
+[Task 10/25]  Current/Best:   18.24/  18.24 GFLOPS | Progress: (4/20) | 2.59 s
+[Task 10/25]  Current/Best:   15.42/  18.24 GFLOPS | Progress: (8/20) | 4.16 s
+[Task 10/25]  Current/Best:   12.49/  18.88 GFLOPS | Progress: (12/20) | 5.69 s
+[Task 10/25]  Current/Best:   19.14/  20.30 GFLOPS | Progress: (16/20) | 6.81 s
+[Task 10/25]  Current/Best:    8.81/  20.30 GFLOPS | Progress: (20/20) | 8.38 s Done.
 
 [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 11/25]  Current/Best:   12.32/  18.19 GFLOPS | Progress: (4/20) | 3.37 s
-[Task 11/25]  Current/Best:   16.86/  18.19 GFLOPS | Progress: (8/20) | 6.14 s
-[Task 11/25]  Current/Best:   18.20/  18.20 GFLOPS | Progress: (12/20) | 8.22 s
-[Task 11/25]  Current/Best:   13.49/  20.95 GFLOPS | Progress: (16/20) | 11.02 s
-[Task 11/25]  Current/Best:   19.44/  21.59 GFLOPS | Progress: (20/20) | 13.06 s Done.
+[Task 11/25]  Current/Best:   11.71/  18.26 GFLOPS | Progress: (4/20) | 3.34 s
+[Task 11/25]  Current/Best:   16.82/  18.26 GFLOPS | Progress: (8/20) | 6.11 s
+[Task 11/25]  Current/Best:   16.42/  18.26 GFLOPS | Progress: (12/20) | 8.21 s
+[Task 11/25]  Current/Best:   13.49/  20.96 GFLOPS | Progress: (16/20) | 10.92 s
+[Task 11/25]  Current/Best:   19.43/  21.55 GFLOPS | Progress: (20/20) | 12.97 s Done.
 
 [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 12/25]  Current/Best:    7.79/  18.19 GFLOPS | Progress: (4/20) | 5.46 s
-[Task 12/25]  Current/Best:    5.21/  18.19 GFLOPS | Progress: (8/20) | 9.15 s
-[Task 12/25]  Current/Best:   19.07/  19.07 GFLOPS | Progress: (12/20) | 11.14 s
-[Task 12/25]  Current/Best:   14.50/  19.07 GFLOPS | Progress: (16/20) | 13.96 s
-[Task 12/25]  Current/Best:   15.21/  19.07 GFLOPS | Progress: (20/20) | 15.94 s Done.
+[Task 12/25]  Current/Best:    7.72/  18.11 GFLOPS | Progress: (4/20) | 5.47 s
+[Task 12/25]  Current/Best:    5.20/  18.11 GFLOPS | Progress: (8/20) | 9.17 s
+[Task 12/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (12/20) | 11.16 s
+[Task 12/25]  Current/Best:   15.02/  18.80 GFLOPS | Progress: (16/20) | 13.95 s
+[Task 12/25]  Current/Best:   15.10/  18.86 GFLOPS | Progress: (20/20) | 15.87 s Done.
 
 [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 13/25]  Current/Best:    8.71/  16.96 GFLOPS | Progress: (4/20) | 3.75 s
-[Task 13/25]  Current/Best:   15.99/  20.76 GFLOPS | Progress: (8/20) | 6.21 s
-[Task 13/25]  Current/Best:   19.37/  21.08 GFLOPS | Progress: (12/20) | 9.17 s
-[Task 13/25]  Current/Best:   12.22/  21.08 GFLOPS | Progress: (16/20) | 12.61 s
-[Task 13/25]  Current/Best:   18.69/  21.08 GFLOPS | Progress: (20/20) | 14.86 s Done.
+[Task 13/25]  Current/Best:    8.75/  17.30 GFLOPS | Progress: (4/20) | 3.70 s
+[Task 13/25]  Current/Best:   16.07/  20.87 GFLOPS | Progress: (8/20) | 6.14 s
+[Task 13/25]  Current/Best:   19.54/  21.46 GFLOPS | Progress: (12/20) | 9.01 s
+[Task 13/25]  Current/Best:   12.23/  21.46 GFLOPS | Progress: (16/20) | 12.39 s
+[Task 13/25]  Current/Best:   18.66/  21.46 GFLOPS | Progress: (20/20) | 14.70 s Done.
 
 [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 14/25]  Current/Best:   13.60/  13.60 GFLOPS | Progress: (4/20) | 3.29 s
-[Task 14/25]  Current/Best:    6.10/  13.60 GFLOPS | Progress: (8/20) | 5.47 s
-[Task 14/25]  Current/Best:   20.81/  20.81 GFLOPS | Progress: (12/20) | 8.04 s
-[Task 14/25]  Current/Best:   17.10/  20.81 GFLOPS | Progress: (16/20) | 9.74 s Done.
+[Task 14/25]  Current/Best:   13.72/  13.72 GFLOPS | Progress: (4/20) | 3.27 s
+[Task 14/25]  Current/Best:    6.12/  13.72 GFLOPS | Progress: (8/20) | 5.45 s
+[Task 14/25]  Current/Best:   20.51/  20.51 GFLOPS | Progress: (12/20) | 7.98 s
+[Task 14/25]  Current/Best:   17.17/  20.51 GFLOPS | Progress: (16/20) | 9.64 s Done.
 
-[Task 14/25]  Current/Best:   17.54/  20.81 GFLOPS | Progress: (20/20) | 11.50 s
+[Task 14/25]  Current/Best:   17.53/  20.51 GFLOPS | Progress: (20/20) | 11.44 s
 [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 15/25]  Current/Best:   16.18/  17.55 GFLOPS | Progress: (4/20) | 2.77 s
-[Task 15/25]  Current/Best:   14.36/  18.12 GFLOPS | Progress: (8/20) | 4.12 s
-[Task 15/25]  Current/Best:   10.33/  22.22 GFLOPS | Progress: (12/20) | 6.25 s
-[Task 15/25]  Current/Best:   20.25/  22.22 GFLOPS | Progress: (16/20) | 9.22 s
-[Task 15/25]  Current/Best:    9.70/  22.22 GFLOPS | Progress: (20/20) | 10.19 s
+[Task 15/25]  Current/Best:   16.16/  17.42 GFLOPS | Progress: (4/20) | 2.76 s
+[Task 15/25]  Current/Best:   14.42/  17.88 GFLOPS | Progress: (8/20) | 4.10 s
+[Task 15/25]  Current/Best:   10.40/  22.36 GFLOPS | Progress: (12/20) | 6.17 s
+[Task 15/25]  Current/Best:   20.27/  22.36 GFLOPS | Progress: (16/20) | 9.17 s
+[Task 15/25]  Current/Best:    9.67/  22.36 GFLOPS | Progress: (20/20) | 10.15 s
 [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 16/25]  Current/Best:   20.25/  20.25 GFLOPS | Progress: (4/20) | 3.01 s
-[Task 16/25]  Current/Best:    3.04/  20.25 GFLOPS | Progress: (8/20) | 4.64 s
-[Task 16/25]  Current/Best:   19.44/  20.25 GFLOPS | Progress: (12/20) | 5.87 s
-[Task 16/25]  Current/Best:   17.89/  20.25 GFLOPS | Progress: (16/20) | 7.23 s
-[Task 16/25]  Current/Best:   10.04/  21.94 GFLOPS | Progress: (20/20) | 9.27 s Done.
+[Task 16/25]  Current/Best:   20.44/  20.44 GFLOPS | Progress: (4/20) | 2.99 s
+[Task 16/25]  Current/Best:    3.04/  20.44 GFLOPS | Progress: (8/20) | 4.61 s
+[Task 16/25]  Current/Best:   19.46/  20.44 GFLOPS | Progress: (12/20) | 5.83 s
+[Task 16/25]  Current/Best:   18.31/  20.44 GFLOPS | Progress: (16/20) | 7.18 s
+[Task 16/25]  Current/Best:   10.00/  21.59 GFLOPS | Progress: (20/20) | 9.23 s Done.
 
 [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 17/25]  Current/Best:   11.86/  18.10 GFLOPS | Progress: (4/20) | 4.80 s
-[Task 17/25]  Current/Best:   14.40/  22.93 GFLOPS | Progress: (8/20) | 7.67 s
-[Task 17/25]  Current/Best:   18.06/  22.93 GFLOPS | Progress: (12/20) | 9.72 s
-[Task 17/25]  Current/Best:   16.42/  22.93 GFLOPS | Progress: (16/20) | 11.86 s
-[Task 17/25]  Current/Best:   10.01/  22.93 GFLOPS | Progress: (20/20) | 14.01 s Done.
+[Task 17/25]  Current/Best:   13.37/  18.13 GFLOPS | Progress: (4/20) | 4.75 s
+[Task 17/25]  Current/Best:   14.43/  23.25 GFLOPS | Progress: (8/20) | 7.61 s
+[Task 17/25]  Current/Best:   18.46/  23.25 GFLOPS | Progress: (12/20) | 9.67 s
+[Task 17/25]  Current/Best:   16.40/  23.25 GFLOPS | Progress: (16/20) | 11.79 s
+[Task 17/25]  Current/Best:   10.05/  23.25 GFLOPS | Progress: (20/20) | 13.93 s Done.
 
 [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 18/25]  Current/Best:   11.40/  17.84 GFLOPS | Progress: (4/20) | 3.75 s
-[Task 18/25]  Current/Best:   10.63/  19.89 GFLOPS | Progress: (8/20) | 7.22 s
-[Task 18/25]  Current/Best:   19.28/  19.89 GFLOPS | Progress: (12/20) | 9.15 s
-[Task 18/25]  Current/Best:   10.00/  19.89 GFLOPS | Progress: (16/20) | 12.73 s
-[Task 18/25]  Current/Best:   20.53/  20.53 GFLOPS | Progress: (20/20) | 14.27 s Done.
+[Task 18/25]  Current/Best:   11.26/  18.00 GFLOPS | Progress: (4/20) | 3.73 s
+[Task 18/25]  Current/Best:   10.55/  20.01 GFLOPS | Progress: (8/20) | 7.15 s
+[Task 18/25]  Current/Best:   19.14/  20.01 GFLOPS | Progress: (12/20) | 9.09 s
+[Task 18/25]  Current/Best:    9.96/  20.01 GFLOPS | Progress: (16/20) | 12.67 s
+[Task 18/25]  Current/Best:   20.27/  20.27 GFLOPS | Progress: (20/20) | 14.21 s Done.
 
 [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 19/25]  Current/Best:    7.07/  20.06 GFLOPS | Progress: (4/20) | 6.09 s
-[Task 19/25]  Current/Best:    2.69/  20.06 GFLOPS | Progress: (8/20) | 9.32 s
-[Task 19/25]  Current/Best:   19.61/  20.82 GFLOPS | Progress: (12/20) | 12.09 s
-[Task 19/25]  Current/Best:   14.75/  21.28 GFLOPS | Progress: (16/20) | 14.91 s
-[Task 19/25]  Current/Best:    2.70/  22.62 GFLOPS | Progress: (20/20) | 17.70 s Done.
+[Task 19/25]  Current/Best:    7.03/  20.27 GFLOPS | Progress: (4/20) | 6.10 s
+[Task 19/25]  Current/Best:    2.69/  20.27 GFLOPS | Progress: (8/20) | 9.33 s
+[Task 19/25]  Current/Best:   19.84/  21.41 GFLOPS | Progress: (12/20) | 12.10 s
+[Task 19/25]  Current/Best:   15.48/  21.64 GFLOPS | Progress: (16/20) | 14.92 s
+[Task 19/25]  Current/Best:    2.70/  23.07 GFLOPS | Progress: (20/20) | 17.69 s Done.
 
 [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 20/25]  Current/Best:    9.01/  14.91 GFLOPS | Progress: (4/20) | 3.35 s Done.
+[Task 20/25]  Current/Best:    9.41/  15.46 GFLOPS | Progress: (4/20) | 3.32 s Done.
  Done.
 
-[Task 20/25]  Current/Best:   10.46/  14.91 GFLOPS | Progress: (8/20) | 6.74 s
-[Task 20/25]  Current/Best:    2.33/  16.54 GFLOPS | Progress: (12/20) | 10.70 s
-[Task 20/25]  Current/Best:   12.55/  16.54 GFLOPS | Progress: (16/20) | 14.29 s
-[Task 20/25]  Current/Best:   13.64/  21.51 GFLOPS | Progress: (20/20) | 16.38 s
+[Task 20/25]  Current/Best:   10.18/  15.46 GFLOPS | Progress: (8/20) | 6.61 s
+[Task 20/25]  Current/Best:    2.32/  16.65 GFLOPS | Progress: (12/20) | 10.67 s
+[Task 20/25]  Current/Best:   11.20/  16.65 GFLOPS | Progress: (16/20) | 14.46 s
+[Task 20/25]  Current/Best:   13.08/  22.14 GFLOPS | Progress: (20/20) | 16.54 s
 [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 21/25]  Current/Best:    6.38/  17.57 GFLOPS | Progress: (4/20) | 3.28 s
-[Task 21/25]  Current/Best:   14.48/  17.57 GFLOPS | Progress: (8/20) | 4.89 s
-[Task 21/25]  Current/Best:    1.61/  17.57 GFLOPS | Progress: (12/20) | 7.05 s
-[Task 21/25]  Current/Best:   17.95/  17.95 GFLOPS | Progress: (16/20) | 10.53 s
-[Task 21/25]  Current/Best:    4.45/  17.95 GFLOPS | Progress: (20/20) | 17.85 s
+[Task 21/25]  Current/Best:    6.40/  17.66 GFLOPS | Progress: (4/20) | 3.26 s
+[Task 21/25]  Current/Best:   14.63/  17.66 GFLOPS | Progress: (8/20) | 4.80 s
+[Task 21/25]  Current/Best:    1.61/  17.66 GFLOPS | Progress: (12/20) | 6.96 s
+[Task 21/25]  Current/Best:   18.12/  18.12 GFLOPS | Progress: (16/20) | 10.41 s
+[Task 21/25]  Current/Best:    4.47/  18.12 GFLOPS | Progress: (20/20) | 17.57 s
 [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 22/25]  Current/Best:    2.70/  17.00 GFLOPS | Progress: (4/20) | 2.74 s
-[Task 22/25]  Current/Best:    8.70/  21.91 GFLOPS | Progress: (8/20) | 4.71 s
-[Task 22/25]  Current/Best:   19.72/  21.91 GFLOPS | Progress: (12/20) | 7.05 s
-[Task 22/25]  Current/Best:   15.01/  21.91 GFLOPS | Progress: (16/20) | 9.11 s
-[Task 22/25]  Current/Best:   15.24/  21.91 GFLOPS | Progress: (20/20) | 10.84 s Done.
+[Task 22/25]  Current/Best:    2.70/  17.05 GFLOPS | Progress: (4/20) | 2.70 s
+[Task 22/25]  Current/Best:    8.73/  21.88 GFLOPS | Progress: (8/20) | 4.68 s
+[Task 22/25]  Current/Best:   20.00/  21.88 GFLOPS | Progress: (12/20) | 7.00 s
+[Task 22/25]  Current/Best:   14.92/  21.88 GFLOPS | Progress: (16/20) | 9.08 s
+[Task 22/25]  Current/Best:   14.24/  21.88 GFLOPS | Progress: (20/20) | 10.75 s Done.
 
 [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 23/25]  Current/Best:   17.31/  20.25 GFLOPS | Progress: (4/20) | 3.32 s
-[Task 23/25]  Current/Best:   16.15/  20.25 GFLOPS | Progress: (8/20) | 6.68 s
-[Task 23/25]  Current/Best:   20.75/  21.19 GFLOPS | Progress: (12/20) | 8.50 s
-[Task 23/25]  Current/Best:    6.23/  21.19 GFLOPS | Progress: (16/20) | 15.70 s
-[Task 23/25]  Current/Best:    7.64/  21.19 GFLOPS | Progress: (20/20) | 19.97 s Done.
+[Task 23/25]  Current/Best:   17.56/  20.54 GFLOPS | Progress: (4/20) | 3.29 s
+[Task 23/25]  Current/Best:   15.49/  20.54 GFLOPS | Progress: (8/20) | 6.54 s
+[Task 23/25]  Current/Best:   20.94/  21.51 GFLOPS | Progress: (12/20) | 8.35 s
+[Task 23/25]  Current/Best:    6.30/  21.51 GFLOPS | Progress: (16/20) | 15.24 s
+[Task 23/25]  Current/Best:    7.67/  21.51 GFLOPS | Progress: (20/20) | 19.46 s Done.
 
 [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 24/25]  Current/Best:    8.42/   8.42 GFLOPS | Progress: (4/20) | 11.84 s
-[Task 24/25]  Current/Best:    1.95/   8.42 GFLOPS | Progress: (8/20) | 22.88 s
-[Task 24/25]  Current/Best:    3.85/   8.42 GFLOPS | Progress: (12/20) | 34.46 s Done.
+[Task 24/25]  Current/Best:    8.10/   8.10 GFLOPS | Progress: (4/20) | 11.85 s
+[Task 24/25]  Current/Best:    3.61/   8.10 GFLOPS | Progress: (8/20) | 23.11 s
+[Task 24/25]  Current/Best:    4.68/   8.10 GFLOPS | Progress: (12/20) | 33.84 s Done.
 
-[Task 24/25]  Current/Best:    7.13/   8.78 GFLOPS | Progress: (16/20) | 39.92 s
-[Task 24/25]  Current/Best:    3.31/   8.86 GFLOPS | Progress: (20/20) | 45.85 s Done.
+[Task 24/25]  Current/Best:    7.16/   8.95 GFLOPS | Progress: (16/20) | 39.22 s
+[Task 24/25]  Current/Best:    3.28/   8.99 GFLOPS | Progress: (20/20) | 45.12 s Done.
 
 [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/20) | 0.00 s
-[Task 25/25]  Current/Best:    1.55/   2.88 GFLOPS | Progress: (4/20) | 11.63 s
-[Task 25/25]  Current/Best:    5.50/   7.86 GFLOPS | Progress: (8/20) | 22.94 s
-[Task 25/25]  Current/Best:    5.60/   7.86 GFLOPS | Progress: (12/20) | 34.44 s
-[Task 25/25]  Current/Best:    5.57/   9.25 GFLOPS | Progress: (16/20) | 36.25 s
-[Task 25/25]  Current/Best:    2.88/   9.25 GFLOPS | Progress: (20/20) | 46.92 s
+[Task 25/25]  Current/Best:    1.55/   2.89 GFLOPS | Progress: (4/20) | 11.64 s
+[Task 25/25]  Current/Best:    5.70/   7.89 GFLOPS | Progress: (8/20) | 22.96 s
+[Task 25/25]  Current/Best:    6.02/   7.89 GFLOPS | Progress: (12/20) | 34.45 s
+[Task 25/25]  Current/Best:    5.78/   8.93 GFLOPS | Progress: (16/20) | 36.33 s
+[Task 25/25]  Current/Best:    2.90/   9.05 GFLOPS | Progress: (20/20) | 46.99 s
 </pre></div>
 </div>
 <p>The output from this tuning process will look something like this:</p>
@@ -981,8 +981,8 @@ improvement in comparing the optimized model to the unoptimized model.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;unoptimized: </span><span class="si">%s</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="p">(</span><a href="https://docs.python.org/3/library/stdtypes.html#dict" title="builtins.dict" class="sphx-glr-backref-module-builtins sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">unoptimized</span></a><span class="p">))</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>optimized: {&#39;mean&#39;: 413.67374406999943, &#39;median&#39;: 413.19100769999295, &#39;std&#39;: 1.3671008475901267}
-unoptimized: {&#39;mean&#39;: 497.01132035, &#39;median&#39;: 496.9583858500016, &#39;std&#39;: 0.719820825023517}
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>optimized: {&#39;mean&#39;: 414.7075195700063, &#39;median&#39;: 414.6528698500106, &#39;std&#39;: 1.308842791249852}
+unoptimized: {&#39;mean&#39;: 496.51104268999006, &#39;median&#39;: 495.9818490999851, &#39;std&#39;: 1.709787608932748}
 </pre></div>
 </div>
 </div>
@@ -996,7 +996,7 @@ models.</p>
 <p>Here we presented a simple example using ResNet-50 v2 locally. However, TVM
 supports many more features including cross-compilation, remote execution and
 profiling/benchmarking.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 10 minutes  21.737 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 10 minutes  16.869 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-autotvm-relay-x86-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/57a45d9bef1af358191e7d50043e652c/autotvm_relay_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">autotvm_relay_x86.py</span></code></a></p>
diff --git a/docs/tutorial/cross_compilation_and_rpc.html b/docs/tutorial/cross_compilation_and_rpc.html
index d892c7257..6af5cec1f 100644
--- a/docs/tutorial/cross_compilation_and_rpc.html
+++ b/docs/tutorial/cross_compilation_and_rpc.html
@@ -527,7 +527,7 @@ device and returns the measured cost. Network overhead is excluded.</p>
 <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;</span><span class="si">%g</span><span class="s2"> secs/op&quot;</span> <span class="o">%</span> <span class="n">cost</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>1.246e-07 secs/op
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>1.279e-07 secs/op
 </pre></div>
 </div>
 </div>
diff --git a/docs/tutorial/intro_topi.html b/docs/tutorial/intro_topi.html
index 21d9fd521..f7b8f83e7 100644
--- a/docs/tutorial/intro_topi.html
+++ b/docs/tutorial/intro_topi.html
@@ -484,7 +484,7 @@ we can schedule the following series of operations ending with <code class="code
 <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><a href="../reference/api/python/ir.html#tvm.ir.Array" title="tvm.ir.Array" class="sphx-glr-backref-module-tvm-ir sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">sg</span><span class="o">.</span><span class="n">stages</span></a><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>[stage(a, placeholder(a, 0x1c281cb0)), stage(b, placeholder(b, 0x1c282b20)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[ [...]
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>[stage(a, placeholder(a, 0x124ca830)), stage(b, placeholder(b, 0x12482b70)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[ [...]
 </pre></div>
 </div>
 <p>We can test the correctness by comparing with <code class="code docutils literal notranslate"><span class="pre">numpy</span></code> result as follows</p>
diff --git a/docs/tutorial/sg_execution_times.html b/docs/tutorial/sg_execution_times.html
index 2848cab67..8157d08a9 100644
--- a/docs/tutorial/sg_execution_times.html
+++ b/docs/tutorial/sg_execution_times.html
@@ -327,7 +327,7 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-tutorial-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>13:09.031</strong> total execution time for <strong>tutorial</strong> files:</p>
+<p><strong>13:05.923</strong> total execution time for <strong>tutorial</strong> files:</p>
 <table class="docutils align-default">
 <colgroup>
 <col style="width: 83%" />
@@ -336,35 +336,35 @@
 </colgroup>
 <tbody>
 <tr class="row-odd"><td><p><a class="reference internal" href="autotvm_relay_x86.html#sphx-glr-tutorial-autotvm-relay-x86-py"><span class="std std-ref">Compiling and Optimizing a Model with the Python Interface (AutoTVM)</span></a> (<code class="docutils literal notranslate"><span class="pre">autotvm_relay_x86.py</span></code>)</p></td>
-<td><p>10:21.737</p></td>
+<td><p>10:16.869</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="tensor_expr_get_started.html#sphx-glr-tutorial-tensor-expr-get-started-py"><span class="std std-ref">Working with Operators Using Tensor Expression</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_expr_get_started.py</span></code>)</p></td>
-<td><p>01:00.306</p></td>
+<td><p>00:59.413</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="auto_scheduler_matmul_x86.html#sphx-glr-tutorial-auto-scheduler-matmul-x86-py"><span class="std std-ref">Optimizing Operators with Auto-scheduling</span></a> (<code class="docutils literal notranslate"><span class="pre">auto_scheduler_matmul_x86.py</span></code>)</p></td>
-<td><p>00:48.890</p></td>
+<td><p>00:53.320</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="relay_quick_start.html#sphx-glr-tutorial-relay-quick-start-py"><span class="std std-ref">Quick Start Tutorial for Compiling Deep Learning Models</span></a> (<code class="docutils literal notranslate"><span class="pre">relay_quick_start.py</span></code>)</p></td>
-<td><p>00:31.358</p></td>
+<td><p>00:30.823</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="autotvm_matmul_x86.html#sphx-glr-tutorial-autotvm-matmul-x86-py"><span class="std std-ref">Optimizing Operators with Schedule Templates and AutoTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">autotvm_matmul_x86.py</span></code>)</p></td>
-<td><p>00:24.696</p></td>
+<td><p>00:24.118</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="tensor_ir_blitz_course.html#sphx-glr-tutorial-tensor-ir-blitz-course-py"><span class="std std-ref">Blitz Course to TensorIR</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_ir_blitz_course.py</span></code>)</p></td>
-<td><p>00:01.172</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="intro_topi.html#sphx-glr-tutorial-intro-topi-py"><span class="std std-ref">Introduction to TOPI</span></a> (<code class="docutils literal notranslate"><span class="pre">intro_topi.py</span></code>)</p></td>
+<td><p>00:00.708</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="intro_topi.html#sphx-glr-tutorial-intro-topi-py"><span class="std std-ref">Introduction to TOPI</span></a> (<code class="docutils literal notranslate"><span class="pre">intro_topi.py</span></code>)</p></td>
-<td><p>00:00.706</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="tensor_ir_blitz_course.html#sphx-glr-tutorial-tensor-ir-blitz-course-py"><span class="std std-ref">Blitz Course to TensorIR</span></a> (<code class="docutils literal notranslate"><span class="pre">tensor_ir_blitz_course.py</span></code>)</p></td>
+<td><p>00:00.513</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" href="cross_compilation_and_rpc.html#sphx-glr-tutorial-cross-compilation-and-rpc-py"><span class="std std-ref">Cross Compilation and RPC</span></a> (<code class="docutils literal notranslate"><span class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.156</p></td>
+<td><p>00:00.151</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" href="introduction.html#sphx-glr-tutorial-introduction-py"><span class="std std-ref">Introduction</span></a> (<code class="docutils literal notranslate"><span class="pre">introduction.py</span></code>)</p></td>
@@ -375,15 +375,15 @@
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="tvmc_python.html#sphx-glr-tutorial-tvmc-python-py"><span class="std std-ref">Getting Starting using TVMC Python: a high-level API for TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_python.py</span></code>)</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="install.html#sphx-glr-tutorial-install-py"><span class="std std-ref">Installing TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">install.py</span></code>)</p></td>
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" href="install.html#sphx-glr-tutorial-install-py"><span class="std std-ref">Installing TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">install.py</span></code>)</p></td>
+<tr class="row-even"><td><p><a class="reference internal" href="tvmc_command_line_driver.html#sphx-glr-tutorial-tvmc-command-line-driver-py"><span class="std std-ref">Compiling and Optimizing a Model with TVMC</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_command_line_driver.py</span></code>)</p></td>
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" href="tvmc_command_line_driver.html#sphx-glr-tutorial-tvmc-command-line-driver-py"><span class="std std-ref">Compiling and Optimizing a Model with TVMC</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_command_line_driver.py</span></code>)</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" href="tvmc_python.html#sphx-glr-tutorial-tvmc-python-py"><span class="std std-ref">Getting Starting using TVMC Python: a high-level API for TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">tvmc_python.py</span></code>)</p></td>
 <td><p>00:00.001</p></td>
 <td><p>0.0 MB</p></td>
 </tr>
diff --git a/docs/tutorial/tensor_expr_get_started.html b/docs/tutorial/tensor_expr_get_started.html
index c7b42e886..43c2b872c 100644
--- a/docs/tutorial/tensor_expr_get_started.html
+++ b/docs/tutorial/tensor_expr_get_started.html
@@ -542,8 +542,8 @@ helper function to run a profile of the TVM generated code.</p>
 <span class="n">evaluate_addition</span><span class="p">(</span><span class="n">fadd</span><span class="p">,</span> <a href="../reference/api/python/target.html#tvm.target.Target" title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">tgt</span></a><span class="p">,</span> <span class="s2">&quot;naive&quot;</span><span class="p">,</span> <a href="https://docs.python.org/3/library/stdtypes.html#list" ti [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.000008
-naive: 0.000007
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.000012
+naive: 0.000010
 </pre></div>
 </div>
 </div>
@@ -594,7 +594,7 @@ compile and run this new schedule with the parallel operation applied:</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-parallel: 0.000006
+parallel: 0.000011
 </pre></div>
 </div>
 </div>
@@ -668,10 +668,10 @@ vector: 0.000025
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Operator                  Timing             Performance
-   numpy    8.439260000159265e-06                    1.0
-   naive    6.598399999999999e-06     0.7818695003916782
-parallel              5.9822e-06      0.7088536198537673
-  vector    2.4694700000000003e-05     2.926168882050555
+   numpy    1.2196850002510473e-05                   1.0
+   naive             1.02922e-05      0.8438408275810196
+parallel             1.10824e-05      0.9086280472186601
+  vector    2.4550700000000002e-05    2.0128721755983503
 </pre></div>
 </div>
 <div class="admonition-code-specialization admonition">
@@ -987,7 +987,7 @@ matrix multiplication.</p>
 <span class="n">answer</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">numpy</span><span class="p">(),</span> <span class="n">b</span><span class="o">.</span><span class="n">numpy</span><span class="p">())</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018766
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018257
 </pre></div>
 </div>
 <p>Now we write a basic matrix multiplication using TVM TE and verify that it
@@ -1030,7 +1030,7 @@ optimizations.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-none: 3.327939
+none: 3.290384
 </pre></div>
 </div>
 <p>Let’s take a look at the intermediate representation of the operator and
@@ -1097,7 +1097,7 @@ schedule.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-blocking: 0.311314
+blocking: 0.301687
 </pre></div>
 </div>
 <p>By reordering the computation to take advantage of caching, you should see a
@@ -1158,7 +1158,7 @@ already cache friendly from our previous optimizations.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-vectorization: 0.346457
+vectorization: 0.336929
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1215,7 +1215,7 @@ more cache friendly.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-loop permutation: 0.114644
+loop permutation: 0.118714
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1293,7 +1293,7 @@ optimized schedule.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-array packing: 0.107473
+array packing: 0.109632
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1369,7 +1369,7 @@ to `C</cite> when all the block results are ready.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-block caching: 0.110177
+block caching: 0.111129
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1438,7 +1438,7 @@ of thread-level parallelization.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>/workspace/python/tvm/driver/build_module.py:267: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
   &quot;target_host parameter is going to be deprecated. &quot;
-parallelization: 0.146244
+parallelization: 0.145083
 @main = primfn(A_1: handle, B_1: handle, C_1: handle) -&gt; ()
   attr = {&quot;from_legacy_te_schedule&quot;: True, &quot;global_symbol&quot;: &quot;main&quot;, &quot;tir.noalias&quot;: True}
   buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1500,13 +1500,13 @@ working, we can compare the results.</p>
 </pre></div>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>        Operator                  Timing             Performance
-            none            3.3279385334                     1.0
-        blocking            0.3113135968     0.09354547677956822
-   vectorization            0.3464572697      0.1041056696879677
-loop permutation             0.114643695    0.034448861915389366
-   array packing     0.10747286520000002     0.03229412566409392
-   block caching            0.1101772656     0.03310676098558738
- parallelization            0.1462436361    0.043944211899427627
+            none            3.2903840148                     1.0
+        blocking     0.30168669979999996     0.09168738312702307
+   vectorization             0.336928585     0.10239795217959677
+loop permutation     0.11871362569999999     0.03607895770403437
+   array packing     0.10963219760000001     0.03331896736273921
+   block caching            0.1111290892     0.03377389651181939
+ parallelization            0.1450830362     0.04409304067471243
 </pre></div>
 </div>
 <p>Note that the outputs on the web page reflect the running times on a
@@ -1538,7 +1538,6 @@ is</p>
 you can build generic templates of the matrix multiplication and other
 operations with tunable parameters that allows you to automatically optimize
 the computation for specific platforms.</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  0.306 seconds)</p>
 <div class="sphx-glr-footer sphx-glr-footer-example docutils container" id="sphx-glr-download-tutorial-tensor-expr-get-started-py">
 <div class="sphx-glr-download sphx-glr-download-python docutils container">
 <p><a class="reference download internal" download="" href="../_downloads/40a01cffb015a67aaec0fad7e27cf80d/tensor_expr_get_started.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tensor_expr_get_started.py</span></code></a></p>