You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by tq...@apache.org on 2022/05/02 17:23:22 UTC

[tvm-site] branch asf-site updated: deploying docs (apache/tvm@8eae317d28622238c0a6c0f22c0d4a8f9e62f883)

This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new a1387872d deploying docs (apache/tvm@8eae317d28622238c0a6c0f22c0d4a8f9e62f883)
a1387872d is described below

commit a1387872d69a41cb55edcec49f0e90b1e6c736bf
Author: tvm-bot <95...@users.noreply.github.com>
AuthorDate: Mon May 2 17:23:15 2022 +0000

    deploying docs (apache/tvm@8eae317d28622238c0a6c0f22c0d4a8f9e62f883)
---
 .../how_to/compile_models/from_mxnet.rst.txt       |    2 +-
 .../how_to/compile_models/from_oneflow.rst.txt     |    2 +-
 .../how_to/compile_models/from_paddle.rst.txt      |    2 +-
 .../how_to/compile_models/from_pytorch.rst.txt     |    2 +-
 .../how_to/compile_models/from_tensorflow.rst.txt  |    2 +-
 .../compile_models/sg_execution_times.rst.txt      |   22 +-
 .../deploy_models/deploy_model_on_android.rst.txt  |    2 +-
 .../deploy_object_detection_pytorch.rst.txt        |    4 +-
 .../deploy_models/deploy_prequantized.rst.txt      |    6 +-
 .../deploy_prequantized_tflite.rst.txt             |    4 +-
 .../how_to/deploy_models/deploy_quantized.rst.txt  |    2 +-
 .../deploy_models/deploy_ssd_gluoncv.rst.txt       |    4 +-
 .../deploy_models/sg_execution_times.rst.txt       |   16 +-
 .../extend_tvm/bring_your_own_datatypes.rst.txt    |    4 +-
 .../how_to/extend_tvm/sg_execution_times.rst.txt   |   10 +-
 .../how_to/extend_tvm/use_pass_instrument.rst.txt  |   16 +-
 .../optimize_operators/opt_conv_cuda.rst.txt       |    2 +-
 .../optimize_operators/opt_conv_tensorcore.rst.txt |    2 +-
 .../how_to/optimize_operators/opt_gemm.rst.txt     |   16 +-
 .../optimize_operators/sg_execution_times.rst.txt  |    8 +-
 .../sg_execution_times.rst.txt                     |   16 +-
 .../tune_conv2d_layer_cuda.rst.txt                 | 1839 +++++----
 .../tune_network_cuda.rst.txt                      |    2 +-
 .../tune_network_x86.rst.txt                       |    4 +-
 .../tune_sparse_x86.rst.txt                        |    4 +-
 .../tune_with_autotvm/sg_execution_times.rst.txt   |   12 +-
 .../tune_with_autotvm/tune_conv2d_cuda.rst.txt     |   34 +-
 .../work_with_microtvm/micro_autotune.rst.txt      |   16 +-
 .../work_with_microtvm/sg_execution_times.rst.txt  |   10 +-
 .../work_with_relay/sg_execution_times.rst.txt     |    8 +-
 .../work_with_schedules/sg_execution_times.rst.txt |   18 +-
 .../how_to/work_with_schedules/tensorize.rst.txt   |    2 +-
 .../tutorials/autotvm/sg_execution_times.rst.txt   |    6 +-
 .../frontend/deploy_classification.rst.txt         |    2 +-
 .../tutorials/frontend/deploy_detection.rst.txt    |    2 +-
 .../tutorials/frontend/sg_execution_times.rst.txt  |    6 +-
 .../tutorials/optimize/sg_execution_times.rst.txt  |    6 +-
 .../topic/vta/tutorials/sg_execution_times.rst.txt |    6 +-
 .../tutorial/auto_scheduler_matmul_x86.rst.txt     |    2 +-
 docs/_sources/tutorial/autotvm_relay_x86.rst.txt   |   56 +-
 .../tutorial/cross_compilation_and_rpc.rst.txt     |    2 +-
 docs/_sources/tutorial/intro_topi.rst.txt          |    2 +-
 docs/_sources/tutorial/sg_execution_times.rst.txt  |   22 +-
 .../tutorial/tensor_expr_get_started.rst.txt       |   44 +-
 docs/commit_hash                                   |    2 +-
 docs/how_to/compile_models/from_mxnet.html         |    2 +-
 docs/how_to/compile_models/from_oneflow.html       |  141 +-
 docs/how_to/compile_models/from_paddle.html        |    2 +-
 docs/how_to/compile_models/from_pytorch.html       |    6 +-
 docs/how_to/compile_models/from_tensorflow.html    |    2 +-
 docs/how_to/compile_models/sg_execution_times.html |   22 +-
 .../deploy_models/deploy_model_on_android.html     |    2 +-
 .../deploy_object_detection_pytorch.html           |  119 +-
 docs/how_to/deploy_models/deploy_prequantized.html |   13 +-
 .../deploy_models/deploy_prequantized_tflite.html  |    4 +-
 docs/how_to/deploy_models/deploy_quantized.html    |    2 +-
 docs/how_to/deploy_models/deploy_ssd_gluoncv.html  |   36 +-
 docs/how_to/deploy_models/sg_execution_times.html  |   16 +-
 .../extend_tvm/bring_your_own_datatypes.html       |    4 +-
 docs/how_to/extend_tvm/sg_execution_times.html     |   10 +-
 docs/how_to/extend_tvm/use_pass_instrument.html    |   16 +-
 docs/how_to/optimize_operators/opt_conv_cuda.html  |    2 +-
 .../optimize_operators/opt_conv_tensorcore.html    |    2 +-
 docs/how_to/optimize_operators/opt_gemm.html       |   16 +-
 .../optimize_operators/sg_execution_times.html     |    8 +-
 .../sg_execution_times.html                        |   14 +-
 .../tune_conv2d_layer_cuda.html                    | 1839 +++++----
 .../tune_with_autoscheduler/tune_network_cuda.html |    2 +-
 .../tune_with_autoscheduler/tune_network_x86.html  |    4 +-
 .../tune_with_autoscheduler/tune_sparse_x86.html   |    4 +-
 .../tune_with_autotvm/sg_execution_times.html      |   12 +-
 .../how_to/tune_with_autotvm/tune_conv2d_cuda.html |   34 +-
 docs/how_to/work_with_microtvm/micro_autotune.html |   16 +-
 .../work_with_microtvm/sg_execution_times.html     |   10 +-
 .../how_to/work_with_relay/sg_execution_times.html |    8 +-
 .../work_with_schedules/sg_execution_times.html    |   18 +-
 docs/how_to/work_with_schedules/tensorize.html     |    2 +-
 docs/reference/api/doxygen/annotated.html          |  160 +-
 .../doxygen/apply__history__best_8h_source.html    |    2 +-
 docs/reference/api/doxygen/builder_8h_source.html  |    2 +-
 docs/reference/api/doxygen/classes.html            |  427 +-
 .../doxygen/classtvm_1_1LinkedParam-members.html   |  102 -
 .../api/doxygen/classtvm_1_1LinkedParam.html       |  256 --
 .../classtvm_1_1LinkedParamNode-members.html       |  115 -
 .../api/doxygen/classtvm_1_1LinkedParamNode.html   |  320 --
 .../classtvm_1_1LinkedParamNode__coll__graph.svg   |  185 -
 ...classtvm_1_1LinkedParamNode__inherit__graph.svg |   76 -
 .../classtvm_1_1LinkedParam__coll__graph.svg       |   92 -
 .../classtvm_1_1LinkedParam__inherit__graph.svg    |   62 -
 .../api/doxygen/classtvm_1_1runtime_1_1Object.html |    2 +-
 .../doxygen/classtvm_1_1runtime_1_1ObjectRef.html  |    2 +-
 ...asstvm_1_1runtime_1_1ObjectRef__coll__graph.svg |   12 +-
 .../classtvm_1_1runtime_1_1Object__coll__graph.svg |    8 +-
 docs/reference/api/doxygen/codegen_8h_source.html  |    2 +-
 docs/reference/api/doxygen/database_8h_source.html |    2 +-
 .../api/doxygen/dataflow__matcher_8h_source.html   |    2 +-
 .../api/doxygen/diagnostic_8h_source.html          |    2 +-
 docs/reference/api/doxygen/error_8h_source.html    |    2 +-
 .../api/doxygen/extracted__task_8h_source.html     |    2 +-
 docs/reference/api/doxygen/functions__.html        |    1 -
 docs/reference/api/doxygen/functions_func_l.html   |    5 +-
 docs/reference/api/doxygen/functions_func_s.html   |    4 +-
 docs/reference/api/doxygen/functions_func_t.html   |    7 +-
 docs/reference/api/doxygen/functions_func_v.html   |   31 +-
 docs/reference/api/doxygen/functions_i.html        |   11 +-
 docs/reference/api/doxygen/functions_l.html        |    5 +-
 docs/reference/api/doxygen/functions_p.html        |    7 +-
 docs/reference/api/doxygen/functions_s.html        |    2 +-
 docs/reference/api/doxygen/functions_t.html        |   15 +-
 docs/reference/api/doxygen/functions_v.html        |   35 +-
 docs/reference/api/doxygen/functions_vars.html     |    1 -
 docs/reference/api/doxygen/functions_vars_i.html   |    3 -
 docs/reference/api/doxygen/functions_vars_p.html   |    3 -
 docs/reference/api/doxygen/hierarchy.html          |  740 ++--
 docs/reference/api/doxygen/inherit_graph_10.svg    |   16 +-
 docs/reference/api/doxygen/inherit_graph_107.svg   | 3035 +++++++-------
 docs/reference/api/doxygen/inherit_graph_116.svg   | 4353 ++++++++++----------
 docs/reference/api/doxygen/inherit_graph_185.svg   |    8 +-
 docs/reference/api/doxygen/inherit_graph_199.svg   |   16 +-
 docs/reference/api/doxygen/inherit_graph_200.svg   |   16 +-
 docs/reference/api/doxygen/inherit_graph_39.svg    |   16 +-
 docs/reference/api/doxygen/inherit_graph_43.svg    |    8 +-
 docs/reference/api/doxygen/inherits.html           |    2 +-
 .../api/doxygen/instrument_8h_source.html          |    2 +-
 .../api/doxygen/interpreter_8h_source.html         |    2 +-
 docs/reference/api/doxygen/ir_2module_8h.html      |    6 -
 .../api/doxygen/ir_2module_8h_source.html          |   78 +-
 .../api/doxygen/ir_2transform_8h_source.html       |    2 +-
 .../api/doxygen/namespacemembers_func_s.html       |    6 +-
 docs/reference/api/doxygen/namespacemembers_k.html |    5 +-
 docs/reference/api/doxygen/namespacemembers_s.html |    6 +-
 .../api/doxygen/namespacemembers_vars.html         |    3 -
 docs/reference/api/doxygen/namespacetvm.html       |    6 -
 .../api/doxygen/namespacetvm_1_1tir_1_1attr.html   |   21 -
 docs/reference/api/doxygen/parser_8h_source.html   |    2 +-
 .../api/doxygen/relay_2feature_8h_source.html      |    2 +-
 .../api/doxygen/relay_2transform_8h_source.html    |    2 +-
 docs/reference/api/doxygen/search/all_1.js         |    2 +-
 docs/reference/api/doxygen/search/all_11.js        |    1 -
 docs/reference/api/doxygen/search/all_13.js        |    6 +-
 docs/reference/api/doxygen/search/all_14.js        |   14 +-
 docs/reference/api/doxygen/search/all_15.js        |   12 +-
 docs/reference/api/doxygen/search/all_16.js        |    4 +-
 docs/reference/api/doxygen/search/all_17.js        |    8 +-
 docs/reference/api/doxygen/search/all_a.js         |    2 +-
 docs/reference/api/doxygen/search/all_c.js         |    1 -
 docs/reference/api/doxygen/search/all_d.js         |    2 -
 docs/reference/api/doxygen/search/all_e.js         |    2 +-
 docs/reference/api/doxygen/search/classes_10.js    |    4 +-
 docs/reference/api/doxygen/search/classes_11.js    |    2 +-
 docs/reference/api/doxygen/search/classes_13.js    |    4 +-
 docs/reference/api/doxygen/search/classes_9.js     |    2 -
 docs/reference/api/doxygen/search/classes_f.js     |    2 +-
 docs/reference/api/doxygen/search/functions_12.js  |    2 +-
 docs/reference/api/doxygen/search/functions_13.js  |    4 +-
 docs/reference/api/doxygen/search/functions_14.js  |    8 +-
 docs/reference/api/doxygen/search/functions_15.js  |    2 +-
 docs/reference/api/doxygen/search/functions_16.js  |    4 +-
 docs/reference/api/doxygen/search/functions_c.js   |    1 -
 docs/reference/api/doxygen/search/functions_d.js   |    2 +-
 docs/reference/api/doxygen/search/variables_0.js   |    2 +-
 docs/reference/api/doxygen/search/variables_9.js   |    1 -
 docs/reference/api/doxygen/search/variables_a.js   |    1 -
 docs/reference/api/doxygen/search/variables_f.js   |    1 -
 .../api/doxygen/space__generator_8h_source.html    |    2 +-
 docs/reference/api/doxygen/state_8h_source.html    |    2 +-
 .../api/doxygen/tir_2analysis_8h_source.html       |    4 +-
 docs/reference/api/doxygen/tir_2function_8h.html   |    3 -
 .../api/doxygen/tir_2function_8h_source.html       |    5 +-
 .../doxygen/tir_2schedule_2schedule_8h_source.html |    2 +-
 .../api/doxygen/tir_2usmp_2utils_8h_source.html    |    2 +-
 .../api/doxygen/type__relation_8h_source.html      |    2 +-
 docs/reference/api/python/auto_scheduler.html      |    4 +-
 .../api/typedoc/classes/bytestreamreader.html      |   12 +-
 .../api/typedoc/classes/cachedcallstack.html       |   34 +-
 docs/reference/api/typedoc/classes/dldatatype.html |   12 +-
 docs/reference/api/typedoc/classes/dldevice.html   |   10 +-
 .../reference/api/typedoc/classes/environment.html |   12 +-
 docs/reference/api/typedoc/classes/ffilibrary.html |   20 +-
 .../api/typedoc/classes/graphexecutor.html         |   16 +-
 docs/reference/api/typedoc/classes/instance.html   |   40 +-
 docs/reference/api/typedoc/classes/memory.html     |   34 +-
 docs/reference/api/typedoc/classes/module.html     |   10 +-
 docs/reference/api/typedoc/classes/ndarray.html    |   22 +-
 .../api/typedoc/classes/packedfunccell.html        |    6 +-
 docs/reference/api/typedoc/classes/rpcserver.html  |   14 +-
 docs/reference/api/typedoc/classes/scalar.html     |    6 +-
 .../api/typedoc/classes/webgpucontext.html         |   12 +-
 docs/reference/api/typedoc/enums/argtypecode.html  |   30 +-
 .../api/typedoc/enums/aynccallbackcode.html        |    4 +-
 .../api/typedoc/enums/dldatatypecode.html          |    8 +-
 .../api/typedoc/enums/rpcserverstate.html          |   12 +-
 docs/reference/api/typedoc/enums/sizeof.html       |   18 +-
 docs/reference/api/typedoc/index.html              |  112 +-
 .../api/typedoc/interfaces/disposable.html         |    2 +-
 .../api/typedoc/interfaces/functioninfo.html       |    6 +-
 .../api/typedoc/interfaces/libraryprovider.html    |    4 +-
 docs/searchindex.js                                |    2 +-
 .../vta/tutorials/autotvm/sg_execution_times.html  |    6 +-
 .../tutorials/frontend/deploy_classification.html  |    2 +-
 .../vta/tutorials/frontend/deploy_detection.html   |    2 +-
 .../vta/tutorials/frontend/sg_execution_times.html |    6 +-
 .../vta/tutorials/optimize/sg_execution_times.html |    6 +-
 docs/topic/vta/tutorials/sg_execution_times.html   |    6 +-
 docs/tutorial/auto_scheduler_matmul_x86.html       |    2 +-
 docs/tutorial/autotvm_relay_x86.html               |  170 +-
 docs/tutorial/cross_compilation_and_rpc.html       |    2 +-
 docs/tutorial/intro_topi.html                      |    2 +-
 docs/tutorial/sg_execution_times.html              |   22 +-
 docs/tutorial/tensor_expr_get_started.html         |   44 +-
 210 files changed, 7278 insertions(+), 8529 deletions(-)

diff --git a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
index 33bef5582..dbc5c94e1 100644
--- a/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_mxnet.rst.txt
@@ -98,7 +98,7 @@ In this section, we download a pretrained imagenet model and classify an image.
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip1902edea-a301-4f5a-baf4-31e1d3057145 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+    Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip0eb6e8fc-3d27-4193-9d1a-e0aa100151b4 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
     x (1, 3, 224, 224)
 
 
diff --git a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
index 31a9b2b97..46146a4ad 100644
--- a/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_oneflow.rst.txt
@@ -100,7 +100,7 @@ Load a pretrained OneFlow model and save model
  .. code-block:: none
 
     Downloading: "https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip" to /workspace/.oneflow/flowvision_cache/resnet18.zip
-
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
      0%|          | 16.0k/41.5M [00:00<08:20, 86.8kB/s]
      0%|          | 48.0k/41.5M [00:00<05:16, 137kB/s] 
      0%|          | 96.0k/41.5M [00:00<03:44, 193kB/s]
      0%|          | 160k/41.5M [00:00<02:51, 253kB/s] 
      1%|          | 288k/41.5M [00:00<01:45, 411kB/s]
      1%|1         | 464k/41.5M [00:01<01:12, 594kB/s]
      1%|1         | 600k/41.5M [00:01<01:07, 640kB/s]
      2%|2         | 952k/41.5M [00:01<00:40, 1.04MB/s]
      3%|3         | 1.35M/41.5M [00:01<00:29, 1.44MB/s]
      5%|5         | 2.09M/41.5M [00:01<00:18, 2.26MB/s]
      9%|8         | 3.56M/41.5M [00:02<00:09, 4.06MB/s]
     12%|#2        | 5.03M/41.5M [00:02<00:06, 5.92MB/s]
     14%|#4        | 5.92M/41.5M [00:02<00:05, 6.37MB/s]
     16%|#5        | 6.59M/41.5M [00:02<00:06, 6.03MB/s]
     19%|#9        | 7.97M/41.5M [00:02<00:04, 7.16MB/s]
     23%|##2       | 9.39M/41.5M [00:02<00:03, 8.90MB/s]
     25%|##4       | 10.3M/41.5M [00:02<00:03
 , 8.32MB/s]
     27%|##6       | 11.2M/41.5M [00:03<00:04, 7.02MB/s]
     30%|##9       | 12.4M/41.5M [00:03<00:04, 6.96MB/s]
     33%|###3      | 13.9M/41.5M [00:03<00:03, 7.35MB/s]
     37%|###6      | 15.3M/41.5M [00:03<00:03, 8.89MB/s]
     39%|###9      | 16.3M/41.5M [00:03<00:03, 8.35MB/s]
     41%|####1     | 17.1M/41.5M [00:03<00:03, 7.82MB/s]
     44%|####4     | 18.3M/41.5M [00:03<00:02, 8.76MB/s]
     46%|####6     | 19.1M/41.5M [00:04<00:02, 8.10MB/s]
     48%|####8     | 20.0M/41.5M [00:04<00:03, 7.52MB/s]
     51%|#####1    | 21.2M/41.5M [00:04<00:02, 8.75MB/s]
     53%|#####3    | 22.1M/41.5M [00:04<00:02, 8.08MB/s]
     55%|#####5    | 22.9M/41.5M [00:04<00:02, 7.47MB/s]
     58%|#####8    | 24.1M/41.5M [00:04<00:02, 8.66MB/s]
     60%|######    | 25.0M/41.5M [00:04<00:02, 8.11MB/s]
     62%|######2   | 25.8M/41.5M [00:04<00:02, 7.49MB/s]
     65%|######5   | 27.0M/41.5M [00:05<00:01, 8.67MB/s]
     67%|######7   | 27.9M/41.5M [00:05<00:01, 8.09MB/s]
     69%|######9
    | 28.7M/41.5M [00:05<00:01, 7.46MB/s]
     72%|#######2  | 30.0M/41.5M [00:05<00:01, 8.69MB/s]
     74%|#######4  | 30.9M/41.5M [00:05<00:01, 8.13MB/s]
     76%|#######6  | 31.7M/41.5M [00:05<00:01, 7.48MB/s]
     79%|#######9  | 32.9M/41.5M [00:05<00:01, 8.72MB/s]
     81%|########1 | 33.8M/41.5M [00:05<00:00, 8.14MB/s]
     83%|########3 | 34.6M/41.5M [00:06<00:00, 7.50MB/s]
     86%|########6 | 35.9M/41.5M [00:06<00:00, 8.75MB/s]
     89%|########8 | 36.7M/41.5M [00:06<00:00, 8.15MB/s]
     90%|######### | 37.5M/41.5M [00:06<00:00, 7.51MB/s]
     94%|#########3| 38.8M/41.5M [00:06<00:00, 8.71MB/s]
     96%|#########5| 39.7M/41.5M [00:06<00:00, 8.15MB/s]
     98%|#########7| 40.5M/41.5M [00:06<00:00, 7.51MB/s]
    100%|##########| 41.5M/41.5M [00:06<00:00, 6.32MB/s]
+
      0%|          | 0.00/41.5M [00:00<?, ?B/s]
      0%|          | 16.0k/41.5M [00:00<08:38, 83.8kB/s]
      0%|          | 48.0k/41.5M [00:00<05:45, 126kB/s] 
      0%|          | 96.0k/41.5M [00:00<04:13, 172kB/s]
      0%|          | 160k/41.5M [00:00<03:03, 236kB/s] 
      1%|          | 216k/41.5M [00:01<02:49, 256kB/s]
      1%|          | 280k/41.5M [00:01<02:38, 272kB/s]
      1%|          | 336k/41.5M [00:01<02:39, 270kB/s]
      1%|          | 408k/41.5M [00:01<02:20, 306kB/s]
      1%|1         | 480k/41.5M [00:01<02:11, 327kB/s]
      1%|1         | 552k/41.5M [00:02<02:09, 332kB/s]
      2%|1         | 640k/41.5M [00:02<01:59, 358kB/s]
      2%|1         | 720k/41.5M [00:02<01:52, 380kB/s]
      2%|1         | 808k/41.5M [00:02<01:46, 401kB/s]
      2%|2         | 904k/41.5M [00:02<01:42, 417kB/s]
      2%|2         | 0.98M/41.5M [00:03<01:36, 440kB/s]
      3%|2         | 1.09M/41.5M [00:03<01:29, 476kB/s]
      3%|2         | 1.19M/41.5M [00:03<01:25, 492kB/s]
    
   3%|3         | 1.30M/41.5M [00:03<01:23, 503kB/s]
      3%|3         | 1.41M/41.5M [00:03<01:19, 529kB/s]
      4%|3         | 1.54M/41.5M [00:04<01:12, 576kB/s]
      4%|4         | 1.67M/41.5M [00:04<01:08, 614kB/s]
      4%|4         | 1.80M/41.5M [00:04<01:06, 624kB/s]
      5%|4         | 1.95M/41.5M [00:04<01:04, 641kB/s]
      5%|5         | 2.09M/41.5M [00:04<00:59, 694kB/s]
      5%|5         | 2.26M/41.5M [00:05<00:55, 744kB/s]
      6%|5         | 2.41M/41.5M [00:05<00:54, 749kB/s]
      6%|6         | 2.59M/41.5M [00:05<00:51, 788kB/s]
      7%|6         | 2.77M/41.5M [00:05<00:47, 848kB/s]
      7%|7         | 2.97M/41.5M [00:05<00:44, 899kB/s]
      8%|7         | 3.16M/41.5M [00:06<00:43, 914kB/s]
      8%|8         | 3.37M/41.5M [00:06<00:42, 943kB/s]
      9%|8         | 3.59M/41.5M [00:06<00:38, 1.02MB/s]
      9%|9         | 3.82M/41.5M [00:06<00:36, 1.08MB/s]
     10%|9         | 4.05M/41.5M [00:06<00:35, 1.10MB/s]
     10%|#         | 4.31M/41.5M [00:07<00:33,
  1.15MB/s]
     11%|#1        | 4.57M/41.5M [00:07<00:31, 1.23MB/s]
     12%|#1        | 4.85M/41.5M [00:07<00:29, 1.30MB/s]
     12%|#2        | 5.15M/41.5M [00:07<00:28, 1.34MB/s]
     13%|#3        | 5.45M/41.5M [00:07<00:27, 1.39MB/s]
     14%|#3        | 5.76M/41.5M [00:08<00:25, 1.48MB/s]
     15%|#4        | 6.08M/41.5M [00:08<00:24, 1.55MB/s]
     15%|#5        | 6.41M/41.5M [00:08<00:23, 1.57MB/s]
     16%|#6        | 6.77M/41.5M [00:08<00:18, 1.93MB/s]
     17%|#6        | 6.98M/41.5M [00:08<00:18, 1.93MB/s]
     17%|#7        | 7.18M/41.5M [00:09<00:21, 1.66MB/s]
     18%|#8        | 7.53M/41.5M [00:09<00:20, 1.73MB/s]
     19%|#9        | 7.95M/41.5M [00:09<00:19, 1.83MB/s]
     20%|##        | 8.38M/41.5M [00:09<00:18, 1.92MB/s]
     21%|##1       | 8.84M/41.5M [00:09<00:16, 2.11MB/s]
     22%|##2       | 9.32M/41.5M [00:09<00:13, 2.49MB/s]
     24%|##3       | 9.77M/41.5M [00:10<00:11, 2.84MB/s]
     24%|##4       | 10.1M/41.5M [00:10<00:12, 2.54MB/s]
     25%|##5     
   | 10.4M/41.5M [00:10<00:14, 2.27MB/s]
     26%|##6       | 11.0M/41.5M [00:10<00:10, 2.99MB/s]
     27%|##7       | 11.3M/41.5M [00:10<00:10, 3.01MB/s]
     28%|##7       | 11.6M/41.5M [00:10<00:12, 2.54MB/s]
     29%|##9       | 12.2M/41.5M [00:11<00:10, 2.82MB/s]
     31%|###1      | 12.9M/41.5M [00:11<00:08, 3.53MB/s]
     32%|###1      | 13.2M/41.5M [00:11<00:08, 3.38MB/s]
     33%|###2      | 13.6M/41.5M [00:11<00:08, 3.36MB/s]
     34%|###4      | 14.3M/41.5M [00:11<00:07, 4.06MB/s]
     35%|###5      | 14.7M/41.5M [00:11<00:07, 3.66MB/s]
     37%|###6      | 15.1M/41.5M [00:11<00:07, 3.69MB/s]
     38%|###7      | 15.7M/41.5M [00:11<00:06, 4.08MB/s]
     39%|###8      | 16.1M/41.5M [00:12<00:07, 3.36MB/s]
     41%|####      | 16.8M/41.5M [00:12<00:06, 4.18MB/s]
     42%|####2     | 17.6M/41.5M [00:12<00:05, 4.96MB/s]
     44%|####3     | 18.1M/41.5M [00:12<00:05, 4.44MB/s]
     45%|####5     | 18.7M/41.5M [00:12<00:05, 4.48MB/s]
     47%|####6     | 19.4M/41.5M [00:12<00:04
 , 4.87MB/s]
     48%|####7     | 19.8M/41.5M [00:12<00:05, 4.09MB/s]
     50%|#####     | 20.8M/41.5M [00:13<00:04, 5.27MB/s]
     52%|#####1    | 21.5M/41.5M [00:13<00:03, 5.80MB/s]
     53%|#####3    | 22.1M/41.5M [00:13<00:04, 4.94MB/s]
     56%|#####5    | 23.1M/41.5M [00:13<00:03, 5.83MB/s]
     57%|#####7    | 23.8M/41.5M [00:13<00:02, 6.28MB/s]
     59%|#####8    | 24.4M/41.5M [00:13<00:03, 5.10MB/s]
     62%|######1   | 25.6M/41.5M [00:13<00:02, 6.65MB/s]
     64%|######3   | 26.5M/41.5M [00:13<00:02, 7.27MB/s]
     66%|######5   | 27.3M/41.5M [00:14<00:02, 6.18MB/s]
     69%|######8   | 28.4M/41.5M [00:14<00:01, 7.41MB/s]
     70%|#######   | 29.2M/41.5M [00:14<00:01, 7.33MB/s]
     72%|#######2  | 30.0M/41.5M [00:14<00:02, 5.90MB/s]
     76%|#######5  | 31.4M/41.5M [00:14<00:01, 7.08MB/s]
     79%|#######9  | 32.8M/41.5M [00:14<00:01, 7.35MB/s]
     83%|########2 | 34.3M/41.5M [00:14<00:00, 8.86MB/s]
     85%|########4 | 35.2M/41.5M [00:15<00:00, 7.87MB/s]
     87%|#######
 #6 | 36.0M/41.5M [00:15<00:00, 6.53MB/s]
     90%|########9 | 37.2M/41.5M [00:15<00:00, 6.65MB/s]
     93%|#########3| 38.7M/41.5M [00:15<00:00, 8.34MB/s]
     95%|#########5| 39.6M/41.5M [00:15<00:00, 8.45MB/s]
     98%|#########7| 40.5M/41.5M [00:15<00:00, 6.84MB/s]
    100%|##########| 41.5M/41.5M [00:16<00:00, 2.72MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_paddle.rst.txt b/docs/_sources/how_to/compile_models/from_paddle.rst.txt
index 2e56e48e3..cc728cfa2 100644
--- a/docs/_sources/how_to/compile_models/from_paddle.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_paddle.rst.txt
@@ -201,7 +201,7 @@ Look up prediction top 1 index in 1000 class synset.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  13.822 seconds)
+   **Total running time of the script:** ( 1 minutes  5.545 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_paddle.py:
diff --git a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
index bbef14bec..1ee8db677 100644
--- a/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_pytorch.rst.txt
@@ -79,7 +79,7 @@ Load a pretrained PyTorch model
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
-
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
     15%|#5        | 6.88M/44.7M [00:00<00:00, 72.1MB/s]
     50%|####9     | 22.2M/44.7M [00:00<00:00, 124MB/s] 
     92%|#########1| 40.9M/44.7M [00:00<00:00, 149MB/s]
    100%|##########| 44.7M/44.7M [00:00<00:00, 124MB/s]
+
      0%|          | 0.00/44.7M [00:00<?, ?B/s]
     42%|####1     | 18.7M/44.7M [00:00<00:00, 197MB/s]
    100%|##########| 44.7M/44.7M [00:00<00:00, 242MB/s]
 
 
 
diff --git a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
index f5de718f3..7c2639e0a 100644
--- a/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
+++ b/docs/_sources/how_to/compile_models/from_tensorflow.rst.txt
@@ -372,7 +372,7 @@ Run the corresponding model on tensorflow
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  2.951 seconds)
+   **Total running time of the script:** ( 1 minutes  3.090 seconds)
 
 
 .. _sphx_glr_download_how_to_compile_models_from_tensorflow.py:
diff --git a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
index f90cc4ff9..3eca36330 100644
--- a/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/compile_models/sg_execution_times.rst.txt
@@ -5,15 +5,15 @@
 
 Computation times
 =================
-**05:29.219** total execution time for **how_to_compile_models** files:
+**05:32.921** total execution time for **how_to_compile_models** files:
 
-- **01:13.822**: :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)
-- **01:02.951**: :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``)
-- **00:57.140**: :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)
-- **00:31.855**: :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)
-- **00:25.538**: :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)
-- **00:21.532**: :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)
-- **00:21.205**: :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)
-- **00:18.939**: :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)
-- **00:13.521**: :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)
-- **00:02.716**: :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)
+- **01:05.545**: :ref:`sphx_glr_how_to_compile_models_from_paddle.py` (``from_paddle.py``)
+- **01:03.090**: :ref:`sphx_glr_how_to_compile_models_from_tensorflow.py` (``from_tensorflow.py``)
+- **00:57.015**: :ref:`sphx_glr_how_to_compile_models_from_darknet.py` (``from_darknet.py``)
+- **00:40.008**: :ref:`sphx_glr_how_to_compile_models_from_oneflow.py` (``from_oneflow.py``)
+- **00:27.136**: :ref:`sphx_glr_how_to_compile_models_from_tflite.py` (``from_tflite.py``)
+- **00:22.196**: :ref:`sphx_glr_how_to_compile_models_from_mxnet.py` (``from_mxnet.py``)
+- **00:21.117**: :ref:`sphx_glr_how_to_compile_models_from_coreml.py` (``from_coreml.py``)
+- **00:19.158**: :ref:`sphx_glr_how_to_compile_models_from_pytorch.py` (``from_pytorch.py``)
+- **00:14.886**: :ref:`sphx_glr_how_to_compile_models_from_keras.py` (``from_keras.py``)
+- **00:02.770**: :ref:`sphx_glr_how_to_compile_models_from_onnx.py` (``from_onnx.py``)
diff --git a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
index 1fa4b219d..27cc84f44 100644
--- a/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_model_on_android.rst.txt
@@ -393,7 +393,7 @@ Execute on TVM
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      16.1746      16.0475      16.9060      15.8896       0.3173   
+      16.2819      16.1866      16.7154      16.1016       0.2104   
                
 
 
diff --git a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
index 39efb2d4e..a4f2da192 100644
--- a/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_object_detection_pytorch.rst.txt
@@ -108,7 +108,7 @@ Load pre-trained maskrcnn from torchvision and do tracing
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth" to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
-
      0%|          | 0.00/170M [00:00<?, ?B/s]
      1%|          | 1.25M/170M [00:00<00:13, 12.9MB/s]
      3%|2         | 4.25M/170M [00:00<00:07, 23.5MB/s]
      4%|3         | 6.50M/170M [00:00<00:07, 22.2MB/s]
      6%|5         | 9.38M/170M [00:00<00:06, 25.1MB/s]
      7%|7         | 12.2M/170M [00:00<00:06, 26.5MB/s]
      9%|8         | 14.8M/170M [00:00<00:07, 22.6MB/s]
     11%|#         | 18.2M/170M [00:00<00:05, 26.5MB/s]
     12%|#2        | 20.9M/170M [00:00<00:06, 25.2MB/s]
     14%|#3        | 23.4M/170M [00:01<00:08, 18.9MB/s]
     15%|#4        | 25.4M/170M [00:01<00:08, 17.7MB/s]
     16%|#6        | 27.3M/170M [00:01<00:08, 16.8MB/s]
     18%|#7        | 29.8M/170M [00:01<00:07, 19.0MB/s]
     19%|#8        | 32.2M/170M [00:01<00:07, 20.4MB/s]
     20%|##        | 34.5M/170M [00:01<00:06, 21.3MB/s]
     22%|##1       | 36.6M/170M [00:01<00:08, 16.7MB/s]
     23%|##2       | 38.6M/170M [00:02<00:07, 17.6MB/s]
     24%|##3       | 40.5M/170M [00:02<00:07, 17.4MB/
 s]
     25%|##4       | 42.4M/170M [00:02<00:07, 18.2MB/s]
     27%|##6       | 45.3M/170M [00:02<00:06, 20.6MB/s]
     28%|##7       | 47.4M/170M [00:02<00:06, 18.8MB/s]
     29%|##9       | 49.5M/170M [00:02<00:06, 18.8MB/s]
     30%|###       | 51.3M/170M [00:02<00:07, 17.3MB/s]
     31%|###1      | 53.0M/170M [00:02<00:07, 16.7MB/s]
     32%|###2      | 54.6M/170M [00:02<00:07, 15.6MB/s]
     33%|###3      | 56.2M/170M [00:03<00:07, 15.7MB/s]
     34%|###3      | 57.7M/170M [00:03<00:07, 14.9MB/s]
     35%|###5      | 59.6M/170M [00:03<00:07, 15.9MB/s]
     37%|###6      | 62.6M/170M [00:03<00:05, 19.4MB/s]
     39%|###8      | 65.9M/170M [00:03<00:04, 23.5MB/s]
     40%|####      | 68.4M/170M [00:03<00:04, 23.7MB/s]
     42%|####1     | 70.7M/170M [00:03<00:04, 23.7MB/s]
     43%|####2     | 73.0M/170M [00:03<00:04, 21.3MB/s]
     44%|####4     | 75.1M/170M [00:04<00:06, 15.3MB/s]
     46%|####5     | 77.7M/170M [00:04<00:05, 17.8MB/s]
     47%|####6     | 79.7M/170M [00:04<00:
 05, 18.3MB/s]
     48%|####8     | 81.7M/170M [00:04<00:04, 18.8MB/s]
     49%|####9     | 83.6M/170M [00:04<00:05, 17.9MB/s]
     51%|#####     | 85.9M/170M [00:04<00:04, 19.4MB/s]
     52%|#####2    | 88.7M/170M [00:04<00:03, 22.2MB/s]
     54%|#####3    | 90.9M/170M [00:04<00:04, 18.4MB/s]
     55%|#####4    | 93.3M/170M [00:05<00:04, 19.9MB/s]
     56%|#####6    | 95.4M/170M [00:05<00:04, 18.5MB/s]
     57%|#####7    | 97.2M/170M [00:05<00:04, 17.1MB/s]
     58%|#####8    | 99.0M/170M [00:05<00:04, 17.1MB/s]
     59%|#####9    | 101M/170M [00:05<00:04, 17.1MB/s] 
     60%|######    | 102M/170M [00:05<00:04, 16.1MB/s]
     61%|######1   | 104M/170M [00:05<00:05, 13.7MB/s]
     62%|######1   | 105M/170M [00:05<00:05, 12.4MB/s]
     63%|######2   | 107M/170M [00:06<00:04, 13.5MB/s]
     64%|######4   | 109M/170M [00:06<00:04, 15.1MB/s]
     65%|######5   | 111M/170M [00:06<00:03, 16.3MB/s]
     67%|######7   | 115M/170M [00:06<00:02, 22.2MB/s]
     69%|######8   | 117M/170M [00:06<
 00:02, 21.9MB/s]
     71%|#######   | 120M/170M [00:06<00:02, 24.5MB/s]
     72%|#######1  | 122M/170M [00:06<00:02, 21.4MB/s]
     73%|#######3  | 124M/170M [00:06<00:02, 19.0MB/s]
     74%|#######4  | 126M/170M [00:07<00:02, 18.4MB/s]
     75%|#######5  | 128M/170M [00:07<00:02, 17.7MB/s]
     76%|#######6  | 130M/170M [00:07<00:02, 16.9MB/s]
     77%|#######7  | 131M/170M [00:07<00:02, 15.1MB/s]
     78%|#######8  | 133M/170M [00:07<00:02, 14.6MB/s]
     79%|#######9  | 134M/170M [00:07<00:02, 13.7MB/s]
     80%|########  | 136M/170M [00:07<00:02, 14.9MB/s]
     81%|########1 | 138M/170M [00:07<00:02, 15.8MB/s]
     82%|########2 | 140M/170M [00:07<00:02, 15.2MB/s]
     83%|########3 | 141M/170M [00:08<00:02, 13.7MB/s]
     84%|########3 | 143M/170M [00:08<00:02, 14.3MB/s]
     85%|########4 | 144M/170M [00:08<00:01, 14.7MB/s]
     86%|########5 | 146M/170M [00:08<00:01, 15.3MB/s]
     87%|########6 | 147M/170M [00:08<00:01, 15.5MB/s]
     88%|########7 | 149M/170M [00:08<00:01, 
 15.5MB/s]
     89%|########8 | 151M/170M [00:08<00:01, 17.2MB/s]
     90%|######### | 153M/170M [00:08<00:01, 17.7MB/s]
     91%|#########1| 155M/170M [00:08<00:00, 18.9MB/s]
     92%|#########2| 157M/170M [00:09<00:00, 16.8MB/s]
     94%|#########3| 159M/170M [00:09<00:00, 17.5MB/s]
     95%|#########4| 161M/170M [00:09<00:00, 17.1MB/s]
     96%|#########5| 162M/170M [00:09<00:00, 16.4MB/s]
     97%|#########6| 164M/170M [00:09<00:00, 16.8MB/s]
     98%|#########7| 166M/170M [00:09<00:00, 15.2MB/s]
     98%|#########8| 167M/170M [00:09<00:00, 13.4MB/s]
     99%|#########9| 169M/170M [00:09<00:00, 13.1MB/s]
    100%|##########| 170M/170M [00:10<00:00, 17.7MB/s]
+
      0%|          | 0.00/170M [00:00<?, ?B/s]
      3%|3         | 5.56M/170M [00:00<00:02, 58.3MB/s]
      7%|6         | 11.2M/170M [00:00<00:02, 59.0MB/s]
     10%|9         | 16.9M/170M [00:00<00:03, 49.8MB/s]
     13%|#2        | 21.8M/170M [00:00<00:03, 48.8MB/s]
     16%|#5        | 26.9M/170M [00:00<00:02, 50.5MB/s]
     19%|#9        | 32.9M/170M [00:00<00:02, 54.0MB/s]
     23%|##2       | 38.5M/170M [00:00<00:02, 55.6MB/s]
     26%|##5       | 44.0M/170M [00:00<00:02, 56.2MB/s]
     29%|##9       | 49.5M/170M [00:00<00:02, 56.6MB/s]
     32%|###2      | 54.9M/170M [00:01<00:02, 50.5MB/s]
     35%|###5      | 59.9M/170M [00:01<00:02, 49.9MB/s]
     39%|###8      | 65.8M/170M [00:01<00:02, 53.3MB/s]
     42%|####2     | 71.4M/170M [00:01<00:01, 54.6MB/s]
     45%|####5     | 76.7M/170M [00:01<00:01, 52.1MB/s]
     48%|####8     | 82.3M/170M [00:01<00:01, 54.2MB/s]
     52%|#####1    | 87.9M/170M [00:01<00:01, 55.5MB/s]
     55%|#####4    | 93.3M/170M [00:01<00:01, 42.7MB/
 s]
     59%|#####8    | 99.7M/170M [00:02<00:01, 48.7MB/s]
     62%|######1   | 105M/170M [00:02<00:01, 50.8MB/s] 
     65%|######5   | 110M/170M [00:02<00:01, 52.2MB/s]
     68%|######8   | 116M/170M [00:02<00:01, 45.5MB/s]
     71%|#######   | 120M/170M [00:02<00:01, 37.8MB/s]
     73%|#######3  | 124M/170M [00:02<00:01, 37.5MB/s]
     76%|#######5  | 128M/170M [00:02<00:01, 33.8MB/s]
     78%|#######7  | 132M/170M [00:02<00:01, 34.4MB/s]
     80%|#######9  | 136M/170M [00:03<00:00, 36.5MB/s]
     82%|########2 | 139M/170M [00:03<00:00, 32.9MB/s]
     84%|########4 | 143M/170M [00:03<00:00, 35.1MB/s]
     88%|########8 | 150M/170M [00:03<00:00, 42.7MB/s]
     91%|#########1| 155M/170M [00:03<00:00, 46.0MB/s]
     94%|#########3| 159M/170M [00:03<00:00, 42.6MB/s]
     97%|#########6| 164M/170M [00:03<00:00, 44.0MB/s]
    100%|#########9| 170M/170M [00:03<00:00, 48.1MB/s]
    100%|##########| 170M/170M [00:03<00:00, 46.5MB/s]
     /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
       for i in range(dim)
     /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
@@ -253,7 +253,7 @@ Get boxes with score larger than 0.9
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 3 minutes  14.259 seconds)
+   **Total running time of the script:** ( 3 minutes  7.991 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_object_detection_pytorch.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
index dd4e692ad..758590cb4 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized.rst.txt
@@ -187,7 +187,7 @@ training. Other models require a full post training calibration.
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/mobilenet_v2-b0353104.pth" to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
-
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
     14%|#3        | 1.88M/13.6M [00:00<00:00, 18.4MB/s]
     27%|##6       | 3.63M/13.6M [00:00<00:00, 16.2MB/s]
     40%|###9      | 5.39M/13.6M [00:00<00:00, 17.1MB/s]
     55%|#####4    | 7.44M/13.6M [00:00<00:00, 18.5MB/s]
     76%|#######6  | 10.3M/13.6M [00:00<00:00, 22.3MB/s]
     92%|#########1| 12.5M/13.6M [00:00<00:00, 21.8MB/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 20.2MB/s]
+
      0%|          | 0.00/13.6M [00:00<?, ?B/s]
     93%|#########3| 12.6M/13.6M [00:00<00:00, 132MB/s]
    100%|##########| 13.6M/13.6M [00:00<00:00, 135MB/s]
 
 
 
@@ -344,7 +344,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      90.3776      90.2127      95.2388      90.0487       0.5623   
+      90.4764      90.2552      98.0507      90.0430       0.8613   
                
 
 
@@ -384,7 +384,7 @@ TODO
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  5.547 seconds)
+   **Total running time of the script:** ( 1 minutes  4.715 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
index d21303788..eb60ad48e 100644
--- a/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_prequantized_tflite.rst.txt
@@ -351,7 +351,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      119.8114     119.8220     121.4090     118.3415      0.4469   
+      118.3109     118.2177     123.2838     117.5628      0.6162   
                
 
 
@@ -385,7 +385,7 @@ Here we give an example of how to measure performance of TVM compiled models.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  53.880 seconds)
+   **Total running time of the script:** ( 1 minutes  52.855 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_prequantized_tflite.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
index 35a31b003..d346f8023 100644
--- a/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_quantized.rst.txt
@@ -221,7 +221,7 @@ We create a Relay VM to build and execute the model.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  9.236 seconds)
+   **Total running time of the script:** ( 1 minutes  8.315 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_quantized.py:
diff --git a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
index 1d98709b8..b47f88485 100644
--- a/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
+++ b/docs/_sources/how_to/deploy_models/deploy_ssd_gluoncv.rst.txt
@@ -137,7 +137,7 @@ Convert and compile model for CPU.
             data: None
       input_sym_arg_type = in_param.infer_type()[0]
     Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
-
      0%|          | 0/132723 [00:00<?, ?KB/s]
      5%|4         | 6265/132723 [00:00<00:02, 62639.61KB/s]
     11%|#         | 14572/132723 [00:00<00:01, 74649.38KB/s]
     17%|#6        | 22119/132723 [00:00<00:01, 75021.64KB/s]
     23%|##2       | 30458/132723 [00:00<00:01, 78322.65KB/s]
     29%|##9       | 38846/132723 [00:00<00:01, 80324.08KB/s]
     36%|###5      | 47194/132723 [00:00<00:01, 81393.69KB/s]
     42%|####1     | 55558/132723 [00:00<00:00, 82125.15KB/s]
     48%|####8     | 63909/132723 [00:00<00:00, 82562.19KB/s]
     54%|#####4    | 72166/132723 [00:00<00:00, 82466.38KB/s]
     61%|######    | 80454/132723 [00:01<00:00, 82591.31KB/s]
     67%|######6   | 88716/132723 [00:01<00:00, 82599.03KB/s]
     73%|#######3  | 97062/132723 [00:01<00:00, 82854.96KB/s]
     79%|#######9  | 105408/132723 [00:01<00:00, 83036.17KB/s]
     86%|########5 | 113757/132723 [00:01<00:00, 83171.62KB/s]
     92%|#########2| 122116/132723 [00:01<00:00, 83294.74KB/s]
     98%|########
 #8| 130486/132723 [00:01<00:00, 83409.01KB/s]
    100%|##########| 132723/132723 [00:01<00:00, 81493.10KB/s]
+
      0%|          | 0/132723 [00:00<?, ?KB/s]
      4%|4         | 5876/132723 [00:00<00:02, 58755.15KB/s]
     10%|#         | 13601/132723 [00:00<00:01, 69628.87KB/s]
     16%|#6        | 21466/132723 [00:00<00:01, 73736.74KB/s]
     22%|##2       | 29391/132723 [00:00<00:01, 75909.78KB/s]
     28%|##8       | 37372/132723 [00:00<00:01, 77313.91KB/s]
     34%|###4      | 45435/132723 [00:00<00:01, 78440.03KB/s]
     40%|####      | 53373/132723 [00:00<00:01, 78743.71KB/s]
     46%|####6     | 61435/132723 [00:00<00:00, 79339.56KB/s]
     52%|#####2    | 69408/132723 [00:00<00:00, 79459.44KB/s]
     58%|#####8    | 77417/132723 [00:01<00:00, 79653.12KB/s]
     64%|######4   | 85450/132723 [00:01<00:00, 79858.90KB/s]
     70%|#######   | 93464/132723 [00:01<00:00, 79942.17KB/s]
     76%|#######6  | 101497/132723 [00:01<00:00, 80058.11KB/s]
     83%|########2 | 109526/132723 [00:01<00:00, 80127.19KB/s]
     89%|########8 | 117539/132723 [00:01<00:00, 80077.42KB/s]
     95%|########
 #4| 125649/132723 [00:01<00:00, 80383.88KB/s]
    100%|##########| 132723/132723 [00:01<00:00, 78565.74KB/s]
 
 
 
@@ -202,7 +202,7 @@ Display result
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  22.964 seconds)
+   **Total running time of the script:** ( 2 minutes  22.894 seconds)
 
 
 .. _sphx_glr_download_how_to_deploy_models_deploy_ssd_gluoncv.py:
diff --git a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
index 71eb5b80a..890a6ccec 100644
--- a/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/deploy_models/sg_execution_times.rst.txt
@@ -5,13 +5,13 @@
 
 Computation times
 =================
-**10:35.869** total execution time for **how_to_deploy_models** files:
+**10:26.488** total execution time for **how_to_deploy_models** files:
 
-- **03:14.259**: :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``)
-- **02:22.964**: :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)
-- **01:53.880**: :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)
-- **01:09.236**: :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)
-- **01:05.547**: :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)
-- **00:28.224**: :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)
-- **00:21.561**: :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)
+- **03:07.991**: :ref:`sphx_glr_how_to_deploy_models_deploy_object_detection_pytorch.py` (``deploy_object_detection_pytorch.py``)
+- **02:22.894**: :ref:`sphx_glr_how_to_deploy_models_deploy_ssd_gluoncv.py` (``deploy_ssd_gluoncv.py``)
+- **01:52.855**: :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized_tflite.py` (``deploy_prequantized_tflite.py``)
+- **01:08.315**: :ref:`sphx_glr_how_to_deploy_models_deploy_quantized.py` (``deploy_quantized.py``)
+- **01:04.715**: :ref:`sphx_glr_how_to_deploy_models_deploy_prequantized.py` (``deploy_prequantized.py``)
+- **00:28.055**: :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_android.py` (``deploy_model_on_android.py``)
+- **00:21.464**: :ref:`sphx_glr_how_to_deploy_models_deploy_model_on_rasp.py` (``deploy_model_on_rasp.py``)
 - **00:00.199**: :ref:`sphx_glr_how_to_deploy_models_deploy_sparse.py` (``deploy_sparse.py``)
diff --git a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
index 0f8e55fac..0577378c1 100644
--- a/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/bring_your_own_datatypes.rst.txt
@@ -423,7 +423,7 @@ First let us define two helper functions to get the mobilenet model and a cat im
 
  .. code-block:: none
 
-    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zipb93ec226-55dc-4bbf-a017-1b11d08f2501 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+    Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip80a2fdd9-51f0-4e24-801b-fedc14ead5c2 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 
 
 
@@ -525,7 +525,7 @@ Now, to actually convert the entire network, we have written `a pass in Relay <h
 
  .. code-block:: none
 
-      Check failed: (lower) is false: Intrinsic lowering function for target llvm, intrinsic name tir.sqrt, type 150 not found
+      Check failed: (lower) is false: FloatImm lowering function for target llvm type 150 not found
 
 
 
diff --git a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
index a67da3135..1df2e1c9a 100644
--- a/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/sg_execution_times.rst.txt
@@ -5,9 +5,9 @@
 
 Computation times
 =================
-**00:38.899** total execution time for **how_to_extend_tvm** files:
+**00:38.595** total execution time for **how_to_extend_tvm** files:
 
-- **00:35.345**: :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``)
-- **00:02.255**: :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)
-- **00:01.087**: :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)
-- **00:00.212**: :ref:`sphx_glr_how_to_extend_tvm_low_level_custom_pass.py` (``low_level_custom_pass.py``)
+- **00:35.043**: :ref:`sphx_glr_how_to_extend_tvm_bring_your_own_datatypes.py` (``bring_your_own_datatypes.py``)
+- **00:02.252**: :ref:`sphx_glr_how_to_extend_tvm_use_pass_instrument.py` (``use_pass_instrument.py``)
+- **00:01.093**: :ref:`sphx_glr_how_to_extend_tvm_use_pass_infra.py` (``use_pass_infra.py``)
+- **00:00.206**: :ref:`sphx_glr_how_to_extend_tvm_low_level_custom_pass.py` (``low_level_custom_pass.py``)
diff --git a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
index 4d1b3fb03..ae00aace5 100644
--- a/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
+++ b/docs/_sources/how_to/extend_tvm/use_pass_instrument.rst.txt
@@ -199,10 +199,10 @@ profile the execution time of each passes.
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 6080us [6080us] (45.57%; 45.57%)
-    FoldScaleAxis: 7262us [2us] (54.43%; 54.43%)
-            FoldConstant: 7260us [1500us] (54.41%; 99.97%)
-                    InferType: 5760us [5760us] (43.17%; 79.34%)
+    InferType: 6089us [6089us] (45.47%; 45.47%)
+    FoldScaleAxis: 7302us [3us] (54.53%; 54.53%)
+            FoldConstant: 7300us [1492us] (54.51%; 99.97%)
+                    InferType: 5808us [5808us] (43.37%; 79.56%)
 
 
 
@@ -239,10 +239,10 @@ Refer to following sections and :py:func:`tvm.instrument.pass_instrument` for th
  .. code-block:: none
 
     Printing results of timing profile...
-    InferType: 5814us [5814us] (44.58%; 44.58%)
-    FoldScaleAxis: 7228us [2us] (55.42%; 55.42%)
-            FoldConstant: 7226us [1506us] (55.40%; 99.97%)
-                    InferType: 5720us [5720us] (43.86%; 79.16%)
+    InferType: 5843us [5843us] (44.51%; 44.51%)
+    FoldScaleAxis: 7284us [2us] (55.49%; 55.49%)
+            FoldConstant: 7282us [1537us] (55.47%; 99.97%)
+                    InferType: 5745us [5745us] (43.76%; 78.89%)
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
index 7c5935d16..37c8a433f 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_cuda.rst.txt
@@ -295,7 +295,7 @@ latency of convolution.
 
  .. code-block:: none
 
-    Convolution: 35.935623 ms
+    Convolution: 54.175673 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
index d68426716..c033d0380 100644
--- a/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_conv_tensorcore.rst.txt
@@ -628,7 +628,7 @@ be able to run on our build server
 
  .. code-block:: none
 
-    conv2d with tensor core: 9.161962 ms
+    conv2d with tensor core: 7.100658 ms
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
index 61a136798..8036ccdd0 100644
--- a/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/opt_gemm.rst.txt
@@ -118,8 +118,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 
  .. code-block:: none
 
-    Numpy running time: 0.018871
-    Baseline: 3.346656
+    Numpy running time: 0.018666
+    Baseline: 3.344988
 
 
 
@@ -210,7 +210,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 
  .. code-block:: none
 
-    Opt1: 0.300723
+    Opt1: 0.298091
 
 
 
@@ -309,7 +309,7 @@ In this tutorial, we chose to vectorize the inner loop row data since it is cach
 
  .. code-block:: none
 
-    Opt2: 0.335491
+    Opt2: 0.334549
 
 
 
@@ -401,7 +401,7 @@ the access pattern for A matrix is more cache friendly.
 
  .. code-block:: none
 
-    Opt3: 0.118515
+    Opt3: 0.119140
 
 
 
@@ -520,7 +520,7 @@ flattening.
 
  .. code-block:: none
 
-    Opt4: 0.112324
+    Opt4: 0.111795
 
 
 
@@ -638,7 +638,7 @@ write to C when all the block results are ready.
 
  .. code-block:: none
 
-    Opt5: 0.111561
+    Opt5: 0.111665
 
 
 
@@ -759,7 +759,7 @@ Futhermore, we can also utilize multi-core processors to do the thread-level par
 
  .. code-block:: none
 
-    Opt6: 0.145279
+    Opt6: 0.144061
 
 
 
diff --git a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
index c90eacebc..1f49c3608 100644
--- a/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/optimize_operators/sg_execution_times.rst.txt
@@ -5,8 +5,8 @@
 
 Computation times
 =================
-**00:34.993** total execution time for **how_to_optimize_operators** files:
+**00:34.633** total execution time for **how_to_optimize_operators** files:
 
-- **00:32.258**: :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)
-- **00:01.509**: :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``)
-- **00:01.226**: :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)
+- **00:32.068**: :ref:`sphx_glr_how_to_optimize_operators_opt_gemm.py` (``opt_gemm.py``)
+- **00:01.352**: :ref:`sphx_glr_how_to_optimize_operators_opt_conv_tensorcore.py` (``opt_conv_tensorcore.py``)
+- **00:01.213**: :ref:`sphx_glr_how_to_optimize_operators_opt_conv_cuda.py` (``opt_conv_cuda.py``)
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
index d196ebaf5..974b17548 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/sg_execution_times.rst.txt
@@ -5,11 +5,11 @@
 
 Computation times
 =================
-**04:55.065** total execution time for **how_to_tune_with_autoscheduler** files:
-
-- **02:20.665**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``)
-- **01:20.512**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)
-- **00:40.577**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)
-- **00:15.818**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)
-- **00:08.855**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)
-- **00:08.639**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)
+**04:55.001** total execution time for **how_to_tune_with_autoscheduler** files:
+
+- **02:20.451**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py` (``tune_conv2d_layer_cuda.py``)
+- **01:20.072**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_x86.py` (``tune_network_x86.py``)
+- **00:40.650**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_cuda.py` (``tune_network_cuda.py``)
+- **00:16.595**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_sparse_x86.py` (``tune_sparse_x86.py``)
+- **00:08.727**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_mali.py` (``tune_network_mali.py``)
+- **00:08.506**: :ref:`sphx_glr_how_to_tune_with_autoscheduler_tune_network_arm.py` (``tune_network_arm.py``)
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
index ea5f3e5d2..220aa4065 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.rst.txt
@@ -222,483 +222,487 @@ cooperative fetching, unrolling and operator fusion.
                  compute: Buffer(compute_2: Pointer(float32), float32, [25088], [])}
       buffer_map = {data_1: data, kernel_1: kernel, bias_1: bias, compute_1: compute}
       preflattened_buffer_map = {data_1: data_3: Buffer(data_2, float32, [1, 512, 7, 7], []), kernel_1: kernel_3: Buffer(kernel_2, float32, [512, 512, 3, 3], []), bias_1: bias_3: Buffer(bias_2, float32, [1, 512, 1, 1], []), compute_1: compute_3: Buffer(compute_2, float32, [1, 512, 7, 7], [])} {
-      attr [IterVar(blockIdx.x: int32, (nullptr), "ThreadIndex", "blockIdx.x")] "thread_extent" = 28;
-      allocate(conv2d_nchw: Pointer(local float32), float32, [14]), storage_scope = local;
-      allocate(pad_temp.shared: Pointer(shared float32), float32, [72]), storage_scope = shared;
-      allocate(kernel.shared: Pointer(shared float32), float32, [3072]), storage_scope = shared;
-      attr [IterVar(threadIdx.x: int32, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64 {
-        conv2d_nchw_1: Buffer(conv2d_nchw, float32, [14], [], scope="local", align=32)[0] = 0f32
-        conv2d_nchw_1[1] = 0f32
-        conv2d_nchw_1[2] = 0f32
-        conv2d_nchw_1[3] = 0f32
+      attr [IterVar(blockIdx.x: int32, (nullptr), "ThreadIndex", "blockIdx.x")] "thread_extent" = 64;
+      allocate(conv2d_nchw: Pointer(local float32), float32, [28]), storage_scope = local;
+      allocate(pad_temp.shared: Pointer(shared float32), float32, [324]), storage_scope = shared;
+      allocate(kernel.shared: Pointer(shared float32), float32, [288]), storage_scope = shared;
+      attr [IterVar(threadIdx.x: int32, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14 {
+        conv2d_nchw_1: Buffer(conv2d_nchw, float32, [16], [], scope="local", align=16)[0] = 0f32
         conv2d_nchw_1[4] = 0f32
-        conv2d_nchw_1[5] = 0f32
-        conv2d_nchw_1[6] = 0f32
-        conv2d_nchw_1[7] = 0f32
         conv2d_nchw_1[8] = 0f32
+        conv2d_nchw_1[12] = 0f32
+        conv2d_nchw_1[16] = 0f32
+        conv2d_nchw_1[20] = 0f32
+        conv2d_nchw_1[24] = 0f32
+        conv2d_nchw_1[1] = 0f32
+        conv2d_nchw_1[5] = 0f32
         conv2d_nchw_1[9] = 0f32
+        conv2d_nchw_1[13] = 0f32
+        conv2d_nchw_1[17] = 0f32
+        conv2d_nchw_1[21] = 0f32
+        conv2d_nchw_1[25] = 0f32
+        conv2d_nchw_1[2] = 0f32
+        conv2d_nchw_1[6] = 0f32
         conv2d_nchw_1[10] = 0f32
+        conv2d_nchw_1[14] = 0f32
+        conv2d_nchw_1[18] = 0f32
+        conv2d_nchw_1[22] = 0f32
+        conv2d_nchw_1[26] = 0f32
+        conv2d_nchw_1[3] = 0f32
+        conv2d_nchw_1[7] = 0f32
         conv2d_nchw_1[11] = 0f32
-        conv2d_nchw_1[12] = 0f32
-        conv2d_nchw_1[13] = 0f32
-        for (rc.outer.outer: int32, 0, 64) {
-          for (ry.outer.outer: int32, 0, 3) {
-            let cse_var_2: int32 = (rc.outer.outer*72)
-            let cse_var_1: int32 = (ry.outer.outer*3)
-             {
-              attr [IterVar(threadIdx.x_1: int32, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64 {
-                if @tir.likely((threadIdx.x_1 < 18), dtype=bool) {
-                  pad_temp.shared_1: Buffer(pad_temp.shared, float32, [72], [], scope="shared")[(threadIdx.x_1*4)] = @tir.if_then_else(((((1 <= (ry.outer.outer + floormod(blockIdx.x, 7))) && ((ry.outer.outer + floormod(blockIdx.x, 7)) < 8)) && (1 <= floormod((threadIdx.x_1*4), 9))) && (floormod((threadIdx.x_1*4), 9) < 8)), data[((((((rc.outer.outer*392) + (floordiv((threadIdx.x_1*4), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) + floormod((threadIdx.x_1*4), 9)) - 8)], 0f3 [...]
-                }
-                if @tir.likely((threadIdx.x_1 < 18), dtype=bool) {
-                  pad_temp.shared_1[((threadIdx.x_1*4) + 1)] = @tir.if_then_else(((((1 <= (ry.outer.outer + floormod(blockIdx.x, 7))) && ((ry.outer.outer + floormod(blockIdx.x, 7)) < 8)) && (1 <= floormod(((threadIdx.x_1*4) + 1), 9))) && (floormod(((threadIdx.x_1*4) + 1), 9) < 8)), data[((((((rc.outer.outer*392) + (floordiv(((threadIdx.x_1*4) + 1), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) + floormod(((threadIdx.x_1*4) + 1), 9)) - 8)], 0f32, dtype=float32)
-                }
-                if @tir.likely((threadIdx.x_1 < 18), dtype=bool) {
-                  pad_temp.shared_1[((threadIdx.x_1*4) + 2)] = @tir.if_then_else(((((1 <= (ry.outer.outer + floormod(blockIdx.x, 7))) && ((ry.outer.outer + floormod(blockIdx.x, 7)) < 8)) && (1 <= floormod(((threadIdx.x_1*4) + 2), 9))) && (floormod(((threadIdx.x_1*4) + 2), 9) < 8)), data[((((((rc.outer.outer*392) + (floordiv(((threadIdx.x_1*4) + 2), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) + floormod(((threadIdx.x_1*4) + 2), 9)) - 8)], 0f32, dtype=float32)
-                }
-                if @tir.likely((threadIdx.x_1 < 18), dtype=bool) {
-                  pad_temp.shared_1[((threadIdx.x_1*4) + 3)] = @tir.if_then_else(((((1 <= (ry.outer.outer + floormod(blockIdx.x, 7))) && ((ry.outer.outer + floormod(blockIdx.x, 7)) < 8)) && (1 <= floormod(((threadIdx.x_1*4) + 3), 9))) && (floormod(((threadIdx.x_1*4) + 3), 9) < 8)), data[((((((rc.outer.outer*392) + (floordiv(((threadIdx.x_1*4) + 3), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) + floormod(((threadIdx.x_1*4) + 3), 9)) - 8)], 0f32, dtype=float32)
-                }
+        conv2d_nchw_1[15] = 0f32
+        conv2d_nchw_1[19] = 0f32
+        conv2d_nchw_1[23] = 0f32
+        conv2d_nchw_1[27] = 0f32
+        for (rc.outer.outer: int32, 0, 128) {
+          let cse_var_2: int32 = (rc.outer.outer*196)
+          let cse_var_1: int32 = (rc.outer.outer*36)
+           {
+            attr [IterVar(threadIdx.x_1: int32, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14 {
+              pad_temp.shared_1: Buffer(pad_temp.shared, float32, [324], [], scope="shared")[(threadIdx.x_1*12)] = @tir.if_then_else(((((3 <= floormod((threadIdx.x_1*4), 27)) && (floormod((threadIdx.x_1*12), 81) < 72)) && (1 <= floormod((threadIdx.x_1*12), 9))) && (floormod((threadIdx.x_1*12), 9) < 8)), data[((((cse_var_2 + (floordiv((threadIdx.x_1*4), 27)*49)) + (floordiv(floormod((threadIdx.x_1*4), 27), 3)*7)) + floormod((threadIdx.x_1*12), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 1)] = @tir.if_then_else(((((3 <= floormod((threadIdx.x_1*4), 27)) && (floormod(((threadIdx.x_1*12) + 1), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 1), 9))) && (floormod(((threadIdx.x_1*12) + 1), 9) < 8)), data[((((cse_var_2 + (floordiv((threadIdx.x_1*4), 27)*49)) + (floordiv(floormod((threadIdx.x_1*4), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 1), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 2)] = @tir.if_then_else(((((3 <= floormod((threadIdx.x_1*4), 27)) && (floormod(((threadIdx.x_1*12) + 2), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 2), 9))) && (floormod(((threadIdx.x_1*12) + 2), 9) < 8)), data[((((cse_var_2 + (floordiv((threadIdx.x_1*4), 27)*49)) + (floordiv(floormod((threadIdx.x_1*4), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 2), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 3)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 1), 27)) && (floormod(((threadIdx.x_1*12) + 3), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 3), 9))) && (floormod(((threadIdx.x_1*12) + 3), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 1), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 1), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 3), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 4)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 1), 27)) && (floormod(((threadIdx.x_1*12) + 4), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 4), 9))) && (floormod(((threadIdx.x_1*12) + 4), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 1), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 1), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 4), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 5)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 1), 27)) && (floormod(((threadIdx.x_1*12) + 5), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 5), 9))) && (floormod(((threadIdx.x_1*12) + 5), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 1), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 1), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 5), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 6)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 2), 27)) && (floormod(((threadIdx.x_1*12) + 6), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 6), 9))) && (floormod(((threadIdx.x_1*12) + 6), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 2), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 2), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 6), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 7)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 2), 27)) && (floormod(((threadIdx.x_1*12) + 7), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 7), 9))) && (floormod(((threadIdx.x_1*12) + 7), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 2), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 2), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 7), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 8)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 2), 27)) && (floormod(((threadIdx.x_1*12) + 8), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 8), 9))) && (floormod(((threadIdx.x_1*12) + 8), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 2), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 2), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 8), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 9)] = @tir.if_then_else(((((1 <= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 9), 81) < 72)) && (1 <= floormod((threadIdx.x_1*12), 9))) && (floormod((threadIdx.x_1*12), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 3), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod((threadIdx.x_1*12), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 10)] = @tir.if_then_else(((((1 <= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 10), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 1), 9))) && (floormod(((threadIdx.x_1*12) + 1), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 3), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 1), 9)) - 8)], 0f32, dtype=float32)
+              pad_temp.shared_1[((threadIdx.x_1*12) + 11)] = @tir.if_then_else(((((1 <= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 11), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 2), 9))) && (floormod(((threadIdx.x_1*12) + 2), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 3), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 2), 9)) - 8)], 0f32, dtype=float32)
+            }
+            attr [IterVar(threadIdx.x_1, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14 {
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 168)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 56), 27)) && (floormod(((threadIdx.x_1*12) + 6), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 6), 9))) && (floormod(((threadIdx.x_1*12) + 6), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 56), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 56), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 6), 9)) - 8)], 0f32, dtype=float32)
               }
-              attr [IterVar(threadIdx.x_2: int32, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1: Buffer(kernel.shared, float32, [3072], [], scope="shared")[threadIdx.x_2] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(threadIdx.x_2, 24)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 64)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 8), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 64), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 128)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 16), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 128), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 192)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 36864)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 256)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 32), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 256), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 320)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 40), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 320), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 384)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 73728)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 448)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 56), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 448), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 512)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 64), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 512), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 576)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 110592)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 640)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 80), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 640), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 704)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 88), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 704), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 768)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 147456)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 832)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 104), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 832), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 896)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 112), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 896), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 960)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 184320)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1024)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 128), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1024), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1088)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 136), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1088), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1152)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 221184)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1216)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 152), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1216), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1280)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 160), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1280), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1344)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 258048)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1408)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 176), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1408), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1472)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 184), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1472), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1536)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 294912)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1600)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 200), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1600), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1664)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 208), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1664), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1728)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 331776)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1792)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 224), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1792), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1856)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 232), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1856), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1920)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 368640)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 1984)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 248), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1984), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2048)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 256), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2048), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2112)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 405504)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2176)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 272), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2176), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2240)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 280), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2240), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2304)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 442368)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2368)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 296), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2368), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2432)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 304), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2432), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2496)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 479232)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2560)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 320), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2560), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2624)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 328), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2624), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2688)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 516096)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2752)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 344), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2752), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2816)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 352), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2816), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2880)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 552960)]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 2944)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 368), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2944), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-              attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 64;
-              kernel.shared_1[(threadIdx.x_2 + 3008)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 376), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 3008), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[0]*kernel.shared_1[(threadIdx.x*48)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[9]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[1]*kernel.shared_1[(threadIdx.x*48)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[2]*kernel.shared_1[(threadIdx.x*48)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[3]*kernel.shared_1[(threadIdx.x*48)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[4]*kernel.shared_1[(threadIdx.x*48)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[5]*kernel.shared_1[(threadIdx.x*48)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[6]*kernel.shared_1[(threadIdx.x*48)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[0]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[9]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[1]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[1]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[1]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[8]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[17]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[8]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[17]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[18]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[27]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[18]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[27]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[26]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[35]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[26]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[35]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[36]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[45]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[36]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[45]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[44]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[53]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[44]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[53]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[54]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[63]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[54]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[63]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[62]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[71]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[62]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[71]*kernel.shared_1[((threadIdx.x*48) + 47)]))
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 169)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 56), 27)) && (floormod(((threadIdx.x_1*12) + 7), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 7), 9))) && (floormod(((threadIdx.x_1*12) + 7), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 56), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 56), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 7), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 170)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 56), 27)) && (floormod(((threadIdx.x_1*12) + 8), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 8), 9))) && (floormod(((threadIdx.x_1*12) + 8), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 56), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 56), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 8), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 171)] = @tir.if_then_else(((((1 <= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 9), 81) < 72)) && (1 <= floormod((threadIdx.x_1*12), 9))) && (floormod((threadIdx.x_1*12), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 57), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod((threadIdx.x_1*12), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 172)] = @tir.if_then_else(((((1 <= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 10), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 1), 9))) && (floormod(((threadIdx.x_1*12) + 1), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 57), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 1), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 173)] = @tir.if_then_else(((((1 <= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 11), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 2), 9))) && (floormod(((threadIdx.x_1*12) + 2), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 57), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 2), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 174)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 58), 27)) && (floormod(((threadIdx.x_1*12) + 12), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 3), 9))) && (floormod(((threadIdx.x_1*12) + 3), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 58), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 58), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 3), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 175)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 58), 27)) && (floormod(((threadIdx.x_1*12) + 13), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 4), 9))) && (floormod(((threadIdx.x_1*12) + 4), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 58), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 58), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 4), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 176)] = @tir.if_then_else(((((3 <= floormod(((threadIdx.x_1*4) + 58), 27)) && (floormod(((threadIdx.x_1*12) + 14), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 5), 9))) && (floormod(((threadIdx.x_1*12) + 5), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 58), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 58), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 5), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 177)] = @tir.if_then_else(((((1 <= floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 15), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 6), 9))) && (floormod(((threadIdx.x_1*12) + 6), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 59), 27)*49)) + (floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 6), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 178)] = @tir.if_then_else(((((1 <= floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 16), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 7), 9))) && (floormod(((threadIdx.x_1*12) + 7), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 59), 27)*49)) + (floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 7), 9)) - 8)], 0f32, dtype=float32)
+              }
+              if @tir.likely((threadIdx.x_1 < 13), dtype=bool) {
+                pad_temp.shared_1[((threadIdx.x_1*12) + 179)] = @tir.if_then_else(((((1 <= floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)) && (floormod(((threadIdx.x_1*12) + 17), 81) < 72)) && (1 <= floormod(((threadIdx.x_1*12) + 8), 9))) && (floormod(((threadIdx.x_1*12) + 8), 9) < 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 59), 27)*49)) + (floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 8), 9)) - 8)], 0f32, dtype=float32)
+              }
+            }
+            attr [IterVar(threadIdx.x_2: int32, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1: Buffer(kernel.shared, float32, [288], [], scope="shared")[threadIdx.x_2] = kernel[(((blockIdx.x*36864) + cse_var_1) + threadIdx.x_2)]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 14)] = kernel[(((blockIdx.x*36864) + cse_var_1) + (threadIdx.x_2 + 14))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 28)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 14), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 28), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 42)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 21), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 6), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 56)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 28), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 20), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 70)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 35), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 34), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 84)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 42), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 12), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 98)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 49), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 26), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 112)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 56), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 4), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 126)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 63), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 18), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 140)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 70), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 32), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 154)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 77), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 10), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 168)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 84), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 24), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 182)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 91), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 196)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 98), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 16), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 210)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 105), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 30), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 224)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 112), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 8), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 238)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 119), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 22), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 252)] = kernel[((((blockIdx.x*36864) + cse_var_1) + threadIdx.x_2) + 32256)]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            kernel.shared_1[(threadIdx.x_2 + 266)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 133), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 14), 36))]
+            attr [IterVar(threadIdx.x_2, (nullptr), "ThreadIndex", "threadIdx.x")] "thread_extent" = 14;
+            if @tir.likely((threadIdx.x_2 < 8), dtype=bool) {
+              kernel.shared_1[(threadIdx.x_2 + 280)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 140), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 28), 36))]
+            }
+            for (rx.outer.inner: int32, 0, 3) {
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+              conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+              conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+              conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+              conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+              conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+              conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+              conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+              conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+              conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+              conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+              conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+              conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+              conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+              conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+              conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+              conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+              conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+              conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+              conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+              conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+              conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+              conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+              conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+              conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+              conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+              conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+              conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+              conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
             }
           }
         }
-        for (i1.inner: int32, 0, 2) {
-          for (i3.inner: int32, 0, 7) {
-            compute[(((((floordiv(blockIdx.x, 7)*6272) + (threadIdx.x*98)) + (i1.inner*49)) + (floormod(blockIdx.x, 7)*7)) + i3.inner)] = max((conv2d_nchw_1[((i1.inner*7) + i3.inner)] + bias[(((floordiv(blockIdx.x, 7)*128) + (threadIdx.x*2)) + i1.inner)]), 0f32)
-          }
+        for (i1.inner: int32, 0, 4) {
+          compute[((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7))] = max((conv2d_nchw_1[i1.inner] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+          compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 1)] = max((conv2d_nchw_1[(i1.inner + 4)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+          compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 2)] = max((conv2d_nchw_1[(i1.inner + 8)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+          compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 3)] = max((conv2d_nchw_1[(i1.inner + 12)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+          compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 4)] = max((conv2d_nchw_1[(i1.inner + 16)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+          compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 5)] = max((conv2d_nchw_1[(i1.inner + 20)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+          compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 6)] = max((conv2d_nchw_1[(i1.inner + 24)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
         }
       }
     }
@@ -751,7 +755,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 0.367 ms
+    Execution time of this operator: 0.370 ms
 
 
 
@@ -795,21 +799,21 @@ They can be used for debugging and learning the behavior of the auto-scheduler.
     conv2d_nchw_nn_o_o_i, conv2d_nchw_nn_o_i = s[conv2d_nchw].split(conv2d_nchw_nn_o_i, factor=1)
     conv2d_nchw_nn_o_o_o_i, conv2d_nchw_nn_o_o_i = s[conv2d_nchw].split(conv2d_nchw_nn_o_o_i, factor=1)
     conv2d_nchw_nn_o_o_o_o, conv2d_nchw_nn_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_nn_o_o_o_i, factor=1)
-    conv2d_nchw_ff_o_i, conv2d_nchw_ff_i = s[conv2d_nchw].split(conv2d_nchw_ff, factor=1)
-    conv2d_nchw_ff_o_o_i, conv2d_nchw_ff_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_i, factor=2)
-    conv2d_nchw_ff_o_o_o_i, conv2d_nchw_ff_o_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_o_i, factor=64)
+    conv2d_nchw_ff_o_i, conv2d_nchw_ff_i = s[conv2d_nchw].split(conv2d_nchw_ff, factor=4)
+    conv2d_nchw_ff_o_o_i, conv2d_nchw_ff_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_i, factor=1)
+    conv2d_nchw_ff_o_o_o_i, conv2d_nchw_ff_o_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_o_i, factor=2)
     conv2d_nchw_ff_o_o_o_o, conv2d_nchw_ff_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_o_o_i, factor=1)
     conv2d_nchw_yy_o_i, conv2d_nchw_yy_i = s[conv2d_nchw].split(conv2d_nchw_yy, factor=1)
     conv2d_nchw_yy_o_o_i, conv2d_nchw_yy_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_i, factor=1)
-    conv2d_nchw_yy_o_o_o_i, conv2d_nchw_yy_o_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_o_i, factor=1)
+    conv2d_nchw_yy_o_o_o_i, conv2d_nchw_yy_o_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_o_i, factor=7)
     conv2d_nchw_yy_o_o_o_o, conv2d_nchw_yy_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_o_o_i, factor=1)
     conv2d_nchw_xx_o_i, conv2d_nchw_xx_i = s[conv2d_nchw].split(conv2d_nchw_xx, factor=1)
-    conv2d_nchw_xx_o_o_i, conv2d_nchw_xx_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_i, factor=7)
+    conv2d_nchw_xx_o_o_i, conv2d_nchw_xx_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_i, factor=1)
     conv2d_nchw_xx_o_o_o_i, conv2d_nchw_xx_o_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_o_i, factor=1)
-    conv2d_nchw_xx_o_o_o_o, conv2d_nchw_xx_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_o_o_i, factor=1)
-    conv2d_nchw_rc_o_i, conv2d_nchw_rc_i = s[conv2d_nchw].split(conv2d_nchw_rc, factor=2)
-    conv2d_nchw_rc_o_o, conv2d_nchw_rc_o_i = s[conv2d_nchw].split(conv2d_nchw_rc_o_i, factor=4)
-    conv2d_nchw_ry_o_i, conv2d_nchw_ry_i = s[conv2d_nchw].split(conv2d_nchw_ry, factor=1)
+    conv2d_nchw_xx_o_o_o_o, conv2d_nchw_xx_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_o_o_i, factor=7)
+    conv2d_nchw_rc_o_i, conv2d_nchw_rc_i = s[conv2d_nchw].split(conv2d_nchw_rc, factor=4)
+    conv2d_nchw_rc_o_o, conv2d_nchw_rc_o_i = s[conv2d_nchw].split(conv2d_nchw_rc_o_i, factor=1)
+    conv2d_nchw_ry_o_i, conv2d_nchw_ry_i = s[conv2d_nchw].split(conv2d_nchw_ry, factor=3)
     conv2d_nchw_ry_o_o, conv2d_nchw_ry_o_i = s[conv2d_nchw].split(conv2d_nchw_ry_o_i, factor=1)
     conv2d_nchw_rx_o_i, conv2d_nchw_rx_i = s[conv2d_nchw].split(conv2d_nchw_rx, factor=1)
     conv2d_nchw_rx_o_o, conv2d_nchw_rx_o_i = s[conv2d_nchw].split(conv2d_nchw_rx_o_i, factor=3)
@@ -817,15 +821,15 @@ They can be used for debugging and learning the behavior of the auto-scheduler.
     compute_i0_o_i, compute_i0_i = s[compute].split(compute_i0, factor=1)
     compute_i0_o_o_i, compute_i0_o_i = s[compute].split(compute_i0_o_i, factor=1)
     compute_i0_o_o_o, compute_i0_o_o_i = s[compute].split(compute_i0_o_o_i, factor=1)
-    compute_i1_o_i, compute_i1_i = s[compute].split(compute_i1, factor=2)
-    compute_i1_o_o_i, compute_i1_o_i = s[compute].split(compute_i1_o_i, factor=64)
+    compute_i1_o_i, compute_i1_i = s[compute].split(compute_i1, factor=4)
+    compute_i1_o_o_i, compute_i1_o_i = s[compute].split(compute_i1_o_i, factor=2)
     compute_i1_o_o_o, compute_i1_o_o_i = s[compute].split(compute_i1_o_o_i, factor=1)
     compute_i2_o_i, compute_i2_i = s[compute].split(compute_i2, factor=1)
-    compute_i2_o_o_i, compute_i2_o_i = s[compute].split(compute_i2_o_i, factor=1)
+    compute_i2_o_o_i, compute_i2_o_i = s[compute].split(compute_i2_o_i, factor=7)
     compute_i2_o_o_o, compute_i2_o_o_i = s[compute].split(compute_i2_o_o_i, factor=1)
-    compute_i3_o_i, compute_i3_i = s[compute].split(compute_i3, factor=7)
+    compute_i3_o_i, compute_i3_i = s[compute].split(compute_i3, factor=1)
     compute_i3_o_o_i, compute_i3_o_i = s[compute].split(compute_i3_o_i, factor=1)
-    compute_i3_o_o_o, compute_i3_o_o_i = s[compute].split(compute_i3_o_o_i, factor=1)
+    compute_i3_o_o_o, compute_i3_o_o_i = s[compute].split(compute_i3_o_o_i, factor=7)
     s[compute].reorder(compute_i0_o_o_o, compute_i1_o_o_o, compute_i2_o_o_o, compute_i3_o_o_o, compute_i0_o_o_i, compute_i1_o_o_i, compute_i2_o_o_i, compute_i3_o_o_i, compute_i0_o_i, compute_i1_o_i, compute_i2_o_i, compute_i3_o_i, compute_i0_i, compute_i1_i, compute_i2_i, compute_i3_i)
     s[conv2d_nchw].compute_at(s[compute], compute_i3_o_i)
     kernel_shared = s.cache_read(kernel, "shared", [conv2d_nchw])
@@ -844,12 +848,12 @@ They can be used for debugging and learning the behavior of the auto-scheduler.
     kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused = s[kernel_shared].fuse(kernel_shared_ax0, kernel_shared_ax1, kernel_shared_ax2, kernel_shared_ax3)
     kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i = s[kernel_shared].split(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused, factor=1)
     s[kernel_shared].vectorize(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i)
-    kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[kernel_shared].split(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=64)
+    kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[kernel_shared].split(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=14)
     s[kernel_shared].bind(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i, te.thread_axis("threadIdx.x"))
     pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused = s[pad_temp_shared].fuse(pad_temp_shared_ax0, pad_temp_shared_ax1, pad_temp_shared_ax2, pad_temp_shared_ax3)
-    pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused, factor=4)
+    pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused, factor=12)
     s[pad_temp_shared].vectorize(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i)
-    pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=64)
+    pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=14)
     s[pad_temp_shared].bind(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i, te.thread_axis("threadIdx.x"))
     s[conv2d_nchw].pragma(conv2d_nchw_nn_o_o_o_o, "auto_unroll_max_step", 512)
     s[conv2d_nchw].pragma(conv2d_nchw_nn_o_o_o_o, "unroll_explicit", True)
@@ -869,430 +873,459 @@ They can be used for debugging and learning the behavior of the auto-scheduler.
       #define int64_t long long
       #define uint64_t unsigned long long
     #endif
-    extern "C" __global__ void __launch_bounds__(64) default_function_kernel0(float* __restrict__ data, float* __restrict__ kernel, float* __restrict__ compute, float* __restrict__ bias) {
-      float conv2d_nchw[14];
-      __shared__ float pad_temp_shared[72];
-      __shared__ float kernel_shared[3072];
+    extern "C" __global__ void __launch_bounds__(14) default_function_kernel0(float* __restrict__ data, float* __restrict__ kernel, float* __restrict__ compute, float* __restrict__ bias) {
+      float conv2d_nchw[28];
+      __shared__ float pad_temp_shared[324];
+      __shared__ float kernel_shared[288];
       conv2d_nchw[0] = 0.000000e+00f;
-      conv2d_nchw[1] = 0.000000e+00f;
-      conv2d_nchw[2] = 0.000000e+00f;
-      conv2d_nchw[3] = 0.000000e+00f;
       conv2d_nchw[4] = 0.000000e+00f;
-      conv2d_nchw[5] = 0.000000e+00f;
-      conv2d_nchw[6] = 0.000000e+00f;
-      conv2d_nchw[7] = 0.000000e+00f;
       conv2d_nchw[8] = 0.000000e+00f;
+      conv2d_nchw[12] = 0.000000e+00f;
+      conv2d_nchw[16] = 0.000000e+00f;
+      conv2d_nchw[20] = 0.000000e+00f;
+      conv2d_nchw[24] = 0.000000e+00f;
+      conv2d_nchw[1] = 0.000000e+00f;
+      conv2d_nchw[5] = 0.000000e+00f;
       conv2d_nchw[9] = 0.000000e+00f;
+      conv2d_nchw[13] = 0.000000e+00f;
+      conv2d_nchw[17] = 0.000000e+00f;
+      conv2d_nchw[21] = 0.000000e+00f;
+      conv2d_nchw[25] = 0.000000e+00f;
+      conv2d_nchw[2] = 0.000000e+00f;
+      conv2d_nchw[6] = 0.000000e+00f;
       conv2d_nchw[10] = 0.000000e+00f;
+      conv2d_nchw[14] = 0.000000e+00f;
+      conv2d_nchw[18] = 0.000000e+00f;
+      conv2d_nchw[22] = 0.000000e+00f;
+      conv2d_nchw[26] = 0.000000e+00f;
+      conv2d_nchw[3] = 0.000000e+00f;
+      conv2d_nchw[7] = 0.000000e+00f;
       conv2d_nchw[11] = 0.000000e+00f;
-      conv2d_nchw[12] = 0.000000e+00f;
-      conv2d_nchw[13] = 0.000000e+00f;
-      for (int rc_outer_outer = 0; rc_outer_outer < 64; ++rc_outer_outer) {
-        for (int ry_outer_outer = 0; ry_outer_outer < 3; ++ry_outer_outer) {
-          __syncthreads();
-          if (((int)threadIdx.x) < 18) {
-            pad_temp_shared[(((int)threadIdx.x) * 4)] = (((((1 <= (ry_outer_outer + (((int)blockIdx.x) % 7))) && ((ry_outer_outer + (((int)blockIdx.x) % 7)) < 8)) && (1 <= ((((int)threadIdx.x) * 4) % 9))) && (((((int)threadIdx.x) * 4) % 9) < 8)) ? data[((((((rc_outer_outer * 392) + (((((int)threadIdx.x) * 4) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + ((((int)threadIdx.x) * 4) % 9)) - 8)] : 0.000000e+00f);
-          }
-          if (((int)threadIdx.x) < 18) {
-            pad_temp_shared[((((int)threadIdx.x) * 4) + 1)] = (((((1 <= (ry_outer_outer + (((int)blockIdx.x) % 7))) && ((ry_outer_outer + (((int)blockIdx.x) % 7)) < 8)) && (1 <= (((((int)threadIdx.x) * 4) + 1) % 9))) && ((((((int)threadIdx.x) * 4) + 1) % 9) < 8)) ? data[((((((rc_outer_outer * 392) + ((((((int)threadIdx.x) * 4) + 1) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + (((((int)threadIdx.x) * 4) + 1) % 9)) - 8)] : 0.000000e+00f);
-          }
-          if (((int)threadIdx.x) < 18) {
-            pad_temp_shared[((((int)threadIdx.x) * 4) + 2)] = (((((1 <= (ry_outer_outer + (((int)blockIdx.x) % 7))) && ((ry_outer_outer + (((int)blockIdx.x) % 7)) < 8)) && (1 <= (((((int)threadIdx.x) * 4) + 2) % 9))) && ((((((int)threadIdx.x) * 4) + 2) % 9) < 8)) ? data[((((((rc_outer_outer * 392) + ((((((int)threadIdx.x) * 4) + 2) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + (((((int)threadIdx.x) * 4) + 2) % 9)) - 8)] : 0.000000e+00f);
-          }
-          if (((int)threadIdx.x) < 18) {
-            pad_temp_shared[((((int)threadIdx.x) * 4) + 3)] = (((((1 <= (ry_outer_outer + (((int)blockIdx.x) % 7))) && ((ry_outer_outer + (((int)blockIdx.x) % 7)) < 8)) && (1 <= (((((int)threadIdx.x) * 4) + 3) % 9))) && ((((((int)threadIdx.x) * 4) + 3) % 9) < 8)) ? data[((((((rc_outer_outer * 392) + ((((((int)threadIdx.x) * 4) + 3) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + (((((int)threadIdx.x) * 4) + 3) % 9)) - 8)] : 0.000000e+00f);
-          }
-          kernel_shared[((int)threadIdx.x)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 64)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 64) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 128)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 128) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 192)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 36864)];
-          kernel_shared[(((int)threadIdx.x) + 256)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 256) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 320)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 320) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 384)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 73728)];
-          kernel_shared[(((int)threadIdx.x) + 448)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 448) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 512)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 512) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 576)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 110592)];
-          kernel_shared[(((int)threadIdx.x) + 640)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 640) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 704)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 704) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 768)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 147456)];
-          kernel_shared[(((int)threadIdx.x) + 832)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 832) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 896)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 896) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 960)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 184320)];
-          kernel_shared[(((int)threadIdx.x) + 1024)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1024) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1088)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1088) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1152)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 221184)];
-          kernel_shared[(((int)threadIdx.x) + 1216)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1216) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1280)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1280) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1344)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 258048)];
-          kernel_shared[(((int)threadIdx.x) + 1408)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1408) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1472)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1472) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1536)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 294912)];
-          kernel_shared[(((int)threadIdx.x) + 1600)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1600) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1664)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1664) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1728)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 331776)];
-          kernel_shared[(((int)threadIdx.x) + 1792)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1792) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1856)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1856) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 1920)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 368640)];
-          kernel_shared[(((int)threadIdx.x) + 1984)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1984) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2048)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2048) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2112)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 405504)];
-          kernel_shared[(((int)threadIdx.x) + 2176)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2176) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2240)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2240) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2304)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 442368)];
-          kernel_shared[(((int)threadIdx.x) + 2368)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2368) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2432)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2432) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2496)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 479232)];
-          kernel_shared[(((int)threadIdx.x) + 2560)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2560) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2624)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2624) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2688)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 516096)];
-          kernel_shared[(((int)threadIdx.x) + 2752)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2752) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2816)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2816) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 2880)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 552960)];
-          kernel_shared[(((int)threadIdx.x) + 2944)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2944) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-          kernel_shared[(((int)threadIdx.x) + 3008)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 3008) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-          __syncthreads();
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[0] * kernel_shared[(((int)threadIdx.x) * 48)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[9] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[1] * kernel_shared[(((int)threadIdx.x) * 48)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[2] * kernel_shared[(((int)threadIdx.x) * 48)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[3] * kernel_shared[(((int)threadIdx.x) * 48)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[4] * kernel_shared[(((int)threadIdx.x) * 48)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[5] * kernel_shared[(((int)threadIdx.x) * 48)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[6] * kernel_shared[(((int)threadIdx.x) * 48)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[0] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[9] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[1] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[1] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[1] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[8] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[17] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[8] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[17] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[18] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[27] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[18] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[27] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[26] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[35] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[26] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[35] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[36] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[45] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[36] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[45] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[44] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[53] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[44] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[53] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[54] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[63] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[54] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[63] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[62] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[71] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[62] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[71] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
+      conv2d_nchw[15] = 0.000000e+00f;
+      conv2d_nchw[19] = 0.000000e+00f;
+      conv2d_nchw[23] = 0.000000e+00f;
+      conv2d_nchw[27] = 0.000000e+00f;
+      for (int rc_outer_outer = 0; rc_outer_outer < 128; ++rc_outer_outer) {
+        __syncthreads();
+        pad_temp_shared[(((int)threadIdx.x) * 12)] = (((((3 <= ((((int)threadIdx.x) * 4) % 27)) && (((((int)threadIdx.x) * 12) % 81) < 72)) && (1 <= ((((int)threadIdx.x) * 12) % 9))) && (((((int)threadIdx.x) * 12) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + (((((int)threadIdx.x) * 4) / 27) * 49)) + ((((((int)threadIdx.x) * 4) % 27) / 3) * 7)) + ((((int)threadIdx.x) * 12) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 1)] = (((((3 <= ((((int)threadIdx.x) * 4) % 27)) && ((((((int)threadIdx.x) * 12) + 1) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 1) % 9))) && ((((((int)threadIdx.x) * 12) + 1) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + (((((int)threadIdx.x) * 4) / 27) * 49)) + ((((((int)threadIdx.x) * 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 1) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 2)] = (((((3 <= ((((int)threadIdx.x) * 4) % 27)) && ((((((int)threadIdx.x) * 12) + 2) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 2) % 9))) && ((((((int)threadIdx.x) * 12) + 2) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + (((((int)threadIdx.x) * 4) / 27) * 49)) + ((((((int)threadIdx.x) * 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 2) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 3)] = (((((3 <= (((((int)threadIdx.x) * 4) + 1) % 27)) && ((((((int)threadIdx.x) * 12) + 3) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 3) % 9))) && ((((((int)threadIdx.x) * 12) + 3) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 1) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 1) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 3) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 4)] = (((((3 <= (((((int)threadIdx.x) * 4) + 1) % 27)) && ((((((int)threadIdx.x) * 12) + 4) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 4) % 9))) && ((((((int)threadIdx.x) * 12) + 4) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 1) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 1) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 4) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 5)] = (((((3 <= (((((int)threadIdx.x) * 4) + 1) % 27)) && ((((((int)threadIdx.x) * 12) + 5) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 5) % 9))) && ((((((int)threadIdx.x) * 12) + 5) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 1) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 1) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 5) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 6)] = (((((3 <= (((((int)threadIdx.x) * 4) + 2) % 27)) && ((((((int)threadIdx.x) * 12) + 6) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 6) % 9))) && ((((((int)threadIdx.x) * 12) + 6) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 2) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 6) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 7)] = (((((3 <= (((((int)threadIdx.x) * 4) + 2) % 27)) && ((((((int)threadIdx.x) * 12) + 7) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 7) % 9))) && ((((((int)threadIdx.x) * 12) + 7) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 2) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 7) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 8)] = (((((3 <= (((((int)threadIdx.x) * 4) + 2) % 27)) && ((((((int)threadIdx.x) * 12) + 8) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 8) % 9))) && ((((((int)threadIdx.x) * 12) + 8) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 2) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 8) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 9)] = (((((1 <= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 9) % 81) < 72)) && (1 <= ((((int)threadIdx.x) * 12) % 9))) && (((((int)threadIdx.x) * 12) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 3) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + ((((int)threadIdx.x) * 12) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 10)] = (((((1 <= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 10) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 1) % 9))) && ((((((int)threadIdx.x) * 12) + 1) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 3) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 1) % 9)) - 8)] : 0.000000e+00f);
+        pad_temp_shared[((((int)threadIdx.x) * 12) + 11)] = (((((1 <= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 11) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 2) % 9))) && ((((((int)threadIdx.x) * 12) + 2) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 3) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 2) % 9)) - 8)] : 0.000000e+00f);
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 168)] = (((((3 <= (((((int)threadIdx.x) * 4) + 2) % 27)) && ((((((int)threadIdx.x) * 12) + 6) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 6) % 9))) && ((((((int)threadIdx.x) * 12) + 6) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 56) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 6) % 9)) - 8)] : 0.000000e+00f);
         }
-      }
-      for (int i1_inner = 0; i1_inner < 2; ++i1_inner) {
-        for (int i3_inner = 0; i3_inner < 7; ++i3_inner) {
-          compute[((((((((int)blockIdx.x) / 7) * 6272) + (((int)threadIdx.x) * 98)) + (i1_inner * 49)) + ((((int)blockIdx.x) % 7) * 7)) + i3_inner)] = max((conv2d_nchw[((i1_inner * 7) + i3_inner)] + bias[((((((int)blockIdx.x) / 7) * 128) + (((int)threadIdx.x) * 2)) + i1_inner)]), 0.000000e+00f);
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 169)] = (((((3 <= (((((int)threadIdx.x) * 4) + 2) % 27)) && ((((((int)threadIdx.x) * 12) + 7) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 7) % 9))) && ((((((int)threadIdx.x) * 12) + 7) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 56) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 7) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 170)] = (((((3 <= (((((int)threadIdx.x) * 4) + 2) % 27)) && ((((((int)threadIdx.x) * 12) + 8) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 8) % 9))) && ((((((int)threadIdx.x) * 12) + 8) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 56) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 8) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 171)] = (((((1 <= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 9) % 81) < 72)) && (1 <= ((((int)threadIdx.x) * 12) % 9))) && (((((int)threadIdx.x) * 12) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 57) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + ((((int)threadIdx.x) * 12) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 172)] = (((((1 <= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 10) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 1) % 9))) && ((((((int)threadIdx.x) * 12) + 1) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 57) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 1) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 173)] = (((((1 <= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 11) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 2) % 9))) && ((((((int)threadIdx.x) * 12) + 2) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 57) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 2) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 174)] = (((((3 <= (((((int)threadIdx.x) * 4) + 4) % 27)) && ((((((int)threadIdx.x) * 12) + 12) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 3) % 9))) && ((((((int)threadIdx.x) * 12) + 3) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 58) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 3) % 9)) - 8)] : 0.000000e+00f);
         }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 175)] = (((((3 <= (((((int)threadIdx.x) * 4) + 4) % 27)) && ((((((int)threadIdx.x) * 12) + 13) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 4) % 9))) && ((((((int)threadIdx.x) * 12) + 4) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 58) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 4) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 176)] = (((((3 <= (((((int)threadIdx.x) * 4) + 4) % 27)) && ((((((int)threadIdx.x) * 12) + 14) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 5) % 9))) && ((((((int)threadIdx.x) * 12) + 5) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 58) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 5) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 177)] = (((((1 <= (((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 15) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 6) % 9))) && ((((((int)threadIdx.x) * 12) + 6) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 59) / 27) * 49)) + ((((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 6) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 178)] = (((((1 <= (((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 16) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 7) % 9))) && ((((((int)threadIdx.x) * 12) + 7) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 59) / 27) * 49)) + ((((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 7) % 9)) - 8)] : 0.000000e+00f);
+        }
+        if (((int)threadIdx.x) < 13) {
+          pad_temp_shared[((((int)threadIdx.x) * 12) + 179)] = (((((1 <= (((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9)) && ((((((int)threadIdx.x) * 12) + 17) % 81) < 72)) && (1 <= (((((int)threadIdx.x) * 12) + 8) % 9))) && ((((((int)threadIdx.x) * 12) + 8) % 9) < 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 59) / 27) * 49)) + ((((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 8) % 9)) - 8)] : 0.000000e+00f);
+        }
+        kernel_shared[((int)threadIdx.x)] = kernel[(((((int)blockIdx.x) * 36864) + (rc_outer_outer * 36)) + ((int)threadIdx.x))];
+        kernel_shared[(((int)threadIdx.x) + 14)] = kernel[((((((int)blockIdx.x) * 36864) + (rc_outer_outer * 36)) + ((int)threadIdx.x)) + 14)];
+        kernel_shared[(((int)threadIdx.x) + 28)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 28) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 28) % 36))];
+        kernel_shared[(((int)threadIdx.x) + 42)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 42) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 6))];
+        kernel_shared[(((int)threadIdx.x) + 56)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 56) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 20))];
+        kernel_shared[(((int)threadIdx.x) + 70)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 70) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 34) % 36))];
+        kernel_shared[(((int)threadIdx.x) + 84)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 84) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 12))];
+        kernel_shared[(((int)threadIdx.x) + 98)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 98) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 26) % 36))];
+        kernel_shared[(((int)threadIdx.x) + 112)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 112) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 4))];
+        kernel_shared[(((int)threadIdx.x) + 126)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 126) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 18))];
+        kernel_shared[(((int)threadIdx.x) + 140)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 140) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 32) % 36))];
+        kernel_shared[(((int)threadIdx.x) + 154)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 154) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 10))];
+        kernel_shared[(((int)threadIdx.x) + 168)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 168) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 24) % 36))];
+        kernel_shared[(((int)threadIdx.x) + 182)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 182) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 2))];
+        kernel_shared[(((int)threadIdx.x) + 196)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 196) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 16))];
+        kernel_shared[(((int)threadIdx.x) + 210)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 210) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 30) % 36))];
+        kernel_shared[(((int)threadIdx.x) + 224)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 224) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 8))];
+        kernel_shared[(((int)threadIdx.x) + 238)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 238) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 22))];
+        kernel_shared[(((int)threadIdx.x) + 252)] = kernel[((((((int)blockIdx.x) * 36864) + (rc_outer_outer * 36)) + ((int)threadIdx.x)) + 32256)];
+        kernel_shared[(((int)threadIdx.x) + 266)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 266) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 14))];
+        if (((int)threadIdx.x) < 8) {
+          kernel_shared[(((int)threadIdx.x) + 280)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 280) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 28))];
+        }
+        __syncthreads();
+        for (int rx_outer_inner = 0; rx_outer_inner < 3; ++rx_outer_inner) {
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+          conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+          conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+          conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+          conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+          conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+          conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+          conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+          conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+          conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+          conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+          conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+          conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+          conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+          conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+          conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+          conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+          conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+          conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+          conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+          conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+          conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+          conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+          conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+          conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+          conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+          conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+          conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+          conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+        }
+      }
+      for (int i1_inner = 0; i1_inner < 4; ++i1_inner) {
+        compute[((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7))] = max((conv2d_nchw[i1_inner] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+        compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 1)] = max((conv2d_nchw[(i1_inner + 4)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+        compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 2)] = max((conv2d_nchw[(i1_inner + 8)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+        compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 3)] = max((conv2d_nchw[(i1_inner + 12)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+        compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 4)] = max((conv2d_nchw[(i1_inner + 16)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+        compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 5)] = max((conv2d_nchw[(i1_inner + 20)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+        compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 6)] = max((conv2d_nchw[(i1_inner + 24)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
       }
     }
 
@@ -1351,7 +1384,7 @@ In the example below we resume the status and do more 5 trials.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 2 minutes  20.665 seconds)
+   **Total running time of the script:** ( 2 minutes  20.451 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_conv2d_layer_cuda.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
index edf8eb30d..0cbae2e94 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_cuda.rst.txt
@@ -614,7 +614,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      10.0707      10.0897      10.0959      10.0265       0.0314   
+      10.1876      10.1918      10.2093      10.1617       0.0197   
                
 
 
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
index e1ff6f2b1..727ffef22 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_network_x86.rst.txt
@@ -633,7 +633,7 @@ so we can read the log file and load the best schedules.
     Evaluate inference time cost...
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-      767.2840     767.8188     770.8792     763.1539      3.1765   
+      759.3154     758.3050     761.5435     758.0978      1.5778   
                
 
 
@@ -658,7 +658,7 @@ Other Tips
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  20.512 seconds)
+   **Total running time of the script:** ( 1 minutes  20.072 seconds)
 
 
 .. _sphx_glr_download_how_to_tune_with_autoscheduler_tune_network_x86.py:
diff --git a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
index 675ffe6c8..db282b172 100644
--- a/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
+++ b/docs/_sources/how_to/tune_with_autoscheduler/tune_sparse_x86.rst.txt
@@ -362,7 +362,7 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
                  placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
                  compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
       buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-      preflattened_buffer_map = {placeholder_5: placeholder_15: Buffer(placeholder_10, float32, [128, 256], []), placeholder_9: placeholder_16: Buffer(placeholder_14, float32, [128, 512], []), placeholder_6: placeholder_17: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_7: placeholder_18: Buffer(placeholder_12, int32, [4916], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_8: placeholder_19: Buffer(placeholder_13, int32, [33], [])} {
+      preflattened_buffer_map = {placeholder_7: placeholder_15: Buffer(placeholder_12, int32, [4916], []), placeholder_9: placeholder_16: Buffer(placeholder_14, float32, [128, 512], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_6: placeholder_17: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_8: placeholder_19: Buffer(placeholder_13, int32, [33], [])} {
       for (i0.outer.i1.outer.fused: int32, 0, 32) "parallel" {
         allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global {
           for (i.outer.inner: int32, 0, 2) {
@@ -439,7 +439,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 1.625 ms
+    Execution time of this operator: 1.621 ms
 
 
 
diff --git a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
index b59946f8b..b34a51cdd 100644
--- a/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:45.248** total execution time for **how_to_tune_with_autotvm** files:
+**00:43.947** total execution time for **how_to_tune_with_autotvm** files:
 
-- **00:44.386**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)
-- **00:00.232**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_x86.py` (``tune_relay_x86.py``)
-- **00:00.211**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_arm.py` (``tune_relay_arm.py``)
-- **00:00.210**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_mobile_gpu.py` (``tune_relay_mobile_gpu.py``)
-- **00:00.209**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_cuda.py` (``tune_relay_cuda.py``)
+- **00:43.053**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py` (``tune_conv2d_cuda.py``)
+- **00:00.239**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_x86.py` (``tune_relay_x86.py``)
+- **00:00.222**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_mobile_gpu.py` (``tune_relay_mobile_gpu.py``)
+- **00:00.220**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_cuda.py` (``tune_relay_cuda.py``)
+- **00:00.213**: :ref:`sphx_glr_how_to_tune_with_autotvm_tune_relay_arm.py` (``tune_relay_arm.py``)
diff --git a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
index 0569b85b7..d19563b01 100644
--- a/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
+++ b/docs/_sources/how_to/tune_with_autotvm/tune_conv2d_cuda.rst.txt
@@ -859,8 +859,8 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 4, 32]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 1, 128]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2885496
-    No: 6   GFLOPS: 63.26/63.26     result: MeasureResult(costs=(0.0036596923,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6318109035491943, timestamp=1651374043.883832)        [('tile_f', [-1, 1, 1, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 4, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,3754080
-    No: 7   GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 6   GFLOPS: 92.58/92.58     result: MeasureResult(costs=(0.0025004747916666666,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6190400123596191, timestamp=1651508826.0159192)      [('tile_f', [-1, 1, 1, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 4, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,3754080
+    No: 7   GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -983,7 +983,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 1, 16, 32]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 256, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6225319
-    No: 8   GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 8   GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1106,7 +1106,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 1, 32]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 8, 64]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,943546
-    No: 9   GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 9   GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1229,7 +1229,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 16, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 16, 32]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2868708
-    No: 10  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 10  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 142, in build
         res = future.result()
       File "/usr/lib/python3.7/concurrent/futures/_base.py", line 435, in result
@@ -1247,7 +1247,7 @@ for this template
     TimeoutError
 
             [('tile_f', [-1, 32, 2, 4]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 2]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4691833
-    No: 11  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 11  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1370,7 +1370,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 1, 2, 64]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 4, 4]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,1042124
-    No: 12  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 12  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1493,7 +1493,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 32, 1, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 32, 16]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10013405
-    No: 13  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 13  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1616,7 +1616,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 8, 8, 2]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 4, 32]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6732082
-    No: 14  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 14  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1739,7 +1739,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 4, 32]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 4, 128]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 1)],None,7536735
-    No: 15  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 15  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1862,7 +1862,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 1, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 128, 4]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,482121
-    No: 16  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 16  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -1985,7 +1985,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 1, 16]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 32, 8]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2824525
-    No: 17  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 17  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -2108,7 +2108,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 64, 1, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 8, 8]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4559286
-    No: 18  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 18  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 571, in __call__
         func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 523, in _build_func_common
@@ -2231,7 +2231,7 @@ for this template
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 854, in verify_pass
         raise InstantiationError("Skipped because of invalid gpu kernel")
     tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 1, 32, 16]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 1, 512]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9677544
-    No: 19  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+    No: 19  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 721, in __call__
         yield remote, remote.load_module(os.path.split(build_result.filename)[1])
       File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 685, in run_through_rpc
@@ -2319,7 +2319,7 @@ for this template
       15: _PyEval_EvalFrameDefault
       14: 0x0000000000537c30
       13: _PyObject_FastCallKeywords
-      12: 0x00007fc64b807fa2
+      12: 0x00007f3a3047dfa2
       11: _ctypes_callproc
       10: ffi_call
       9: ffi_call_unix64
@@ -2384,7 +2384,7 @@ for this template
       21: _PyFunction_FastCallKeywords
       20: _PyEval_EvalFrameDefault
       19: _PyFunction_FastCall      [('tile_f', [-1, 8, 2, 16]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 1, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6390073
-    No: 20  GFLOPS: 142.48/142.48   result: MeasureResult(costs=(0.00162477076,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.4184842109680176, timestamp=1651374070.365538)       [('tile_f', [-1, 1, 4, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9881539
+    No: 20  GFLOPS: 141.66/141.66   result: MeasureResult(costs=(0.0016342177903225807,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.1537678241729736, timestamp=1651508852.1777332)      [('tile_f', [-1, 1, 4, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9881539
 
 
 
@@ -2437,7 +2437,7 @@ and measure running time.
 
     Best config:
     [('tile_f', [-1, 1, 4, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9881539
-    Time cost of this operator: 0.002004
+    Time cost of this operator: 0.001984
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
index 136ee3212..9f2888682 100644
--- a/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt
@@ -292,10 +292,10 @@ Timing the untuned program
     ########## Build without Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  
     ---------                                     ---                                           --------  -------  -----              ------  -------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  314.0     98.74    (1, 2, 10, 10, 3)  2       1        
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.073     0.966    (1, 6, 10, 10)     1       1        
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.934     0.294    (1, 1, 10, 10, 3)  1       1        
-    Total_time                                    -                                             318.007   -        -                  -       -        
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  315.2     98.742   (1, 2, 10, 10, 3)  2       1        
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.097     0.97     (1, 6, 10, 10)     1       1        
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.918     0.288    (1, 1, 10, 10, 3)  1       1        
+    Total_time                                    -                                             319.215   -        -                  -       -        
 
 
 
@@ -357,10 +357,10 @@ Timing the tuned program
     ########## Build with Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  
     ---------                                     ---                                           --------  -------  -----              ------  -------  
-    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  81.4      96.877   (1, 6, 10, 10, 1)  2       1        
-    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.702     2.025    (1, 6, 10, 10)     1       1        
-    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.922     1.097    (1, 1, 10, 10, 3)  1       1        
-    Total_time                                    -                                             84.024    -        -                  -       -        
+    tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  81.15     96.765   (1, 6, 10, 10, 1)  2       1        
+    tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.812     2.161    (1, 6, 10, 10)     1       1        
+    tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.901     1.074    (1, 1, 10, 10, 3)  1       1        
+    Total_time                                    -                                             83.863    -        -                  -       -        
 
 
 
diff --git a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
index 9711f2210..8440e1e9b 100644
--- a/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_microtvm/sg_execution_times.rst.txt
@@ -5,10 +5,10 @@
 
 Computation times
 =================
-**00:45.186** total execution time for **how_to_work_with_microtvm** files:
+**00:44.150** total execution time for **how_to_work_with_microtvm** files:
 
-- **00:40.959**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)
-- **00:03.619**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)
+- **00:40.057**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_autotune.py` (``micro_autotune.py``)
+- **00:03.492**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_tflite.py` (``micro_tflite.py``)
 - **00:00.208**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_tvmc.py` (``micro_tvmc.py``)
-- **00:00.202**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_ethosu.py` (``micro_ethosu.py``)
-- **00:00.199**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_reference_vm.py` (``micro_reference_vm.py``)
+- **00:00.197**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_ethosu.py` (``micro_ethosu.py``)
+- **00:00.197**: :ref:`sphx_glr_how_to_work_with_microtvm_micro_reference_vm.py` (``micro_reference_vm.py``)
diff --git a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
index ab9e75745..102437037 100644
--- a/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_relay/sg_execution_times.rst.txt
@@ -5,8 +5,8 @@
 
 Computation times
 =================
-**00:09.226** total execution time for **how_to_work_with_relay** files:
+**00:06.390** total execution time for **how_to_work_with_relay** files:
 
-- **00:07.184**: :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)
-- **00:01.829**: :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)
-- **00:00.214**: :ref:`sphx_glr_how_to_work_with_relay_using_relay_viz.py` (``using_relay_viz.py``)
+- **00:04.488**: :ref:`sphx_glr_how_to_work_with_relay_using_external_lib.py` (``using_external_lib.py``)
+- **00:01.691**: :ref:`sphx_glr_how_to_work_with_relay_build_gcn.py` (``build_gcn.py``)
+- **00:00.212**: :ref:`sphx_glr_how_to_work_with_relay_using_relay_viz.py` (``using_relay_viz.py``)
diff --git a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
index d8762fedd..204634643 100644
--- a/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/sg_execution_times.rst.txt
@@ -5,13 +5,13 @@
 
 Computation times
 =================
-**00:05.920** total execution time for **how_to_work_with_schedules** files:
+**00:05.572** total execution time for **how_to_work_with_schedules** files:
 
-- **00:02.180**: :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)
-- **00:01.234**: :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)
-- **00:00.770**: :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)
-- **00:00.745**: :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)
-- **00:00.315**: :ref:`sphx_glr_how_to_work_with_schedules_extern_op.py` (``extern_op.py``)
-- **00:00.231**: :ref:`sphx_glr_how_to_work_with_schedules_schedule_primitives.py` (``schedule_primitives.py``)
-- **00:00.227**: :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)
-- **00:00.218**: :ref:`sphx_glr_how_to_work_with_schedules_tuple_inputs.py` (``tuple_inputs.py``)
+- **00:02.035**: :ref:`sphx_glr_how_to_work_with_schedules_intrin_math.py` (``intrin_math.py``)
+- **00:01.097**: :ref:`sphx_glr_how_to_work_with_schedules_tensorize.py` (``tensorize.py``)
+- **00:00.724**: :ref:`sphx_glr_how_to_work_with_schedules_reduction.py` (``reduction.py``)
+- **00:00.699**: :ref:`sphx_glr_how_to_work_with_schedules_scan.py` (``scan.py``)
+- **00:00.310**: :ref:`sphx_glr_how_to_work_with_schedules_extern_op.py` (``extern_op.py``)
+- **00:00.246**: :ref:`sphx_glr_how_to_work_with_schedules_schedule_primitives.py` (``schedule_primitives.py``)
+- **00:00.238**: :ref:`sphx_glr_how_to_work_with_schedules_tedd.py` (``tedd.py``)
+- **00:00.223**: :ref:`sphx_glr_how_to_work_with_schedules_tuple_inputs.py` (``tuple_inputs.py``)
diff --git a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
index d498ff933..ec428717c 100644
--- a/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
+++ b/docs/_sources/how_to/work_with_schedules/tensorize.rst.txt
@@ -318,7 +318,7 @@ The importing needs to happen before the tensorized GEMV being executed.
                  C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
       buffer_map = {A_1: A, B_1: B, C_1: C}
       preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmp8plnll6e/input0.cc'\nsource_filename = \"/tmp/tmp8plnll6e/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
+      attr [IterVar(i: int32, (nullptr), "DataPar", "")] "pragma_import_llvm" = "; ModuleID = '/tmp/tmppb0827hi/input0.cc'\nsource_filename = \"/tmp/tmppb0827hi/input0.cc\"\ntarget datalayout = \"e-m:e-i64:64-f80:128-n8:16:32:64-S128\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = alloca float*, align 8\n  %8 = alloca float*, align 8\n  %9 = alloca floa [...]
       for (i, 0, 1024) {
         for (j.outer: int32, 0, 32) {
           @tir.call_extern("gemv_update", @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
index df79c5d03..1093f8f9a 100644
--- a/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/autotvm/sg_execution_times.rst.txt
@@ -5,7 +5,7 @@
 
 Computation times
 =================
-**00:20.809** total execution time for **topic_vta_tutorials_autotvm** files:
+**00:20.762** total execution time for **topic_vta_tutorials_autotvm** files:
 
-- **00:20.610**: :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``)
-- **00:00.198**: :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_alu_vta.py` (``tune_alu_vta.py``)
+- **00:20.551**: :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_relay_vta.py` (``tune_relay_vta.py``)
+- **00:00.210**: :ref:`sphx_glr_topic_vta_tutorials_autotvm_tune_alu_vta.py` (``tune_alu_vta.py``)
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
index 14f8d3d82..7f2e5af0e 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_classification.rst.txt
@@ -265,7 +265,7 @@ The compilation steps are:
       DeprecationWarning,
     /workspace/vta/tutorials/frontend/deploy_classification.py:213: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
       relay_prog, target=tvm.target.Target(target, host=env.target_host), params=params
-    resnet18_v1 inference graph built in 21.55s!
+    resnet18_v1 inference graph built in 21.65s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
index dfffdf70f..01275e66b 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/deploy_detection.rst.txt
@@ -301,7 +301,7 @@ The compilation steps are:
 
     /workspace/python/tvm/relay/build_module.py:439: DeprecationWarning: Please use input parameter mod (tvm.IRModule) instead of deprecated parameter mod (tvm.relay.function.Function)
       DeprecationWarning,
-    yolov3-tiny inference graph built in 14.99s!
+    yolov3-tiny inference graph built in 15.02s!
 
 
 
diff --git a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
index e51eacb74..ca95db4e8 100644
--- a/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/frontend/sg_execution_times.rst.txt
@@ -5,7 +5,7 @@
 
 Computation times
 =================
-**01:28.931** total execution time for **topic_vta_tutorials_frontend** files:
+**01:29.076** total execution time for **topic_vta_tutorials_frontend** files:
 
-- **00:47.303**: :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)
-- **00:41.628**: :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``)
+- **00:47.242**: :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_detection.py` (``deploy_detection.py``)
+- **00:41.834**: :ref:`sphx_glr_topic_vta_tutorials_frontend_deploy_classification.py` (``deploy_classification.py``)
diff --git a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
index 87fa41552..332dd4246 100644
--- a/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/optimize/sg_execution_times.rst.txt
@@ -5,7 +5,7 @@
 
 Computation times
 =================
-**00:03.529** total execution time for **topic_vta_tutorials_optimize** files:
+**00:03.539** total execution time for **topic_vta_tutorials_optimize** files:
 
-- **00:02.963**: :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)
-- **00:00.565**: :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``)
+- **00:03.000**: :ref:`sphx_glr_topic_vta_tutorials_optimize_convolution_opt.py` (``convolution_opt.py``)
+- **00:00.538**: :ref:`sphx_glr_topic_vta_tutorials_optimize_matrix_multiply_opt.py` (``matrix_multiply_opt.py``)
diff --git a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
index f3fcca6c5..8e01e660f 100644
--- a/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/topic/vta/tutorials/sg_execution_times.rst.txt
@@ -5,7 +5,7 @@
 
 Computation times
 =================
-**00:01.056** total execution time for **topic_vta_tutorials** files:
+**00:00.992** total execution time for **topic_vta_tutorials** files:
 
-- **00:00.536**: :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``)
-- **00:00.520**: :ref:`sphx_glr_topic_vta_tutorials_vta_get_started.py` (``vta_get_started.py``)
+- **00:00.502**: :ref:`sphx_glr_topic_vta_tutorials_matrix_multiply.py` (``matrix_multiply.py``)
+- **00:00.490**: :ref:`sphx_glr_topic_vta_tutorials_vta_get_started.py` (``vta_get_started.py``)
diff --git a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
index 2164c7e0c..dfd3ef5f3 100644
--- a/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
+++ b/docs/_sources/tutorial/auto_scheduler_matmul_x86.rst.txt
@@ -306,7 +306,7 @@ We build the binary and check its correctness and performance.
 
  .. code-block:: none
 
-    Execution time of this operator: 92.952 ms
+    Execution time of this operator: 93.667 ms
 
 
 
diff --git a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
index f2ac0476f..82ea9eda9 100644
--- a/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
+++ b/docs/_sources/tutorial/autotvm_relay_x86.rst.txt
@@ -268,7 +268,7 @@ standard deviation.
 
  .. code-block:: none
 
-    {'mean': 495.78402514001937, 'median': 495.88523714919575, 'std': 0.6694210681957663}
+    {'mean': 492.9064424504759, 'median': 492.7712094009621, 'std': 1.1670846093769405}
 
 
 
@@ -482,30 +482,30 @@ the tuning data to.
 
  .. code-block:: none
 
-
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  1/25]  Current/Best:   17.95/  23.31 GFLOPS | Progress: (4/10) | 8.62 s
    [Task  1/25]  Current/Best:    6.42/  23.59 GFLOPS | Progress: (8/10) | 12.15 s
    [Task  1/25]  Current/Best:   17.49/  23.59 GFLOPS | Progress: (10/10) | 13.07 s Done.
-
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  2/25]  Current/Best:   14.01/  16.53 GFLOPS | Progress: (4/10) | 2.40 s
    [Task  2/25]  Current/Best:    3.56/  16.53 GFLOPS | Progress: (8/10) | 3.86 s
    [Task  2/25]  Current/Best:   13.29/  16.53 GFLOPS | Progress: (10/10) | 5.01 s Done.
-
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  3/25]  Current/Best:   13.50/  20.29 GFLOPS | Progress: (4/10) | 2.87 s
    [Task  3/25]  Current/Best:   16.74/  20.29 GFLOPS | Progress: (8/10) | 4.78 s
    [Task  3/25]  Current/Best:   13.27/  20.29 GFLOPS | Progress: (10/10) | 7.04 s Done.
-
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  4/25]  Current/Best:   21.99/  21.99 GFLOPS | Progress: (4/10) | 7.52 s
    [Task  4/25]  Current/Best:    6.21/  21.99 GFLOPS | Progress: (8/10) | 11.96 s
    [Task  4/25]  Current/Best:   18.78/  21.99 GFLOPS | Progress: (10/10) | 14.00 s Done.
-
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  5/25]  Current/Best:   15.98/  15.98 GFLOPS | Progress: (4/10) | 2.93 s
    [Task  5/25]  Current/Best:    2.74/  17.68 GFLOPS | Progress: (8/10) | 5.35 s
    [Task  5/25]  Current/Best:   15.15/  17.68 GFLOPS | Progress: (10/10) | 6.12 s Done.
-
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  6/25]  Current/Best:    9.58/  14.43 GFLOPS | Progress: (4/10) | 3.54 s
    [Task  6/25]  Current/Best:    9.84/  19.20 GFLOPS | Progress: (8/10) | 5.50 s
    [Task  6/25]  Current/Best:   21.06/  21.06 GFLOPS | Progress: (10/10) | 6.83 s Done.
-
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  7/25]  Current/Best:   10.49/  13.62 GFLOPS | Progress: (4/10) | 2.87 s
    [Task  7/25]  Current/Best:   12.22/  13.62 GFLOPS | Progress: (8/10) | 4.98 s
    [Task  7/25]  Current/Best:    6.50/  14.78 GFLOPS | Progress: (10/10) | 6.10 s Done.
-
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  8/25]  Current/Best:   10.60/  20.47 GFLOPS | Progress: (4/10) | 5.06 s
    [Task  8/25]  Current/Best:   10.78/  20.47 GFLOPS | Progress: (8/10) | 6.95 s
    [Task  8/25]  Current/Best:    6.40/  20.47 GFLOPS | Progress: (10/10) | 14.18 s Done.
-
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  9/25]  Current/Best:   18.80/  18.80 GFLOPS | Progress: (4/10) | 4.57 s
    [Task  9/25]  Current/Best:   15.28/  18.80 GFLOPS | Progress: (8/10) | 8.21 s
    [Task  9/25]  Current/Best:   10.48/  18.80 GFLOPS | Progress: (10/10) | 12.43 s Done.
-
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 10/25]  Current/Best:   12.52/  16.66 GFLOPS | Progress: (4/10) | 4.49 s
    [Task 10/25]  Current/Best:   11.86/  19.57 GFLOPS | Progress: (8/10) | 7.51 s
    [Task 10/25]  Current/Best:    3.06/  19.57 GFLOPS | Progress: (10/10) | 8.49 s Done.
-
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 11/25]  Current/Best:    3.13/  22.08 GFLOPS | Progress: (4/10) | 3.29 s
    [Task 11/25]  Current/Best:   13.96/  22.08 GFLOPS | Progress: (8/10) | 6.75 s
    [Task 11/25]  Current/Best:   12.38/  22.08 GFLOPS | Progress: (10/10) | 7.63 s Done.
-
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 12/25]  Current/Best:    3.16/  18.04 GFLOPS | Progress: (4/10) | 3.78 s
    [Task 12/25]  Current/Best:   20.15/  20.15 GFLOPS | Progress: (8/10) | 5.66 s
    [Task 12/25]  Current/Best:   12.13/  20.15 GFLOPS | Progress: (10/10) | 6.46 s Done.
-
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 13/25]  Current/Best:   18.10/  20.42 GFLOPS | Progress: (4/10) | 4.99 s
    [Task 13/25]  Current/Best:   20.28/  22.91 GFLOPS | Progress: (8/10) | 7.05 s
    [Task 13/25]  Current/Best:   17.77/  22.91 GFLOPS | Progress: (10/10) | 9.33 s Done.
-
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 14/25]  Current/Best:   13.03/  20.95 GFLOPS | Progress: (4/10) | 2.93 s
    [Task 14/25]  Current/Best:    9.42/  20.95 GFLOPS | Progress: (8/10) | 5.12 s
    [Task 14/25]  Current/Best:    1.60/  20.95 GFLOPS | Progress: (10/10) | 7.55 s
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 15/25]  Current/Best:   21.18/  22.25 GFLOPS | Progress: (4/10) | 2.62 s
    [Task 15/25]  Current/Best:    7.57/  22.25 GFLOPS | Progress: (8/10) | 3.91 s
    [Task 15/25]  Current/Best:    9.40/  22.25 GFLOPS | Progress: (10/10) | 8.71 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s Done.
+
    [Task  1/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  1/25]  Current/Best:    9.22/  23.00 GFLOPS | Progress: (4/10) | 5.50 s
    [Task  1/25]  Current/Best:    3.52/  23.30 GFLOPS | Progress: (8/10) | 8.10 s
    [Task  1/25]  Current/Best:   23.50/  23.50 GFLOPS | Progress: (10/10) | 8.98 s Done.
+
    [Task  2/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  2/25]  Current/Best:    5.56/  13.76 GFLOPS | Progress: (4/10) | 2.71 s
    [Task  2/25]  Current/Best:   15.88/  16.27 GFLOPS | Progress: (8/10) | 4.17 s
    [Task  2/25]  Current/Best:   17.83/  17.83 GFLOPS | Progress: (10/10) | 4.76 s Done.
+
    [Task  3/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  3/25]  Current/Best:   11.39/  24.13 GFLOPS | Progress: (4/10) | 2.76 s
    [Task  3/25]  Current/Best:   24.29/  24.29 GFLOPS | Progress: (8/10) | 4.73 s
    [Task  3/25]  Current/Best:   17.52/  24.29 GFLOPS | Progress: (10/10) | 5.61 s Done.
+
    [Task  4/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  4/25]  Current/Best:    6.70/  16.91 GFLOPS | Progress: (4/10) | 7.43 s
    [Task  4/25]  Current/Best:   13.37/  17.26 GFLOPS | Progress: (8/10) | 8.99 s
    [Task  4/25]  Current/Best:   21.04/  21.04 GFLOPS | Progress: (10/10) | 9.88 s Done.
+
    [Task  5/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  5/25]  Current/Best:    3.25/  13.38 GFLOPS | Progress: (4/10) | 3.86 s
    [Task  5/25]  Current/Best:    9.00/  14.99 GFLOPS | Progress: (8/10) | 6.01 s
    [Task  5/25]  Current/Best:   18.27/  18.27 GFLOPS | Progress: (10/10) | 6.85 s Done.
+
    [Task  6/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  6/25]  Current/Best:   14.75/  14.75 GFLOPS | Progress: (4/10) | 4.02 s
    [Task  6/25]  Current/Best:   18.20/  18.20 GFLOPS | Progress: (8/10) | 6.55 s
    [Task  6/25]  Current/Best:   15.85/  18.20 GFLOPS | Progress: (10/10) | 7.68 s Done.
+
    [Task  7/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  7/25]  Current/Best:   19.09/  19.09 GFLOPS | Progress: (4/10) | 3.31 s
    [Task  7/25]  Current/Best:    5.28/  19.61 GFLOPS | Progress: (8/10) | 5.20 s
    [Task  7/25]  Current/Best:   17.48/  19.61 GFLOPS | Progress: (10/10) | 5.96 s Done.
+
    [Task  8/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  8/25]  Current/Best:    9.29/  11.09 GFLOPS | Progress: (4/10) | 9.03 s
    [Task  8/25]  Current/Best:   16.18/  16.18 GFLOPS | Progress: (8/10) | 13.51 s
    [Task  8/25]  Current/Best:    9.29/  16.18 GFLOPS | Progress: (10/10) | 21.29 s Done.
+
    [Task  9/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task  9/25]  Current/Best:   17.28/  19.69 GFLOPS | Progress: (4/10) | 3.86 s
    [Task  9/25]  Current/Best:    8.50/  21.24 GFLOPS | Progress: (8/10) | 5.19 s
    [Task  9/25]  Current/Best:   11.75/  21.24 GFLOPS | Progress: (10/10) | 6.17 s Done.
+
    [Task 10/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 10/25]  Current/Best:   14.43/  19.49 GFLOPS | Progress: (4/10) | 2.45 s
    [Task 10/25]  Current/Best:   13.85/  19.49 GFLOPS | Progress: (8/10) | 5.06 s
    [Task 10/25]  Current/Best:    9.57/  19.49 GFLOPS | Progress: (10/10) | 6.63 s Done.
+
    [Task 11/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 11/25]  Current/Best:   15.21/  18.56 GFLOPS | Progress: (4/10) | 3.05 s
    [Task 11/25]  Current/Best:   14.93/  22.50 GFLOPS | Progress: (8/10) | 4.60 s
    [Task 11/25]  Current/Best:    6.29/  22.50 GFLOPS | Progress: (10/10) | 6.02 s Done.
+
    [Task 12/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 12/25]  Current/Best:   11.93/  21.78 GFLOPS | Progress: (4/10) | 3.37 s
    [Task 12/25]  Current/Best:   13.74/  21.78 GFLOPS | Progress: (8/10) | 6.10 s
    [Task 12/25]  Current/Best:   13.84/  21.78 GFLOPS | Progress: (10/10) | 7.87 s Done.
+
    [Task 13/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 13/25]  Current/Best:   19.87/  22.20 GFLOPS | Progress: (4/10) | 3.62 s
    [Task 13/25]  Current/Best:   13.50/  22.20 GFLOPS | Progress: (8/10) | 6.31 s
    [Task 13/25]  Current/Best:   17.84/  22.20 GFLOPS | Progress: (10/10) | 7.19 s Done.
+
    [Task 14/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 14/25]  Current/Best:   14.94/  18.87 GFLOPS | Progress: (4/10) | 4.83 s
    [Task 14/25]  Current/Best:   16.50/  18.87 GFLOPS | Progress: (8/10) | 8.15 s
    [Task 14/25]  Current/Best:    8.95/  18.87 GFLOPS | Progress: (10/10) | 9.50 s Done.
+
    [Task 15/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 15/25]  Current/Best:   10.41/  16.43 GFLOPS | Progress: (4/10) | 5.84 s
    [Task 15/25]  Current/Best:   11.46/  16.43 GFLOPS | Progress: (8/10) | 11.00 s
    [Task 15/25]  Current/Best:   22.23/  22.23 GFLOPS | Progress: (10/10) | 15.61 s
    [Task 16/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 16/25]  Current/Best:    5.31/  21.46 GFLOPS | Progress: (4/10) | 2.60 s
    [Task 16/25]  Current/Best:    7.50/  21.46 GFLOPS | Progress: (8/10) | 5.23 s
    [Task 16/25]  Current/Best:   17.22/  21.46 GFLOPS | Progress: (10/10) | 6.05 s Done.
+
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 17/25]  Current/Best:   10.14/  12.40 GFLOPS | Progress: (4/10) | 4.25 s
    [Task 17/25]  Current/Best:   11.59/  24.05 GFLOPS | Progress: (8/10) | 7.97 s
    [Task 17/25]  Current/Best:   14.14/  24.05 GFLOPS | Progress: (10/10) | 8.94 s Done.
+
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 18/25]  Current/Best:   10.34/  17.15 GFLOPS | Progress: (4/10) | 2.63 s
    [Task 18/25]  Current/Best:   14.70/  17.15 GFLOPS | Progress: (8/10) | 6.43 s
    [Task 18/25]  Current/Best:    4.33/  17.15 GFLOPS | Progress: (10/10) | 7.61 s Done.
+
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 19/25]  Current/Best:   10.64/  16.91 GFLOPS | Progress: (4/10) | 3.87 s
    [Task 19/25]  Current/Best:   16.01/  16.91 GFLOPS | Progress: (8/10) | 6.96 s
    [Task 19/25]  Current/Best:   19.34/  19.34 GFLOPS | Progress: (10/10) | 8.86 s Done.
+
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 20/25]  Current/Best:    1.57/  15.88 GFLOPS | Progress: (4/10) | 5.44 s
    [Task 20/25]  Current/Best:   16.54/  16.54 GFLOPS | Progress: (8/10) | 6.99 s
    [Task 20/25]  Current/Best:   17.63/  17.63 GFLOPS | Progress: (10/10) | 8.12 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s Done.
      Done.
-
    [Task 16/25]  Current/Best:   10.44/  18.40 GFLOPS | Progress: (4/10) | 2.78 s
    [Task 16/25]  Current/Best:    5.35/  18.40 GFLOPS | Progress: (8/10) | 4.35 s
    [Task 16/25]  Current/Best:   14.58/  18.40 GFLOPS | Progress: (10/10) | 5.10 s Done.
-
    [Task 17/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 17/25]  Current/Best:   11.06/  17.71 GFLOPS | Progress: (4/10) | 3.07 s
    [Task 17/25]  Current/Best:   11.06/  18.73 GFLOPS | Progress: (8/10) | 5.04 s
    [Task 17/25]  Current/Best:   10.06/  18.73 GFLOPS | Progress: (10/10) | 5.91 s Done.
-
    [Task 18/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 18/25]  Current/Best:   12.12/  17.96 GFLOPS | Progress: (4/10) | 2.95 s
    [Task 18/25]  Current/Best:   18.55/  18.55 GFLOPS | Progress: (8/10) | 8.64 s
    [Task 18/25]  Current/Best:   21.89/  21.89 GFLOPS | Progress: (10/10) | 9.55 s Done.
-
    [Task 19/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 19/25]  Current/Best:   11.16/  23.97 GFLOPS | Progress: (4/10) | 3.88 s
    [Task 19/25]  Current/Best:   10.55/  23.97 GFLOPS | Progress: (8/10) | 7.33 s
    [Task 19/25]  Current/Best:   11.10/  23.97 GFLOPS | Progress: (10/10) | 8.36 s Done.
-
    [Task 20/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 20/25]  Current/Best:   15.12/  20.74 GFLOPS | Progress: (4/10) | 3.26 s
    [Task 20/25]  Current/Best:   10.47/  20.74 GFLOPS | Progress: (8/10) | 5.11 s
    [Task 20/25]  Current/Best:   11.90/  20.74 GFLOPS | Progress: (10/10) | 6.05 s
    [Task 21/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 21/25]  Current/Best:   17.87/  17.87 GFLOPS | Progress: (4/10) | 2.56 s
    [Task 21/25]  Current/Best:    9.47/  17.87 GFLOPS | Progress: (8/10) | 6.30 s
    [Task 21/25]  Current/Best:   18.88/  18.88 GFLOPS | Progress: (10/10) | 7.78 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s Done.
-     Done.
-
    [Task 22/25]  Current/Best:   10.36/  13.84 GFLOPS | Progress: (4/10) | 3.22 s
    [Task 22/25]  Current/Best:   11.28/  17.90 GFLOPS | Progress: (8/10) | 5.34 s
    [Task 22/25]  Current/Best:    9.89/  17.90 GFLOPS | Progress: (10/10) | 6.64 s Done.
-
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 23/25]  Current/Best:    6.16/  11.71 GFLOPS | Progress: (4/10) | 3.75 s
    [Task 23/25]  Current/Best:    9.03/  12.26 GFLOPS | Progress: (8/10) | 6.86 s
    [Task 23/25]  Current/Best:    8.68/  12.26 GFLOPS | Progress: (10/10) | 8.19 s Done.
-
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 24/25]  Current/Best:    3.98/  10.06 GFLOPS | Progress: (4/10) | 3.16 s
    [Task 24/25]  Current/Best:    4.25/  10.06 GFLOPS | Progress: (8/10) | 5.14 s
    [Task 24/25]  Current/Best:   10.12/  10.12 GFLOPS | Progress: (10/10) | 1459.94 s
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 25/25]  Current/Best:    3.61/   5.83 GFLOPS | Progress: (4/10) | 14.04 s
    [Task 25/25]  Current/Best:    1.55/   5.83 GFLOPS | Progress: (8/10) | 33.46 s
    [Task 25/25]  Current/Best:    2.76/   5.83 GFLOPS | Progress: (10/10) | 38.20 s
+
    [Task 21/25]  Current/Best:   18.24/  22.50 GFLOPS | Progress: (4/10) | 2.67 s
    [Task 21/25]  Current/Best:    8.74/  22.50 GFLOPS | Progress: (8/10) | 4.03 s
    [Task 21/25]  Current/Best:    1.63/  22.50 GFLOPS | Progress: (10/10) | 5.25 s
    [Task 22/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 22/25]  Current/Best:   19.44/  19.94 GFLOPS | Progress: (4/10) | 2.61 s
    [Task 22/25]  Current/Best:   13.40/  19.94 GFLOPS | Progress: (8/10) | 4.45 s
    [Task 22/25]  Current/Best:   16.63/  19.94 GFLOPS | Progress: (10/10) | 5.18 s Done.
+
    [Task 23/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 23/25]  Current/Best:   17.12/  19.40 GFLOPS | Progress: (4/10) | 3.87 s
    [Task 23/25]  Current/Best:    9.56/  19.40 GFLOPS | Progress: (8/10) | 6.53 s
    [Task 23/25]  Current/Best:   18.24/  19.40 GFLOPS | Progress: (10/10) | 8.11 s Done.
+
    [Task 24/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 24/25]  Current/Best:    1.07/   7.00 GFLOPS | Progress: (4/10) | 4.12 s
    [Task 24/25]  Current/Best:    3.39/   7.44 GFLOPS | Progress: (8/10) | 11.70 s
    [Task 24/25]  Current/Best:    5.07/   8.95 GFLOPS | Progress: (10/10) | 12.49 s Done.
+
    [Task 25/25]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/10) | 0.00 s
    [Task 25/25]  Current/Best:    8.32/   8.32 GFLOPS | Progress: (4/10) | 5.50 s
    [Task 25/25]  Current/Best:    8.80/   8.80 GFLOPS | Progress: (8/10) | 31.01 s
    [Task 25/25]  Current/Best:    3.04/   8.80 GFLOPS | Progress: (10/10) | 31.83 s
 
 
 The output from this tuning process will look something like this:
@@ -593,7 +593,7 @@ Verify that the optimized model runs and produces the same results:
 
  .. code-block:: none
 
-    class='n02123045 tabby, tabby cat' with probability=0.621104
+    class='n02123045 tabby, tabby cat' with probability=0.621103
     class='n02123159 tiger cat' with probability=0.356378
     class='n02124075 Egyptian cat' with probability=0.019712
     class='n02129604 tiger, Panthera tigris' with probability=0.001215
@@ -647,8 +647,8 @@ improvement in comparing the optimized model to the unoptimized model.
 
  .. code-block:: none
 
-    optimized: {'mean': 456.5580205198785, 'median': 456.21353815004113, 'std': 1.01731385316094}
-    unoptimized: {'mean': 495.78402514001937, 'median': 495.88523714919575, 'std': 0.6694210681957663}
+    optimized: {'mean': 423.66693938995013, 'median': 423.7311506498372, 'std': 0.9946109582292081}
+    unoptimized: {'mean': 492.9064424504759, 'median': 492.7712094009621, 'std': 1.1670846093769405}
 
 
 
@@ -668,7 +668,7 @@ profiling/benchmarking.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 31 minutes  24.128 seconds)
+   **Total running time of the script:** ( 7 minutes  1.585 seconds)
 
 
 .. _sphx_glr_download_tutorial_autotvm_relay_x86.py:
diff --git a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
index 528707a0e..564dd9b7a 100644
--- a/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/tutorial/cross_compilation_and_rpc.rst.txt
@@ -235,7 +235,7 @@ device and returns the measured cost. Network overhead is excluded.
 
  .. code-block:: none
 
-    1.306e-07 secs/op
+    1.268e-07 secs/op
 
 
 
diff --git a/docs/_sources/tutorial/intro_topi.rst.txt b/docs/_sources/tutorial/intro_topi.rst.txt
index 6863d6e3d..6566d6ba6 100644
--- a/docs/_sources/tutorial/intro_topi.rst.txt
+++ b/docs/_sources/tutorial/intro_topi.rst.txt
@@ -233,7 +233,7 @@ As you can see, scheduled stages of computation have been accumulated and we can
 
  .. code-block:: none
 
-    [stage(a, placeholder(a, 0xe6e4430)), stage(b, placeholder(b, 0xc515230)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min= [...]
+    [stage(a, placeholder(a, 0x51115a0)), stage(b, placeholder(b, 0x94499d0)), stage(T_add, compute(T_add, body=[(a[ax0, ax1, ax2] + b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min=0, ext=10))], reduce_axis=[], tag=broadcast, attrs={})), stage(T_multiply, compute(T_multiply, body=[(a[ax0, ax1, ax2]*b[ax1, ax2])], axis=[iter_var(ax0, range(min=0, ext=100)), iter_var(ax1, range(min=0, ext=10)), iter_var(ax2, range(min= [...]
 
 
 
diff --git a/docs/_sources/tutorial/sg_execution_times.rst.txt b/docs/_sources/tutorial/sg_execution_times.rst.txt
index 5241fd6de..797e33b46 100644
--- a/docs/_sources/tutorial/sg_execution_times.rst.txt
+++ b/docs/_sources/tutorial/sg_execution_times.rst.txt
@@ -5,17 +5,17 @@
 
 Computation times
 =================
-**34:07.347** total execution time for **tutorial** files:
+**09:37.106** total execution time for **tutorial** files:
 
-- **31:24.128**: :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)
-- **01:01.829**: :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)
-- **00:53.038**: :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``)
-- **00:26.385**: :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)
-- **00:20.287**: :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)
-- **00:00.715**: :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)
-- **00:00.606**: :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)
-- **00:00.215**: :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``)
-- **00:00.042**: :ref:`sphx_glr_tutorial_introduction.py` (``introduction.py``)
+- **07:01.585**: :ref:`sphx_glr_tutorial_autotvm_relay_x86.py` (``autotvm_relay_x86.py``)
+- **01:01.938**: :ref:`sphx_glr_tutorial_tensor_expr_get_started.py` (``tensor_expr_get_started.py``)
+- **00:46.121**: :ref:`sphx_glr_tutorial_auto_scheduler_matmul_x86.py` (``auto_scheduler_matmul_x86.py``)
+- **00:26.324**: :ref:`sphx_glr_tutorial_relay_quick_start.py` (``relay_quick_start.py``)
+- **00:19.482**: :ref:`sphx_glr_tutorial_autotvm_matmul_x86.py` (``autotvm_matmul_x86.py``)
+- **00:00.716**: :ref:`sphx_glr_tutorial_intro_topi.py` (``intro_topi.py``)
+- **00:00.571**: :ref:`sphx_glr_tutorial_tensor_ir_blitz_course.py` (``tensor_ir_blitz_course.py``)
+- **00:00.228**: :ref:`sphx_glr_tutorial_cross_compilation_and_rpc.py` (``cross_compilation_and_rpc.py``)
+- **00:00.036**: :ref:`sphx_glr_tutorial_introduction.py` (``introduction.py``)
 - **00:00.036**: :ref:`sphx_glr_tutorial_tvmc_python.py` (``tvmc_python.py``)
+- **00:00.035**: :ref:`sphx_glr_tutorial_install.py` (``install.py``)
 - **00:00.033**: :ref:`sphx_glr_tutorial_tvmc_command_line_driver.py` (``tvmc_command_line_driver.py``)
-- **00:00.033**: :ref:`sphx_glr_tutorial_install.py` (``install.py``)
diff --git a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
index 7819cf564..c89a521e5 100644
--- a/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
+++ b/docs/_sources/tutorial/tensor_expr_get_started.rst.txt
@@ -335,7 +335,7 @@ compile and run this new schedule with the parallel operation applied:
 
  .. code-block:: none
 
-    parallel: 0.000011
+    parallel: 0.000006
 
 
 
@@ -388,7 +388,7 @@ factor to be the number of threads on your CPU.
 
  .. code-block:: none
 
-    vector: 0.000045
+    vector: 0.000025
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [(stride: int32*n: int32)], [], type="auto"),
@@ -438,10 +438,10 @@ We can now compare the different schedules
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                   numpy    8.241979958256707e-06                    1.0
-                   naive               5.824e-06      0.7066263239533351
-                parallel             1.14998e-05      1.3952715316275006
-                  vector             4.51475e-05       5.477749306435989
+                   numpy    7.707279874011874e-06                    1.0
+                   naive    5.8588000000000006e-06    0.7601644284068688
+                parallel              6.1357e-06      0.7960915005420949
+                  vector    2.4720800000000002e-05      3.20746105034487
 
 
 
@@ -830,7 +830,7 @@ matrix multiplication.
 
  .. code-block:: none
 
-    Numpy running time: 0.018290
+    Numpy running time: 0.018506
 
 
 
@@ -886,7 +886,7 @@ optimizations.
 
  .. code-block:: none
 
-    none: 3.494150
+    none: 3.481866
 
 
 
@@ -985,7 +985,7 @@ schedule.
 
  .. code-block:: none
 
-    blocking: 0.293640
+    blocking: 0.304842
 
 
 
@@ -1077,7 +1077,7 @@ already cache friendly from our previous optimizations.
 
  .. code-block:: none
 
-    vectorization: 0.330230
+    vectorization: 0.341577
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1149,7 +1149,7 @@ more cache friendly.
 
  .. code-block:: none
 
-    loop permutation: 0.115239
+    loop permutation: 0.116889
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1246,7 +1246,7 @@ optimized schedule.
 
  .. code-block:: none
 
-    array packing: 0.109536
+    array packing: 0.110665
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1337,7 +1337,7 @@ to `C` when all the block results are ready.
 
  .. code-block:: none
 
-    block caching: 0.110827
+    block caching: 0.110986
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1421,7 +1421,7 @@ of thread-level parallelization.
 
  .. code-block:: none
 
-    parallelization: 0.144832
+    parallelization: 0.144117
     @main = primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
       attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
       buffers = {A: Buffer(A_2: Pointer(float32), float32, [1048576], []),
@@ -1500,13 +1500,13 @@ working, we can compare the results.
  .. code-block:: none
 
                 Operator                  Timing             Performance
-                    none            3.4941498621                     1.0
-                blocking             0.293639645     0.08403750743064044
-           vectorization     0.33023015850000004     0.09450944336472454
-        loop permutation            0.1152391511     0.03298059775568434
-           array packing     0.10953616340000001     0.03134844460682871
-           block caching            0.1108266685     0.03171777767808524
-         parallelization     0.14483214490000001     0.04144989500048382
+                    none      3.4818661379999996                     1.0
+                blocking             0.304842499     0.08755147007894536
+           vectorization     0.34157665649999996     0.09810160499053626
+        loop permutation     0.11688866210000001     0.03357069383693809
+           array packing            0.1106652892    0.031783326760392534
+           block caching            0.1109861249     0.03187547151475242
+         parallelization             0.144116947     0.04139072017363122
 
 
 
@@ -1543,7 +1543,7 @@ the computation for specific platforms.
 
 .. rst-class:: sphx-glr-timing
 
-   **Total running time of the script:** ( 1 minutes  1.829 seconds)
+   **Total running time of the script:** ( 1 minutes  1.938 seconds)
 
 
 .. _sphx_glr_download_tutorial_tensor_expr_get_started.py:
diff --git a/docs/commit_hash b/docs/commit_hash
index b2d511267..b56def866 100644
--- a/docs/commit_hash
+++ b/docs/commit_hash
@@ -1 +1 @@
-b6b0bafdef15bb5491c38770668ddf73ddd02af2
+8eae317d28622238c0a6c0f22c0d4a8f9e62f883
diff --git a/docs/how_to/compile_models/from_mxnet.html b/docs/how_to/compile_models/from_mxnet.html
index 4a37424b7..3df4a7486 100644
--- a/docs/how_to/compile_models/from_mxnet.html
+++ b/docs/how_to/compile_models/from_mxnet.html
@@ -401,7 +401,7 @@
 </div>
 <img alt="../../_images/sphx_glr_from_mxnet_001.png" class="sphx-glr-single-img" src="../../_images/sphx_glr_from_mxnet_001.png" />
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip1902edea-a301-4f5a-baf4-31e1d3057145 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/resnet18_v1-a0666292.zip0eb6e8fc-3d27-4193-9d1a-e0aa100151b4 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1-a0666292.zip...
 x (1, 3, 224, 224)
 </pre></div>
 </div>
diff --git a/docs/how_to/compile_models/from_oneflow.html b/docs/how_to/compile_models/from_oneflow.html
index bd484f609..6c49b5768 100644
--- a/docs/how_to/compile_models/from_oneflow.html
+++ b/docs/how_to/compile_models/from_oneflow.html
@@ -406,54 +406,99 @@ python3 -m pip install -f https://release.oneflow.info <span class="nv">oneflow<
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/flowvision/classification/ResNet/resnet18.zip&quot; to /workspace/.oneflow/flowvision_cache/resnet18.zip
 
   0%|          | 0.00/41.5M [00:00&lt;?, ?B/s]
-  0%|          | 16.0k/41.5M [00:00&lt;08:20, 86.8kB/s]
-  0%|          | 48.0k/41.5M [00:00&lt;05:16, 137kB/s]
-  0%|          | 96.0k/41.5M [00:00&lt;03:44, 193kB/s]
-  0%|          | 160k/41.5M [00:00&lt;02:51, 253kB/s]
-  1%|          | 288k/41.5M [00:00&lt;01:45, 411kB/s]
-  1%|1         | 464k/41.5M [00:01&lt;01:12, 594kB/s]
-  1%|1         | 600k/41.5M [00:01&lt;01:07, 640kB/s]
-  2%|2         | 952k/41.5M [00:01&lt;00:40, 1.04MB/s]
-  3%|3         | 1.35M/41.5M [00:01&lt;00:29, 1.44MB/s]
-  5%|5         | 2.09M/41.5M [00:01&lt;00:18, 2.26MB/s]
-  9%|8         | 3.56M/41.5M [00:02&lt;00:09, 4.06MB/s]
- 12%|#2        | 5.03M/41.5M [00:02&lt;00:06, 5.92MB/s]
- 14%|#4        | 5.92M/41.5M [00:02&lt;00:05, 6.37MB/s]
- 16%|#5        | 6.59M/41.5M [00:02&lt;00:06, 6.03MB/s]
- 19%|#9        | 7.97M/41.5M [00:02&lt;00:04, 7.16MB/s]
- 23%|##2       | 9.39M/41.5M [00:02&lt;00:03, 8.90MB/s]
- 25%|##4       | 10.3M/41.5M [00:02&lt;00:03, 8.32MB/s]
- 27%|##6       | 11.2M/41.5M [00:03&lt;00:04, 7.02MB/s]
- 30%|##9       | 12.4M/41.5M [00:03&lt;00:04, 6.96MB/s]
- 33%|###3      | 13.9M/41.5M [00:03&lt;00:03, 7.35MB/s]
- 37%|###6      | 15.3M/41.5M [00:03&lt;00:03, 8.89MB/s]
- 39%|###9      | 16.3M/41.5M [00:03&lt;00:03, 8.35MB/s]
- 41%|####1     | 17.1M/41.5M [00:03&lt;00:03, 7.82MB/s]
- 44%|####4     | 18.3M/41.5M [00:03&lt;00:02, 8.76MB/s]
- 46%|####6     | 19.1M/41.5M [00:04&lt;00:02, 8.10MB/s]
- 48%|####8     | 20.0M/41.5M [00:04&lt;00:03, 7.52MB/s]
- 51%|#####1    | 21.2M/41.5M [00:04&lt;00:02, 8.75MB/s]
- 53%|#####3    | 22.1M/41.5M [00:04&lt;00:02, 8.08MB/s]
- 55%|#####5    | 22.9M/41.5M [00:04&lt;00:02, 7.47MB/s]
- 58%|#####8    | 24.1M/41.5M [00:04&lt;00:02, 8.66MB/s]
- 60%|######    | 25.0M/41.5M [00:04&lt;00:02, 8.11MB/s]
- 62%|######2   | 25.8M/41.5M [00:04&lt;00:02, 7.49MB/s]
- 65%|######5   | 27.0M/41.5M [00:05&lt;00:01, 8.67MB/s]
- 67%|######7   | 27.9M/41.5M [00:05&lt;00:01, 8.09MB/s]
- 69%|######9   | 28.7M/41.5M [00:05&lt;00:01, 7.46MB/s]
- 72%|#######2  | 30.0M/41.5M [00:05&lt;00:01, 8.69MB/s]
- 74%|#######4  | 30.9M/41.5M [00:05&lt;00:01, 8.13MB/s]
- 76%|#######6  | 31.7M/41.5M [00:05&lt;00:01, 7.48MB/s]
- 79%|#######9  | 32.9M/41.5M [00:05&lt;00:01, 8.72MB/s]
- 81%|########1 | 33.8M/41.5M [00:05&lt;00:00, 8.14MB/s]
- 83%|########3 | 34.6M/41.5M [00:06&lt;00:00, 7.50MB/s]
- 86%|########6 | 35.9M/41.5M [00:06&lt;00:00, 8.75MB/s]
- 89%|########8 | 36.7M/41.5M [00:06&lt;00:00, 8.15MB/s]
- 90%|######### | 37.5M/41.5M [00:06&lt;00:00, 7.51MB/s]
- 94%|#########3| 38.8M/41.5M [00:06&lt;00:00, 8.71MB/s]
- 96%|#########5| 39.7M/41.5M [00:06&lt;00:00, 8.15MB/s]
- 98%|#########7| 40.5M/41.5M [00:06&lt;00:00, 7.51MB/s]
-100%|##########| 41.5M/41.5M [00:06&lt;00:00, 6.32MB/s]
+  0%|          | 16.0k/41.5M [00:00&lt;08:38, 83.8kB/s]
+  0%|          | 48.0k/41.5M [00:00&lt;05:45, 126kB/s]
+  0%|          | 96.0k/41.5M [00:00&lt;04:13, 172kB/s]
+  0%|          | 160k/41.5M [00:00&lt;03:03, 236kB/s]
+  1%|          | 216k/41.5M [00:01&lt;02:49, 256kB/s]
+  1%|          | 280k/41.5M [00:01&lt;02:38, 272kB/s]
+  1%|          | 336k/41.5M [00:01&lt;02:39, 270kB/s]
+  1%|          | 408k/41.5M [00:01&lt;02:20, 306kB/s]
+  1%|1         | 480k/41.5M [00:01&lt;02:11, 327kB/s]
+  1%|1         | 552k/41.5M [00:02&lt;02:09, 332kB/s]
+  2%|1         | 640k/41.5M [00:02&lt;01:59, 358kB/s]
+  2%|1         | 720k/41.5M [00:02&lt;01:52, 380kB/s]
+  2%|1         | 808k/41.5M [00:02&lt;01:46, 401kB/s]
+  2%|2         | 904k/41.5M [00:02&lt;01:42, 417kB/s]
+  2%|2         | 0.98M/41.5M [00:03&lt;01:36, 440kB/s]
+  3%|2         | 1.09M/41.5M [00:03&lt;01:29, 476kB/s]
+  3%|2         | 1.19M/41.5M [00:03&lt;01:25, 492kB/s]
+  3%|3         | 1.30M/41.5M [00:03&lt;01:23, 503kB/s]
+  3%|3         | 1.41M/41.5M [00:03&lt;01:19, 529kB/s]
+  4%|3         | 1.54M/41.5M [00:04&lt;01:12, 576kB/s]
+  4%|4         | 1.67M/41.5M [00:04&lt;01:08, 614kB/s]
+  4%|4         | 1.80M/41.5M [00:04&lt;01:06, 624kB/s]
+  5%|4         | 1.95M/41.5M [00:04&lt;01:04, 641kB/s]
+  5%|5         | 2.09M/41.5M [00:04&lt;00:59, 694kB/s]
+  5%|5         | 2.26M/41.5M [00:05&lt;00:55, 744kB/s]
+  6%|5         | 2.41M/41.5M [00:05&lt;00:54, 749kB/s]
+  6%|6         | 2.59M/41.5M [00:05&lt;00:51, 788kB/s]
+  7%|6         | 2.77M/41.5M [00:05&lt;00:47, 848kB/s]
+  7%|7         | 2.97M/41.5M [00:05&lt;00:44, 899kB/s]
+  8%|7         | 3.16M/41.5M [00:06&lt;00:43, 914kB/s]
+  8%|8         | 3.37M/41.5M [00:06&lt;00:42, 943kB/s]
+  9%|8         | 3.59M/41.5M [00:06&lt;00:38, 1.02MB/s]
+  9%|9         | 3.82M/41.5M [00:06&lt;00:36, 1.08MB/s]
+ 10%|9         | 4.05M/41.5M [00:06&lt;00:35, 1.10MB/s]
+ 10%|#         | 4.31M/41.5M [00:07&lt;00:33, 1.15MB/s]
+ 11%|#1        | 4.57M/41.5M [00:07&lt;00:31, 1.23MB/s]
+ 12%|#1        | 4.85M/41.5M [00:07&lt;00:29, 1.30MB/s]
+ 12%|#2        | 5.15M/41.5M [00:07&lt;00:28, 1.34MB/s]
+ 13%|#3        | 5.45M/41.5M [00:07&lt;00:27, 1.39MB/s]
+ 14%|#3        | 5.76M/41.5M [00:08&lt;00:25, 1.48MB/s]
+ 15%|#4        | 6.08M/41.5M [00:08&lt;00:24, 1.55MB/s]
+ 15%|#5        | 6.41M/41.5M [00:08&lt;00:23, 1.57MB/s]
+ 16%|#6        | 6.77M/41.5M [00:08&lt;00:18, 1.93MB/s]
+ 17%|#6        | 6.98M/41.5M [00:08&lt;00:18, 1.93MB/s]
+ 17%|#7        | 7.18M/41.5M [00:09&lt;00:21, 1.66MB/s]
+ 18%|#8        | 7.53M/41.5M [00:09&lt;00:20, 1.73MB/s]
+ 19%|#9        | 7.95M/41.5M [00:09&lt;00:19, 1.83MB/s]
+ 20%|##        | 8.38M/41.5M [00:09&lt;00:18, 1.92MB/s]
+ 21%|##1       | 8.84M/41.5M [00:09&lt;00:16, 2.11MB/s]
+ 22%|##2       | 9.32M/41.5M [00:09&lt;00:13, 2.49MB/s]
+ 24%|##3       | 9.77M/41.5M [00:10&lt;00:11, 2.84MB/s]
+ 24%|##4       | 10.1M/41.5M [00:10&lt;00:12, 2.54MB/s]
+ 25%|##5       | 10.4M/41.5M [00:10&lt;00:14, 2.27MB/s]
+ 26%|##6       | 11.0M/41.5M [00:10&lt;00:10, 2.99MB/s]
+ 27%|##7       | 11.3M/41.5M [00:10&lt;00:10, 3.01MB/s]
+ 28%|##7       | 11.6M/41.5M [00:10&lt;00:12, 2.54MB/s]
+ 29%|##9       | 12.2M/41.5M [00:11&lt;00:10, 2.82MB/s]
+ 31%|###1      | 12.9M/41.5M [00:11&lt;00:08, 3.53MB/s]
+ 32%|###1      | 13.2M/41.5M [00:11&lt;00:08, 3.38MB/s]
+ 33%|###2      | 13.6M/41.5M [00:11&lt;00:08, 3.36MB/s]
+ 34%|###4      | 14.3M/41.5M [00:11&lt;00:07, 4.06MB/s]
+ 35%|###5      | 14.7M/41.5M [00:11&lt;00:07, 3.66MB/s]
+ 37%|###6      | 15.1M/41.5M [00:11&lt;00:07, 3.69MB/s]
+ 38%|###7      | 15.7M/41.5M [00:11&lt;00:06, 4.08MB/s]
+ 39%|###8      | 16.1M/41.5M [00:12&lt;00:07, 3.36MB/s]
+ 41%|####      | 16.8M/41.5M [00:12&lt;00:06, 4.18MB/s]
+ 42%|####2     | 17.6M/41.5M [00:12&lt;00:05, 4.96MB/s]
+ 44%|####3     | 18.1M/41.5M [00:12&lt;00:05, 4.44MB/s]
+ 45%|####5     | 18.7M/41.5M [00:12&lt;00:05, 4.48MB/s]
+ 47%|####6     | 19.4M/41.5M [00:12&lt;00:04, 4.87MB/s]
+ 48%|####7     | 19.8M/41.5M [00:12&lt;00:05, 4.09MB/s]
+ 50%|#####     | 20.8M/41.5M [00:13&lt;00:04, 5.27MB/s]
+ 52%|#####1    | 21.5M/41.5M [00:13&lt;00:03, 5.80MB/s]
+ 53%|#####3    | 22.1M/41.5M [00:13&lt;00:04, 4.94MB/s]
+ 56%|#####5    | 23.1M/41.5M [00:13&lt;00:03, 5.83MB/s]
+ 57%|#####7    | 23.8M/41.5M [00:13&lt;00:02, 6.28MB/s]
+ 59%|#####8    | 24.4M/41.5M [00:13&lt;00:03, 5.10MB/s]
+ 62%|######1   | 25.6M/41.5M [00:13&lt;00:02, 6.65MB/s]
+ 64%|######3   | 26.5M/41.5M [00:13&lt;00:02, 7.27MB/s]
+ 66%|######5   | 27.3M/41.5M [00:14&lt;00:02, 6.18MB/s]
+ 69%|######8   | 28.4M/41.5M [00:14&lt;00:01, 7.41MB/s]
+ 70%|#######   | 29.2M/41.5M [00:14&lt;00:01, 7.33MB/s]
+ 72%|#######2  | 30.0M/41.5M [00:14&lt;00:02, 5.90MB/s]
+ 76%|#######5  | 31.4M/41.5M [00:14&lt;00:01, 7.08MB/s]
+ 79%|#######9  | 32.8M/41.5M [00:14&lt;00:01, 7.35MB/s]
+ 83%|########2 | 34.3M/41.5M [00:14&lt;00:00, 8.86MB/s]
+ 85%|########4 | 35.2M/41.5M [00:15&lt;00:00, 7.87MB/s]
+ 87%|########6 | 36.0M/41.5M [00:15&lt;00:00, 6.53MB/s]
+ 90%|########9 | 37.2M/41.5M [00:15&lt;00:00, 6.65MB/s]
+ 93%|#########3| 38.7M/41.5M [00:15&lt;00:00, 8.34MB/s]
+ 95%|#########5| 39.6M/41.5M [00:15&lt;00:00, 8.45MB/s]
+ 98%|#########7| 40.5M/41.5M [00:15&lt;00:00, 6.84MB/s]
+100%|##########| 41.5M/41.5M [00:16&lt;00:00, 2.72MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_paddle.html b/docs/how_to/compile_models/from_paddle.html
index fe0296855..e080e4d9d 100644
--- a/docs/how_to/compile_models/from_paddle.html
+++ b/docs/how_to/compile_models/from_paddle.html
@@ -464,7 +464,7 @@ A quick solution is</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>TVM prediction top-1 id: 282, class name:  282: &#39;tiger cat&#39;,
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  13.822 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  5.545 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-paddle-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/16269b77359771348d507395692524cf/from_paddle.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_paddle.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/from_pytorch.html b/docs/how_to/compile_models/from_pytorch.html
index 120c3d0a2..fa5db7a5e 100644
--- a/docs/how_to/compile_models/from_pytorch.html
+++ b/docs/how_to/compile_models/from_pytorch.html
@@ -387,10 +387,8 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/resnet18-f37072fd.pth&quot; to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
 
   0%|          | 0.00/44.7M [00:00&lt;?, ?B/s]
- 15%|#5        | 6.88M/44.7M [00:00&lt;00:00, 72.1MB/s]
- 50%|####9     | 22.2M/44.7M [00:00&lt;00:00, 124MB/s]
- 92%|#########1| 40.9M/44.7M [00:00&lt;00:00, 149MB/s]
-100%|##########| 44.7M/44.7M [00:00&lt;00:00, 124MB/s]
+ 42%|####1     | 18.7M/44.7M [00:00&lt;00:00, 197MB/s]
+100%|##########| 44.7M/44.7M [00:00&lt;00:00, 242MB/s]
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/compile_models/from_tensorflow.html b/docs/how_to/compile_models/from_tensorflow.html
index 465220e85..5151fb6d4 100644
--- a/docs/how_to/compile_models/from_tensorflow.html
+++ b/docs/how_to/compile_models/from_tensorflow.html
@@ -607,7 +607,7 @@ banana (score = 0.00022)
 desk (score = 0.00019)
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  2.951 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  3.090 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-compile-models-from-tensorflow-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7f1d3d1b878694c201c614c807cdebc8/from_tensorflow.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">from_tensorflow.py</span></code></a></p>
diff --git a/docs/how_to/compile_models/sg_execution_times.html b/docs/how_to/compile_models/sg_execution_times.html
index b8342b4a7..ec4a8867d 100644
--- a/docs/how_to/compile_models/sg_execution_times.html
+++ b/docs/how_to/compile_models/sg_execution_times.html
@@ -300,18 +300,18 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-compile-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>05:29.219</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
+<p><strong>05:32.921</strong> total execution time for <strong>how_to_compile_models</strong> files:</p>
 <ul class="simple">
-<li><p><strong>01:13.822</strong>: <a class="reference internal" href="from_paddle.html#sphx-glr-how-to-compile-models-from-paddle-py"><span class="std std-ref">Compile PaddlePaddle Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_paddle.py</span></code>)</p></li>
-<li><p><strong>01:02.951</strong>: <a class="reference internal" href="from_tensorflow.html#sphx-glr-how-to-compile-models-from-tensorflow-py"><span class="std std-ref">Compile Tensorflow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tensorflow.py</span></code>)</p></li>
-<li><p><strong>00:57.140</strong>: <a class="reference internal" href="from_darknet.html#sphx-glr-how-to-compile-models-from-darknet-py"><span class="std std-ref">Compile YOLO-V2 and YOLO-V3 in DarkNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_darknet.py</span></code>)</p></li>
-<li><p><strong>00:31.855</strong>: <a class="reference internal" href="from_oneflow.html#sphx-glr-how-to-compile-models-from-oneflow-py"><span class="std std-ref">Compile OneFlow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_oneflow.py</span></code>)</p></li>
-<li><p><strong>00:25.538</strong>: <a class="reference internal" href="from_tflite.html#sphx-glr-how-to-compile-models-from-tflite-py"><span class="std std-ref">Compile TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tflite.py</span></code>)</p></li>
-<li><p><strong>00:21.532</strong>: <a class="reference internal" href="from_coreml.html#sphx-glr-how-to-compile-models-from-coreml-py"><span class="std std-ref">Compile CoreML Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_coreml.py</span></code>)</p></li>
-<li><p><strong>00:21.205</strong>: <a class="reference internal" href="from_mxnet.html#sphx-glr-how-to-compile-models-from-mxnet-py"><span class="std std-ref">Compile MXNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_mxnet.py</span></code>)</p></li>
-<li><p><strong>00:18.939</strong>: <a class="reference internal" href="from_pytorch.html#sphx-glr-how-to-compile-models-from-pytorch-py"><span class="std std-ref">Compile PyTorch Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_pytorch.py</span></code>)</p></li>
-<li><p><strong>00:13.521</strong>: <a class="reference internal" href="from_keras.html#sphx-glr-how-to-compile-models-from-keras-py"><span class="std std-ref">Compile Keras Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_keras.py</span></code>)</p></li>
-<li><p><strong>00:02.716</strong>: <a class="reference internal" href="from_onnx.html#sphx-glr-how-to-compile-models-from-onnx-py"><span class="std std-ref">Compile ONNX Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_onnx.py</span></code>)</p></li>
+<li><p><strong>01:05.545</strong>: <a class="reference internal" href="from_paddle.html#sphx-glr-how-to-compile-models-from-paddle-py"><span class="std std-ref">Compile PaddlePaddle Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_paddle.py</span></code>)</p></li>
+<li><p><strong>01:03.090</strong>: <a class="reference internal" href="from_tensorflow.html#sphx-glr-how-to-compile-models-from-tensorflow-py"><span class="std std-ref">Compile Tensorflow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tensorflow.py</span></code>)</p></li>
+<li><p><strong>00:57.015</strong>: <a class="reference internal" href="from_darknet.html#sphx-glr-how-to-compile-models-from-darknet-py"><span class="std std-ref">Compile YOLO-V2 and YOLO-V3 in DarkNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_darknet.py</span></code>)</p></li>
+<li><p><strong>00:40.008</strong>: <a class="reference internal" href="from_oneflow.html#sphx-glr-how-to-compile-models-from-oneflow-py"><span class="std std-ref">Compile OneFlow Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_oneflow.py</span></code>)</p></li>
+<li><p><strong>00:27.136</strong>: <a class="reference internal" href="from_tflite.html#sphx-glr-how-to-compile-models-from-tflite-py"><span class="std std-ref">Compile TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_tflite.py</span></code>)</p></li>
+<li><p><strong>00:22.196</strong>: <a class="reference internal" href="from_mxnet.html#sphx-glr-how-to-compile-models-from-mxnet-py"><span class="std std-ref">Compile MXNet Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_mxnet.py</span></code>)</p></li>
+<li><p><strong>00:21.117</strong>: <a class="reference internal" href="from_coreml.html#sphx-glr-how-to-compile-models-from-coreml-py"><span class="std std-ref">Compile CoreML Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_coreml.py</span></code>)</p></li>
+<li><p><strong>00:19.158</strong>: <a class="reference internal" href="from_pytorch.html#sphx-glr-how-to-compile-models-from-pytorch-py"><span class="std std-ref">Compile PyTorch Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_pytorch.py</span></code>)</p></li>
+<li><p><strong>00:14.886</strong>: <a class="reference internal" href="from_keras.html#sphx-glr-how-to-compile-models-from-keras-py"><span class="std std-ref">Compile Keras Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_keras.py</span></code>)</p></li>
+<li><p><strong>00:02.770</strong>: <a class="reference internal" href="from_onnx.html#sphx-glr-how-to-compile-models-from-onnx-py"><span class="std std-ref">Compile ONNX Models</span></a> (<code class="docutils literal notranslate"><span class="pre">from_onnx.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/deploy_models/deploy_model_on_android.html b/docs/how_to/deploy_models/deploy_model_on_android.html
index a1d7eb459..b9465a62a 100644
--- a/docs/how_to/deploy_models/deploy_model_on_android.html
+++ b/docs/how_to/deploy_models/deploy_model_on_android.html
@@ -622,7 +622,7 @@ to the remote android device.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  16.1746      16.0475      16.9060      15.8896       0.3173
+  16.2819      16.1866      16.7154      16.1016       0.2104
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
index 9edb862af..a0bd9dd91 100644
--- a/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
+++ b/docs/how_to/deploy_models/deploy_object_detection_pytorch.html
@@ -409,89 +409,40 @@ be unstable.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth&quot; to /workspace/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
 
   0%|          | 0.00/170M [00:00&lt;?, ?B/s]
-  1%|          | 1.25M/170M [00:00&lt;00:13, 12.9MB/s]
-  3%|2         | 4.25M/170M [00:00&lt;00:07, 23.5MB/s]
-  4%|3         | 6.50M/170M [00:00&lt;00:07, 22.2MB/s]
-  6%|5         | 9.38M/170M [00:00&lt;00:06, 25.1MB/s]
-  7%|7         | 12.2M/170M [00:00&lt;00:06, 26.5MB/s]
-  9%|8         | 14.8M/170M [00:00&lt;00:07, 22.6MB/s]
- 11%|#         | 18.2M/170M [00:00&lt;00:05, 26.5MB/s]
- 12%|#2        | 20.9M/170M [00:00&lt;00:06, 25.2MB/s]
- 14%|#3        | 23.4M/170M [00:01&lt;00:08, 18.9MB/s]
- 15%|#4        | 25.4M/170M [00:01&lt;00:08, 17.7MB/s]
- 16%|#6        | 27.3M/170M [00:01&lt;00:08, 16.8MB/s]
- 18%|#7        | 29.8M/170M [00:01&lt;00:07, 19.0MB/s]
- 19%|#8        | 32.2M/170M [00:01&lt;00:07, 20.4MB/s]
- 20%|##        | 34.5M/170M [00:01&lt;00:06, 21.3MB/s]
- 22%|##1       | 36.6M/170M [00:01&lt;00:08, 16.7MB/s]
- 23%|##2       | 38.6M/170M [00:02&lt;00:07, 17.6MB/s]
- 24%|##3       | 40.5M/170M [00:02&lt;00:07, 17.4MB/s]
- 25%|##4       | 42.4M/170M [00:02&lt;00:07, 18.2MB/s]
- 27%|##6       | 45.3M/170M [00:02&lt;00:06, 20.6MB/s]
- 28%|##7       | 47.4M/170M [00:02&lt;00:06, 18.8MB/s]
- 29%|##9       | 49.5M/170M [00:02&lt;00:06, 18.8MB/s]
- 30%|###       | 51.3M/170M [00:02&lt;00:07, 17.3MB/s]
- 31%|###1      | 53.0M/170M [00:02&lt;00:07, 16.7MB/s]
- 32%|###2      | 54.6M/170M [00:02&lt;00:07, 15.6MB/s]
- 33%|###3      | 56.2M/170M [00:03&lt;00:07, 15.7MB/s]
- 34%|###3      | 57.7M/170M [00:03&lt;00:07, 14.9MB/s]
- 35%|###5      | 59.6M/170M [00:03&lt;00:07, 15.9MB/s]
- 37%|###6      | 62.6M/170M [00:03&lt;00:05, 19.4MB/s]
- 39%|###8      | 65.9M/170M [00:03&lt;00:04, 23.5MB/s]
- 40%|####      | 68.4M/170M [00:03&lt;00:04, 23.7MB/s]
- 42%|####1     | 70.7M/170M [00:03&lt;00:04, 23.7MB/s]
- 43%|####2     | 73.0M/170M [00:03&lt;00:04, 21.3MB/s]
- 44%|####4     | 75.1M/170M [00:04&lt;00:06, 15.3MB/s]
- 46%|####5     | 77.7M/170M [00:04&lt;00:05, 17.8MB/s]
- 47%|####6     | 79.7M/170M [00:04&lt;00:05, 18.3MB/s]
- 48%|####8     | 81.7M/170M [00:04&lt;00:04, 18.8MB/s]
- 49%|####9     | 83.6M/170M [00:04&lt;00:05, 17.9MB/s]
- 51%|#####     | 85.9M/170M [00:04&lt;00:04, 19.4MB/s]
- 52%|#####2    | 88.7M/170M [00:04&lt;00:03, 22.2MB/s]
- 54%|#####3    | 90.9M/170M [00:04&lt;00:04, 18.4MB/s]
- 55%|#####4    | 93.3M/170M [00:05&lt;00:04, 19.9MB/s]
- 56%|#####6    | 95.4M/170M [00:05&lt;00:04, 18.5MB/s]
- 57%|#####7    | 97.2M/170M [00:05&lt;00:04, 17.1MB/s]
- 58%|#####8    | 99.0M/170M [00:05&lt;00:04, 17.1MB/s]
- 59%|#####9    | 101M/170M [00:05&lt;00:04, 17.1MB/s]
- 60%|######    | 102M/170M [00:05&lt;00:04, 16.1MB/s]
- 61%|######1   | 104M/170M [00:05&lt;00:05, 13.7MB/s]
- 62%|######1   | 105M/170M [00:05&lt;00:05, 12.4MB/s]
- 63%|######2   | 107M/170M [00:06&lt;00:04, 13.5MB/s]
- 64%|######4   | 109M/170M [00:06&lt;00:04, 15.1MB/s]
- 65%|######5   | 111M/170M [00:06&lt;00:03, 16.3MB/s]
- 67%|######7   | 115M/170M [00:06&lt;00:02, 22.2MB/s]
- 69%|######8   | 117M/170M [00:06&lt;00:02, 21.9MB/s]
- 71%|#######   | 120M/170M [00:06&lt;00:02, 24.5MB/s]
- 72%|#######1  | 122M/170M [00:06&lt;00:02, 21.4MB/s]
- 73%|#######3  | 124M/170M [00:06&lt;00:02, 19.0MB/s]
- 74%|#######4  | 126M/170M [00:07&lt;00:02, 18.4MB/s]
- 75%|#######5  | 128M/170M [00:07&lt;00:02, 17.7MB/s]
- 76%|#######6  | 130M/170M [00:07&lt;00:02, 16.9MB/s]
- 77%|#######7  | 131M/170M [00:07&lt;00:02, 15.1MB/s]
- 78%|#######8  | 133M/170M [00:07&lt;00:02, 14.6MB/s]
- 79%|#######9  | 134M/170M [00:07&lt;00:02, 13.7MB/s]
- 80%|########  | 136M/170M [00:07&lt;00:02, 14.9MB/s]
- 81%|########1 | 138M/170M [00:07&lt;00:02, 15.8MB/s]
- 82%|########2 | 140M/170M [00:07&lt;00:02, 15.2MB/s]
- 83%|########3 | 141M/170M [00:08&lt;00:02, 13.7MB/s]
- 84%|########3 | 143M/170M [00:08&lt;00:02, 14.3MB/s]
- 85%|########4 | 144M/170M [00:08&lt;00:01, 14.7MB/s]
- 86%|########5 | 146M/170M [00:08&lt;00:01, 15.3MB/s]
- 87%|########6 | 147M/170M [00:08&lt;00:01, 15.5MB/s]
- 88%|########7 | 149M/170M [00:08&lt;00:01, 15.5MB/s]
- 89%|########8 | 151M/170M [00:08&lt;00:01, 17.2MB/s]
- 90%|######### | 153M/170M [00:08&lt;00:01, 17.7MB/s]
- 91%|#########1| 155M/170M [00:08&lt;00:00, 18.9MB/s]
- 92%|#########2| 157M/170M [00:09&lt;00:00, 16.8MB/s]
- 94%|#########3| 159M/170M [00:09&lt;00:00, 17.5MB/s]
- 95%|#########4| 161M/170M [00:09&lt;00:00, 17.1MB/s]
- 96%|#########5| 162M/170M [00:09&lt;00:00, 16.4MB/s]
- 97%|#########6| 164M/170M [00:09&lt;00:00, 16.8MB/s]
- 98%|#########7| 166M/170M [00:09&lt;00:00, 15.2MB/s]
- 98%|#########8| 167M/170M [00:09&lt;00:00, 13.4MB/s]
- 99%|#########9| 169M/170M [00:09&lt;00:00, 13.1MB/s]
-100%|##########| 170M/170M [00:10&lt;00:00, 17.7MB/s]
+  3%|3         | 5.56M/170M [00:00&lt;00:02, 58.3MB/s]
+  7%|6         | 11.2M/170M [00:00&lt;00:02, 59.0MB/s]
+ 10%|9         | 16.9M/170M [00:00&lt;00:03, 49.8MB/s]
+ 13%|#2        | 21.8M/170M [00:00&lt;00:03, 48.8MB/s]
+ 16%|#5        | 26.9M/170M [00:00&lt;00:02, 50.5MB/s]
+ 19%|#9        | 32.9M/170M [00:00&lt;00:02, 54.0MB/s]
+ 23%|##2       | 38.5M/170M [00:00&lt;00:02, 55.6MB/s]
+ 26%|##5       | 44.0M/170M [00:00&lt;00:02, 56.2MB/s]
+ 29%|##9       | 49.5M/170M [00:00&lt;00:02, 56.6MB/s]
+ 32%|###2      | 54.9M/170M [00:01&lt;00:02, 50.5MB/s]
+ 35%|###5      | 59.9M/170M [00:01&lt;00:02, 49.9MB/s]
+ 39%|###8      | 65.8M/170M [00:01&lt;00:02, 53.3MB/s]
+ 42%|####2     | 71.4M/170M [00:01&lt;00:01, 54.6MB/s]
+ 45%|####5     | 76.7M/170M [00:01&lt;00:01, 52.1MB/s]
+ 48%|####8     | 82.3M/170M [00:01&lt;00:01, 54.2MB/s]
+ 52%|#####1    | 87.9M/170M [00:01&lt;00:01, 55.5MB/s]
+ 55%|#####4    | 93.3M/170M [00:01&lt;00:01, 42.7MB/s]
+ 59%|#####8    | 99.7M/170M [00:02&lt;00:01, 48.7MB/s]
+ 62%|######1   | 105M/170M [00:02&lt;00:01, 50.8MB/s]
+ 65%|######5   | 110M/170M [00:02&lt;00:01, 52.2MB/s]
+ 68%|######8   | 116M/170M [00:02&lt;00:01, 45.5MB/s]
+ 71%|#######   | 120M/170M [00:02&lt;00:01, 37.8MB/s]
+ 73%|#######3  | 124M/170M [00:02&lt;00:01, 37.5MB/s]
+ 76%|#######5  | 128M/170M [00:02&lt;00:01, 33.8MB/s]
+ 78%|#######7  | 132M/170M [00:02&lt;00:01, 34.4MB/s]
+ 80%|#######9  | 136M/170M [00:03&lt;00:00, 36.5MB/s]
+ 82%|########2 | 139M/170M [00:03&lt;00:00, 32.9MB/s]
+ 84%|########4 | 143M/170M [00:03&lt;00:00, 35.1MB/s]
+ 88%|########8 | 150M/170M [00:03&lt;00:00, 42.7MB/s]
+ 91%|#########1| 155M/170M [00:03&lt;00:00, 46.0MB/s]
+ 94%|#########3| 159M/170M [00:03&lt;00:00, 42.6MB/s]
+ 97%|#########6| 164M/170M [00:03&lt;00:00, 44.0MB/s]
+100%|#########9| 170M/170M [00:03&lt;00:00, 48.1MB/s]
+100%|##########| 170M/170M [00:03&lt;00:00, 46.5MB/s]
 /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3878: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
   for i in range(dim)
 /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/anchor_utils.py:127: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the &#39;trunc&#39; function NOT &#39;floor&#39;). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=&#39;trunc&#39;), or for actual floor division, use torch.div(a, b, rounding_mode=&#39;floor&#39;).
@@ -584,7 +535,7 @@ torchvision rcnn models.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Get 9 valid boxes
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  14.259 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 3 minutes  7.991 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-object-detection-pytorch-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7795da4b258c8feff986668b95ef57ad/deploy_object_detection_pytorch.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_object_detection_pytorch.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized.html b/docs/how_to/deploy_models/deploy_prequantized.html
index 2a3d30d76..3adbde136 100644
--- a/docs/how_to/deploy_models/deploy_prequantized.html
+++ b/docs/how_to/deploy_models/deploy_prequantized.html
@@ -450,13 +450,8 @@ training. Other models require a full post training calibration.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading: &quot;https://download.pytorch.org/models/mobilenet_v2-b0353104.pth&quot; to /workspace/.cache/torch/hub/checkpoints/mobilenet_v2-b0353104.pth
 
   0%|          | 0.00/13.6M [00:00&lt;?, ?B/s]
- 14%|#3        | 1.88M/13.6M [00:00&lt;00:00, 18.4MB/s]
- 27%|##6       | 3.63M/13.6M [00:00&lt;00:00, 16.2MB/s]
- 40%|###9      | 5.39M/13.6M [00:00&lt;00:00, 17.1MB/s]
- 55%|#####4    | 7.44M/13.6M [00:00&lt;00:00, 18.5MB/s]
- 76%|#######6  | 10.3M/13.6M [00:00&lt;00:00, 22.3MB/s]
- 92%|#########1| 12.5M/13.6M [00:00&lt;00:00, 21.8MB/s]
-100%|##########| 13.6M/13.6M [00:00&lt;00:00, 20.2MB/s]
+ 93%|#########3| 12.6M/13.6M [00:00&lt;00:00, 132MB/s]
+100%|##########| 13.6M/13.6M [00:00&lt;00:00, 135MB/s]
 </pre></div>
 </div>
 </div>
@@ -545,7 +540,7 @@ output values are identical out of 1000 outputs from mobilenet v2.</p>
 <p class="sphx-glr-script-out">Out:</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  90.3776      90.2127      95.2388      90.0487       0.5623
+  90.4764      90.2552      98.0507      90.0430       0.8613
 </pre></div>
 </div>
 <div class="admonition note">
@@ -584,7 +579,7 @@ This includes support for the VNNI 8 bit dot product instruction (CascadeLake or
 <div class="section" id="deploy-a-quantized-tflite-model">
 <h2>Deploy a quantized TFLite Model<a class="headerlink" href="#deploy-a-quantized-tflite-model" title="Permalink to this headline">¶</a></h2>
 <p>TODO</p>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  5.547 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  4.715 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/fb8217c13f4351224c6cf3aacf1a87fc/deploy_prequantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_prequantized_tflite.html b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
index f898865fb..c3c58b4a0 100644
--- a/docs/how_to/deploy_models/deploy_prequantized_tflite.html
+++ b/docs/how_to/deploy_models/deploy_prequantized_tflite.html
@@ -540,7 +540,7 @@ TFLite Top-5 labels: [387 102 386 341 349]
 <p class="sphx-glr-script-out">Out:</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  119.8114     119.8220     121.4090     118.3415      0.4469
+  118.3109     118.2177     123.2838     117.5628      0.6162
 </pre></div>
 </div>
 <div class="admonition note">
@@ -568,7 +568,7 @@ network for ARM CPU</span></a>.</p></li>
 </ul>
 </div></blockquote>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  53.880 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  52.855 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-prequantized-tflite-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/56691c7a27d45da61d112276334640d3/deploy_prequantized_tflite.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_prequantized_tflite.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_quantized.html b/docs/how_to/deploy_models/deploy_quantized.html
index 39f05e4bf..9fe62c149 100644
--- a/docs/how_to/deploy_models/deploy_quantized.html
+++ b/docs/how_to/deploy_models/deploy_quantized.html
@@ -480,7 +480,7 @@ for calibration. But the accuracy might be impacted.</p>
   DeprecationWarning,
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  9.236 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  8.315 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-quantized-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/7810ecf51bfc05f7d5e8a400ac3e815d/deploy_quantized.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_quantized.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
index 95622a94b..d924f8b40 100644
--- a/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
+++ b/docs/how_to/deploy_models/deploy_ssd_gluoncv.html
@@ -415,23 +415,23 @@ to your device.</p>
 Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_voc-9c8b225a.zip...
 
   0%|          | 0/132723 [00:00&lt;?, ?KB/s]
-  5%|4         | 6265/132723 [00:00&lt;00:02, 62639.61KB/s]
- 11%|#         | 14572/132723 [00:00&lt;00:01, 74649.38KB/s]
- 17%|#6        | 22119/132723 [00:00&lt;00:01, 75021.64KB/s]
- 23%|##2       | 30458/132723 [00:00&lt;00:01, 78322.65KB/s]
- 29%|##9       | 38846/132723 [00:00&lt;00:01, 80324.08KB/s]
- 36%|###5      | 47194/132723 [00:00&lt;00:01, 81393.69KB/s]
- 42%|####1     | 55558/132723 [00:00&lt;00:00, 82125.15KB/s]
- 48%|####8     | 63909/132723 [00:00&lt;00:00, 82562.19KB/s]
- 54%|#####4    | 72166/132723 [00:00&lt;00:00, 82466.38KB/s]
- 61%|######    | 80454/132723 [00:01&lt;00:00, 82591.31KB/s]
- 67%|######6   | 88716/132723 [00:01&lt;00:00, 82599.03KB/s]
- 73%|#######3  | 97062/132723 [00:01&lt;00:00, 82854.96KB/s]
- 79%|#######9  | 105408/132723 [00:01&lt;00:00, 83036.17KB/s]
- 86%|########5 | 113757/132723 [00:01&lt;00:00, 83171.62KB/s]
- 92%|#########2| 122116/132723 [00:01&lt;00:00, 83294.74KB/s]
- 98%|#########8| 130486/132723 [00:01&lt;00:00, 83409.01KB/s]
-100%|##########| 132723/132723 [00:01&lt;00:00, 81493.10KB/s]
+  4%|4         | 5876/132723 [00:00&lt;00:02, 58755.15KB/s]
+ 10%|#         | 13601/132723 [00:00&lt;00:01, 69628.87KB/s]
+ 16%|#6        | 21466/132723 [00:00&lt;00:01, 73736.74KB/s]
+ 22%|##2       | 29391/132723 [00:00&lt;00:01, 75909.78KB/s]
+ 28%|##8       | 37372/132723 [00:00&lt;00:01, 77313.91KB/s]
+ 34%|###4      | 45435/132723 [00:00&lt;00:01, 78440.03KB/s]
+ 40%|####      | 53373/132723 [00:00&lt;00:01, 78743.71KB/s]
+ 46%|####6     | 61435/132723 [00:00&lt;00:00, 79339.56KB/s]
+ 52%|#####2    | 69408/132723 [00:00&lt;00:00, 79459.44KB/s]
+ 58%|#####8    | 77417/132723 [00:01&lt;00:00, 79653.12KB/s]
+ 64%|######4   | 85450/132723 [00:01&lt;00:00, 79858.90KB/s]
+ 70%|#######   | 93464/132723 [00:01&lt;00:00, 79942.17KB/s]
+ 76%|#######6  | 101497/132723 [00:01&lt;00:00, 80058.11KB/s]
+ 83%|########2 | 109526/132723 [00:01&lt;00:00, 80127.19KB/s]
+ 89%|########8 | 117539/132723 [00:01&lt;00:00, 80077.42KB/s]
+ 95%|#########4| 125649/132723 [00:01&lt;00:00, 80383.88KB/s]
+100%|##########| 132723/132723 [00:01&lt;00:00, 78565.74KB/s]
 </pre></div>
 </div>
 <p>Create TVM runtime and do inference
@@ -471,7 +471,7 @@ Downloading /workspace/.mxnet/models/ssd_512_resnet50_v1_voc-9c8b225a.zip from h
 </pre></div>
 </div>
 <img alt="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" class="sphx-glr-single-img" src="../../_images/sphx_glr_deploy_ssd_gluoncv_001.png" />
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  22.964 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  22.894 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-deploy-models-deploy-ssd-gluoncv-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/cccb17d28e5e8b2e94ea8cd5ec59f6ed/deploy_ssd_gluoncv.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">deploy_ssd_gluoncv.py</span></code></a></p>
diff --git a/docs/how_to/deploy_models/sg_execution_times.html b/docs/how_to/deploy_models/sg_execution_times.html
index 03b5fa17e..c4b41ac06 100644
--- a/docs/how_to/deploy_models/sg_execution_times.html
+++ b/docs/how_to/deploy_models/sg_execution_times.html
@@ -300,15 +300,15 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-deploy-models-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>10:35.869</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
+<p><strong>10:26.488</strong> total execution time for <strong>how_to_deploy_models</strong> files:</p>
 <ul class="simple">
-<li><p><strong>03:14.259</strong>: <a class="reference internal" href="deploy_object_detection_pytorch.html#sphx-glr-how-to-deploy-models-deploy-object-detection-pytorch-py"><span class="std std-ref">Compile PyTorch Object Detection Models</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_object_detection_pytorch.py</span></code>)</p></li>
-<li><p><strong>02:22.964</strong>: <a class="reference internal" href="deploy_ssd_gluoncv.html#sphx-glr-how-to-deploy-models-deploy-ssd-gluoncv-py"><span class="std std-ref">Deploy Single Shot Multibox Detector(SSD) model</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_ssd_gluoncv.py</span></code>)</p></li>
-<li><p><strong>01:53.880</strong>: <a class="reference internal" href="deploy_prequantized_tflite.html#sphx-glr-how-to-deploy-models-deploy-prequantized-tflite-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM - Part 3 (TFLite)</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized_tflite.py</span></code>)</p></li>
-<li><p><strong>01:09.236</strong>: <a class="reference internal" href="deploy_quantized.html#sphx-glr-how-to-deploy-models-deploy-quantized-py"><span class="std std-ref">Deploy a Quantized Model on Cuda</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_quantized.py</span></code>)</p></li>
-<li><p><strong>01:05.547</strong>: <a class="reference internal" href="deploy_prequantized.html#sphx-glr-how-to-deploy-models-deploy-prequantized-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized.py</span></code>)</p></li>
-<li><p><strong>00:28.224</strong>: <a class="reference internal" href="deploy_model_on_android.html#sphx-glr-how-to-deploy-models-deploy-model-on-android-py"><span class="std std-ref">Deploy the Pretrained Model on Android</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_android.py</span></code>)</p></li>
-<li><p><strong>00:21.561</strong>: <a class="reference internal" href="deploy_model_on_rasp.html#sphx-glr-how-to-deploy-models-deploy-model-on-rasp-py"><span class="std std-ref">Deploy the Pretrained Model on Raspberry Pi</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_rasp.py</span></code>)</p></li>
+<li><p><strong>03:07.991</strong>: <a class="reference internal" href="deploy_object_detection_pytorch.html#sphx-glr-how-to-deploy-models-deploy-object-detection-pytorch-py"><span class="std std-ref">Compile PyTorch Object Detection Models</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_object_detection_pytorch.py</span></code>)</p></li>
+<li><p><strong>02:22.894</strong>: <a class="reference internal" href="deploy_ssd_gluoncv.html#sphx-glr-how-to-deploy-models-deploy-ssd-gluoncv-py"><span class="std std-ref">Deploy Single Shot Multibox Detector(SSD) model</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_ssd_gluoncv.py</span></code>)</p></li>
+<li><p><strong>01:52.855</strong>: <a class="reference internal" href="deploy_prequantized_tflite.html#sphx-glr-how-to-deploy-models-deploy-prequantized-tflite-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM - Part 3 (TFLite)</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized_tflite.py</span></code>)</p></li>
+<li><p><strong>01:08.315</strong>: <a class="reference internal" href="deploy_quantized.html#sphx-glr-how-to-deploy-models-deploy-quantized-py"><span class="std std-ref">Deploy a Quantized Model on Cuda</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_quantized.py</span></code>)</p></li>
+<li><p><strong>01:04.715</strong>: <a class="reference internal" href="deploy_prequantized.html#sphx-glr-how-to-deploy-models-deploy-prequantized-py"><span class="std std-ref">Deploy a Framework-prequantized Model with TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_prequantized.py</span></code>)</p></li>
+<li><p><strong>00:28.055</strong>: <a class="reference internal" href="deploy_model_on_android.html#sphx-glr-how-to-deploy-models-deploy-model-on-android-py"><span class="std std-ref">Deploy the Pretrained Model on Android</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_android.py</span></code>)</p></li>
+<li><p><strong>00:21.464</strong>: <a class="reference internal" href="deploy_model_on_rasp.html#sphx-glr-how-to-deploy-models-deploy-model-on-rasp-py"><span class="std std-ref">Deploy the Pretrained Model on Raspberry Pi</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_model_on_rasp.py</span></code>)</p></li>
 <li><p><strong>00:00.199</strong>: <a class="reference internal" href="deploy_sparse.html#sphx-glr-how-to-deploy-models-deploy-sparse-py"><span class="std std-ref">Deploy a Hugging Face Pruned Model on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">deploy_sparse.py</span></code>)</p></li>
 </ul>
 </div>
diff --git a/docs/how_to/extend_tvm/bring_your_own_datatypes.html b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
index 0643b602f..ff0595464 100644
--- a/docs/how_to/extend_tvm/bring_your_own_datatypes.html
+++ b/docs/how_to/extend_tvm/bring_your_own_datatypes.html
@@ -588,7 +588,7 @@ In this alpha state of the Bring Your Own Datatypes framework, we have not imple
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zipb93ec226-55dc-4bbf-a017-1b11d08f2501 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Downloading /workspace/.mxnet/models/mobilenet0.25-9f83e440.zip80a2fdd9-51f0-4e24-801b-fedc14ead5c2 from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/mobilenet0.25-9f83e440.zip...
 </pre></div>
 </div>
 <p>It’s easy to execute MobileNet with native TVM:</p>
@@ -650,7 +650,7 @@ In this alpha state of the Bring Your Own Datatypes framework, we have not imple
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Check failed: (lower) is false: Intrinsic lowering function for target llvm, intrinsic name tir.sqrt, type 150 not found
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Check failed: (lower) is false: FloatImm lowering function for target llvm type 150 not found
 </pre></div>
 </div>
 <p>When we attempt to run the model, we get a familiar error telling us that more functions need to be registerd for myfloat.</p>
diff --git a/docs/how_to/extend_tvm/sg_execution_times.html b/docs/how_to/extend_tvm/sg_execution_times.html
index 5b122e7bc..a1153686a 100644
--- a/docs/how_to/extend_tvm/sg_execution_times.html
+++ b/docs/how_to/extend_tvm/sg_execution_times.html
@@ -300,12 +300,12 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-extend-tvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:38.899</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
+<p><strong>00:38.595</strong> total execution time for <strong>how_to_extend_tvm</strong> files:</p>
 <ul class="simple">
-<li><p><strong>00:35.345</strong>: <a class="reference internal" href="bring_your_own_datatypes.html#sphx-glr-how-to-extend-tvm-bring-your-own-datatypes-py"><span class="std std-ref">Bring Your Own Datatypes to TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">bring_your_own_datatypes.py</span></code>)</p></li>
-<li><p><strong>00:02.255</strong>: <a class="reference internal" href="use_pass_instrument.html#sphx-glr-how-to-extend-tvm-use-pass-instrument-py"><span class="std std-ref">How to Use TVM Pass Instrument</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_instrument.py</span></code>)</p></li>
-<li><p><strong>00:01.087</strong>: <a class="reference internal" href="use_pass_infra.html#sphx-glr-how-to-extend-tvm-use-pass-infra-py"><span class="std std-ref">How to Use TVM Pass Infra</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_infra.py</span></code>)</p></li>
-<li><p><strong>00:00.212</strong>: <a class="reference internal" href="low_level_custom_pass.html#sphx-glr-how-to-extend-tvm-low-level-custom-pass-py"><span class="std std-ref">Writing a Customized Pass</span></a> (<code class="docutils literal notranslate"><span class="pre">low_level_custom_pass.py</span></code>)</p></li>
+<li><p><strong>00:35.043</strong>: <a class="reference internal" href="bring_your_own_datatypes.html#sphx-glr-how-to-extend-tvm-bring-your-own-datatypes-py"><span class="std std-ref">Bring Your Own Datatypes to TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">bring_your_own_datatypes.py</span></code>)</p></li>
+<li><p><strong>00:02.252</strong>: <a class="reference internal" href="use_pass_instrument.html#sphx-glr-how-to-extend-tvm-use-pass-instrument-py"><span class="std std-ref">How to Use TVM Pass Instrument</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_instrument.py</span></code>)</p></li>
+<li><p><strong>00:01.093</strong>: <a class="reference internal" href="use_pass_infra.html#sphx-glr-how-to-extend-tvm-use-pass-infra-py"><span class="std std-ref">How to Use TVM Pass Infra</span></a> (<code class="docutils literal notranslate"><span class="pre">use_pass_infra.py</span></code>)</p></li>
+<li><p><strong>00:00.206</strong>: <a class="reference internal" href="low_level_custom_pass.html#sphx-glr-how-to-extend-tvm-low-level-custom-pass-py"><span class="std std-ref">Writing a Customized Pass</span></a> (<code class="docutils literal notranslate"><span class="pre">low_level_custom_pass.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/extend_tvm/use_pass_instrument.html b/docs/how_to/extend_tvm/use_pass_instrument.html
index 7812189f1..7f0c7e9e0 100644
--- a/docs/how_to/extend_tvm/use_pass_instrument.html
+++ b/docs/how_to/extend_tvm/use_pass_instrument.html
@@ -486,10 +486,10 @@ profile the execution time of each passes.</p>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 6080us [6080us] (45.57%; 45.57%)
-FoldScaleAxis: 7262us [2us] (54.43%; 54.43%)
-        FoldConstant: 7260us [1500us] (54.41%; 99.97%)
-                InferType: 5760us [5760us] (43.17%; 79.34%)
+InferType: 6089us [6089us] (45.47%; 45.47%)
+FoldScaleAxis: 7302us [3us] (54.53%; 54.53%)
+        FoldConstant: 7300us [1492us] (54.51%; 99.97%)
+                InferType: 5808us [5808us] (43.37%; 79.56%)
 </pre></div>
 </div>
 </div>
@@ -512,10 +512,10 @@ Refer to following sections and <a class="reference internal" href="../../refere
 </div>
 <p class="sphx-glr-script-out">Out:</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Printing results of timing profile...
-InferType: 5814us [5814us] (44.58%; 44.58%)
-FoldScaleAxis: 7228us [2us] (55.42%; 55.42%)
-        FoldConstant: 7226us [1506us] (55.40%; 99.97%)
-                InferType: 5720us [5720us] (43.86%; 79.16%)
+InferType: 5843us [5843us] (44.51%; 44.51%)
+FoldScaleAxis: 7284us [2us] (55.49%; 55.49%)
+        FoldConstant: 7282us [1537us] (55.47%; 99.97%)
+                InferType: 5745us [5745us] (43.76%; 78.89%)
 </pre></div>
 </div>
 <p>Register empty list to clear existing instruments.</p>
diff --git a/docs/how_to/optimize_operators/opt_conv_cuda.html b/docs/how_to/optimize_operators/opt_conv_cuda.html
index 474be1165..1b6f0904c 100644
--- a/docs/how_to/optimize_operators/opt_conv_cuda.html
+++ b/docs/how_to/optimize_operators/opt_conv_cuda.html
@@ -534,7 +534,7 @@ latency of convolution.</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 35.935623 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Convolution: 54.175673 ms
 </pre></div>
 </div>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-optimize-operators-opt-conv-cuda-py">
diff --git a/docs/how_to/optimize_operators/opt_conv_tensorcore.html b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
index 937ef6fa6..24577776e 100644
--- a/docs/how_to/optimize_operators/opt_conv_tensorcore.html
+++ b/docs/how_to/optimize_operators/opt_conv_tensorcore.html
@@ -878,7 +878,7 @@ be able to run on our build server</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 9.161962 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>conv2d with tensor core: 7.100658 ms
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/optimize_operators/opt_gemm.html b/docs/how_to/optimize_operators/opt_gemm.html
index d3c11a629..5d6841380 100644
--- a/docs/how_to/optimize_operators/opt_gemm.html
+++ b/docs/how_to/optimize_operators/opt_gemm.html
@@ -431,8 +431,8 @@ Then we write a baseline implementation, the simplest way to write a matrix mult
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018871
-Baseline: 3.346656
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Numpy running time: 0.018666
+Baseline: 3.344988
 </pre></div>
 </div>
 <p>In TVM, we can always inspect lower level IR to debug or optimize our schedule.
@@ -494,7 +494,7 @@ fill 32 * 32 * sizeof(float) which is 4KB in the cache whose total size is 32KB
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.300723
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt1: 0.298091
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -563,7 +563,7 @@ vastly.</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.335491
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt2: 0.334549
 </pre></div>
 </div>
 <p>Here is the generated IR after vectorization.</p>
@@ -626,7 +626,7 @@ the access pattern for A matrix is more cache friendly.</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.118515
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt3: 0.119140
 </pre></div>
 </div>
 <p>Here is the generated IR after loop permutation.</p>
@@ -711,7 +711,7 @@ flattening.</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.112324
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt4: 0.111795
 </pre></div>
 </div>
 <p>Here is the generated IR after array packing.</p>
@@ -799,7 +799,7 @@ write to C when all the block results are ready.</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.111561
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt5: 0.111665
 </pre></div>
 </div>
 <p>Here is the generated IR after blocking.</p>
@@ -891,7 +891,7 @@ write to C when all the block results are ready.</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.145279
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Opt6: 0.144061
 </pre></div>
 </div>
 <p>Here is the generated IR after parallelization.</p>
diff --git a/docs/how_to/optimize_operators/sg_execution_times.html b/docs/how_to/optimize_operators/sg_execution_times.html
index d10d21e16..55d5dbdac 100644
--- a/docs/how_to/optimize_operators/sg_execution_times.html
+++ b/docs/how_to/optimize_operators/sg_execution_times.html
@@ -300,11 +300,11 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-optimize-operators-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:34.993</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
+<p><strong>00:34.633</strong> total execution time for <strong>how_to_optimize_operators</strong> files:</p>
 <ul class="simple">
-<li><p><strong>00:32.258</strong>: <a class="reference internal" href="opt_gemm.html#sphx-glr-how-to-optimize-operators-opt-gemm-py"><span class="std std-ref">How to optimize GEMM on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_gemm.py</span></code>)</p></li>
-<li><p><strong>00:01.509</strong>: <a class="reference internal" href="opt_conv_tensorcore.html#sphx-glr-how-to-optimize-operators-opt-conv-tensorcore-py"><span class="std std-ref">How to optimize convolution using TensorCores</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_tensorcore.py</span></code>)</p></li>
-<li><p><strong>00:01.226</strong>: <a class="reference internal" href="opt_conv_cuda.html#sphx-glr-how-to-optimize-operators-opt-conv-cuda-py"><span class="std std-ref">How to optimize convolution on GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_cuda.py</span></code>)</p></li>
+<li><p><strong>00:32.068</strong>: <a class="reference internal" href="opt_gemm.html#sphx-glr-how-to-optimize-operators-opt-gemm-py"><span class="std std-ref">How to optimize GEMM on CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_gemm.py</span></code>)</p></li>
+<li><p><strong>00:01.352</strong>: <a class="reference internal" href="opt_conv_tensorcore.html#sphx-glr-how-to-optimize-operators-opt-conv-tensorcore-py"><span class="std std-ref">How to optimize convolution using TensorCores</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_tensorcore.py</span></code>)</p></li>
+<li><p><strong>00:01.213</strong>: <a class="reference internal" href="opt_conv_cuda.html#sphx-glr-how-to-optimize-operators-opt-conv-cuda-py"><span class="std std-ref">How to optimize convolution on GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">opt_conv_cuda.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
index b922586dd..0e362caac 100644
--- a/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
+++ b/docs/how_to/tune_with_autoscheduler/sg_execution_times.html
@@ -300,14 +300,14 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autoscheduler-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>04:55.065</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
+<p><strong>04:55.001</strong> total execution time for <strong>how_to_tune_with_autoscheduler</strong> files:</p>
 <ul class="simple">
-<li><p><strong>02:20.665</strong>: <a class="reference internal" href="tune_conv2d_layer_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py"><span class="std std-ref">Auto-scheduling a Convolution Layer for GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_layer_cuda.py</span></code>)</p></li>
-<li><p><strong>01:20.512</strong>: <a class="reference internal" href="tune_network_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-x86-py"><span class="std std-ref">Auto-scheduling a Neural Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_x86.py</span></code>)</p></li>
-<li><p><strong>00:40.577</strong>: <a class="reference internal" href="tune_network_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-cuda-py"><span class="std std-ref">Auto-scheduling a Neural Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_cuda.py</span></code>)</p></li>
-<li><p><strong>00:15.818</strong>: <a class="reference internal" href="tune_sparse_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-sparse-x86-py"><span class="std std-ref">Auto-scheduling Sparse Matrix Multiplication on CPU with Custom Sketch Rule</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_sparse_x86.py</span></code>)</p></li>
-<li><p><strong>00:08.855</strong>: <a class="reference internal" href="tune_network_mali.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-mali-py"><span class="std std-ref">Auto-scheduling a Neural Network for mali GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_mali.py</span></code>)</p></li>
-<li><p><strong>00:08.639</strong>: <a class="reference internal" href="tune_network_arm.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-arm-py"><span class="std std-ref">Auto-scheduling a Neural Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_arm.py</span></code>)</p></li>
+<li><p><strong>02:20.451</strong>: <a class="reference internal" href="tune_conv2d_layer_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py"><span class="std std-ref">Auto-scheduling a Convolution Layer for GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_layer_cuda.py</span></code>)</p></li>
+<li><p><strong>01:20.072</strong>: <a class="reference internal" href="tune_network_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-x86-py"><span class="std std-ref">Auto-scheduling a Neural Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_x86.py</span></code>)</p></li>
+<li><p><strong>00:40.650</strong>: <a class="reference internal" href="tune_network_cuda.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-cuda-py"><span class="std std-ref">Auto-scheduling a Neural Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_cuda.py</span></code>)</p></li>
+<li><p><strong>00:16.595</strong>: <a class="reference internal" href="tune_sparse_x86.html#sphx-glr-how-to-tune-with-autoscheduler-tune-sparse-x86-py"><span class="std std-ref">Auto-scheduling Sparse Matrix Multiplication on CPU with Custom Sketch Rule</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_sparse_x86.py</span></code>)</p></li>
+<li><p><strong>00:08.727</strong>: <a class="reference internal" href="tune_network_mali.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-mali-py"><span class="std std-ref">Auto-scheduling a Neural Network for mali GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_mali.py</span></code>)</p></li>
+<li><p><strong>00:08.506</strong>: <a class="reference internal" href="tune_network_arm.html#sphx-glr-how-to-tune-with-autoscheduler-tune-network-arm-py"><span class="std std-ref">Auto-scheduling a Neural Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_network_arm.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
index e5cebaf7b..c1958c4c5 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_conv2d_layer_cuda.html
@@ -470,483 +470,487 @@ cooperative fetching, unrolling and operator fusion.</p>
              compute: Buffer(compute_2: Pointer(float32), float32, [25088], [])}
   buffer_map = {data_1: data, kernel_1: kernel, bias_1: bias, compute_1: compute}
   preflattened_buffer_map = {data_1: data_3: Buffer(data_2, float32, [1, 512, 7, 7], []), kernel_1: kernel_3: Buffer(kernel_2, float32, [512, 512, 3, 3], []), bias_1: bias_3: Buffer(bias_2, float32, [1, 512, 1, 1], []), compute_1: compute_3: Buffer(compute_2, float32, [1, 512, 7, 7], [])} {
-  attr [IterVar(blockIdx.x: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;blockIdx.x&quot;)] &quot;thread_extent&quot; = 28;
-  allocate(conv2d_nchw: Pointer(local float32), float32, [14]), storage_scope = local;
-  allocate(pad_temp.shared: Pointer(shared float32), float32, [72]), storage_scope = shared;
-  allocate(kernel.shared: Pointer(shared float32), float32, [3072]), storage_scope = shared;
-  attr [IterVar(threadIdx.x: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64 {
-    conv2d_nchw_1: Buffer(conv2d_nchw, float32, [14], [], scope=&quot;local&quot;, align=32)[0] = 0f32
-    conv2d_nchw_1[1] = 0f32
-    conv2d_nchw_1[2] = 0f32
-    conv2d_nchw_1[3] = 0f32
+  attr [IterVar(blockIdx.x: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;blockIdx.x&quot;)] &quot;thread_extent&quot; = 64;
+  allocate(conv2d_nchw: Pointer(local float32), float32, [28]), storage_scope = local;
+  allocate(pad_temp.shared: Pointer(shared float32), float32, [324]), storage_scope = shared;
+  allocate(kernel.shared: Pointer(shared float32), float32, [288]), storage_scope = shared;
+  attr [IterVar(threadIdx.x: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14 {
+    conv2d_nchw_1: Buffer(conv2d_nchw, float32, [16], [], scope=&quot;local&quot;, align=16)[0] = 0f32
     conv2d_nchw_1[4] = 0f32
-    conv2d_nchw_1[5] = 0f32
-    conv2d_nchw_1[6] = 0f32
-    conv2d_nchw_1[7] = 0f32
     conv2d_nchw_1[8] = 0f32
+    conv2d_nchw_1[12] = 0f32
+    conv2d_nchw_1[16] = 0f32
+    conv2d_nchw_1[20] = 0f32
+    conv2d_nchw_1[24] = 0f32
+    conv2d_nchw_1[1] = 0f32
+    conv2d_nchw_1[5] = 0f32
     conv2d_nchw_1[9] = 0f32
+    conv2d_nchw_1[13] = 0f32
+    conv2d_nchw_1[17] = 0f32
+    conv2d_nchw_1[21] = 0f32
+    conv2d_nchw_1[25] = 0f32
+    conv2d_nchw_1[2] = 0f32
+    conv2d_nchw_1[6] = 0f32
     conv2d_nchw_1[10] = 0f32
+    conv2d_nchw_1[14] = 0f32
+    conv2d_nchw_1[18] = 0f32
+    conv2d_nchw_1[22] = 0f32
+    conv2d_nchw_1[26] = 0f32
+    conv2d_nchw_1[3] = 0f32
+    conv2d_nchw_1[7] = 0f32
     conv2d_nchw_1[11] = 0f32
-    conv2d_nchw_1[12] = 0f32
-    conv2d_nchw_1[13] = 0f32
-    for (rc.outer.outer: int32, 0, 64) {
-      for (ry.outer.outer: int32, 0, 3) {
-        let cse_var_2: int32 = (rc.outer.outer*72)
-        let cse_var_1: int32 = (ry.outer.outer*3)
-         {
-          attr [IterVar(threadIdx.x_1: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64 {
-            if @tir.likely((threadIdx.x_1 &lt; 18), dtype=bool) {
-              pad_temp.shared_1: Buffer(pad_temp.shared, float32, [72], [], scope=&quot;shared&quot;)[(threadIdx.x_1*4)] = @tir.if_then_else(((((1 &lt;= (ry.outer.outer + floormod(blockIdx.x, 7))) &amp;&amp; ((ry.outer.outer + floormod(blockIdx.x, 7)) &lt; 8)) &amp;&amp; (1 &lt;= floormod((threadIdx.x_1*4), 9))) &amp;&amp; (floormod((threadIdx.x_1*4), 9) &lt; 8)), data[((((((rc.outer.outer*392) + (floordiv((threadIdx.x_1*4), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) +  [...]
-            }
-            if @tir.likely((threadIdx.x_1 &lt; 18), dtype=bool) {
-              pad_temp.shared_1[((threadIdx.x_1*4) + 1)] = @tir.if_then_else(((((1 &lt;= (ry.outer.outer + floormod(blockIdx.x, 7))) &amp;&amp; ((ry.outer.outer + floormod(blockIdx.x, 7)) &lt; 8)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*4) + 1), 9))) &amp;&amp; (floormod(((threadIdx.x_1*4) + 1), 9) &lt; 8)), data[((((((rc.outer.outer*392) + (floordiv(((threadIdx.x_1*4) + 1), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) + floormod(((threadIdx.x_1*4) + 1), 9)) - 8)], 0 [...]
-            }
-            if @tir.likely((threadIdx.x_1 &lt; 18), dtype=bool) {
-              pad_temp.shared_1[((threadIdx.x_1*4) + 2)] = @tir.if_then_else(((((1 &lt;= (ry.outer.outer + floormod(blockIdx.x, 7))) &amp;&amp; ((ry.outer.outer + floormod(blockIdx.x, 7)) &lt; 8)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*4) + 2), 9))) &amp;&amp; (floormod(((threadIdx.x_1*4) + 2), 9) &lt; 8)), data[((((((rc.outer.outer*392) + (floordiv(((threadIdx.x_1*4) + 2), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) + floormod(((threadIdx.x_1*4) + 2), 9)) - 8)], 0 [...]
-            }
-            if @tir.likely((threadIdx.x_1 &lt; 18), dtype=bool) {
-              pad_temp.shared_1[((threadIdx.x_1*4) + 3)] = @tir.if_then_else(((((1 &lt;= (ry.outer.outer + floormod(blockIdx.x, 7))) &amp;&amp; ((ry.outer.outer + floormod(blockIdx.x, 7)) &lt; 8)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*4) + 3), 9))) &amp;&amp; (floormod(((threadIdx.x_1*4) + 3), 9) &lt; 8)), data[((((((rc.outer.outer*392) + (floordiv(((threadIdx.x_1*4) + 3), 9)*49)) + (ry.outer.outer*7)) + (floormod(blockIdx.x, 7)*7)) + floormod(((threadIdx.x_1*4) + 3), 9)) - 8)], 0 [...]
-            }
+    conv2d_nchw_1[15] = 0f32
+    conv2d_nchw_1[19] = 0f32
+    conv2d_nchw_1[23] = 0f32
+    conv2d_nchw_1[27] = 0f32
+    for (rc.outer.outer: int32, 0, 128) {
+      let cse_var_2: int32 = (rc.outer.outer*196)
+      let cse_var_1: int32 = (rc.outer.outer*36)
+       {
+        attr [IterVar(threadIdx.x_1: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14 {
+          pad_temp.shared_1: Buffer(pad_temp.shared, float32, [324], [], scope=&quot;shared&quot;)[(threadIdx.x_1*12)] = @tir.if_then_else(((((3 &lt;= floormod((threadIdx.x_1*4), 27)) &amp;&amp; (floormod((threadIdx.x_1*12), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod((threadIdx.x_1*12), 9))) &amp;&amp; (floormod((threadIdx.x_1*12), 9) &lt; 8)), data[((((cse_var_2 + (floordiv((threadIdx.x_1*4), 27)*49)) + (floordiv(floormod((threadIdx.x_1*4), 27), 3)*7)) + floormod((threadIdx.x_1*12), 9)) [...]
+          pad_temp.shared_1[((threadIdx.x_1*12) + 1)] = @tir.if_then_else(((((3 &lt;= floormod((threadIdx.x_1*4), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 1), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 1), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 1), 9) &lt; 8)), data[((((cse_var_2 + (floordiv((threadIdx.x_1*4), 27)*49)) + (floordiv(floormod((threadIdx.x_1*4), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 1), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 2)] = @tir.if_then_else(((((3 &lt;= floormod((threadIdx.x_1*4), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 2), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 2), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 2), 9) &lt; 8)), data[((((cse_var_2 + (floordiv((threadIdx.x_1*4), 27)*49)) + (floordiv(floormod((threadIdx.x_1*4), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 2), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 3)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 1), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 3), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 3), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 3), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 1), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 1), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 3), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 4)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 1), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 4), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 4), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 4), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 1), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 1), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 4), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 5)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 1), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 5), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 5), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 5), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 1), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 1), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 5), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 6)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 2), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 6), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 6), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 6), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 2), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 2), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 6), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 7)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 2), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 7), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 7), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 7), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 2), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 2), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 7), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 8)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 2), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 8), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 8), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 8), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 2), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 2), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 8), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 9)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 9), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod((threadIdx.x_1*12), 9))) &amp;&amp; (floormod((threadIdx.x_1*12), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 3), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod((threadIdx.x_1*12), 9)) - 8)], 0f32, dtype=float32)
+          pad_temp.shared_1[((threadIdx.x_1*12) + 10)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 10), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 1), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 1), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 3), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 1), 9)) - 8)], 0f [...]
+          pad_temp.shared_1[((threadIdx.x_1*12) + 11)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 11), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 2), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 2), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 3), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 2), 9)) - 8)], 0f [...]
+        }
+        attr [IterVar(threadIdx.x_1, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14 {
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 168)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 56), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 6), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 6), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 6), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 56), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 56), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 6), 9)) - 8)], 0f32, dt [...]
           }
-          attr [IterVar(threadIdx.x_2: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1: Buffer(kernel.shared, float32, [3072], [], scope=&quot;shared&quot;)[threadIdx.x_2] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(threadIdx.x_2, 24)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 64)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 8), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 64), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 128)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 16), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 128), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 192)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 36864)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 256)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 32), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 256), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 320)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 40), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 320), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 384)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 73728)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 448)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 56), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 448), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 512)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 64), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 512), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 576)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 110592)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 640)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 80), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 640), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 704)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 88), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 704), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 768)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 147456)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 832)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 104), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 832), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 896)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 112), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 896), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 960)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 184320)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1024)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 128), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1024), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1088)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 136), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1088), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1152)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 221184)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1216)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 152), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1216), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1280)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 160), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1280), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1344)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 258048)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1408)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 176), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1408), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1472)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 184), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1472), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1536)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 294912)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1600)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 200), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1600), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1664)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 208), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1664), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1728)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 331776)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1792)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 224), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1792), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1856)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 232), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1856), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1920)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 368640)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 1984)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 248), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 1984), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2048)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 256), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2048), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2112)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 405504)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2176)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 272), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2176), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2240)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 280), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2240), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2304)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 442368)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2368)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 296), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2368), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2432)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 304), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2432), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2496)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 479232)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2560)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 320), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2560), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2624)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 328), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2624), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2688)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 516096)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2752)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 344), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2752), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2816)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 352), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2816), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2880)] = kernel[(((((((floordiv(blockIdx.x, 7)*589824) + (floordiv(floordiv(threadIdx.x_2, 8), 3)*4608)) + cse_var_2) + (floordiv(floormod(threadIdx.x_2, 24), 3)*9)) + cse_var_1) + floormod(threadIdx.x_2, 3)) + 552960)]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 2944)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 368), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 2944), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 1), 3))]
-          attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 64;
-          kernel.shared_1[(threadIdx.x_2 + 3008)] = kernel[((((((floordiv(blockIdx.x, 7)*589824) + (floordiv((floordiv(threadIdx.x_2, 8) + 376), 3)*4608)) + cse_var_2) + (floordiv(floormod((threadIdx.x_2 + 3008), 24), 3)*9)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 3))]
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[0]*kernel.shared_1[(threadIdx.x*48)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[9]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[1]*kernel.shared_1[(threadIdx.x*48)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[2]*kernel.shared_1[(threadIdx.x*48)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[3]*kernel.shared_1[(threadIdx.x*48)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[4]*kernel.shared_1[(threadIdx.x*48)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[5]*kernel.shared_1[(threadIdx.x*48)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[6]*kernel.shared_1[(threadIdx.x*48)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 3)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[0]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[9]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[1]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 24)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 27)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[1]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 1)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 4)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[1]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[10]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 25)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 28)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[8]*kernel.shared_1[((threadIdx.x*48) + 2)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[17]*kernel.shared_1[((threadIdx.x*48) + 5)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[2]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[11]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[3]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[12]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[4]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[13]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[5]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[14]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[6]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[15]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[7]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[16]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[8]*kernel.shared_1[((threadIdx.x*48) + 26)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[17]*kernel.shared_1[((threadIdx.x*48) + 29)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[18]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[27]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 6)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 9)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[18]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[27]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 30)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 33)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 7)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 10)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[19]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[28]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 31)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 34)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[26]*kernel.shared_1[((threadIdx.x*48) + 8)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[35]*kernel.shared_1[((threadIdx.x*48) + 11)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[20]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[29]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[21]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[30]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[22]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[31]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[23]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[32]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[24]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[33]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[25]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[34]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[26]*kernel.shared_1[((threadIdx.x*48) + 32)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[35]*kernel.shared_1[((threadIdx.x*48) + 35)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[36]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[45]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 12)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 15)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[36]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[45]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 36)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 39)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 13)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 16)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[37]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[46]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 37)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 40)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[44]*kernel.shared_1[((threadIdx.x*48) + 14)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[53]*kernel.shared_1[((threadIdx.x*48) + 17)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[38]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[47]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[39]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[48]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[40]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[49]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[41]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[50]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[42]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[51]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[43]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[52]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[44]*kernel.shared_1[((threadIdx.x*48) + 38)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[53]*kernel.shared_1[((threadIdx.x*48) + 41)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[54]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[63]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 18)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 21)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[54]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[63]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 42)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 45)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 19)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 22)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[55]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[64]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 43)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 46)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[62]*kernel.shared_1[((threadIdx.x*48) + 20)]))
-          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[71]*kernel.shared_1[((threadIdx.x*48) + 23)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[56]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[65]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[57]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[66]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[58]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[67]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[59]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[68]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[60]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[69]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[61]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[70]*kernel.shared_1[((threadIdx.x*48) + 47)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[62]*kernel.shared_1[((threadIdx.x*48) + 44)]))
-          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[71]*kernel.shared_1[((threadIdx.x*48) + 47)]))
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 169)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 56), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 7), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 7), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 7), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 56), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 56), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 7), 9)) - 8)], 0f32, dt [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 170)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 56), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 8), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 8), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 8), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 56), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 56), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 8), 9)) - 8)], 0f32, dt [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 171)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 9), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod((threadIdx.x_1*12), 9))) &amp;&amp; (floormod((threadIdx.x_1*12), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 57), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod((threadIdx.x_1*12), 9)) - 8)], 0f32, dtype=float32)
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 172)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 10), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 1), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 1), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 57), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 1), 9)) - 8)] [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 173)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 11), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 2), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 2), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 57), 27)*49)) + (floormod((floordiv((threadIdx.x_1*4), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + 2), 9)) - 8)] [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 174)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 58), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 12), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 3), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 3), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 58), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 58), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 3), 9)) - 8)], 0f32, d [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 175)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 58), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 13), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 4), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 4), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 58), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 58), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 4), 9)) - 8)], 0f32, d [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 176)] = @tir.if_then_else(((((3 &lt;= floormod(((threadIdx.x_1*4) + 58), 27)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 14), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 5), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 5), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 58), 27)*49)) + (floordiv(floormod(((threadIdx.x_1*4) + 58), 27), 3)*7)) + floormod(((threadIdx.x_1*12) + 5), 9)) - 8)], 0f32, d [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 177)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 15), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 6), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 6), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 59), 27)*49)) + (floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 178)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 16), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 7), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 7), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 59), 27)*49)) + (floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + [...]
+          }
+          if @tir.likely((threadIdx.x_1 &lt; 13), dtype=bool) {
+            pad_temp.shared_1[((threadIdx.x_1*12) + 179)] = @tir.if_then_else(((((1 &lt;= floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)) &amp;&amp; (floormod(((threadIdx.x_1*12) + 17), 81) &lt; 72)) &amp;&amp; (1 &lt;= floormod(((threadIdx.x_1*12) + 8), 9))) &amp;&amp; (floormod(((threadIdx.x_1*12) + 8), 9) &lt; 8)), data[((((cse_var_2 + (floordiv(((threadIdx.x_1*4) + 59), 27)*49)) + (floormod((floordiv(((threadIdx.x_1*4) + 56), 3) + 1), 9)*7)) + floormod(((threadIdx.x_1*12) + [...]
+          }
+        }
+        attr [IterVar(threadIdx.x_2: int32, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1: Buffer(kernel.shared, float32, [288], [], scope=&quot;shared&quot;)[threadIdx.x_2] = kernel[(((blockIdx.x*36864) + cse_var_1) + threadIdx.x_2)]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 14)] = kernel[(((blockIdx.x*36864) + cse_var_1) + (threadIdx.x_2 + 14))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 28)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 14), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 28), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 42)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 21), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 6), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 56)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 28), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 20), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 70)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 35), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 34), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 84)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 42), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 12), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 98)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 49), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 26), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 112)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 56), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 4), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 126)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 63), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 18), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 140)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 70), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 32), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 154)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 77), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 10), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 168)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 84), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 24), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 182)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 91), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 2), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 196)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 98), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 16), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 210)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 105), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 30), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 224)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 112), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 8), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 238)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 119), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 22), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 252)] = kernel[((((blockIdx.x*36864) + cse_var_1) + threadIdx.x_2) + 32256)]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        kernel.shared_1[(threadIdx.x_2 + 266)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 133), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 14), 36))]
+        attr [IterVar(threadIdx.x_2, (nullptr), &quot;ThreadIndex&quot;, &quot;threadIdx.x&quot;)] &quot;thread_extent&quot; = 14;
+        if @tir.likely((threadIdx.x_2 &lt; 8), dtype=bool) {
+          kernel.shared_1[(threadIdx.x_2 + 280)] = kernel[((((blockIdx.x*36864) + (floordiv((floordiv(threadIdx.x_2, 2) + 140), 18)*4608)) + cse_var_1) + floormod((threadIdx.x_2 + 28), 36))]
+        }
+        for (rx.outer.inner: int32, 0, 3) {
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[((floordiv(threadIdx.x, 7)*144) + rx.outer.inner)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 36)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 72)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[((floormod(threadIdx.x, 7)*9) + rx.outer.inner)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 1)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 2)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 3)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 4)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 5)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 6)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 108)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 3)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 39)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 75)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 9)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 10)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 11)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 12)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 13)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 14)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 15)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 111)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 6)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 42)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 78)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 18)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 19)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 20)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 21)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 22)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 23)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 24)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 114)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 9)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 45)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 81)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 81)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 82)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 83)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 84)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 85)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 86)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 87)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 117)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 12)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 48)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 84)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 90)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 91)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 92)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 93)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 94)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 95)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 96)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 120)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 15)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 51)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 87)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 99)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 100)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 101)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 102)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 103)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 104)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 105)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 123)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 18)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 54)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 90)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 162)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 163)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 164)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 165)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 166)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 167)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 168)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 126)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 21)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 57)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 93)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 171)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 172)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 173)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 174)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 175)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 176)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 177)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 129)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 24)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 60)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 96)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 180)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 181)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 182)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 183)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 184)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 185)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 186)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 132)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 27)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 63)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 99)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 243)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 244)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 245)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 246)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 247)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 248)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 249)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 135)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 30)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 66)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 102)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 252)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 253)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 254)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 255)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 256)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 257)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 258)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 138)]))
+          conv2d_nchw_1[0] = (conv2d_nchw_1[0] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+          conv2d_nchw_1[4] = (conv2d_nchw_1[4] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+          conv2d_nchw_1[8] = (conv2d_nchw_1[8] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+          conv2d_nchw_1[12] = (conv2d_nchw_1[12] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+          conv2d_nchw_1[16] = (conv2d_nchw_1[16] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+          conv2d_nchw_1[20] = (conv2d_nchw_1[20] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+          conv2d_nchw_1[24] = (conv2d_nchw_1[24] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 33)]))
+          conv2d_nchw_1[1] = (conv2d_nchw_1[1] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+          conv2d_nchw_1[5] = (conv2d_nchw_1[5] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+          conv2d_nchw_1[9] = (conv2d_nchw_1[9] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+          conv2d_nchw_1[13] = (conv2d_nchw_1[13] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+          conv2d_nchw_1[17] = (conv2d_nchw_1[17] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+          conv2d_nchw_1[21] = (conv2d_nchw_1[21] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+          conv2d_nchw_1[25] = (conv2d_nchw_1[25] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 69)]))
+          conv2d_nchw_1[2] = (conv2d_nchw_1[2] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+          conv2d_nchw_1[6] = (conv2d_nchw_1[6] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+          conv2d_nchw_1[10] = (conv2d_nchw_1[10] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+          conv2d_nchw_1[14] = (conv2d_nchw_1[14] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+          conv2d_nchw_1[18] = (conv2d_nchw_1[18] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+          conv2d_nchw_1[22] = (conv2d_nchw_1[22] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+          conv2d_nchw_1[26] = (conv2d_nchw_1[26] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 105)]))
+          conv2d_nchw_1[3] = (conv2d_nchw_1[3] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 261)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+          conv2d_nchw_1[7] = (conv2d_nchw_1[7] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 262)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+          conv2d_nchw_1[11] = (conv2d_nchw_1[11] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 263)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+          conv2d_nchw_1[15] = (conv2d_nchw_1[15] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 264)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+          conv2d_nchw_1[19] = (conv2d_nchw_1[19] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 265)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+          conv2d_nchw_1[23] = (conv2d_nchw_1[23] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 266)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
+          conv2d_nchw_1[27] = (conv2d_nchw_1[27] + (pad_temp.shared_1[(((floormod(threadIdx.x, 7)*9) + rx.outer.inner) + 267)]*kernel.shared_1[(((floordiv(threadIdx.x, 7)*144) + rx.outer.inner) + 141)]))
         }
       }
     }
-    for (i1.inner: int32, 0, 2) {
-      for (i3.inner: int32, 0, 7) {
-        compute[(((((floordiv(blockIdx.x, 7)*6272) + (threadIdx.x*98)) + (i1.inner*49)) + (floormod(blockIdx.x, 7)*7)) + i3.inner)] = max((conv2d_nchw_1[((i1.inner*7) + i3.inner)] + bias[(((floordiv(blockIdx.x, 7)*128) + (threadIdx.x*2)) + i1.inner)]), 0f32)
-      }
+    for (i1.inner: int32, 0, 4) {
+      compute[((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7))] = max((conv2d_nchw_1[i1.inner] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+      compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 1)] = max((conv2d_nchw_1[(i1.inner + 4)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+      compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 2)] = max((conv2d_nchw_1[(i1.inner + 8)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+      compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 3)] = max((conv2d_nchw_1[(i1.inner + 12)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+      compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 4)] = max((conv2d_nchw_1[(i1.inner + 16)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+      compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 5)] = max((conv2d_nchw_1[(i1.inner + 20)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
+      compute[(((((blockIdx.x*392) + (floordiv(threadIdx.x, 7)*196)) + (i1.inner*49)) + (floormod(threadIdx.x, 7)*7)) + 6)] = max((conv2d_nchw_1[(i1.inner + 24)] + bias[(((blockIdx.x*8) + (floordiv(threadIdx.x, 7)*4)) + i1.inner)]), 0f32)
     }
   }
 }
@@ -984,7 +988,7 @@ cooperative fetching, unrolling and operator fusion.</p>
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.367 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 0.370 ms
 </pre></div>
 </div>
 </div>
@@ -1014,21 +1018,21 @@ conv2d_nchw_nn_o_i, conv2d_nchw_nn_i = s[conv2d_nchw].split(conv2d_nchw_nn, fact
 conv2d_nchw_nn_o_o_i, conv2d_nchw_nn_o_i = s[conv2d_nchw].split(conv2d_nchw_nn_o_i, factor=1)
 conv2d_nchw_nn_o_o_o_i, conv2d_nchw_nn_o_o_i = s[conv2d_nchw].split(conv2d_nchw_nn_o_o_i, factor=1)
 conv2d_nchw_nn_o_o_o_o, conv2d_nchw_nn_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_nn_o_o_o_i, factor=1)
-conv2d_nchw_ff_o_i, conv2d_nchw_ff_i = s[conv2d_nchw].split(conv2d_nchw_ff, factor=1)
-conv2d_nchw_ff_o_o_i, conv2d_nchw_ff_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_i, factor=2)
-conv2d_nchw_ff_o_o_o_i, conv2d_nchw_ff_o_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_o_i, factor=64)
+conv2d_nchw_ff_o_i, conv2d_nchw_ff_i = s[conv2d_nchw].split(conv2d_nchw_ff, factor=4)
+conv2d_nchw_ff_o_o_i, conv2d_nchw_ff_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_i, factor=1)
+conv2d_nchw_ff_o_o_o_i, conv2d_nchw_ff_o_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_o_i, factor=2)
 conv2d_nchw_ff_o_o_o_o, conv2d_nchw_ff_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_ff_o_o_o_i, factor=1)
 conv2d_nchw_yy_o_i, conv2d_nchw_yy_i = s[conv2d_nchw].split(conv2d_nchw_yy, factor=1)
 conv2d_nchw_yy_o_o_i, conv2d_nchw_yy_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_i, factor=1)
-conv2d_nchw_yy_o_o_o_i, conv2d_nchw_yy_o_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_o_i, factor=1)
+conv2d_nchw_yy_o_o_o_i, conv2d_nchw_yy_o_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_o_i, factor=7)
 conv2d_nchw_yy_o_o_o_o, conv2d_nchw_yy_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_yy_o_o_o_i, factor=1)
 conv2d_nchw_xx_o_i, conv2d_nchw_xx_i = s[conv2d_nchw].split(conv2d_nchw_xx, factor=1)
-conv2d_nchw_xx_o_o_i, conv2d_nchw_xx_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_i, factor=7)
+conv2d_nchw_xx_o_o_i, conv2d_nchw_xx_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_i, factor=1)
 conv2d_nchw_xx_o_o_o_i, conv2d_nchw_xx_o_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_o_i, factor=1)
-conv2d_nchw_xx_o_o_o_o, conv2d_nchw_xx_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_o_o_i, factor=1)
-conv2d_nchw_rc_o_i, conv2d_nchw_rc_i = s[conv2d_nchw].split(conv2d_nchw_rc, factor=2)
-conv2d_nchw_rc_o_o, conv2d_nchw_rc_o_i = s[conv2d_nchw].split(conv2d_nchw_rc_o_i, factor=4)
-conv2d_nchw_ry_o_i, conv2d_nchw_ry_i = s[conv2d_nchw].split(conv2d_nchw_ry, factor=1)
+conv2d_nchw_xx_o_o_o_o, conv2d_nchw_xx_o_o_o_i = s[conv2d_nchw].split(conv2d_nchw_xx_o_o_o_i, factor=7)
+conv2d_nchw_rc_o_i, conv2d_nchw_rc_i = s[conv2d_nchw].split(conv2d_nchw_rc, factor=4)
+conv2d_nchw_rc_o_o, conv2d_nchw_rc_o_i = s[conv2d_nchw].split(conv2d_nchw_rc_o_i, factor=1)
+conv2d_nchw_ry_o_i, conv2d_nchw_ry_i = s[conv2d_nchw].split(conv2d_nchw_ry, factor=3)
 conv2d_nchw_ry_o_o, conv2d_nchw_ry_o_i = s[conv2d_nchw].split(conv2d_nchw_ry_o_i, factor=1)
 conv2d_nchw_rx_o_i, conv2d_nchw_rx_i = s[conv2d_nchw].split(conv2d_nchw_rx, factor=1)
 conv2d_nchw_rx_o_o, conv2d_nchw_rx_o_i = s[conv2d_nchw].split(conv2d_nchw_rx_o_i, factor=3)
@@ -1036,15 +1040,15 @@ s[conv2d_nchw].reorder(conv2d_nchw_nn_o_o_o_o, conv2d_nchw_ff_o_o_o_o, conv2d_nc
 compute_i0_o_i, compute_i0_i = s[compute].split(compute_i0, factor=1)
 compute_i0_o_o_i, compute_i0_o_i = s[compute].split(compute_i0_o_i, factor=1)
 compute_i0_o_o_o, compute_i0_o_o_i = s[compute].split(compute_i0_o_o_i, factor=1)
-compute_i1_o_i, compute_i1_i = s[compute].split(compute_i1, factor=2)
-compute_i1_o_o_i, compute_i1_o_i = s[compute].split(compute_i1_o_i, factor=64)
+compute_i1_o_i, compute_i1_i = s[compute].split(compute_i1, factor=4)
+compute_i1_o_o_i, compute_i1_o_i = s[compute].split(compute_i1_o_i, factor=2)
 compute_i1_o_o_o, compute_i1_o_o_i = s[compute].split(compute_i1_o_o_i, factor=1)
 compute_i2_o_i, compute_i2_i = s[compute].split(compute_i2, factor=1)
-compute_i2_o_o_i, compute_i2_o_i = s[compute].split(compute_i2_o_i, factor=1)
+compute_i2_o_o_i, compute_i2_o_i = s[compute].split(compute_i2_o_i, factor=7)
 compute_i2_o_o_o, compute_i2_o_o_i = s[compute].split(compute_i2_o_o_i, factor=1)
-compute_i3_o_i, compute_i3_i = s[compute].split(compute_i3, factor=7)
+compute_i3_o_i, compute_i3_i = s[compute].split(compute_i3, factor=1)
 compute_i3_o_o_i, compute_i3_o_i = s[compute].split(compute_i3_o_i, factor=1)
-compute_i3_o_o_o, compute_i3_o_o_i = s[compute].split(compute_i3_o_o_i, factor=1)
+compute_i3_o_o_o, compute_i3_o_o_i = s[compute].split(compute_i3_o_o_i, factor=7)
 s[compute].reorder(compute_i0_o_o_o, compute_i1_o_o_o, compute_i2_o_o_o, compute_i3_o_o_o, compute_i0_o_o_i, compute_i1_o_o_i, compute_i2_o_o_i, compute_i3_o_o_i, compute_i0_o_i, compute_i1_o_i, compute_i2_o_i, compute_i3_o_i, compute_i0_i, compute_i1_i, compute_i2_i, compute_i3_i)
 s[conv2d_nchw].compute_at(s[compute], compute_i3_o_i)
 kernel_shared = s.cache_read(kernel, &quot;shared&quot;, [conv2d_nchw])
@@ -1063,12 +1067,12 @@ s[compute].bind(compute_i0_o_i_i1_o_i_fused_i2_o_i_fused_i3_o_i_fused, te.thread
 kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused = s[kernel_shared].fuse(kernel_shared_ax0, kernel_shared_ax1, kernel_shared_ax2, kernel_shared_ax3)
 kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i = s[kernel_shared].split(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused, factor=1)
 s[kernel_shared].vectorize(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i)
-kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[kernel_shared].split(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=64)
+kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[kernel_shared].split(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=14)
 s[kernel_shared].bind(kernel_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i, te.thread_axis(&quot;threadIdx.x&quot;))
 pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused = s[pad_temp_shared].fuse(pad_temp_shared_ax0, pad_temp_shared_ax1, pad_temp_shared_ax2, pad_temp_shared_ax3)
-pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused, factor=4)
+pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused, factor=12)
 s[pad_temp_shared].vectorize(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_i)
-pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=64)
+pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_o, pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i = s[pad_temp_shared].split(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o, factor=14)
 s[pad_temp_shared].bind(pad_temp_shared_ax0_ax1_fused_ax2_fused_ax3_fused_o_i, te.thread_axis(&quot;threadIdx.x&quot;))
 s[conv2d_nchw].pragma(conv2d_nchw_nn_o_o_o_o, &quot;auto_unroll_max_step&quot;, 512)
 s[conv2d_nchw].pragma(conv2d_nchw_nn_o_o_o_o, &quot;unroll_explicit&quot;, True)
@@ -1088,430 +1092,459 @@ CUDA source code:
   #define int64_t long long
   #define uint64_t unsigned long long
 #endif
-extern &quot;C&quot; __global__ void __launch_bounds__(64) default_function_kernel0(float* __restrict__ data, float* __restrict__ kernel, float* __restrict__ compute, float* __restrict__ bias) {
-  float conv2d_nchw[14];
-  __shared__ float pad_temp_shared[72];
-  __shared__ float kernel_shared[3072];
+extern &quot;C&quot; __global__ void __launch_bounds__(14) default_function_kernel0(float* __restrict__ data, float* __restrict__ kernel, float* __restrict__ compute, float* __restrict__ bias) {
+  float conv2d_nchw[28];
+  __shared__ float pad_temp_shared[324];
+  __shared__ float kernel_shared[288];
   conv2d_nchw[0] = 0.000000e+00f;
-  conv2d_nchw[1] = 0.000000e+00f;
-  conv2d_nchw[2] = 0.000000e+00f;
-  conv2d_nchw[3] = 0.000000e+00f;
   conv2d_nchw[4] = 0.000000e+00f;
-  conv2d_nchw[5] = 0.000000e+00f;
-  conv2d_nchw[6] = 0.000000e+00f;
-  conv2d_nchw[7] = 0.000000e+00f;
   conv2d_nchw[8] = 0.000000e+00f;
+  conv2d_nchw[12] = 0.000000e+00f;
+  conv2d_nchw[16] = 0.000000e+00f;
+  conv2d_nchw[20] = 0.000000e+00f;
+  conv2d_nchw[24] = 0.000000e+00f;
+  conv2d_nchw[1] = 0.000000e+00f;
+  conv2d_nchw[5] = 0.000000e+00f;
   conv2d_nchw[9] = 0.000000e+00f;
+  conv2d_nchw[13] = 0.000000e+00f;
+  conv2d_nchw[17] = 0.000000e+00f;
+  conv2d_nchw[21] = 0.000000e+00f;
+  conv2d_nchw[25] = 0.000000e+00f;
+  conv2d_nchw[2] = 0.000000e+00f;
+  conv2d_nchw[6] = 0.000000e+00f;
   conv2d_nchw[10] = 0.000000e+00f;
+  conv2d_nchw[14] = 0.000000e+00f;
+  conv2d_nchw[18] = 0.000000e+00f;
+  conv2d_nchw[22] = 0.000000e+00f;
+  conv2d_nchw[26] = 0.000000e+00f;
+  conv2d_nchw[3] = 0.000000e+00f;
+  conv2d_nchw[7] = 0.000000e+00f;
   conv2d_nchw[11] = 0.000000e+00f;
-  conv2d_nchw[12] = 0.000000e+00f;
-  conv2d_nchw[13] = 0.000000e+00f;
-  for (int rc_outer_outer = 0; rc_outer_outer &lt; 64; ++rc_outer_outer) {
-    for (int ry_outer_outer = 0; ry_outer_outer &lt; 3; ++ry_outer_outer) {
-      __syncthreads();
-      if (((int)threadIdx.x) &lt; 18) {
-        pad_temp_shared[(((int)threadIdx.x) * 4)] = (((((1 &lt;= (ry_outer_outer + (((int)blockIdx.x) % 7))) &amp;&amp; ((ry_outer_outer + (((int)blockIdx.x) % 7)) &lt; 8)) &amp;&amp; (1 &lt;= ((((int)threadIdx.x) * 4) % 9))) &amp;&amp; (((((int)threadIdx.x) * 4) % 9) &lt; 8)) ? data[((((((rc_outer_outer * 392) + (((((int)threadIdx.x) * 4) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + ((((int)threadIdx.x) * 4) % 9)) - 8)] : 0.000000e+00f);
-      }
-      if (((int)threadIdx.x) &lt; 18) {
-        pad_temp_shared[((((int)threadIdx.x) * 4) + 1)] = (((((1 &lt;= (ry_outer_outer + (((int)blockIdx.x) % 7))) &amp;&amp; ((ry_outer_outer + (((int)blockIdx.x) % 7)) &lt; 8)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 4) + 1) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 4) + 1) % 9) &lt; 8)) ? data[((((((rc_outer_outer * 392) + ((((((int)threadIdx.x) * 4) + 1) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + (((((int)threadIdx.x) * 4) + 1) % 9)) - 8)] : 0.000000e+00f);
-      }
-      if (((int)threadIdx.x) &lt; 18) {
-        pad_temp_shared[((((int)threadIdx.x) * 4) + 2)] = (((((1 &lt;= (ry_outer_outer + (((int)blockIdx.x) % 7))) &amp;&amp; ((ry_outer_outer + (((int)blockIdx.x) % 7)) &lt; 8)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 4) + 2) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 4) + 2) % 9) &lt; 8)) ? data[((((((rc_outer_outer * 392) + ((((((int)threadIdx.x) * 4) + 2) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + (((((int)threadIdx.x) * 4) + 2) % 9)) - 8)] : 0.000000e+00f);
-      }
-      if (((int)threadIdx.x) &lt; 18) {
-        pad_temp_shared[((((int)threadIdx.x) * 4) + 3)] = (((((1 &lt;= (ry_outer_outer + (((int)blockIdx.x) % 7))) &amp;&amp; ((ry_outer_outer + (((int)blockIdx.x) % 7)) &lt; 8)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 4) + 3) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 4) + 3) % 9) &lt; 8)) ? data[((((((rc_outer_outer * 392) + ((((((int)threadIdx.x) * 4) + 3) / 9) * 49)) + (ry_outer_outer * 7)) + ((((int)blockIdx.x) % 7) * 7)) + (((((int)threadIdx.x) * 4) + 3) % 9)) - 8)] : 0.000000e+00f);
-      }
-      kernel_shared[((int)threadIdx.x)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 64)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 64) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 128)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 128) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 192)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 36864)];
-      kernel_shared[(((int)threadIdx.x) + 256)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 256) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 320)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 320) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 384)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 73728)];
-      kernel_shared[(((int)threadIdx.x) + 448)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 448) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 512)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 512) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 576)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 110592)];
-      kernel_shared[(((int)threadIdx.x) + 640)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 640) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 704)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 704) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 768)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 147456)];
-      kernel_shared[(((int)threadIdx.x) + 832)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 832) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 896)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 896) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 960)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 184320)];
-      kernel_shared[(((int)threadIdx.x) + 1024)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1024) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1088)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1088) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1152)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 221184)];
-      kernel_shared[(((int)threadIdx.x) + 1216)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1216) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1280)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1280) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1344)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 258048)];
-      kernel_shared[(((int)threadIdx.x) + 1408)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1408) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1472)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1472) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1536)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 294912)];
-      kernel_shared[(((int)threadIdx.x) + 1600)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1600) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1664)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1664) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1728)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 331776)];
-      kernel_shared[(((int)threadIdx.x) + 1792)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1792) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1856)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1856) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 1920)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 368640)];
-      kernel_shared[(((int)threadIdx.x) + 1984)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 1984) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2048)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2048) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2112)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 405504)];
-      kernel_shared[(((int)threadIdx.x) + 2176)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2176) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2240)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2240) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2304)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 442368)];
-      kernel_shared[(((int)threadIdx.x) + 2368)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2368) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2432)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2432) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2496)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 479232)];
-      kernel_shared[(((int)threadIdx.x) + 2560)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2560) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2624)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2624) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2688)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 516096)];
-      kernel_shared[(((int)threadIdx.x) + 2752)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2752) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2816)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2816) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 2880)] = kernel[((((((((((int)blockIdx.x) / 7) * 589824) + ((((int)threadIdx.x) / 24) * 4608)) + (rc_outer_outer * 72)) + (((((int)threadIdx.x) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + (((int)threadIdx.x) % 3)) + 552960)];
-      kernel_shared[(((int)threadIdx.x) + 2944)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 2944) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 16) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 1) % 3))];
-      kernel_shared[(((int)threadIdx.x) + 3008)] = kernel[(((((((((int)blockIdx.x) / 7) * 589824) + (((((int)threadIdx.x) + 3008) / 24) * 4608)) + (rc_outer_outer * 72)) + ((((((int)threadIdx.x) + 8) % 24) / 3) * 9)) + (ry_outer_outer * 3)) + ((((int)threadIdx.x) + 2) % 3))];
-      __syncthreads();
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[0] * kernel_shared[(((int)threadIdx.x) * 48)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[9] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[1] * kernel_shared[(((int)threadIdx.x) * 48)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[2] * kernel_shared[(((int)threadIdx.x) * 48)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[3] * kernel_shared[(((int)threadIdx.x) * 48)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[4] * kernel_shared[(((int)threadIdx.x) * 48)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[5] * kernel_shared[(((int)threadIdx.x) * 48)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[6] * kernel_shared[(((int)threadIdx.x) * 48)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 3)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[0] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[9] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[1] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 24)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 27)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[1] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 1)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 4)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[1] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[10] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 25)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 28)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[8] * kernel_shared[((((int)threadIdx.x) * 48) + 2)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[17] * kernel_shared[((((int)threadIdx.x) * 48) + 5)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[2] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[11] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[3] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[12] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[4] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[13] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[5] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[14] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[6] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[15] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[7] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[16] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[8] * kernel_shared[((((int)threadIdx.x) * 48) + 26)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[17] * kernel_shared[((((int)threadIdx.x) * 48) + 29)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[18] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[27] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 6)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 9)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[18] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[27] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 30)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 33)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 7)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 10)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[19] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[28] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 31)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 34)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[26] * kernel_shared[((((int)threadIdx.x) * 48) + 8)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[35] * kernel_shared[((((int)threadIdx.x) * 48) + 11)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[20] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[29] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[21] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[30] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[22] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[31] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[23] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[32] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[24] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[33] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[25] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[34] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[26] * kernel_shared[((((int)threadIdx.x) * 48) + 32)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[35] * kernel_shared[((((int)threadIdx.x) * 48) + 35)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[36] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[45] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 12)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 15)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[36] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[45] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 36)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 39)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 13)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 16)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[37] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[46] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 37)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 40)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[44] * kernel_shared[((((int)threadIdx.x) * 48) + 14)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[53] * kernel_shared[((((int)threadIdx.x) * 48) + 17)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[38] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[47] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[39] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[48] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[40] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[49] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[41] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[50] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[42] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[51] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[43] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[52] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[44] * kernel_shared[((((int)threadIdx.x) * 48) + 38)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[53] * kernel_shared[((((int)threadIdx.x) * 48) + 41)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[54] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[63] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 18)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 21)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[54] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[63] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 42)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 45)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 19)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 22)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[55] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[64] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 43)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 46)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[62] * kernel_shared[((((int)threadIdx.x) * 48) + 20)]));
-      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[71] * kernel_shared[((((int)threadIdx.x) * 48) + 23)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[56] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[65] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[57] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[66] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[58] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[67] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[59] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[68] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[60] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[69] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[61] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[70] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[62] * kernel_shared[((((int)threadIdx.x) * 48) + 44)]));
-      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[71] * kernel_shared[((((int)threadIdx.x) * 48) + 47)]));
+  conv2d_nchw[15] = 0.000000e+00f;
+  conv2d_nchw[19] = 0.000000e+00f;
+  conv2d_nchw[23] = 0.000000e+00f;
+  conv2d_nchw[27] = 0.000000e+00f;
+  for (int rc_outer_outer = 0; rc_outer_outer &lt; 128; ++rc_outer_outer) {
+    __syncthreads();
+    pad_temp_shared[(((int)threadIdx.x) * 12)] = (((((3 &lt;= ((((int)threadIdx.x) * 4) % 27)) &amp;&amp; (((((int)threadIdx.x) * 12) % 81) &lt; 72)) &amp;&amp; (1 &lt;= ((((int)threadIdx.x) * 12) % 9))) &amp;&amp; (((((int)threadIdx.x) * 12) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + (((((int)threadIdx.x) * 4) / 27) * 49)) + ((((((int)threadIdx.x) * 4) % 27) / 3) * 7)) + ((((int)threadIdx.x) * 12) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 1)] = (((((3 &lt;= ((((int)threadIdx.x) * 4) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 1) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 1) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 1) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + (((((int)threadIdx.x) * 4) / 27) * 49)) + ((((((int)threadIdx.x) * 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 1) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 2)] = (((((3 &lt;= ((((int)threadIdx.x) * 4) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 2) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 2) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 2) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + (((((int)threadIdx.x) * 4) / 27) * 49)) + ((((((int)threadIdx.x) * 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 2) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 3)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 1) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 3) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 3) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 3) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 1) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 1) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 3) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 4)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 1) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 4) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 4) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 4) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 1) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 1) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 4) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 5)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 1) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 5) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 5) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 5) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 1) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 1) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 5) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 6)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 2) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 6) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 6) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 6) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 2) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 6) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 7)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 2) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 7) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 7) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 7) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 2) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 7) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 8)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 2) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 8) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 8) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 8) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 2) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 8) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 9)] = (((((1 &lt;= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 9) % 81) &lt; 72)) &amp;&amp; (1 &lt;= ((((int)threadIdx.x) * 12) % 9))) &amp;&amp; (((((int)threadIdx.x) * 12) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 3) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + ((((int)threadIdx.x) * 12) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 10)] = (((((1 &lt;= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 10) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 1) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 1) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 3) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 1) % 9)) - 8)] : 0.000000e+00f);
+    pad_temp_shared[((((int)threadIdx.x) * 12) + 11)] = (((((1 &lt;= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 11) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 2) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 2) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 3) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 2) % 9)) - 8)] : 0.000000e+00f);
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 168)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 2) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 6) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 6) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 6) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 56) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 6) % 9)) - 8)] : 0.000000e+00f);
     }
-  }
-  for (int i1_inner = 0; i1_inner &lt; 2; ++i1_inner) {
-    for (int i3_inner = 0; i3_inner &lt; 7; ++i3_inner) {
-      compute[((((((((int)blockIdx.x) / 7) * 6272) + (((int)threadIdx.x) * 98)) + (i1_inner * 49)) + ((((int)blockIdx.x) % 7) * 7)) + i3_inner)] = max((conv2d_nchw[((i1_inner * 7) + i3_inner)] + bias[((((((int)blockIdx.x) / 7) * 128) + (((int)threadIdx.x) * 2)) + i1_inner)]), 0.000000e+00f);
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 169)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 2) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 7) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 7) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 7) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 56) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 7) % 9)) - 8)] : 0.000000e+00f);
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 170)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 2) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 8) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 8) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 8) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 56) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 2) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 8) % 9)) - 8)] : 0.000000e+00f);
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 171)] = (((((1 &lt;= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 9) % 81) &lt; 72)) &amp;&amp; (1 &lt;= ((((int)threadIdx.x) * 12) % 9))) &amp;&amp; (((((int)threadIdx.x) * 12) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 57) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + ((((int)threadIdx.x) * 12) % 9)) - 8)] : 0.000000e+00f);
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 172)] = (((((1 &lt;= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 10) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 1) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 1) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 57) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 1) % 9)) - 8)] : 0.000000e+00f);
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 173)] = (((((1 &lt;= ((((((int)threadIdx.x) * 4) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 11) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 2) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 2) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 57) / 27) * 49)) + (((((((int)threadIdx.x) * 4) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 2) % 9)) - 8)] : 0.000000e+00f);
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 174)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 4) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 12) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 3) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 3) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 58) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 3) % 9)) - 8)] : 0.000000e+00f);
     }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 175)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 4) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 13) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 4) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 4) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 58) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 4) % 9)) - 8)] : 0.000000e+00f);
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 176)] = (((((3 &lt;= (((((int)threadIdx.x) * 4) + 4) % 27)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 14) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 5) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 5) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 58) / 27) * 49)) + (((((((int)threadIdx.x) * 4) + 4) % 27) / 3) * 7)) + (((((int)threadIdx.x) * 12) + 5) % 9)) - 8)] : 0.000000e+00f);
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 177)] = (((((1 &lt;= (((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 15) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 6) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 6) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 59) / 27) * 49)) + ((((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 6) % 9)) - 8)] [...]
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 178)] = (((((1 &lt;= (((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 16) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 7) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 7) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 59) / 27) * 49)) + ((((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 7) % 9)) - 8)] [...]
+    }
+    if (((int)threadIdx.x) &lt; 13) {
+      pad_temp_shared[((((int)threadIdx.x) * 12) + 179)] = (((((1 &lt;= (((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9)) &amp;&amp; ((((((int)threadIdx.x) * 12) + 17) % 81) &lt; 72)) &amp;&amp; (1 &lt;= (((((int)threadIdx.x) * 12) + 8) % 9))) &amp;&amp; ((((((int)threadIdx.x) * 12) + 8) % 9) &lt; 8)) ? data[(((((rc_outer_outer * 196) + ((((((int)threadIdx.x) * 4) + 59) / 27) * 49)) + ((((((((int)threadIdx.x) * 4) + 56) / 3) + 1) % 9) * 7)) + (((((int)threadIdx.x) * 12) + 8) % 9)) - 8)] [...]
+    }
+    kernel_shared[((int)threadIdx.x)] = kernel[(((((int)blockIdx.x) * 36864) + (rc_outer_outer * 36)) + ((int)threadIdx.x))];
+    kernel_shared[(((int)threadIdx.x) + 14)] = kernel[((((((int)blockIdx.x) * 36864) + (rc_outer_outer * 36)) + ((int)threadIdx.x)) + 14)];
+    kernel_shared[(((int)threadIdx.x) + 28)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 28) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 28) % 36))];
+    kernel_shared[(((int)threadIdx.x) + 42)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 42) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 6))];
+    kernel_shared[(((int)threadIdx.x) + 56)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 56) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 20))];
+    kernel_shared[(((int)threadIdx.x) + 70)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 70) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 34) % 36))];
+    kernel_shared[(((int)threadIdx.x) + 84)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 84) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 12))];
+    kernel_shared[(((int)threadIdx.x) + 98)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 98) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 26) % 36))];
+    kernel_shared[(((int)threadIdx.x) + 112)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 112) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 4))];
+    kernel_shared[(((int)threadIdx.x) + 126)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 126) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 18))];
+    kernel_shared[(((int)threadIdx.x) + 140)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 140) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 32) % 36))];
+    kernel_shared[(((int)threadIdx.x) + 154)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 154) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 10))];
+    kernel_shared[(((int)threadIdx.x) + 168)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 168) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 24) % 36))];
+    kernel_shared[(((int)threadIdx.x) + 182)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 182) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 2))];
+    kernel_shared[(((int)threadIdx.x) + 196)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 196) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 16))];
+    kernel_shared[(((int)threadIdx.x) + 210)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 210) / 36) * 4608)) + (rc_outer_outer * 36)) + ((((int)threadIdx.x) + 30) % 36))];
+    kernel_shared[(((int)threadIdx.x) + 224)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 224) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 8))];
+    kernel_shared[(((int)threadIdx.x) + 238)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 238) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 22))];
+    kernel_shared[(((int)threadIdx.x) + 252)] = kernel[((((((int)blockIdx.x) * 36864) + (rc_outer_outer * 36)) + ((int)threadIdx.x)) + 32256)];
+    kernel_shared[(((int)threadIdx.x) + 266)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 266) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 14))];
+    if (((int)threadIdx.x) &lt; 8) {
+      kernel_shared[(((int)threadIdx.x) + 280)] = kernel[((((((int)blockIdx.x) * 36864) + (((((int)threadIdx.x) + 280) / 36) * 4608)) + (rc_outer_outer * 36)) + (((int)threadIdx.x) + 28))];
+    }
+    __syncthreads();
+    for (int rx_outer_inner = 0; rx_outer_inner &lt; 3; ++rx_outer_inner) {
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[(((((int)threadIdx.x) / 7) * 144) + rx_outer_inner)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 36)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 72)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[(((((int)threadIdx.x) % 7) * 9) + rx_outer_inner)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 1)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 2)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 3)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 4)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 5)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 6)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 108)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 3)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 39)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 75)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 9)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 10)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 11)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 12)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 13)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 14)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 15)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 111)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 6)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 42)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 78)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 18)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 19)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 20)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 21)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 22)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 23)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 24)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 114)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 9)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 45)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 81)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 81)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 82)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 83)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 84)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 85)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 86)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 87)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 117)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 12)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 48)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 84)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 90)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 91)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 92)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 93)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 94)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 95)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 96)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 120)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 15)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 51)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 87)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 99)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 100)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 101)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 102)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 103)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 104)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 105)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 123)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 18)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 54)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 90)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 162)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 163)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 164)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 165)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 166)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 167)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 168)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 126)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 21)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 57)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 93)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 171)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 172)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 173)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 174)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 175)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 176)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 177)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 129)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 24)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 60)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 96)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 180)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 181)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 182)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 183)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 184)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 185)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 186)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 132)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 27)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 63)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 99)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 243)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 244)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 245)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 246)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 247)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 248)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 249)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 135)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 30)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 66)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 102)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 252)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 253)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 254)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 255)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 256)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 257)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 258)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 138)]));
+      conv2d_nchw[0] = (conv2d_nchw[0] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+      conv2d_nchw[4] = (conv2d_nchw[4] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+      conv2d_nchw[8] = (conv2d_nchw[8] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+      conv2d_nchw[12] = (conv2d_nchw[12] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+      conv2d_nchw[16] = (conv2d_nchw[16] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+      conv2d_nchw[20] = (conv2d_nchw[20] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+      conv2d_nchw[24] = (conv2d_nchw[24] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 33)]));
+      conv2d_nchw[1] = (conv2d_nchw[1] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+      conv2d_nchw[5] = (conv2d_nchw[5] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+      conv2d_nchw[9] = (conv2d_nchw[9] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+      conv2d_nchw[13] = (conv2d_nchw[13] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+      conv2d_nchw[17] = (conv2d_nchw[17] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+      conv2d_nchw[21] = (conv2d_nchw[21] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+      conv2d_nchw[25] = (conv2d_nchw[25] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 69)]));
+      conv2d_nchw[2] = (conv2d_nchw[2] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+      conv2d_nchw[6] = (conv2d_nchw[6] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+      conv2d_nchw[10] = (conv2d_nchw[10] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+      conv2d_nchw[14] = (conv2d_nchw[14] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+      conv2d_nchw[18] = (conv2d_nchw[18] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+      conv2d_nchw[22] = (conv2d_nchw[22] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+      conv2d_nchw[26] = (conv2d_nchw[26] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 105)]));
+      conv2d_nchw[3] = (conv2d_nchw[3] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 261)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+      conv2d_nchw[7] = (conv2d_nchw[7] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 262)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+      conv2d_nchw[11] = (conv2d_nchw[11] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 263)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+      conv2d_nchw[15] = (conv2d_nchw[15] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 264)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+      conv2d_nchw[19] = (conv2d_nchw[19] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 265)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+      conv2d_nchw[23] = (conv2d_nchw[23] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 266)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+      conv2d_nchw[27] = (conv2d_nchw[27] + (pad_temp_shared[((((((int)threadIdx.x) % 7) * 9) + rx_outer_inner) + 267)] * kernel_shared[((((((int)threadIdx.x) / 7) * 144) + rx_outer_inner) + 141)]));
+    }
+  }
+  for (int i1_inner = 0; i1_inner &lt; 4; ++i1_inner) {
+    compute[((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7))] = max((conv2d_nchw[i1_inner] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+    compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 1)] = max((conv2d_nchw[(i1_inner + 4)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+    compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 2)] = max((conv2d_nchw[(i1_inner + 8)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+    compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 3)] = max((conv2d_nchw[(i1_inner + 12)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+    compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 4)] = max((conv2d_nchw[(i1_inner + 16)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+    compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 5)] = max((conv2d_nchw[(i1_inner + 20)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
+    compute[(((((((int)blockIdx.x) * 392) + ((((int)threadIdx.x) / 7) * 196)) + (i1_inner * 49)) + ((((int)threadIdx.x) % 7) * 7)) + 6)] = max((conv2d_nchw[(i1_inner + 24)] + bias[(((((int)blockIdx.x) * 8) + ((((int)threadIdx.x) / 7) * 4)) + i1_inner)]), 0.000000e+00f);
   }
 }
 </pre></div>
@@ -1549,7 +1582,7 @@ In the example below we resume the status and do more 5 trials.</p>
 Get devices for measurement successfully!
 </pre></div>
 </div>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  20.665 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes  20.451 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-conv2d-layer-cuda-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e3e540f3b477c0c52d8eb73e674e8ffd/tune_conv2d_layer_cuda.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_conv2d_layer_cuda.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
index 8bde4dbbb..dde7a3728 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_cuda.html
@@ -876,7 +876,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  10.0707      10.0897      10.0959      10.0265       0.0314
+  10.1876      10.1918      10.2093      10.1617       0.0197
 </pre></div>
 </div>
 </div>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
index 52247bcbf..5e544ada1 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_network_x86.html
@@ -895,7 +895,7 @@ so we can read the log file and load the best schedules.</p>
 Evaluate inference time cost...
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-  767.2840     767.8188     770.8792     763.1539      3.1765
+  759.3154     758.3050     761.5435     758.0978      1.5778
 </pre></div>
 </div>
 </div>
@@ -917,7 +917,7 @@ to learn how to use the RPC Tracker and RPC Server.
 To use the RPC Tracker in auto-scheduler, replace the runner in <code class="code docutils literal notranslate"><span class="pre">TuningOptions</span></code>
 with <a class="reference internal" href="../../reference/api/python/auto_scheduler.html#tvm.auto_scheduler.RPCRunner" title="tvm.auto_scheduler.RPCRunner"><code class="xref any py py-class docutils literal notranslate"><span class="pre">auto_scheduler.RPCRunner</span></code></a>.</p></li>
 </ol>
-<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  20.512 seconds)</p>
+<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes  20.072 seconds)</p>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autoscheduler-tune-network-x86-py">
 <div class="sphx-glr-download docutils container">
 <p><a class="reference download internal" download="" href="../../_downloads/e416b94ca1090b0897c0f6e0df95b911/tune_network_x86.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tune_network_x86.py</span></code></a></p>
diff --git a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
index ef23efac2..f54eb5439 100644
--- a/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
+++ b/docs/how_to/tune_with_autoscheduler/tune_sparse_x86.html
@@ -600,7 +600,7 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
              placeholder_4: Buffer(placeholder_14: Pointer(float32), float32, [65536], []),
              compute: Buffer(compute_2: Pointer(float32), float32, [65536], [])}
   buffer_map = {placeholder_5: placeholder, placeholder_6: placeholder_1, placeholder_7: placeholder_2, placeholder_8: placeholder_3, placeholder_9: placeholder_4, compute_1: compute}
-  preflattened_buffer_map = {placeholder_5: placeholder_15: Buffer(placeholder_10, float32, [128, 256], []), placeholder_9: placeholder_16: Buffer(placeholder_14, float32, [128, 512], []), placeholder_6: placeholder_17: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_7: placeholder_18: Buffer(placeholder_12, int32, [4916], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_8: placeholder_19: Buffer(placeholder_13, int32, [33], [])} {
+  preflattened_buffer_map = {placeholder_7: placeholder_15: Buffer(placeholder_12, int32, [4916], []), placeholder_9: placeholder_16: Buffer(placeholder_14, float32, [128, 512], []), compute_1: compute_3: Buffer(compute_2, float32, [128, 512], []), placeholder_6: placeholder_17: Buffer(placeholder_11, float32, [4916, 16, 1], []), placeholder_5: placeholder_18: Buffer(placeholder_10, float32, [128, 256], []), placeholder_8: placeholder_19: Buffer(placeholder_13, int32, [33], [])} {
   for (i0.outer.i1.outer.fused: int32, 0, 32) &quot;parallel&quot; {
     allocate(compute_4: Pointer(global float32), float32, [2048]), storage_scope = global {
       for (i.outer.inner: int32, 0, 2) {
@@ -662,7 +662,7 @@ layout transformation, parallelization, vectorization, unrolling, and operator f
 </pre></div>
 </div>
 <p class="sphx-glr-script-out">Out:</p>
-<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 1.625 ms
+<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Execution time of this operator: 1.621 ms
 </pre></div>
 </div>
 <div class="admonition note">
diff --git a/docs/how_to/tune_with_autotvm/sg_execution_times.html b/docs/how_to/tune_with_autotvm/sg_execution_times.html
index 9d0148a8b..409b03393 100644
--- a/docs/how_to/tune_with_autotvm/sg_execution_times.html
+++ b/docs/how_to/tune_with_autotvm/sg_execution_times.html
@@ -300,13 +300,13 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-tune-with-autotvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:45.248</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
+<p><strong>00:43.947</strong> total execution time for <strong>how_to_tune_with_autotvm</strong> files:</p>
 <ul class="simple">
-<li><p><strong>00:44.386</strong>: <a class="reference internal" href="tune_conv2d_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-conv2d-cuda-py"><span class="std std-ref">Tuning High Performance Convolution on NVIDIA GPUs</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_cuda.py</span></code>)</p></li>
-<li><p><strong>00:00.232</strong>: <a class="reference internal" href="tune_relay_x86.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-x86-py"><span class="std std-ref">Auto-tuning a Convolutional Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_x86.py</span></code>)</p></li>
-<li><p><strong>00:00.211</strong>: <a class="reference internal" href="tune_relay_arm.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-arm-py"><span class="std std-ref">Auto-tuning a Convolutional Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_arm.py</span></code>)</p></li>
-<li><p><strong>00:00.210</strong>: <a class="reference internal" href="tune_relay_mobile_gpu.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-mobile-gpu-py"><span class="std std-ref">Auto-tuning a Convolutional Network for Mobile GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_mobile_gpu.py</span></code>)</p></li>
-<li><p><strong>00:00.209</strong>: <a class="reference internal" href="tune_relay_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-cuda-py"><span class="std std-ref">Auto-tuning a Convolutional Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_cuda.py</span></code>)</p></li>
+<li><p><strong>00:43.053</strong>: <a class="reference internal" href="tune_conv2d_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-conv2d-cuda-py"><span class="std std-ref">Tuning High Performance Convolution on NVIDIA GPUs</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_conv2d_cuda.py</span></code>)</p></li>
+<li><p><strong>00:00.239</strong>: <a class="reference internal" href="tune_relay_x86.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-x86-py"><span class="std std-ref">Auto-tuning a Convolutional Network for x86 CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_x86.py</span></code>)</p></li>
+<li><p><strong>00:00.222</strong>: <a class="reference internal" href="tune_relay_mobile_gpu.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-mobile-gpu-py"><span class="std std-ref">Auto-tuning a Convolutional Network for Mobile GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_mobile_gpu.py</span></code>)</p></li>
+<li><p><strong>00:00.220</strong>: <a class="reference internal" href="tune_relay_cuda.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-cuda-py"><span class="std std-ref">Auto-tuning a Convolutional Network for NVIDIA GPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_cuda.py</span></code>)</p></li>
+<li><p><strong>00:00.213</strong>: <a class="reference internal" href="tune_relay_arm.html#sphx-glr-how-to-tune-with-autotvm-tune-relay-arm-py"><span class="std std-ref">Auto-tuning a Convolutional Network for ARM CPU</span></a> (<code class="docutils literal notranslate"><span class="pre">tune_relay_arm.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
index c8e0b9235..e01509d3f 100644
--- a/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
+++ b/docs/how_to/tune_with_autotvm/tune_conv2d_cuda.html
@@ -1142,8 +1142,8 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 4, 32]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 1, 128]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2885496
-No: 6   GFLOPS: 63.26/63.26     result: MeasureResult(costs=(0.0036596923,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6318109035491943, timestamp=1651374043.883832)        [(&#39;tile_f&#39;, [-1, 1, 1, 1]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 4, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,3754080
-No: 7   GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 6   GFLOPS: 92.58/92.58     result: MeasureResult(costs=(0.0025004747916666666,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.6190400123596191, timestamp=1651508826.0159192)      [(&#39;tile_f&#39;, [-1, 1, 1, 1]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 4, 4]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,3754080
+No: 7   GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -1266,7 +1266,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 1, 16, 32]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 256, 1]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6225319
-No: 8   GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 8   GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -1389,7 +1389,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 2, 1, 32]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 8, 64]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 0)],None,943546
-No: 9   GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 9   GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -1512,7 +1512,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 4, 16, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 16, 32]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2868708
-No: 10  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 10  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 142, in build
     res = future.result()
   File &quot;/usr/lib/python3.7/concurrent/futures/_base.py&quot;, line 435, in result
@@ -1530,7 +1530,7 @@ No: 10  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
 TimeoutError
 
         [(&#39;tile_f&#39;, [-1, 32, 2, 4]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 4, 2]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4691833
-No: 11  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 11  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -1653,7 +1653,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 1, 2, 64]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 4]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 0)],None,1042124
-No: 12  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 12  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -1776,7 +1776,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 32, 1, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 32, 16]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,10013405
-No: 13  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 13  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -1899,7 +1899,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 8, 8, 2]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 7, 1]), (&#39;tile_rc&#39;, [-1, 4, 32]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6732082
-No: 14  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 14  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -2022,7 +2022,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 2, 4, 32]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 1, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 128]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 1)],None,7536735
-No: 15  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 15  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -2145,7 +2145,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 2, 1, 4]), (&#39;tile_y&#39;, [-1, 1, 1, 7]), (&#39;tile_x&#39;, [-1, 1, 1, 7]), (&#39;tile_rc&#39;, [-1, 128, 4]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 1, 1]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 0)],None,482121
-No: 16  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 16  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -2268,7 +2268,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 2, 1, 16]), (&#39;tile_y&#39;, [-1, 1, 7, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 32, 8]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 512), (&#39;unroll_explicit&#39;, 0)],None,2824525
-No: 17  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 17  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -2391,7 +2391,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 64, 1, 1]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 8, 8]), (&#39;tile_ry&#39;, [-1, 1, 3]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 0)],None,4559286
-No: 18  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 18  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 571, in __call__
     func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 523, in _build_func_common
@@ -2514,7 +2514,7 @@ Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 854, in verify_pass
     raise InstantiationError(&quot;Skipped because of invalid gpu kernel&quot;)
 tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [(&#39;tile_f&#39;, [-1, 1, 32, 16]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 512]), (&#39;tile_ry&#39;, [-1, 3, 1]), (&#39;tile_rx&#39;, [-1, 3, 1]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,9677544
-No: 19  GFLOPS: 0.00/63.26      result: Traceback (most recent call last):
+No: 19  GFLOPS: 0.00/92.58      result: Traceback (most recent call last):
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 721, in __call__
     yield remote, remote.load_module(os.path.split(build_result.filename)[1])
   File &quot;/workspace/python/tvm/autotvm/measure/measure_methods.py&quot;, line 685, in run_through_rpc
@@ -2602,7 +2602,7 @@ tvm._ffi.base.TVMError: Traceback (most recent call last):
   15: _PyEval_EvalFrameDefault
   14: 0x0000000000537c30
   13: _PyObject_FastCallKeywords
-  12: 0x00007fc64b807fa2
+  12: 0x00007f3a3047dfa2
   11: _ctypes_callproc
   10: ffi_call
   9: ffi_call_unix64
@@ -2667,7 +2667,7 @@ Traceback (most recent call last):
   21: _PyFunction_FastCallKeywords
   20: _PyEval_EvalFrameDefault
   19: _PyFunction_FastCall      [(&#39;tile_f&#39;, [-1, 8, 2, 16]), (&#39;tile_y&#39;, [-1, 7, 1, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 1, 1]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 0), (&#39;unroll_explicit&#39;, 1)],None,6390073
-No: 20  GFLOPS: 142.48/142.48   result: MeasureResult(costs=(0.00162477076,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.4184842109680176, timestamp=1651374070.365538)       [(&#39;tile_f&#39;, [-1, 1, 4, 1]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,9881539
+No: 20  GFLOPS: 141.66/141.66   result: MeasureResult(costs=(0.0016342177903225807,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.1537678241729736, timestamp=1651508852.1777332)      [(&#39;tile_f&#39;, [-1, 1, 4, 1]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,9881539
 </pre></div>
 </div>
 <p>Finally we can inspect the best config from log file, check correctness,
@@ -2706,7 +2706,7 @@ and measure running time.</p>
 <p class="sphx-glr-script-out">Out:</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>Best config:
 [(&#39;tile_f&#39;, [-1, 1, 4, 1]), (&#39;tile_y&#39;, [-1, 1, 1, 1]), (&#39;tile_x&#39;, [-1, 7, 1, 1]), (&#39;tile_rc&#39;, [-1, 4, 1]), (&#39;tile_ry&#39;, [-1, 1, 1]), (&#39;tile_rx&#39;, [-1, 1, 3]), (&#39;auto_unroll_max_step&#39;, 1500), (&#39;unroll_explicit&#39;, 1)],None,9881539
-Time cost of this operator: 0.002004
+Time cost of this operator: 0.001984
 </pre></div>
 </div>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-tune-with-autotvm-tune-conv2d-cuda-py">
diff --git a/docs/how_to/work_with_microtvm/micro_autotune.html b/docs/how_to/work_with_microtvm/micro_autotune.html
index c3de52b30..936162be9 100644
--- a/docs/how_to/work_with_microtvm/micro_autotune.html
+++ b/docs/how_to/work_with_microtvm/micro_autotune.html
@@ -553,10 +553,10 @@ the tuned operator.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>########## Build without Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs
 ---------                                     ---                                           --------  -------  -----              ------  -------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  314.0     98.74    (1, 2, 10, 10, 3)  2       1
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.073     0.966    (1, 6, 10, 10)     1       1
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.934     0.294    (1, 1, 10, 10, 3)  1       1
-Total_time                                    -                                             318.007   -        -                  -       -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  315.2     98.742   (1, 2, 10, 10, 3)  2       1
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       3.097     0.97     (1, 6, 10, 10)     1       1
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.918     0.288    (1, 1, 10, 10, 3)  1       1
+Total_time                                    -                                             319.215   -        -                  -       -
 </pre></div>
 </div>
 </div>
@@ -608,10 +608,10 @@ Total_time                                    -
 <div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>########## Build with Autotuning ##########
 Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs
 ---------                                     ---                                           --------  -------  -----              ------  -------
-tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  81.4      96.877   (1, 6, 10, 10, 1)  2       1
-tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.702     2.025    (1, 6, 10, 10)     1       1
-tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.922     1.097    (1, 1, 10, 10, 3)  1       1
-Total_time                                    -                                             84.024    -        -                  -       -
+tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  81.15     96.765   (1, 6, 10, 10, 1)  2       1
+tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.812     2.161    (1, 6, 10, 10)     1       1
+tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.901     1.074    (1, 1, 10, 10, 3)  1       1
+Total_time                                    -                                             83.863    -        -                  -       -
 </pre></div>
 </div>
 <div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-how-to-work-with-microtvm-micro-autotune-py">
diff --git a/docs/how_to/work_with_microtvm/sg_execution_times.html b/docs/how_to/work_with_microtvm/sg_execution_times.html
index 08bb11857..68be84414 100644
--- a/docs/how_to/work_with_microtvm/sg_execution_times.html
+++ b/docs/how_to/work_with_microtvm/sg_execution_times.html
@@ -300,13 +300,13 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-microtvm-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:45.186</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
+<p><strong>00:44.150</strong> total execution time for <strong>how_to_work_with_microtvm</strong> files:</p>
 <ul class="simple">
-<li><p><strong>00:40.959</strong>: <a class="reference internal" href="micro_autotune.html#sphx-glr-how-to-work-with-microtvm-micro-autotune-py"><span class="std std-ref">Autotuning with microTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_autotune.py</span></code>)</p></li>
-<li><p><strong>00:03.619</strong>: <a class="reference internal" href="micro_tflite.html#sphx-glr-how-to-work-with-microtvm-micro-tflite-py"><span class="std std-ref">microTVM with TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_tflite.py</span></code>)</p></li>
+<li><p><strong>00:40.057</strong>: <a class="reference internal" href="micro_autotune.html#sphx-glr-how-to-work-with-microtvm-micro-autotune-py"><span class="std std-ref">Autotuning with microTVM</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_autotune.py</span></code>)</p></li>
+<li><p><strong>00:03.492</strong>: <a class="reference internal" href="micro_tflite.html#sphx-glr-how-to-work-with-microtvm-micro-tflite-py"><span class="std std-ref">microTVM with TFLite Models</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_tflite.py</span></code>)</p></li>
 <li><p><strong>00:00.208</strong>: <a class="reference internal" href="micro_tvmc.html#sphx-glr-how-to-work-with-microtvm-micro-tvmc-py"><span class="std std-ref">Executing a Tiny Model with TVMC Micro</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_tvmc.py</span></code>)</p></li>
-<li><p><strong>00:00.202</strong>: <a class="reference internal" href="micro_ethosu.html#sphx-glr-how-to-work-with-microtvm-micro-ethosu-py"><span class="std std-ref">Running TVM on bare metal Arm(R) Cortex(R)-M55 CPU and Ethos(TM)-U55 NPU</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_ethosu.py</span></code>)</p></li>
-<li><p><strong>00:00.199</strong>: <a class="reference internal" href="micro_reference_vm.html#sphx-glr-how-to-work-with-microtvm-micro-reference-vm-py"><span class="std std-ref">microTVM Reference Virtual Machines</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_reference_vm.py</span></code>)</p></li>
+<li><p><strong>00:00.197</strong>: <a class="reference internal" href="micro_ethosu.html#sphx-glr-how-to-work-with-microtvm-micro-ethosu-py"><span class="std std-ref">Running TVM on bare metal Arm(R) Cortex(R)-M55 CPU and Ethos(TM)-U55 NPU</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_ethosu.py</span></code>)</p></li>
+<li><p><strong>00:00.197</strong>: <a class="reference internal" href="micro_reference_vm.html#sphx-glr-how-to-work-with-microtvm-micro-reference-vm-py"><span class="std std-ref">microTVM Reference Virtual Machines</span></a> (<code class="docutils literal notranslate"><span class="pre">micro_reference_vm.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/work_with_relay/sg_execution_times.html b/docs/how_to/work_with_relay/sg_execution_times.html
index aa9a44697..2809ea9b5 100644
--- a/docs/how_to/work_with_relay/sg_execution_times.html
+++ b/docs/how_to/work_with_relay/sg_execution_times.html
@@ -300,11 +300,11 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-relay-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:09.226</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
+<p><strong>00:06.390</strong> total execution time for <strong>how_to_work_with_relay</strong> files:</p>
 <ul class="simple">
-<li><p><strong>00:07.184</strong>: <a class="reference internal" href="using_external_lib.html#sphx-glr-how-to-work-with-relay-using-external-lib-py"><span class="std std-ref">Using External Libraries in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_external_lib.py</span></code>)</p></li>
-<li><p><strong>00:01.829</strong>: <a class="reference internal" href="build_gcn.html#sphx-glr-how-to-work-with-relay-build-gcn-py"><span class="std std-ref">Building a Graph Convolutional Network</span></a> (<code class="docutils literal notranslate"><span class="pre">build_gcn.py</span></code>)</p></li>
-<li><p><strong>00:00.214</strong>: <a class="reference internal" href="using_relay_viz.html#sphx-glr-how-to-work-with-relay-using-relay-viz-py"><span class="std std-ref">Use Relay Visualizer to Visualize Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_relay_viz.py</span></code>)</p></li>
+<li><p><strong>00:04.488</strong>: <a class="reference internal" href="using_external_lib.html#sphx-glr-how-to-work-with-relay-using-external-lib-py"><span class="std std-ref">Using External Libraries in Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_external_lib.py</span></code>)</p></li>
+<li><p><strong>00:01.691</strong>: <a class="reference internal" href="build_gcn.html#sphx-glr-how-to-work-with-relay-build-gcn-py"><span class="std std-ref">Building a Graph Convolutional Network</span></a> (<code class="docutils literal notranslate"><span class="pre">build_gcn.py</span></code>)</p></li>
+<li><p><strong>00:00.212</strong>: <a class="reference internal" href="using_relay_viz.html#sphx-glr-how-to-work-with-relay-using-relay-viz-py"><span class="std std-ref">Use Relay Visualizer to Visualize Relay</span></a> (<code class="docutils literal notranslate"><span class="pre">using_relay_viz.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/work_with_schedules/sg_execution_times.html b/docs/how_to/work_with_schedules/sg_execution_times.html
index 3e4de3964..6ff311930 100644
--- a/docs/how_to/work_with_schedules/sg_execution_times.html
+++ b/docs/how_to/work_with_schedules/sg_execution_times.html
@@ -300,16 +300,16 @@
             
   <div class="section" id="computation-times">
 <span id="sphx-glr-how-to-work-with-schedules-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
-<p><strong>00:05.920</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
+<p><strong>00:05.572</strong> total execution time for <strong>how_to_work_with_schedules</strong> files:</p>
 <ul class="simple">
-<li><p><strong>00:02.180</strong>: <a class="reference internal" href="intrin_math.html#sphx-glr-how-to-work-with-schedules-intrin-math-py"><span class="std std-ref">Intrinsics and Math Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">intrin_math.py</span></code>)</p></li>
-<li><p><strong>00:01.234</strong>: <a class="reference internal" href="tensorize.html#sphx-glr-how-to-work-with-schedules-tensorize-py"><span class="std std-ref">Use Tensorize to Leverage Hardware Intrinsics</span></a> (<code class="docutils literal notranslate"><span class="pre">tensorize.py</span></code>)</p></li>
-<li><p><strong>00:00.770</strong>: <a class="reference internal" href="scan.html#sphx-glr-how-to-work-with-schedules-scan-py"><span class="std std-ref">Scan and Recurrent Kernel</span></a> (<code class="docutils literal notranslate"><span class="pre">scan.py</span></code>)</p></li>
-<li><p><strong>00:00.745</strong>: <a class="reference internal" href="reduction.html#sphx-glr-how-to-work-with-schedules-reduction-py"><span class="std std-ref">Reduction</span></a> (<code class="docutils literal notranslate"><span class="pre">reduction.py</span></code>)</p></li>
-<li><p><strong>00:00.315</strong>: <a class="reference internal" href="extern_op.html#sphx-glr-how-to-work-with-schedules-extern-op-py"><span class="std std-ref">External Tensor Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">extern_op.py</span></code>)</p></li>
-<li><p><strong>00:00.231</strong>: <a class="reference internal" href="schedule_primitives.html#sphx-glr-how-to-work-with-schedules-schedule-primitives-py"><span class="std std-ref">Schedule Primitives in TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">schedule_primitives.py</span></code>)</p></li>
-<li><p><strong>00:00.227</strong>: <a class="reference internal" href="tedd.html#sphx-glr-how-to-work-with-schedules-tedd-py"><span class="std std-ref">Use Tensor Expression Debug Display (TEDD) for Visualization</span></a> (<code class="docutils literal notranslate"><span class="pre">tedd.py</span></code>)</p></li>
-<li><p><strong>00:00.218</strong>: <a class="reference internal" href="tuple_inputs.html#sphx-glr-how-to-work-with-schedules-tuple-inputs-py"><span class="std std-ref">Compute and Reduce with Tuple Inputs</span></a> (<code class="docutils literal notranslate"><span class="pre">tuple_inputs.py</span></code>)</p></li>
+<li><p><strong>00:02.035</strong>: <a class="reference internal" href="intrin_math.html#sphx-glr-how-to-work-with-schedules-intrin-math-py"><span class="std std-ref">Intrinsics and Math Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">intrin_math.py</span></code>)</p></li>
+<li><p><strong>00:01.097</strong>: <a class="reference internal" href="tensorize.html#sphx-glr-how-to-work-with-schedules-tensorize-py"><span class="std std-ref">Use Tensorize to Leverage Hardware Intrinsics</span></a> (<code class="docutils literal notranslate"><span class="pre">tensorize.py</span></code>)</p></li>
+<li><p><strong>00:00.724</strong>: <a class="reference internal" href="reduction.html#sphx-glr-how-to-work-with-schedules-reduction-py"><span class="std std-ref">Reduction</span></a> (<code class="docutils literal notranslate"><span class="pre">reduction.py</span></code>)</p></li>
+<li><p><strong>00:00.699</strong>: <a class="reference internal" href="scan.html#sphx-glr-how-to-work-with-schedules-scan-py"><span class="std std-ref">Scan and Recurrent Kernel</span></a> (<code class="docutils literal notranslate"><span class="pre">scan.py</span></code>)</p></li>
+<li><p><strong>00:00.310</strong>: <a class="reference internal" href="extern_op.html#sphx-glr-how-to-work-with-schedules-extern-op-py"><span class="std std-ref">External Tensor Functions</span></a> (<code class="docutils literal notranslate"><span class="pre">extern_op.py</span></code>)</p></li>
+<li><p><strong>00:00.246</strong>: <a class="reference internal" href="schedule_primitives.html#sphx-glr-how-to-work-with-schedules-schedule-primitives-py"><span class="std std-ref">Schedule Primitives in TVM</span></a> (<code class="docutils literal notranslate"><span class="pre">schedule_primitives.py</span></code>)</p></li>
+<li><p><strong>00:00.238</strong>: <a class="reference internal" href="tedd.html#sphx-glr-how-to-work-with-schedules-tedd-py"><span class="std std-ref">Use Tensor Expression Debug Display (TEDD) for Visualization</span></a> (<code class="docutils literal notranslate"><span class="pre">tedd.py</span></code>)</p></li>
+<li><p><strong>00:00.223</strong>: <a class="reference internal" href="tuple_inputs.html#sphx-glr-how-to-work-with-schedules-tuple-inputs-py"><span class="std std-ref">Compute and Reduce with Tuple Inputs</span></a> (<code class="docutils literal notranslate"><span class="pre">tuple_inputs.py</span></code>)</p></li>
 </ul>
 </div>
 
diff --git a/docs/how_to/work_with_schedules/tensorize.html b/docs/how_to/work_with_schedules/tensorize.html
index fe2a5d070..f051c3e61 100644
--- a/docs/how_to/work_with_schedules/tensorize.html
+++ b/docs/how_to/work_with_schedules/tensorize.html
@@ -552,7 +552,7 @@ The importing needs to happen before the tensorized GEMV being executed.</p>
              C: Buffer(C_2: Pointer(float32), float32, [524288], [])}
   buffer_map = {A_1: A, B_1: B, C_1: C}
   preflattened_buffer_map = {A_1: A_3: Buffer(A_2, float32, [1024, 64], []), B_1: B_3: Buffer(B_2, float32, [512, 64], []), C_1: C_3: Buffer(C_2, float32, [1024, 512], [])} {
-  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmp8plnll6e/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmp8plnll6e/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
+  attr [IterVar(i: int32, (nullptr), &quot;DataPar&quot;, &quot;&quot;)] &quot;pragma_import_llvm&quot; = &quot;; ModuleID = &#39;/tmp/tmppb0827hi/input0.cc&#39;\nsource_filename = \&quot;/tmp/tmppb0827hi/input0.cc\&quot;\ntarget datalayout = \&quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128\&quot;\ntarget triple = \&quot;x86_64-pc-linux-gnu\&quot;\n\n; Function Attrs: noinline nounwind optnone uwtable\ndefine dso_local i32 @gemv_update(float*, float*, float*, i32, i32, i32) #0 {\n  %7 = allo [...]
   for (i, 0, 1024) {
     for (j.outer: int32, 0, 32) {
       @tir.call_extern(&quot;gemv_update&quot;, @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), C_2, ((i*512) + (j.outer*16)), 16, 2, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), A_2, (i*64), 64, 1, dtype=handle), @tir.tvm_access_ptr(@tir.type_annotation(, dtype=float32), B_2, (j.outer*1024), 1024, 1, dtype=handle), 16, 64, 64, dtype=int32)
diff --git a/docs/reference/api/doxygen/annotated.html b/docs/reference/api/doxygen/annotated.html
index 90cdeab54..d36691635 100644
--- a/docs/reference/api/doxygen/annotated.html
+++ b/docs/reference/api/doxygen/annotated.html
@@ -955,87 +955,85 @@ $(function() {
 <tr id="row_1_66_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1IntImmNode.html" target="_self">IntImmNode</a></td><td class="desc">Constant integer literals in the program </td></tr>
 <tr id="row_1_67_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1IRModule.html" target="_self">IRModule</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1IRModuleNode.html" title="IRModule that holds functions and type definitions. ">IRModuleNode</a> </td></tr>
 <tr id="row_1_68_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1IRModuleNode.html" target="_self">IRModuleNode</a></td><td class="desc"><a class="el" href="classtvm_1_1IRModule.html" title="Managed reference class to IRModuleNode. ">IRModule</a> that holds functions and type definitions </td></tr>
-<tr id="row_1_69_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1LinkedParam.html" target="_self">LinkedParam</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1LinkedParamNode.html" title="Describes one parameter that should be linked into the generated module. ">LinkedParamNode</a> </td></tr>
-<tr id="row_1_70_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1LinkedParamNode.html" target="_self">LinkedParamNode</a></td><td class="desc">Describes one parameter that should be linked into the generated module </td></tr>
-<tr id="row_1_71_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1MemoryInfo.html" target="_self">MemoryInfo</a></td><td class="desc">Defines memory info </td></tr>
-<tr id="row_1_72_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1MemoryInfoNode.html" target="_self">MemoryInfoNode</a></td><td class="desc">Memory information of special memory region. Use <a class="el" href="classtvm_1_1MemoryInfo.html" title="Defines memory info. ">MemoryInfo</a> as its container type </td></tr>
-<tr id="row_1_73_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1NDArrayContainerTrait.html" target="_self">NDArrayContainerTrait</a></td><td class="desc"></td></tr>
-<tr id="row_1_74_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1NodeFunctor.html" target="_self">NodeFunctor</a></td><td class="desc">A dynamically dispatched functor on the type of the first argument </td></tr>
-<tr id="row_1_75_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1NodeFunctor_3_01R_07const_01ObjectRef_01_6n_00_01Args_8_8_8_08_4.html" target="_self">NodeFunctor&lt; R(const ObjectRef &amp;n, Args...)&gt;</a></td><td class="desc"></td></tr>
-<tr id="row_1_76_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Op.html" target="_self">Op</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1OpNode.html" title="Primitive Op(builtin intrinsics) ">OpNode</a> </td></tr>
-<tr id="row_1_77_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1OpAttrMap.html" target="_self">OpAttrMap</a></td><td class="desc">Map&lt;Op,ValueType&gt; used to store meta-information about <a class="el" href="classtvm_1_1Op.html" title="Managed reference class to OpNode. ">Op</a> </td></tr>
-<tr id="row_1_78_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1OpNode.html" target="_self">OpNode</a></td><td class="desc">Primitive Op(builtin intrinsics) </td></tr>
-<tr id="row_1_79_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1OpRegEntry.html" target="_self">OpRegEntry</a></td><td class="desc">Helper structure to register operators </td></tr>
-<tr id="row_1_80_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PointerType.html" target="_self">PointerType</a></td><td class="desc"></td></tr>
-<tr id="row_1_81_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PointerTypeNode.html" target="_self">PointerTypeNode</a></td><td class="desc">Low-level raw pointer type </td></tr>
-<tr id="row_1_82_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PoolInfo.html" target="_self">PoolInfo</a></td><td class="desc"></td></tr>
-<tr id="row_1_83_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1PoolInfoNode.html" target="_self">PoolInfoNode</a></td><td class="desc">Describes a pool of memory accessible by one or more targets </td></tr>
-<tr id="row_1_84_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimExpr.html" target="_self">PrimExpr</a></td><td class="desc">Reference to <a class="el" href="classtvm_1_1PrimExprNode.html" title="Base node of all primitive expressions. ">PrimExprNode</a> </td></tr>
-<tr id="row_1_85_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimExprNode.html" target="_self">PrimExprNode</a></td><td class="desc">Base node of all primitive expressions </td></tr>
-<tr id="row_1_86_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimType.html" target="_self">PrimType</a></td><td class="desc"></td></tr>
-<tr id="row_1_87_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimTypeNode.html" target="_self">PrimTypeNode</a></td><td class="desc">Primitive data types used in the low-level IR </td></tr>
-<tr id="row_1_88_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Range.html" target="_self">Range</a></td><td class="desc"><a class="el" href="classtvm_1_1Range.html" title="Range constainer. ">Range</a> constainer </td></tr>
-<tr id="row_1_89_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RangeNode.html" target="_self">RangeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Range.html" title="Range constainer. ">Range</a> over one dimension </td></tr>
-<tr id="row_1_90_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_1_90_" class="arrow" onclick="toggleFolder('1_90_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1ReflectionVTable.html" target="_self">ReflectionVTable</a></td><td class="desc">Virtual function table to support IR/AST node reflection </td></tr>
-<tr id="row_1_90_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1ReflectionVTable_1_1Registry.html" target="_self">Registry</a></td><td class="desc"><a class="el" href="classtvm_1_1ReflectionVTable_1_1Registry.html" title="Registry of a reflection table. ">Registry</a> of a reflection table </td></tr>
-<tr id="row_1_91_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayExpr.html" target="_self">RelayExpr</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1RelayExprNode.html" title="Base node of all non-primitive expressions. ">RelayExprNode</a> </td></tr>
-<tr id="row_1_92_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayExprNode.html" target="_self">RelayExprNode</a></td><td class="desc">Base node of all non-primitive expressions </td></tr>
-<tr id="row_1_93_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayRefType.html" target="_self">RelayRefType</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1RelayRefTypeNode.html" title="Reference Type High-level Relay IR. ">RelayRefTypeNode</a> </td></tr>
-<tr id="row_1_94_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayRefTypeNode.html" target="_self">RelayRefTypeNode</a></td><td class="desc">Reference <a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> High-level Relay IR </td></tr>
-<tr id="row_1_95_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1ReprPrinter.html" target="_self">ReprPrinter</a></td><td class="desc">A printer class to print the AST/IR nodes </td></tr>
-<tr id="row_1_96_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_1_96_" class="arrow" onclick="toggleFolder('1_96_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SEqualReducer.html" target="_self">SEqualReducer</a></td><td class="desc">A Reducer class to reduce the structural equality result of two objects </td></tr>
-<tr id="row_1_96_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SEqualReducer_1_1Handler.html" target="_self">Handler</a></td><td class="desc">Internal handler that defines custom behaviors. </td></tr>
-<tr id="row_1_97_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_1_97_" class="arrow" onclick="toggleFolder('1_97_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SHashReducer.html" target="_self">SHashReducer</a></td><td class="desc">A Reducer class to reduce the structural hash value </td></tr>
-<tr id="row_1_97_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SHashReducer_1_1Handler.html" target="_self">Handler</a></td><td class="desc">Internal handler that defines custom behaviors </td></tr>
-<tr id="row_1_98_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SourceName.html" target="_self">SourceName</a></td><td class="desc">The source name of a file span </td></tr>
-<tr id="row_1_99_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SourceNameNode.html" target="_self">SourceNameNode</a></td><td class="desc">The name of a source fragment </td></tr>
-<tr id="row_1_100_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Span.html" target="_self">Span</a></td><td class="desc"></td></tr>
-<tr id="row_1_101_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SpanNode.html" target="_self">SpanNode</a></td><td class="desc">Stores locations in frontend source that generated a node </td></tr>
-<tr id="row_1_102_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1StructuralEqual.html" target="_self">StructuralEqual</a></td><td class="desc">Content-aware structural equality comparator for objects </td></tr>
-<tr id="row_1_103_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1StructuralHash.html" target="_self">StructuralHash</a></td><td class="desc">Content-aware structural hasing </td></tr>
-<tr id="row_1_104_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Target.html" target="_self">Target</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetNode.html" title="Compilation target. ">TargetNode</a> </td></tr>
-<tr id="row_1_105_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKind.html" target="_self">TargetKind</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetKindNode.html" title="Target kind, specifies the kind of the target. ">TargetKindNode</a> </td></tr>
-<tr id="row_1_106_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindAttrMap.html" target="_self">TargetKindAttrMap</a></td><td class="desc">Map&lt;TargetKind, ValueType&gt; used to store meta-information about <a class="el" href="classtvm_1_1TargetKind.html" title="Managed reference class to TargetKindNode. ">TargetKind</a> </td></tr>
-<tr id="row_1_107_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindNode.html" target="_self">TargetKindNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Target.html" title="Managed reference class to TargetNode. ">Target</a> kind, specifies the kind of the target </td></tr>
-<tr id="row_1_108_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindRegEntry.html" target="_self">TargetKindRegEntry</a></td><td class="desc">Helper structure to register <a class="el" href="classtvm_1_1TargetKind.html" title="Managed reference class to TargetKindNode. ">TargetKind</a> </td></tr>
-<tr id="row_1_109_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetNode.html" target="_self">TargetNode</a></td><td class="desc">Compilation target </td></tr>
-<tr id="row_1_110_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTag.html" target="_self">TargetTag</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetTagNode.html" title="A target tag. ">TargetTagNode</a> </td></tr>
-<tr id="row_1_111_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTagNode.html" target="_self">TargetTagNode</a></td><td class="desc">A target tag </td></tr>
-<tr id="row_1_112_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTagRegEntry.html" target="_self">TargetTagRegEntry</a></td><td class="desc"></td></tr>
-<tr id="row_1_113_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorAffineType.html" target="_self">TensorAffineType</a></td><td class="desc">Managed reference to AffineTypes </td></tr>
-<tr id="row_1_114_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorAffineTypeNode.html" target="_self">TensorAffineTypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TensorAffineType.html" title="Managed reference to AffineTypes. ">TensorAffineType</a> representation </td></tr>
-<tr id="row_1_115_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorType.html" target="_self">TensorType</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TensorTypeNode.html" title="This is the most commonly used type in relay. TensorType have a fixed dimension, data type...">TensorTypeNode</a> </td></tr>
-<tr id="row_1_116_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorTypeNode.html" target="_self">TensorTypeNode</a></td><td class="desc">This is the most commonly used type in relay. <a class="el" href="classtvm_1_1TensorType.html" title="Managed reference to TensorTypeNode. ">TensorType</a> have a fixed dimension, data type </td></tr>
-<tr id="row_1_117_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleAffineType.html" target="_self">TupleAffineType</a></td><td class="desc">Managed reference to TupleAffineTypes </td></tr>
-<tr id="row_1_118_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleAffineTypeNode.html" target="_self">TupleAffineTypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TupleAffineType.html" title="Managed reference to TupleAffineTypes. ">TupleAffineType</a> representation </td></tr>
-<tr id="row_1_119_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleType.html" target="_self">TupleType</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TupleTypeNode.html" title="The type of tuple values. ">TupleTypeNode</a> </td></tr>
-<tr id="row_1_120_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleTypeNode.html" target="_self">TupleTypeNode</a></td><td class="desc">The type of tuple values </td></tr>
-<tr id="row_1_121_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Type.html" target="_self">Type</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeNode.html" title="Type is the base type of all types. ">TypeNode</a> </td></tr>
-<tr id="row_1_122_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeCall.html" target="_self">TypeCall</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeCallNode.html" title="Type function application. ">TypeCallNode</a> </td></tr>
-<tr id="row_1_123_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeCallNode.html" target="_self">TypeCallNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> function application </td></tr>
-<tr id="row_1_124_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeConstraint.html" target="_self">TypeConstraint</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeConstraintNode.html" title="Potential Constraints in a function. ">TypeConstraintNode</a> </td></tr>
-<tr id="row_1_125_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeConstraintNode.html" target="_self">TypeConstraintNode</a></td><td class="desc">Potential Constraints in a function </td></tr>
-<tr id="row_1_126_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeData.html" target="_self">TypeData</a></td><td class="desc">Stores all data for an Algebraic Data <a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> (ADT) </td></tr>
-<tr id="row_1_127_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeDataNode.html" target="_self">TypeDataNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TypeData.html" title="Stores all data for an Algebraic Data Type (ADT). ">TypeData</a> container node </td></tr>
-<tr id="row_1_128_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypedEnvFunc.html" target="_self">TypedEnvFunc</a></td><td class="desc">Please refer to <a class="el" href="classtvm_1_1TypedEnvFunc_3_01R_07Args_8_8_8_08_4.html#TypedEnvFuncAnchor">TypedEnvFunc&lt;R(Args..)&gt;</a> </td></tr>
-<tr id="row_1_129_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypedEnvFunc_3_01R_07Args_8_8_8_08_4.html" target="_self">TypedEnvFunc&lt; R(Args...)&gt;</a></td><td class="desc">A typed version of <a class="el" href="classtvm_1_1EnvFunc.html" title="Managed reference to EnvFuncNode. ">EnvFunc</a>. It is backed by a GlobalFuncNode inte [...]
-<tr id="row_1_130_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeFunctor.html" target="_self">TypeFunctor</a></td><td class="desc"></td></tr>
-<tr id="row_1_131_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html" target="_self">TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a></td><td class="desc"></td></tr>
-<tr id="row_1_132_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeMutator.html" target="_self">TypeMutator</a></td><td class="desc"><a class="el" href="classtvm_1_1TypeMutator.html" title="TypeMutator that mutates expressions. ">TypeMutator</a> that mutates expressions </td></tr>
-<tr id="row_1_133_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeNode.html" target="_self">TypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> is the base type of all types </td></tr>
-<tr id="row_1_134_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeRelation.html" target="_self">TypeRelation</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeRelationNode.html" title="User defined type relation, it is an input-output relation on types. ">TypeRelationNode</a> </td></tr>
-<tr id="row_1_135_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeRelationNode.html" target="_self">TypeRelationNode</a></td><td class="desc">User defined type relation, it is an input-output relation on types </td></tr>
-<tr id="row_1_136_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeReporter.html" target="_self">TypeReporter</a></td><td class="desc">Container class of <a class="el" href="classtvm_1_1TypeReporter.html" title="Container class of TypeReporter. ">TypeReporter</a> </td></tr>
-<tr id="row_1_137_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeReporterNode.html" target="_self">TypeReporterNode</a></td><td class="desc">Reporter that reports back to the type resolution information </td></tr>
-<tr id="row_1_138_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVar.html" target="_self">TypeVar</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeVarNode.html" title="Type parameter in functions. ">TypeVarNode</a> </td></tr>
-<tr id="row_1_139_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVarNode.html" target="_self">TypeVarNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> parameter in functions </td></tr>
-<tr id="row_1_140_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVisitor.html" target="_self">TypeVisitor</a></td><td class="desc">A type visitor that recursively visit types </td></tr>
-<tr id="row_1_141_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1VirtualDevice.html" target="_self">VirtualDevice</a></td><td class="desc">Managed reference class to <code><a class="el" href="classtvm_1_1VirtualDeviceNode.html" title="Describes at compile time the constraints on where data is to be stored at runtime down to the (virtu.. [...]
-<tr id="row_1_142_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1VirtualDeviceCache.html" target="_self">VirtualDeviceCache</a></td><td class="desc">A cache of <code>VirtualDevices</code>. This can be used: </td></tr>
-<tr id="row_1_143_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1VirtualDeviceNode.html" target="_self">VirtualDeviceNode</a></td><td class="desc">Describes at compile time the constraints on where data is to be stored at runtime down to the (virtual) device and memory scope level, and how to compile code to compute that data. Used by t [...]
-<tr id="row_1_144_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1With.html" target="_self">With</a></td><td class="desc">RAII wrapper function to enter and exit a context object similar to python's with syntax </td></tr>
-<tr id="row_1_145_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1WorkspaceMemoryPools.html" target="_self">WorkspaceMemoryPools</a></td><td class="desc"></td></tr>
-<tr id="row_1_146_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1WorkspaceMemoryPoolsNode.html" target="_self">WorkspaceMemoryPoolsNode</a></td><td class="desc"></td></tr>
+<tr id="row_1_69_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1MemoryInfo.html" target="_self">MemoryInfo</a></td><td class="desc">Defines memory info </td></tr>
+<tr id="row_1_70_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1MemoryInfoNode.html" target="_self">MemoryInfoNode</a></td><td class="desc">Memory information of special memory region. Use <a class="el" href="classtvm_1_1MemoryInfo.html" title="Defines memory info. ">MemoryInfo</a> as its container type </td></tr>
+<tr id="row_1_71_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1NDArrayContainerTrait.html" target="_self">NDArrayContainerTrait</a></td><td class="desc"></td></tr>
+<tr id="row_1_72_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1NodeFunctor.html" target="_self">NodeFunctor</a></td><td class="desc">A dynamically dispatched functor on the type of the first argument </td></tr>
+<tr id="row_1_73_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1NodeFunctor_3_01R_07const_01ObjectRef_01_6n_00_01Args_8_8_8_08_4.html" target="_self">NodeFunctor&lt; R(const ObjectRef &amp;n, Args...)&gt;</a></td><td class="desc"></td></tr>
+<tr id="row_1_74_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Op.html" target="_self">Op</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1OpNode.html" title="Primitive Op(builtin intrinsics) ">OpNode</a> </td></tr>
+<tr id="row_1_75_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1OpAttrMap.html" target="_self">OpAttrMap</a></td><td class="desc">Map&lt;Op,ValueType&gt; used to store meta-information about <a class="el" href="classtvm_1_1Op.html" title="Managed reference class to OpNode. ">Op</a> </td></tr>
+<tr id="row_1_76_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1OpNode.html" target="_self">OpNode</a></td><td class="desc">Primitive Op(builtin intrinsics) </td></tr>
+<tr id="row_1_77_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1OpRegEntry.html" target="_self">OpRegEntry</a></td><td class="desc">Helper structure to register operators </td></tr>
+<tr id="row_1_78_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PointerType.html" target="_self">PointerType</a></td><td class="desc"></td></tr>
+<tr id="row_1_79_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PointerTypeNode.html" target="_self">PointerTypeNode</a></td><td class="desc">Low-level raw pointer type </td></tr>
+<tr id="row_1_80_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PoolInfo.html" target="_self">PoolInfo</a></td><td class="desc"></td></tr>
+<tr id="row_1_81_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1PoolInfoNode.html" target="_self">PoolInfoNode</a></td><td class="desc">Describes a pool of memory accessible by one or more targets </td></tr>
+<tr id="row_1_82_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimExpr.html" target="_self">PrimExpr</a></td><td class="desc">Reference to <a class="el" href="classtvm_1_1PrimExprNode.html" title="Base node of all primitive expressions. ">PrimExprNode</a> </td></tr>
+<tr id="row_1_83_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimExprNode.html" target="_self">PrimExprNode</a></td><td class="desc">Base node of all primitive expressions </td></tr>
+<tr id="row_1_84_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimType.html" target="_self">PrimType</a></td><td class="desc"></td></tr>
+<tr id="row_1_85_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimTypeNode.html" target="_self">PrimTypeNode</a></td><td class="desc">Primitive data types used in the low-level IR </td></tr>
+<tr id="row_1_86_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Range.html" target="_self">Range</a></td><td class="desc"><a class="el" href="classtvm_1_1Range.html" title="Range constainer. ">Range</a> constainer </td></tr>
+<tr id="row_1_87_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RangeNode.html" target="_self">RangeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Range.html" title="Range constainer. ">Range</a> over one dimension </td></tr>
+<tr id="row_1_88_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_1_88_" class="arrow" onclick="toggleFolder('1_88_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1ReflectionVTable.html" target="_self">ReflectionVTable</a></td><td class="desc">Virtual function table to support IR/AST node reflection </td></tr>
+<tr id="row_1_88_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1ReflectionVTable_1_1Registry.html" target="_self">Registry</a></td><td class="desc"><a class="el" href="classtvm_1_1ReflectionVTable_1_1Registry.html" title="Registry of a reflection table. ">Registry</a> of a reflection table </td></tr>
+<tr id="row_1_89_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayExpr.html" target="_self">RelayExpr</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1RelayExprNode.html" title="Base node of all non-primitive expressions. ">RelayExprNode</a> </td></tr>
+<tr id="row_1_90_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayExprNode.html" target="_self">RelayExprNode</a></td><td class="desc">Base node of all non-primitive expressions </td></tr>
+<tr id="row_1_91_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayRefType.html" target="_self">RelayRefType</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1RelayRefTypeNode.html" title="Reference Type High-level Relay IR. ">RelayRefTypeNode</a> </td></tr>
+<tr id="row_1_92_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayRefTypeNode.html" target="_self">RelayRefTypeNode</a></td><td class="desc">Reference <a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> High-level Relay IR </td></tr>
+<tr id="row_1_93_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1ReprPrinter.html" target="_self">ReprPrinter</a></td><td class="desc">A printer class to print the AST/IR nodes </td></tr>
+<tr id="row_1_94_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_1_94_" class="arrow" onclick="toggleFolder('1_94_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SEqualReducer.html" target="_self">SEqualReducer</a></td><td class="desc">A Reducer class to reduce the structural equality result of two objects </td></tr>
+<tr id="row_1_94_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SEqualReducer_1_1Handler.html" target="_self">Handler</a></td><td class="desc">Internal handler that defines custom behaviors. </td></tr>
+<tr id="row_1_95_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_1_95_" class="arrow" onclick="toggleFolder('1_95_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SHashReducer.html" target="_self">SHashReducer</a></td><td class="desc">A Reducer class to reduce the structural hash value </td></tr>
+<tr id="row_1_95_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SHashReducer_1_1Handler.html" target="_self">Handler</a></td><td class="desc">Internal handler that defines custom behaviors </td></tr>
+<tr id="row_1_96_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SourceName.html" target="_self">SourceName</a></td><td class="desc">The source name of a file span </td></tr>
+<tr id="row_1_97_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SourceNameNode.html" target="_self">SourceNameNode</a></td><td class="desc">The name of a source fragment </td></tr>
+<tr id="row_1_98_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Span.html" target="_self">Span</a></td><td class="desc"></td></tr>
+<tr id="row_1_99_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SpanNode.html" target="_self">SpanNode</a></td><td class="desc">Stores locations in frontend source that generated a node </td></tr>
+<tr id="row_1_100_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1StructuralEqual.html" target="_self">StructuralEqual</a></td><td class="desc">Content-aware structural equality comparator for objects </td></tr>
+<tr id="row_1_101_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1StructuralHash.html" target="_self">StructuralHash</a></td><td class="desc">Content-aware structural hasing </td></tr>
+<tr id="row_1_102_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Target.html" target="_self">Target</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetNode.html" title="Compilation target. ">TargetNode</a> </td></tr>
+<tr id="row_1_103_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKind.html" target="_self">TargetKind</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetKindNode.html" title="Target kind, specifies the kind of the target. ">TargetKindNode</a> </td></tr>
+<tr id="row_1_104_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindAttrMap.html" target="_self">TargetKindAttrMap</a></td><td class="desc">Map&lt;TargetKind, ValueType&gt; used to store meta-information about <a class="el" href="classtvm_1_1TargetKind.html" title="Managed reference class to TargetKindNode. ">TargetKind</a> </td></tr>
+<tr id="row_1_105_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindNode.html" target="_self">TargetKindNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Target.html" title="Managed reference class to TargetNode. ">Target</a> kind, specifies the kind of the target </td></tr>
+<tr id="row_1_106_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindRegEntry.html" target="_self">TargetKindRegEntry</a></td><td class="desc">Helper structure to register <a class="el" href="classtvm_1_1TargetKind.html" title="Managed reference class to TargetKindNode. ">TargetKind</a> </td></tr>
+<tr id="row_1_107_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetNode.html" target="_self">TargetNode</a></td><td class="desc">Compilation target </td></tr>
+<tr id="row_1_108_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTag.html" target="_self">TargetTag</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetTagNode.html" title="A target tag. ">TargetTagNode</a> </td></tr>
+<tr id="row_1_109_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTagNode.html" target="_self">TargetTagNode</a></td><td class="desc">A target tag </td></tr>
+<tr id="row_1_110_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTagRegEntry.html" target="_self">TargetTagRegEntry</a></td><td class="desc"></td></tr>
+<tr id="row_1_111_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorAffineType.html" target="_self">TensorAffineType</a></td><td class="desc">Managed reference to AffineTypes </td></tr>
+<tr id="row_1_112_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorAffineTypeNode.html" target="_self">TensorAffineTypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TensorAffineType.html" title="Managed reference to AffineTypes. ">TensorAffineType</a> representation </td></tr>
+<tr id="row_1_113_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorType.html" target="_self">TensorType</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TensorTypeNode.html" title="This is the most commonly used type in relay. TensorType have a fixed dimension, data type...">TensorTypeNode</a> </td></tr>
+<tr id="row_1_114_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorTypeNode.html" target="_self">TensorTypeNode</a></td><td class="desc">This is the most commonly used type in relay. <a class="el" href="classtvm_1_1TensorType.html" title="Managed reference to TensorTypeNode. ">TensorType</a> have a fixed dimension, data type </td></tr>
+<tr id="row_1_115_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleAffineType.html" target="_self">TupleAffineType</a></td><td class="desc">Managed reference to TupleAffineTypes </td></tr>
+<tr id="row_1_116_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleAffineTypeNode.html" target="_self">TupleAffineTypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TupleAffineType.html" title="Managed reference to TupleAffineTypes. ">TupleAffineType</a> representation </td></tr>
+<tr id="row_1_117_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleType.html" target="_self">TupleType</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TupleTypeNode.html" title="The type of tuple values. ">TupleTypeNode</a> </td></tr>
+<tr id="row_1_118_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleTypeNode.html" target="_self">TupleTypeNode</a></td><td class="desc">The type of tuple values </td></tr>
+<tr id="row_1_119_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Type.html" target="_self">Type</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeNode.html" title="Type is the base type of all types. ">TypeNode</a> </td></tr>
+<tr id="row_1_120_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeCall.html" target="_self">TypeCall</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeCallNode.html" title="Type function application. ">TypeCallNode</a> </td></tr>
+<tr id="row_1_121_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeCallNode.html" target="_self">TypeCallNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> function application </td></tr>
+<tr id="row_1_122_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeConstraint.html" target="_self">TypeConstraint</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeConstraintNode.html" title="Potential Constraints in a function. ">TypeConstraintNode</a> </td></tr>
+<tr id="row_1_123_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeConstraintNode.html" target="_self">TypeConstraintNode</a></td><td class="desc">Potential Constraints in a function </td></tr>
+<tr id="row_1_124_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeData.html" target="_self">TypeData</a></td><td class="desc">Stores all data for an Algebraic Data <a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> (ADT) </td></tr>
+<tr id="row_1_125_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeDataNode.html" target="_self">TypeDataNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TypeData.html" title="Stores all data for an Algebraic Data Type (ADT). ">TypeData</a> container node </td></tr>
+<tr id="row_1_126_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypedEnvFunc.html" target="_self">TypedEnvFunc</a></td><td class="desc">Please refer to <a class="el" href="classtvm_1_1TypedEnvFunc_3_01R_07Args_8_8_8_08_4.html#TypedEnvFuncAnchor">TypedEnvFunc&lt;R(Args..)&gt;</a> </td></tr>
+<tr id="row_1_127_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypedEnvFunc_3_01R_07Args_8_8_8_08_4.html" target="_self">TypedEnvFunc&lt; R(Args...)&gt;</a></td><td class="desc">A typed version of <a class="el" href="classtvm_1_1EnvFunc.html" title="Managed reference to EnvFuncNode. ">EnvFunc</a>. It is backed by a GlobalFuncNode inte [...]
+<tr id="row_1_128_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeFunctor.html" target="_self">TypeFunctor</a></td><td class="desc"></td></tr>
+<tr id="row_1_129_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html" target="_self">TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a></td><td class="desc"></td></tr>
+<tr id="row_1_130_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeMutator.html" target="_self">TypeMutator</a></td><td class="desc"><a class="el" href="classtvm_1_1TypeMutator.html" title="TypeMutator that mutates expressions. ">TypeMutator</a> that mutates expressions </td></tr>
+<tr id="row_1_131_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeNode.html" target="_self">TypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> is the base type of all types </td></tr>
+<tr id="row_1_132_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeRelation.html" target="_self">TypeRelation</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeRelationNode.html" title="User defined type relation, it is an input-output relation on types. ">TypeRelationNode</a> </td></tr>
+<tr id="row_1_133_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeRelationNode.html" target="_self">TypeRelationNode</a></td><td class="desc">User defined type relation, it is an input-output relation on types </td></tr>
+<tr id="row_1_134_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeReporter.html" target="_self">TypeReporter</a></td><td class="desc">Container class of <a class="el" href="classtvm_1_1TypeReporter.html" title="Container class of TypeReporter. ">TypeReporter</a> </td></tr>
+<tr id="row_1_135_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeReporterNode.html" target="_self">TypeReporterNode</a></td><td class="desc">Reporter that reports back to the type resolution information </td></tr>
+<tr id="row_1_136_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVar.html" target="_self">TypeVar</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1TypeVarNode.html" title="Type parameter in functions. ">TypeVarNode</a> </td></tr>
+<tr id="row_1_137_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVarNode.html" target="_self">TypeVarNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> parameter in functions </td></tr>
+<tr id="row_1_138_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVisitor.html" target="_self">TypeVisitor</a></td><td class="desc">A type visitor that recursively visit types </td></tr>
+<tr id="row_1_139_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1VirtualDevice.html" target="_self">VirtualDevice</a></td><td class="desc">Managed reference class to <code><a class="el" href="classtvm_1_1VirtualDeviceNode.html" title="Describes at compile time the constraints on where data is to be stored at runtime down to the (virtu.. [...]
+<tr id="row_1_140_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1VirtualDeviceCache.html" target="_self">VirtualDeviceCache</a></td><td class="desc">A cache of <code>VirtualDevices</code>. This can be used: </td></tr>
+<tr id="row_1_141_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1VirtualDeviceNode.html" target="_self">VirtualDeviceNode</a></td><td class="desc">Describes at compile time the constraints on where data is to be stored at runtime down to the (virtual) device and memory scope level, and how to compile code to compute that data. Used by t [...]
+<tr id="row_1_142_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1With.html" target="_self">With</a></td><td class="desc">RAII wrapper function to enter and exit a context object similar to python's with syntax </td></tr>
+<tr id="row_1_143_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1WorkspaceMemoryPools.html" target="_self">WorkspaceMemoryPools</a></td><td class="desc"></td></tr>
+<tr id="row_1_144_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1WorkspaceMemoryPoolsNode.html" target="_self">WorkspaceMemoryPoolsNode</a></td><td class="desc"></td></tr>
 <tr id="row_2_" class="even"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structMemoryManagerInterface.html" target="_self">MemoryManagerInterface</a></td><td class="desc"></td></tr>
 <tr id="row_3_"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm__workspace__t.html" target="_self">tvm_workspace_t</a></td><td class="desc"></td></tr>
 <tr id="row_4_" class="even"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structTVMArgs.html" target="_self">TVMArgs</a></td><td class="desc"></td></tr>
diff --git a/docs/reference/api/doxygen/apply__history__best_8h_source.html b/docs/reference/api/doxygen/apply__history__best_8h_source.html
index a10565833..5df9e49de 100644
--- a/docs/reference/api/doxygen/apply__history__best_8h_source.html
+++ b/docs/reference/api/doxygen/apply__history__best_8h_source.html
@@ -79,7 +79,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1Target_html"><div class="ttname"><a href="classtvm_1_1Target.html">tvm::Target</a></div><div class="ttdoc">Managed reference class to TargetNode. </div><div class="ttdef"><b>Definition:</b> target.h:141</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1ObjectRef_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></div><div class="ttdoc">Base class of all object reference. </div><div class="ttdef"><b>Definition:</b> object.h:511</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode_html_a755012568d85aa7cba250c5f8be766cc"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html#a755012568d85aa7cba250c5f8be766cc">tvm::meta_schedule::ApplyHistoryBestNode::_type_key</a></div><div class="ttdeci">static constexpr const char * _type_key</div><div class="ttdef"><b>Definition:</b> apply_history_best.h:48</div></div>
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="target_8h_html"><div class="ttname"><a href="target_8h.html">target.h</a></div><div class="ttdoc">Compilation target object. </div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html">tvm::meta_schedule::ApplyHistoryBestNode</a></div><div class="ttdoc">An integration context that allows application of historically best records from a database...</div><div class="ttdef"><b>Definition:</b> apply_history_best.h:32</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Optional_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Optional.html">tvm::runtime::Optional</a></div><div class="ttdoc">Optional container that to represent to a Nullable variant of T. </div><div class="ttdef"><b>Definition:</b> optional.h:51</div></div>
diff --git a/docs/reference/api/doxygen/builder_8h_source.html b/docs/reference/api/doxygen/builder_8h_source.html
index a07d195d0..7300027ca 100644
--- a/docs/reference/api/doxygen/builder_8h_source.html
+++ b/docs/reference/api/doxygen/builder_8h_source.html
@@ -89,7 +89,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1BuilderInput_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1BuilderInput.html">tvm::meta_schedule::BuilderInput</a></div><div class="ttdoc">Managed reference to BuilderInputNode. </div><div class="ttdef"><b>Definition:</b> builder.h:52</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1ObjectRef_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></div><div class="ttdoc">Base class of all object reference. </div><div class="ttdef"><b>Definition:</b> object.h:511</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyBuilderNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyBuilderNode.html">tvm::meta_schedule::PyBuilderNode</a></div><div class="ttdoc">An abstract builder with customized build method on the python-side. </div><div class="ttdef"><b>Definition:</b> builder.h:135</div></div>
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1BuilderResultNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1BuilderResultNode.html">tvm::meta_schedule::BuilderResultNode</a></div><div class="ttdoc">The builder&amp;#39;s output, containing the artifact path or error message if any. </div><div class="ttdef"><b>Definition:</b> builder.h:66</div></div>
 <div class="ttc" id="namespacetvm_1_1codegen_html_a0d6322c2dda54a66a3b82022f5f3632c"><div class="ttname"><a href="namespacetvm_1_1codegen.html#a0d6322c2dda54a66a3b82022f5f3632c">tvm::codegen::Build</a></div><div class="ttdeci">runtime::Module Build(IRModule mod, Target target)</div><div class="ttdoc">Build a module from array of lowered function. </div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1BuilderInputNode_html_af640877ef243c29d4845977c62f1e12d"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html#af640877ef243c29d4845977c62f1e12d">tvm::meta_schedule::BuilderInputNode::VisitAttrs</a></div><div class="ttdeci">void VisitAttrs(tvm::AttrVisitor *v)</div><div class="ttdef"><b>Definition:</b> builder.h:38</div></div>
diff --git a/docs/reference/api/doxygen/classes.html b/docs/reference/api/doxygen/classes.html
index 94f199385..51e8c626d 100644
--- a/docs/reference/api/doxygen/classes.html
+++ b/docs/reference/api/doxygen/classes.html
@@ -65,227 +65,226 @@ $(function() {
 <div class="qindex"><a class="qindex" href="#letter_a">a</a>&#160;|&#160;<a class="qindex" href="#letter_b">b</a>&#160;|&#160;<a class="qindex" href="#letter_c">c</a>&#160;|&#160;<a class="qindex" href="#letter_d">d</a>&#160;|&#160;<a class="qindex" href="#letter_e">e</a>&#160;|&#160;<a class="qindex" href="#letter_f">f</a>&#160;|&#160;<a class="qindex" href="#letter_g">g</a>&#160;|&#160;<a class="qindex" href="#letter_h">h</a>&#160;|&#160;<a class="qindex" href="#letter_i">i</a>&#160;|& [...]
 <table class="classindex">
 <tr><td rowspan="2" valign="bottom"><a name="letter_a"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;a&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv3DWinogradAttrs.html">Conv3DWinogradAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1IterAdapter.html">IterAdapter</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1PragmaStepNode.html">PragmaStepNode</a [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConvGemmWeightTransformAttrs.html">ConvGemmWeightTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1MapNode_1_1iterator.html">MapNode::iterator</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Prefetch.html">Pref [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AccessAnalyzer.html">AccessAnalyzer</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConvWinogradWeightTransformAttrs.html">ConvWinogradWeightTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1run [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AccessAnalyzerNode.html">AccessAnalyzerNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CorrelationAttrs.html">CorrelationAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1Iterator [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AdaptivePool1DAttrs.html">AdaptivePool1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1CostModel.html">CostModel</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1support_1_1Span_1_1iterator__base.html">Sp [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AdaptivePool2DAttrs.html">AdaptivePool2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CostModel.html">CostModel</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1IteratorNode.html">I [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AdaptivePool3DAttrs.html">AdaptivePool3DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1CostModelNode.html">CostModelNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1auto__scheduler_1_1AttachMapNode_ [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Add.html">Add</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CostModelNode.html">CostModelNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMapExpr.html">IterMapExpr</a> (<a class="el" href="namesp [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AddNode.html">AddNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1CountNode.html">CountNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMapExprNode.html">IterMapExprNode</a> (<a c [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ADT.html">ADT</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CropAndResizeAttrs.html">CropAndResizeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMark.html">IterMark</a> (<a class="el" href="namespacetvm_1_1ar [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ADTObj.html">ADTObj</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_d"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;d&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMarkNode.html">IterMarkNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PrimType.html">PrimType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1Step.html">Step</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_s [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AffineGridAttrs.html">AffineGridAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterSplitExpr.html">IterSplitExpr</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PrimTypeNode.html">PrimTypeNode</a> (<a class="el" href="namespacetv [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AffineType.html">AffineType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Database.html">Database</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterSplitExprNode.html">IterSplitExprNode</a> (<a class="el" href="namespacetvm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AffineTypeNode.html">AffineTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1DatabaseNode.html">DatabaseNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterSumExpr.html">IterSumExpr</a> (<a class="el" href="namespac [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AllClassNonMaximumSuppressionAttrs.html">AllClassNonMaximumSuppressionAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DataProducer.html">DataProducer</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterSumExprNode.html">IterSum [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Allocate.html">Allocate</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DataProducerNode.html">DataProducerNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IterVar.html">IterVar</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir< [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AllocateConst.html">AllocateConst</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1DataType.html">DataType</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarAttr.html">IterVarAttr</a> (<a class="el" href="namespacetvm_1_1te.htm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AllocateConstNode.html">AllocateConstNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DataTypePattern.html">DataTypePattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarAttrNode.html">IterVarAttrNode</a> (<a class="el" href [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1AllocatedPoolInfo.html">AllocatedPoolInfo</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DataTypePatternNode.html">DataTypePatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IterVarNode.html">IterVarNode< [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1AllocatedPoolInfoNode.html">AllocatedPoolInfoNode</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DebugAttrs.html">DebugAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarRelation.html">IterVarRelation</ [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AllocateNode.html">AllocateNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DeformableConv2DAttrs.html">DeformableConv2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarRelationNode.html">IterVarRelationNode</a> (<a clas [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1vm_1_1Allocator.html">Allocator</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DenseAttrs.html">DenseAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_l"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td> [...]
-</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ProgramMeasurer.html">ProgramMeasurer</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html">StmtVisitor</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AllocStorageAttrs.html">AllocStorageAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1DenseMapNode.html">DenseMapNode</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ProgramMeasurerNode.html">ProgramMeasurer [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AllocTensorAttrs.html">AllocTensorAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DensePackAttrs.html">DensePackAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1L2NormalizeAttrs.html">L2NormalizeAttrs</a> (<a class [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AltPattern.html">AltPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Dependency.html">Dependency</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1LayerNormAttrs.html">LayerNormAttrs</a> (<a class="el" href="namespacetvm_1_1rela [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AltPatternNode.html">AltPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DependencyNode.html">DependencyNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Layout.html">Layout</a> (<a class="el" href="namespacetvm_1_1tir.htm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1Analyzer.html">Analyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1qnn_1_1DequantizeAttrs.html">DequantizeAttrs</a> (<a class="el" href="namespacetvm_1_1relay_1_1qnn.html">tvm::relay::qnn</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LayoutAxis.html">LayoutAxis</a> (<a class="el" href= [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1And.html">And</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1DeviceAPI.html">DeviceAPI</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LayoutNode.html">LayoutNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&# [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AndNode.html">AndNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DeviceCopyAttrs.html">DeviceCopyAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1LayoutTransformAttrs.html">LayoutTransformAttrs</a> (<a class="el" href="nam [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AnnotationStep.html">AnnotationStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1DeviceWrapper.html">DeviceWrapper</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_ [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AnnotationStepNode.html">AnnotationStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1profiling_1_1DeviceWrapperNode.html">DeviceWrapperNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el"  [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Any.html">Any</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPattern.html">DFPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1tir_1_1LENode.html">LENode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#1 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AnyNode.html">AnyNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternCallback.html">DFPatternCallback</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Let.html">Let</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::re [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBest.html">ApplyHistoryBest</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternCallbackNode.html">DFPatternCallbackNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Let.html">Let</a>  [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html">ApplyHistoryBestNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor.html">DFPatternFunctor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1LetNode.html">LetN [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ArangeAttrs.html">ArangeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor_3_01R_07const_01DFPattern_01_6n_00_01Args_8_8_8_08_4.html">DFPatternFunctor&lt; R(const DFPattern &amp;n, Args...)&gt;</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a cla [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfo.html">ArgInfo</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternNode.html">DFPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1LetPattern.html">LetPattern</a> (<a class="el" hre [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfoNode.html">ArgInfoNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternVisitor.html">DFPatternVisitor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1LetPatternNode.html">LetPatternNode< [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ArgReduceAttrs.html">ArgReduceAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1Diagnostic.html">Diagnostic</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LetStmt.html">LetStmt</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&# [...]
+</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv3DTransposeAttrs.html">Conv3DTransposeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1is__specialized.html">is_specialized</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1PragmaStep.html">PragmaStep</a> [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv3DWinogradAttrs.html">Conv3DWinogradAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1is__specialized_3_01Container_3_01Args_8_8_8_01_4_00_01Container_01_4.html">is_specialized&lt; Container&lt; Args... &gt;, Container &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td>< [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AccessAnalyzer.html">AccessAnalyzer</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConvGemmWeightTransformAttrs.html">ConvGemmWeightTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AccessAnalyzerNode.html">AccessAnalyzerNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConvWinogradWeightTransformAttrs.html">ConvWinogradWeightTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtv [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AdaptivePool1DAttrs.html">AdaptivePool1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CorrelationAttrs.html">CorrelationAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Map_1_1iterator.html">Map::iterator</a> (< [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AdaptivePool2DAttrs.html">AdaptivePool2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1CostModel.html">CostModel</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1Iterator.html">Iterator [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AdaptivePool3DAttrs.html">AdaptivePool3DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CostModel.html">CostModel</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1support_1_1Span_1_1iterator__base.html" [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Add.html">Add</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1CostModelNode.html">CostModelNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1IteratorNode.html">IteratorNode</a> (<a class="el" hre [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AddNode.html">AddNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CostModelNode.html">CostModelNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1auto__scheduler_1_1AttachMapNode_1_1IterKeyHash.html">AttachM [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ADT.html">ADT</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1CountNode.html">CountNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMapExpr.html">IterMapExpr</a> (<a class [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ADTObj.html">ADTObj</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CropAndResizeAttrs.html">CropAndResizeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMapExprNode.html">IterMapExprNode</a> (<a class="el" href [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AffineGridAttrs.html">AffineGridAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_d"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;d&#160;&#160;</div></td></tr></table>
+</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMark.html">IterMark</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PrimType.html">PrimType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1StateNode.html">StateNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AffineType.html">AffineType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterMarkNode.html">IterMarkNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PrimTypeNode.html">PrimTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AffineTypeNode.html">AffineTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Database.html">Database</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterSplitExpr.html">IterSplitExpr</a> (<a class="el" href="namespacetvm [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AllClassNonMaximumSuppressionAttrs.html">AllClassNonMaximumSuppressionAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1DatabaseNode.html">DatabaseNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_ [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Allocate.html">Allocate</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DataProducer.html">DataProducer</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterSumExpr.html">IterSumExpr</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm:: [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AllocateConst.html">AllocateConst</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DataProducerNode.html">DataProducerNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IterSumExprNode.html">IterSumExprNode</a> (<a class="el" href="namespa [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AllocateConstNode.html">AllocateConstNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1DataType.html">DataType</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IterVar.html">IterVar</a> (<a class="el" href="namespacetvm_1_1tir.h [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1AllocatedPoolInfo.html">AllocatedPoolInfo</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DataTypePattern.html">DataTypePattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarAttr.html">IterVarAttr</a> (<a c [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1AllocatedPoolInfoNode.html">AllocatedPoolInfoNode</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DataTypePatternNode.html">DataTypePatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarAttrNode.html"> [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AllocateNode.html">AllocateNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DebugAttrs.html">DebugAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IterVarNode.html">IterVarNode</a> (<a class="el" href="namespacetvm_1_1tir.html [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1vm_1_1Allocator.html">Allocator</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DeformableConv2DAttrs.html">DeformableConv2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarRelation.html">IterVarRelatio [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AllocStorageAttrs.html">AllocStorageAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DenseAttrs.html">DenseAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1IterVarRelationNode.html">IterVarRelationNode</a> (<a class="el [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AllocTensorAttrs.html">AllocTensorAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1DenseMapNode.html">DenseMapNode</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_l"></a><table border="0" cellspacing="0" cellpadding="0"><tr><t [...]
+</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ProgramMeasurerNode.html">ProgramMeasurerNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html">StmtVisitor</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AltPattern.html">AltPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DensePackAttrs.html">DensePackAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ProgramRunner.html">ProgramRunner</a> (<a class="el" href [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AltPatternNode.html">AltPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Dependency.html">Dependency</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1L2NormalizeAttrs.html">L2NormalizeAttrs</a> (<a class="el" href="namespac [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1Analyzer.html">Analyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DependencyNode.html">DependencyNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1LayerNormAttrs.html">LayerNormAttrs</a> (<a class="el" href="namespacetvm_1_1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1And.html">And</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1qnn_1_1DequantizeAttrs.html">DequantizeAttrs</a> (<a class="el" href="namespacetvm_1_1relay_1_1qnn.html">tvm::relay::qnn</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Layout.html">Layout</a> (<a class="el" href="namespacetvm_1_1tir.htm [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AndNode.html">AndNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1DeviceAPI.html">DeviceAPI</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LayoutAxis.html">LayoutAxis</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::ti [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AnnotationStep.html">AnnotationStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DeviceCopyAttrs.html">DeviceCopyAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LayoutNode.html">LayoutNode</a [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AnnotationStepNode.html">AnnotationStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1DeviceWrapper.html">DeviceWrapper</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="str [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Any.html">Any</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1profiling_1_1DeviceWrapperNode.html">DeviceWrapperNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LE.html">LE</a> (<a class="el" href="nam [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AnyNode.html">AnyNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPattern.html">DFPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1LeakyReluAttrs.html">LeakyReluAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html"> [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBest.html">ApplyHistoryBest</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternCallback.html">DFPatternCallback</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1tir_1_1LENode.html">LENode</a> ( [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html">ApplyHistoryBestNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternCallbackNode.html">DFPatternCallbackNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Let.html [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ArangeAttrs.html">ArangeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor.html">DFPatternFunctor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Let.html">Let</a> (<a class="el" href="namespacetvm_1_1tir.html [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfo.html">ArgInfo</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor_3_01R_07const_01DFPattern_01_6n_00_01Args_8_8_8_08_4.html">DFPatternFunctor&lt; R(const DFPattern &amp;n, Args...)&gt;</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td va [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfoNode.html">ArgInfoNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternNode.html">DFPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LetNode.html">LetNode</a> (<a class="el" hre [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ArgReduceAttrs.html">ArgReduceAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DFPatternVisitor.html">DFPatternVisitor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1LetPattern.html">LetPattern</a> (<a class="el" href="na [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ArgsortAttrs.html">ArgsortAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1Diagnostic.html">Diagnostic</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1LetPatternNode.html">LetPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::re [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticBuilder.html">DiagnosticBuilder</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LetStmt.html">LetStmt</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;& [...]
 </td></tr>
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ArgsortAttrs.html">ArgsortAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticBuilder.html">DiagnosticBuilder</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LetStmtNode.html">LetStmtNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Array.html">Array</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticContext.html">DiagnosticContext</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1support_1_1LinearCongruentialEngine.html">LinearCongruentialEngine</a> (<a class="el" href="namespac [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1ArrayAccessor.html">ArrayAccessor</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata.html">tvm::runtime::metadata</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticContextNode.html">DiagnosticContextNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1LinkedParam.html">LinkedParam</a> (<a clas [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1ArrayAccessor_3_01const_01char_01_5_00_01_1_1tvm_1_1runtime_1_1String_01_4.html">ArrayAccessor&lt; const char *, ::tvm::runtime::String &gt;</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata.html">tvm::runtime::metadata</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticNode.html">DiagnosticNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</t [...]
-</td><td valign="top"><a class="el" href="classtvm_1_1TargetKind.html">TargetKind</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1SimpleObjAllocator_1_1ArrayHandler.html">SimpleObjAllocator::ArrayHandler</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticRenderer.html">DiagnosticRenderer</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Load.html">Load</a> (<a class="el"  [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1ArrayIterator.html">ArrayIterator</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata.html">tvm::runtime::metadata</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticRendererNode.html">DiagnosticRendererNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LoadNode.html">LoadNode</a> (<a c [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ArrayNode.html">ArrayNode</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DictAttrs.html">DictAttrs</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalBuilder.html">LocalBuilder</a> (<a class="el" href="namespacetvm_1_1auto__scheduler. [...]
-</td><td valign="top"><a class="el" href="classtvm_1_1TargetKindRegEntry.html">TargetKindRegEntry</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AssertStmt.html">AssertStmt</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DictAttrsNode.html">DictAttrsNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalBuilderNode.html">LocalBuilderNode</a> (<a class="el" href="namespacetvm_1_1auto__sche [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AssertStmtNode.html">AssertStmtNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DilateAttrs.html">DilateAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalRunner.html">LocalRunner</a> (<a class="el" href="namesp [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AttachMap.html">AttachMap</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Dilation2DAttrs.html">Dilation2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalRunnerNode.html">LocalR [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AttachMapNode.html">AttachMapNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Div.html">Div</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LoopRV.html">LoopRV</a> (<a class="el" href="namespacetvm_1_1ti [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrDocEntry.html">AttrDocEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DivNode.html">DivNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LoopRVNode.html">LoopRVNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm: [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrDocVisitor.html">AttrDocVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DominatorPattern.html">DominatorPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1LRNAttrs.html">LRNAttrs</a> (<a class="el" href="nam [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1AttrError.html">AttrError</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DominatorPatternNode.html">DominatorPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LT.html">LT</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&# [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrExistVisitor.html">AttrExistVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DropoutAttrs.html">DropoutAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LTNode.html">LTNode</a> (<a class="el" href="namespacetvm_ [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AttrFieldInfo.html">AttrFieldInfo</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1DurationNode.html">DurationNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_m"></a><table border="0" cellspacing="0" cellpadding="0">< [...]
-</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1RebaseNode.html">RebaseNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Tensor.html">Tensor</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1AttrFieldInfoNode.html">AttrFieldInfoNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DynExpandDimsAttrs.html">DynExpandDimsAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1RecClosure.html">RecClosure</a> (<a class="el" href="namespacetvm_1_1r [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1detail_1_1AttrInitEntry.html">AttrInitEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_e"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;e&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Map.html">Map</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1RecClosureObj.html">RecClosureObj</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TensorAffineTypeNode.html">TensorAffineTypeNode</a> (<a class="el" href="namespacetvm. [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrInitVisitor.html">AttrInitVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1MapNode.html">MapNode</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1RecordReader.html">RecordReader</a> (<a class="el" hr [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrNonDefaultVisitor.html">AttrNonDefaultVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1EinsumAttrs.html">EinsumAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Match.html">Match</a> (<a class="el" href="names [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1detail_1_1AttrNopEntry.html">AttrNopEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1EnvFunc.html">EnvFunc</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MatchBufferRegion.html">MatchBufferRegion</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrNormalVisitor.html">AttrNormalVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1EnvFuncNode.html">EnvFuncNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MatchBufferRegionNode.html">MatchBufferRegionNode</a> (<a class="el" href="namespace [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AttrPattern.html">AttrPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1EQ.html">EQ</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1MatchNode.html">MatchNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&# [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AttrPatternNode.html">AttrPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1EQNode.html">EQNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MatmulAttrs.html">MatmulAttrs</a> (<a class="el" href="namespacetvm_1_1relay.ht [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AttrRegistry.html">AttrRegistry</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1ErrorBuilder.html">ErrorBuilder</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MatrixSetDiagAttrs.html">MatrixSetDiagAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#16 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AttrRegistryMap.html">AttrRegistryMap</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1ErrorReporter.html">ErrorReporter</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Max.html">Max</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign= [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AttrRegistryMapContainerMap.html">AttrRegistryMapContainerMap</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Evaluate.html">Evaluate</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MaxNode.html">MaxNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::t [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1Attrs.html">Attrs</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1EvaluateNode.html">EvaluateNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MaxPool1DAttrs.html">MaxPool1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&# [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AttrsNode.html">AttrsNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1vm_1_1Executable.html">Executable</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MaxPool2DAttrs.html">MaxPool2DAttrs</a> (<a class="el" href="namespacetvm_1_1rel [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrsSEqualVisitor.html">AttrsSEqualVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Executor.html">Executor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MaxPool3DAttrs.html">MaxPool3DAttrs</a> (<a class="el" href= [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrsSHashVisitor.html">AttrsSHashVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExecutorNode.html">ExecutorNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallback.html">MeasureCallback</a> (<a  [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AttrStmt.html">AttrStmt</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExecutorRegEntry.html">ExecutorRegEntry</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureCallback.html">MeasureCallback</a> (<a class="el" href="n [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AttrStmtNode.html">AttrStmtNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ExpandDimsAttrs.html">ExpandDimsAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallbackNode.html">MeasureCallbackNode</a> (<a clas [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1detail_1_1AttrTriggerNonDefaultEntry.html">AttrTriggerNonDefaultEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1tir_1_1ExprDeepEqual.html">ExprDeepEqual</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureCallbackNode.html">Mea [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1AttrVisitor.html">AttrVisitor</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprFunctor.html">ExprFunctor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidate.html">MeasureCandidate</a> (<a class="el" href="namespacetvm_1_1meta__s [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AutoSchedulerLayoutTransformAttrs.html">AutoSchedulerLayoutTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprFunctor.html">ExprFunctor</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.htm [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AvgPool1DAttrs.html">AvgPool1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html">ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href=" [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AvgPool2DAttrs.html">AvgPool2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprFunctor_3_01R_07const_01PrimExpr_01_6n_00_01Args_8_8_8_08_4.html">ExprFunctor&lt; R(const PrimExpr &amp;n, Args...)&gt;</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AvgPool3DAttrs.html">AvgPool3DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprMutator.html">ExprMutator</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureResult.html">MeasureResult</a> (<a class="el" hr [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1ArrayAccessor.html">ArrayAccessor</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata.html">tvm::runtime::metadata</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticContext.html">DiagnosticContext</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LetStmtNode.html">LetStmtNode</a> (<a class [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1ArrayAccessor_3_01const_01char_01_5_00_01_1_1tvm_1_1runtime_1_1String_01_4.html">ArrayAccessor&lt; const char *, ::tvm::runtime::String &gt;</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata.html">tvm::runtime::metadata</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticContextNode.html">DiagnosticContextNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;& [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1SimpleObjAllocator_1_1ArrayHandler.html">SimpleObjAllocator::ArrayHandler</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticNode.html">DiagnosticNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Load.html">Load</a> (<a class="el" href="na [...]
+</td><td valign="top"><a class="el" href="classtvm_1_1Target.html">Target</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1ArrayIterator.html">ArrayIterator</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata.html">tvm::runtime::metadata</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticRenderer.html">DiagnosticRenderer</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LoadNode.html">LoadNode</a> (<a class="el [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ArrayNode.html">ArrayNode</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DiagnosticRendererNode.html">DiagnosticRendererNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalBuilder.html">LocalBuilder</a> (<a class="el" href="namesp [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AssertStmt.html">AssertStmt</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DictAttrs.html">DictAttrs</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalBuilderNode.html">LocalBuilderNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.ht [...]
+</td><td valign="top"><a class="el" href="classtvm_1_1TargetKindNode.html">TargetKindNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AssertStmtNode.html">AssertStmtNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1DictAttrsNode.html">DictAttrsNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalRunner.html">LocalRunner</a> (<a class="el" href="namespacetvm_1_1auto__schedu [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AttachMap.html">AttachMap</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DilateAttrs.html">DilateAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1LocalRunnerNode.html">LocalRunnerNod [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1AttachMapNode.html">AttachMapNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Dilation2DAttrs.html">Dilation2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LoopRV.html">LoopRV</a> (<a clas [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrDocEntry.html">AttrDocEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Div.html">Div</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LoopRVNode.html">LoopRVNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a> [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrDocVisitor.html">AttrDocVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1DivNode.html">DivNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1LRNAttrs.html">LRNAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html" [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1AttrError.html">AttrError</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DominatorPattern.html">DominatorPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LT.html">LT</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#16 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrExistVisitor.html">AttrExistVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1DominatorPatternNode.html">DominatorPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1LTNode.html">LTNode</a> (<a class="el" href [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AttrFieldInfo.html">AttrFieldInfo</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DropoutAttrs.html">DropoutAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_m"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;m&#160; [...]
+</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Rebase.html">Rebase</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1TempExpr.html">TempExpr</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1AttrFieldInfoNode.html">AttrFieldInfoNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1DurationNode.html">DurationNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1RebaseNode.html">RebaseNode</a> (<a class=" [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1detail_1_1AttrInitEntry.html">AttrInitEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1DynExpandDimsAttrs.html">DynExpandDimsAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Map.html">Map</a> (<a class="el" href="namespac [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrInitVisitor.html">AttrInitVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_e"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;e&#160;&#160;</div></td></tr></table>
+</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1MapNode.html">MapNode</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1RecClosureObj.html">RecClosureObj</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TensorAffineType.html">TensorAffineType</a> (<a class="el" href="namespacetvm. [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrNonDefaultVisitor.html">AttrNonDefaultVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Match.html">Match</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1RecordReader.html">RecordReader</a> (<a class="el"  [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1detail_1_1AttrNopEntry.html">AttrNopEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1EinsumAttrs.html">EinsumAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MatchBufferRegion.html">MatchBufferRegion</a> (<a class="el" href=" [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrNormalVisitor.html">AttrNormalVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1EnvFunc.html">EnvFunc</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MatchBufferRegionNode.html">MatchBufferRegionNode</a> (<a class="el" href="namespacetvm_1_1t [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AttrPattern.html">AttrPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1EnvFuncNode.html">EnvFuncNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1MatchNode.html">MatchNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1AttrPatternNode.html">AttrPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1EQ.html">EQ</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MatmulAttrs.html">MatmulAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm: [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AttrRegistry.html">AttrRegistry</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1EQNode.html">EQNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MatrixSetDiagAttrs.html">MatrixSetDiagAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AttrRegistryMap.html">AttrRegistryMap</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1ErrorBuilder.html">ErrorBuilder</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Max.html">Max</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign=" [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AttrRegistryMapContainerMap.html">AttrRegistryMapContainerMap</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1ErrorReporter.html">ErrorReporter</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MaxNode.html">MaxNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&# [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1Attrs.html">Attrs</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Evaluate.html">Evaluate</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MaxPool1DAttrs.html">MaxPool1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#16 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AttrsNode.html">AttrsNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1EvaluateNode.html">EvaluateNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MaxPool2DAttrs.html">MaxPool2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>) [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrsSEqualVisitor.html">AttrsSEqualVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1vm_1_1Executable.html">Executable</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MaxPool3DAttrs.html">MaxPool3DAttr [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1detail_1_1AttrsSHashVisitor.html">AttrsSHashVisitor</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Executor.html">Executor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallback.html">MeasureCallback</a> (<a class="e [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AttrStmt.html">AttrStmt</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExecutorNode.html">ExecutorNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureCallback.html">MeasureCallback</a> (<a class="el" href="namespace [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1AttrStmtNode.html">AttrStmtNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExecutorRegEntry.html">ExecutorRegEntry</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallbackNode.html">MeasureCallbackNode</a> (<a cla [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1detail_1_1AttrTriggerNonDefaultEntry.html">AttrTriggerNonDefaultEntry</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ExpandDimsAttrs.html">ExpandDimsAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureCallbackNode [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1AttrVisitor.html">AttrVisitor</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1tir_1_1ExprDeepEqual.html">ExprDeepEqual</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidate.html">MeasureCandidate</a> (<a class="el" href="namespacetvm_1_1meta__sc [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AutoSchedulerLayoutTransformAttrs.html">AutoSchedulerLayoutTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprFunctor.html">ExprFunctor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidateNo [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AvgPool1DAttrs.html">AvgPool1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprFunctor.html">ExprFunctor</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureInput.html">MeasureInput</a> (<a class="el" href="name [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AvgPool2DAttrs.html">AvgPool2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html">ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href=" [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1AvgPool3DAttrs.html">AvgPool3DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprFunctor_3_01R_07const_01PrimExpr_01_6n_00_01Args_8_8_8_08_4.html">ExprFunctor&lt; R(const PrimExpr &amp;n, Args...)&gt;</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href [...]
 <tr><td rowspan="2" valign="bottom"><a name="letter_b"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;b&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprMutator.html">ExprMutator</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureResultNode.html">MeasureResultNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1RelayNode.html">RelayNode</a> (<a cl [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprPattern.html">ExprPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1MemoryInfo.html">MemoryInfo</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1RelayRefType.html">RelayRefType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td>< [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseAttrsNode.html">BaseAttrsNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprPatternNode.html">ExprPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1MemoryInfoNode.html">MemoryInfoNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160; [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1te_1_1BaseComputeOpNode.html">BaseComputeOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprRewriter.html">ExprRewriter</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1vm_1_1MemoryManager.html">MemoryManager</a> (<a class="el" href=" [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseExpr.html">BaseExpr</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprVisitor.html">ExprVisitor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structMemoryManagerInterface.html">MemoryManagerInterface</a>&#160;&#160;&#160;</td><td valign="top"><a class="el" href="cla [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseExprNode.html">BaseExprNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprVisitor.html">ExprVisitor</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MeshgridAttrs.html">MeshgridAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseFunc.html">BaseFunc</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1ExternOp.html">ExternOp</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArray.html">MetadataArray</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata.html">tvm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseFuncNode.html">BaseFuncNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1ExternOpNode.html">ExternOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArrayNode.html">MetadataArrayNode</a> (<a class="el" href="namespacetvm_1_1runti [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseTensorType.html">BaseTensorType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTask.html">ExtractedTask</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataBase.html">MetadataBase</a> (<a class=" [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseTensorTypeNode.html">BaseTensorTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html">ExtractedTaskNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataBaseNode.html">Metadata [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseValueEqual.html">BaseValueEqual</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncObj_1_1Extractor.html">PackedFuncObj::Extractor</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollector.html">MetricCollector</a> [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1BaseValueHash.html">BaseValueHash</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_f"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;f&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollectorNode.html">MetricCollectorNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ReshapeLikeAttrs.html">ReshapeLikeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TupleAffin [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BatchMatmulAttrs.html">BatchMatmulAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Min.html">Min</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ReshapeTensorAttrs.html">ReshapeTensorAttrs</a> (<a class="el" href="namespacetvm_ [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BatchNormAttrs.html">BatchNormAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractor.html">FeatureExtractor</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MinNode.html">MinNode</a> (<a c [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BatchToSpaceNDAttrs.html">BatchToSpaceNDAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractorNode.html">FeatureExtractorNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MirrorPadA [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BiasAddAttrs.html">BiasAddAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FeatureSet.html">FeatureSet</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1MixedModeMutator.html">MixedModeMutator</a> (<a class="el" href="namesp [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BijectiveLayout.html">BijectiveLayout</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1FIFOBufferAttrs.html">FIFOBufferAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1MixedModeVisitor.html">MixedModeVisitor</a> (<a class="el" hr [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BijectiveLayoutNode.html">BijectiveLayoutNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1FixedPointMultiplyAttrs.html">FixedPointMultiplyAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Mod.html">Mod</a> (<a class="el" href=" [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BinaryConv2DAttrs.html">BinaryConv2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1SeqStmt_1_1Flattener.html">SeqStmt::Flattener</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ModNode.html">ModNode</a> (<a class="el" href="name [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BinaryDenseAttrs.html">BinaryDenseAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FloatImm.html">FloatImm</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ModularSet.html">ModularSet</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith< [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BinaryOpNode.html">BinaryOpNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FloatImmNode.html">FloatImmNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ModularSetAnalyzer.html">ModularSetAnalyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tv [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BitPackAttrs.html">BitPackAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorDiv.html">FloorDiv</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ModularSetNode.html">ModularSetNode</a> (<a class="el" href="namespacetvm_1_1arit [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Block.html">Block</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorDivNode.html">FloorDivNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Module.html">Module</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>) [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1BlockInfo.html">BlockInfo</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorMod.html">FloorMod</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ModuleNode.html">ModuleNode</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::run [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockNode.html">BlockNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorModNode.html">FloorModNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Mul.html">Mul</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#16 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRealize.html">BlockRealize</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowFusedSplitStep.html">FollowFusedSplitStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MulNode.html">MulNode</a> (<a c [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRealizeNode.html">BlockRealizeNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowFusedSplitStepNode.html">FollowFusedSplitStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MultiBoxPrio [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowSplitStep.html">FollowSplitStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MultiBoxTransformLocAttrs.html">MultiBoxTransfor [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRVNode.html">BlockRVNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowSplitStepNode.html">FollowSplitStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Mutator.html">Mutator</a [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockScope.html">BlockScope</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1For.html">For</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MutatorNode.html">MutatorNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">t [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockScopeNode.html">BlockScopeNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ForNode.html">ForNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_n"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;n&# [...]
-</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerInputNode.html">RunnerInputNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structTVMMetadata.html">TVMMetadata</a>&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1Bool.html">Bool</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1micro__rpc_1_1FrameBuffer.html">FrameBuffer</a> (<a class="el" href="namespacetvm_1_1runtime_1_1micro__rpc.html">tvm::runtime::micro_rpc</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerNode.html">RunnerNode</a> (<a class="el" href="name [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Broadcast.html">Broadcast</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1micro__rpc_1_1Framer.html">Framer</a> (<a class="el" href="namespacetvm_1_1runtime_1_1micro__rpc.html">tvm::runtime::micro_rpc</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1NDArray.html">NDArray</a> (<a class="el" href [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1qnn_1_1BroadcastAttrs.html">BroadcastAttrs</a> (<a class="el" href="namespacetvm_1_1relay_1_1qnn.html">tvm::relay::qnn</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ShapeTupleObj_1_1FromStd.html">ShapeTupleObj::FromStd</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1NDArrayContainerTrait.ht [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BroadcastNode.html">BroadcastNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1StringObj_1_1FromStd.html">StringObj::FromStd</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NdarraySizeAttrs.html">NdarraySizeAttrs</a> (<a cla [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1vm_1_1Buffer.html">Buffer</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Function.html">Function</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1NE.html">NE</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm: [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Buffer.html">Buffer</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FunctionNode.html">FunctionNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1NENode.html">NENode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#16 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1BufferInfo.html">BufferInfo</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FunctionPattern.html">FunctionPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NLLLossAttrs.html">NLLLossAttrs</a> (<a class="el [...]
-</td><td valign="top"><a class="el" href="structTVMParallelGroupEnv.html">TVMParallelGroupEnv</a>&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1BufferInfoAnalysis.html">BufferInfoAnalysis</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FunctionPatternNode.html">FunctionPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1NodeFunctor.html">NodeFunctor</a> ( [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoAnalysisNode.html">BufferInfoAnalysisNode</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FuncType.html">FuncType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1NodeFunctor_3_01R_07const_01ObjectRef_01_6n_00_01Args_8_8_8_08_4.html">No [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoNode.html">BufferInfoNode</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FuncTypeNode.html">FuncTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NonMaximumSuppressionAttrs.html">NonMaximumSuppressionAttrs</a> (<a cla [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferLoad.html">BufferLoad</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Fuse.html">Fuse</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NormalAttrs.html">NormalAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#16 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferLoadNode.html">BufferLoadNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1FuseNode.html">FuseNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Not.html">Not</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferNode.html">BufferNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FuseStep.html">FuseStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1NotNode.html">NotNode</a> (<a class="el" href="namespacetvm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRealize.html">BufferRealize</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FuseStepNode.html">FuseStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1NullOptType.html">NullOptType</a> (<a cl [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRealizeNode.html">BufferRealizeNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_g"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;g&#160;&#160;</div></td></tr></table>
-</td><td rowspan="2" valign="bottom"><a name="letter_o"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;o&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Schedule.html">Schedule</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TypeConstraint.html">TypeConstraint</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRegion.html">BufferRegion</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Schedule.html">Schedule</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TypeConstraintNode.html">TypeConstraintNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)& [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRegionNode.html">BufferRegionNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GatherAttrs.html">GatherAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ObjAllocatorBase.html">ObjAllocatorBase</a> (<a class="el" href=" [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferStore.html">BufferStore</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GatherNDAttrs.html">GatherNDAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> (<a class="el" href="namespacetvm_1_1runtime.ht [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferStoreNode.html">BufferStoreNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GE.html">GE</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectEqual.html">ObjectEqual</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::r [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Builder.html">Builder</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GenericFunc.html">GenericFunc</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectHash.html">ObjectHash</a> (<a class="el" href="namespacetvm_1_1runtime [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInput.html">BuilderInput</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GenericFuncNode.html">GenericFuncNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html">ObjectPtr</a> (<a class="el" href="namespac [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html">BuilderInputNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GENode.html">GENode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectPtrEqual.html">ObjectPtrEqual</a> (<a class [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderNode.html">BuilderNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GetValidCountsAttrs.html">GetValidCountsAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectPtrHash.html">Object [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResult.html">BuilderResult</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GlobalPool2DAttrs.html">GlobalPool2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</ [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResultNode.html">BuilderResultNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GlobalTypeVar.html">GlobalTypeVar</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectTypeChecker.html">ObjectTypeChecker</a> (<a cl [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1BuildResult.html">BuildResult</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GlobalTypeVarNode.html">GlobalTypeVarNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectTypeChecker_3_01Array_3_01T_01_4_01_4.html">Obj [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1BuildResultNode.html">BuildResultNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GlobalVar.html">GlobalVar</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectTypeChecker_3_01Map_3_01K_00_01V_01_4_01_4.html">Object [...]
+</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprMutator.html">ExprMutator</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1MeasureResultNode.html">MeasureResultNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1RelayExprNode.html">RelayExprNode</a> ( [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprMutator.html">ExprMutator</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1MemoryInfo.html">MemoryInfo</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1RelayNode.html">RelayNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseAttrsNode.html">BaseAttrsNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprPattern.html">ExprPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1MemoryInfoNode.html">MemoryInfoNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&# [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1te_1_1BaseComputeOpNode.html">BaseComputeOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprPatternNode.html">ExprPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1vm_1_1MemoryManager.html">MemoryManager</a> (<a class="el"  [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseExpr.html">BaseExpr</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprRewriter.html">ExprRewriter</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structMemoryManagerInterface.html">MemoryManagerInterface</a>&#160;&#160;&#160;</td><td valign="top"><a class="el" href="c [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseExprNode.html">BaseExprNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ExprVisitor.html">ExprVisitor</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MeshgridAttrs.html">MeshgridAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::re [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseFunc.html">BaseFunc</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ExprVisitor.html">ExprVisitor</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArray.html">MetadataArray</a> (<a class="el" href="namespacetvm_1_1runtime_1_1metadata. [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseFuncNode.html">BaseFuncNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1ExternOp.html">ExternOp</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArrayNode.html">MetadataArrayNode</a> (<a class="el" href="namespacetvm_1_1runtime_1_1me [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseTensorType.html">BaseTensorType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1ExternOpNode.html">ExternOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataBase.html">MetadataBase</a> (<a class="el" href="namespacetvm_1_1runtime_1_1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseTensorTypeNode.html">BaseTensorTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTask.html">ExtractedTask</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataBaseNode.html">MetadataBaseNode [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseValueEqual.html">BaseValueEqual</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html">ExtractedTaskNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollector.html">MetricCollector< [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1BaseValueHash.html">BaseValueHash</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncObj_1_1Extractor.html">PackedFuncObj::Extractor</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollectorNode.html">MetricCollectorNo [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BatchMatmulAttrs.html">BatchMatmulAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_f"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;f&#160;&#160;</div></td></tr></table>
+</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Min.html">Min</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ReshapeLikeAttrs.html">ReshapeLikeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Tuple.html">Tuple</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay< [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BatchNormAttrs.html">BatchNormAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MinNode.html">MinNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ReshapeTensorAttrs.html">ReshapeTensorAttrs</a> (<a class="el" href="namespace [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BatchToSpaceNDAttrs.html">BatchToSpaceNDAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractor.html">FeatureExtractor</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MirrorPadAttrs.htm [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BiasAddAttrs.html">BiasAddAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractorNode.html">FeatureExtractorNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1MixedModeMutator.html">Mi [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BijectiveLayout.html">BijectiveLayout</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FeatureSet.html">FeatureSet</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1MixedModeVisitor.html">MixedModeVisitor</a> (<a class="el" href="namespa [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BijectiveLayoutNode.html">BijectiveLayoutNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1FIFOBufferAttrs.html">FIFOBufferAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Mod.html">Mod</a> (<a class="el" href="namespacetvm_1_1 [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BinaryConv2DAttrs.html">BinaryConv2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1FixedPointMultiplyAttrs.html">FixedPointMultiplyAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ModNode.html">ModNode</a> (<a class= [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BinaryDenseAttrs.html">BinaryDenseAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1SeqStmt_1_1Flattener.html">SeqStmt::Flattener</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ModularSet.html">ModularSet</a> (<a class="el" href [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BinaryOpNode.html">BinaryOpNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FloatImm.html">FloatImm</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ModularSetAnalyzer.html">ModularSetAnalyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1BitPackAttrs.html">BitPackAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FloatImmNode.html">FloatImmNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ModularSetNode.html">ModularSetNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Block.html">Block</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorDiv.html">FloorDiv</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Module.html">Module</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&# [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1BlockInfo.html">BlockInfo</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorDivNode.html">FloorDivNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ModuleNode.html">ModuleNode</a> (<a class="el" href="namespacetvm_1_1runtime.html"> [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockNode.html">BlockNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorMod.html">FloorMod</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Mul.html">Mul</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160; [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRealize.html">BlockRealize</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1FloorModNode.html">FloorModNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1MulNode.html">MulNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir< [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRealizeNode.html">BlockRealizeNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowFusedSplitStep.html">FollowFusedSplitStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MultiBoxPriorAttrs.h [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRV.html">BlockRV</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowFusedSplitStepNode.html">FollowFusedSplitStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1MultiBoxTransformLocAttrs.html [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockRVNode.html">BlockRVNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowSplitStep.html">FollowSplitStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Mutator.html">Mutator</a> (<a cl [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockScope.html">BlockScope</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FollowSplitStepNode.html">FollowSplitStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1MutatorNode.html">MutatorN [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BlockScopeNode.html">BlockScopeNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1For.html">For</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_n"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;n&#160;&#16 [...]
+</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerInput.html">RunnerInput</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structTVMFuncRegistry.html">TVMFuncRegistry</a>&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1Bool.html">Bool</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1ForNode.html">ForNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerInputNode.html">RunnerInputNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedu [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Broadcast.html">Broadcast</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1micro__rpc_1_1FrameBuffer.html">FrameBuffer</a> (<a class="el" href="namespacetvm_1_1runtime_1_1micro__rpc.html">tvm::runtime::micro_rpc</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1NDArray.html">NDArray</a> (<a class [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1qnn_1_1BroadcastAttrs.html">BroadcastAttrs</a> (<a class="el" href="namespacetvm_1_1relay_1_1qnn.html">tvm::relay::qnn</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1micro__rpc_1_1Framer.html">Framer</a> (<a class="el" href="namespacetvm_1_1runtime_1_1micro__rpc.html">tvm::runtime::micro_rpc</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1NDArrayContainerTra [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BroadcastNode.html">BroadcastNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ShapeTupleObj_1_1FromStd.html">ShapeTupleObj::FromStd</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NdarraySizeAttrs.html">NdarraySizeAttrs</a> [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1vm_1_1Buffer.html">Buffer</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1StringObj_1_1FromStd.html">StringObj::FromStd</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1NE.html">NE</a> (<a class="el" href="na [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Buffer.html">Buffer</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Function.html">Function</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1NENode.html">NENode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160; [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1BufferInfo.html">BufferInfo</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FunctionNode.html">FunctionNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NLLLossAttrs.html">NLLLossAttrs</a> (<a class="el" href [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1BufferInfoAnalysis.html">BufferInfoAnalysis</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FunctionPattern.html">FunctionPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1NodeFunctor.html">NodeFunctor</a> (<a class [...]
+</td><td valign="top"><a class="el" href="structTVMPackedFunc.html">TVMPackedFunc</a>&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoAnalysisNode.html">BufferInfoAnalysisNode</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1FunctionPatternNode.html">FunctionPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1NodeFunctor_3_01R_07const_ [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoNode.html">BufferInfoNode</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp.html">tvm::tir::usmp</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FuncType.html">FuncType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NonMaximumSuppressionAttrs.html">NonMaximumSuppressionAttrs</a> (<a class="el"  [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferLoad.html">BufferLoad</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1FuncTypeNode.html">FuncTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1NormalAttrs.html">NormalAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#16 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferLoadNode.html">BufferLoadNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Fuse.html">Fuse</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Not.html">Not</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;< [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferNode.html">BufferNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1FuseNode.html">FuseNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1NotNode.html">NotNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRealize.html">BufferRealize</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FuseStep.html">FuseStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1NullOptType.html">NullOptType</a> (<a class="el" [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRealizeNode.html">BufferRealizeNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1FuseStepNode.html">FuseStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_o"></a><table border="0" cellspacing="0" cell [...]
+</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ScatterNDAttrs.html">ScatterNDAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TypeCall.html">TypeCall</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRegion.html">BufferRegion</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_g"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;g&#160;&#160;</div></td></tr></table>
+</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Schedule.html">Schedule</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TypeCallNode.html">TypeCallNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferRegionNode.html">BufferRegionNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ObjAllocatorBase.html">ObjAllocatorBase</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Schedule.html">Schedule</a> (<a class="el" href="namesp [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferStore.html">BufferStore</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GatherAttrs.html">GatherAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> (<a class="el" href="namespacetvm_1_1runtime.html"> [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1BufferStoreNode.html">BufferStoreNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GatherNDAttrs.html">GatherNDAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectEqual.html">ObjectEqual</a> (<a class="el" href="namespa [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Builder.html">Builder</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GE.html">GE</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectHash.html">ObjectHash</a> (<a class="el" href="namespacetvm_1_1runtim [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInput.html">BuilderInput</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GenericFunc.html">GenericFunc</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html">ObjectPtr</a> (<a class="el" href="namespacetvm_1_1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html">BuilderInputNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GenericFuncNode.html">GenericFuncNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectPtrEqual.html">ObjectPtrEqual</a> (<a class= [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderNode.html">BuilderNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GENode.html">GENode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectPtrHash.html">ObjectPtrHash</a> (<a class="el" href=" [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResult.html">BuilderResult</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GetValidCountsAttrs.html">GetValidCountsAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectR [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResultNode.html">BuilderResultNode</a> (<a class="el" href="namespacetvm_1_1meta__schedule.html">tvm::meta_schedule</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GlobalPool2DAttrs.html">GlobalPool2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectTypeChecker. [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1BuildResult.html">BuildResult</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GlobalTypeVar.html">GlobalTypeVar</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectTypeChecker_3_01Array_3_01T_01_4_01_4.html">ObjectTypeC [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1BuildResultNode.html">BuildResultNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1GlobalTypeVarNode.html">GlobalTypeVarNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1ObjectTypeChecker_3_01Map_3_01K_00_01V_01_4_0 [...]
 <tr><td rowspan="2" valign="bottom"><a name="letter_c"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;c&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1GlobalVarNode.html">GlobalVarNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1OnDeviceAttrs.html">OnDeviceAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1SearchSortedAttrs.html">SearchSortedAttrs</a> (<a class="el" href="namespacetvm_1_1rel [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1algo_1_1GreedyBase.html">GreedyBase</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp_1_1algo.html">tvm::tir::usmp::algo</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1OneHotAttrs.html">OneHotAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1SearchStrategy.html">Searc [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStep.html">CacheReadStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GridSampleAttrs.html">GridSampleAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1Op.html">Op</a> (<a class="el" href="na [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStepNode.html">CacheReadStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GroupNormAttrs.html">GroupNormAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1OpAttrMap.html">OpAttrMap</a> (<a [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStep.html">CacheWriteStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GT.html">GT</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Operation.html">Operation</a> (<a class="el" href="namespacetvm_ [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html">CacheWriteStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GTNode.html">GTNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1OperationNode.html">OperationNode</a> (<a class= [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Call.html">Call</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_h"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;h&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpImplementation.html">OpImplementation</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1SelectNode.html">SelectNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1TypeName_3_01void_01_5_01_4.html">TypeName&lt; void * &gt;</a> (<a  [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Call.html">Call</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpImplementationNode.html">OpImplementationNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1SelectSEqualReduce.html">SelectSEqualReduce</a> (<a class="el" href="nam [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1profiling_1_1CallFrame.html">CallFrame</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1SimpleObjAllocator_1_1Handler.html">SimpleObjAllocator::Handler</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1O [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CallLoweredAttrs.html">CallLoweredAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1SEqualReducer_1_1Handler.html">SEqualReducer::Handler</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1OpRegEntry.html">OpRegEntry</a> (<a class="el" href="namespacetvm.html" [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1CallNode.html">CallNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1SHashReducer_1_1Handler.html">SHashReducer::Handler</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpSpecialization.html">OpSpecialization</a> (<a class="el" href="namespacetvm_1_1 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CallNode.html">CallNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structdmlc_1_1serializer_1_1Handler_3_01DLDataType_01_4.html">Handler&lt; DLDataType &gt;</a> (<a class="el" href="namespacedmlc_1_1serializer.html">dmlc::serializer</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpSpecializationNode.html">OpSpec [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1CallPattern.html">CallPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structdmlc_1_1serializer_1_1Handler_3_01DLDevice_01_4.html">Handler&lt; DLDevice &gt;</a> (<a class="el" href="namespacedmlc_1_1serializer.html">dmlc::serializer</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpStrategy.html">OpStrate [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1CallPatternNode.html">CallPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1HardwareParams.html">HardwareParams</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpStrategyNode.html">OpStrate [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1CanonicalSimplifier.html">CanonicalSimplifier</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1HardwareParamsNode.html">HardwareParamsNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Optional.ht [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Cast.html">Cast</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1HybridOp.html">HybridOp</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Or.html">Or</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CastAttrs.html">CastAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1HybridOpNode.html">HybridOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1OrNode.html">OrNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>) [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CastHintAttrs.html">CastHintAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_i"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;i&#160;&#160;</div></td></tr></table>
-</td><td rowspan="2" valign="bottom"><a name="letter_p"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;p&#160;&#160;</div></td></tr></table>
-</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1Sequential.html">Sequential</a> (<a class="el" href="namespacetvm_1_1transform.html">tvm::transform</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_u"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;u&#160;&#160;</div></td></tr></table>
+</td><td valign="top"><a class="el" href="classtvm_1_1GlobalVar.html">GlobalVar</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1OnDeviceAttrs.html">OnDeviceAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1SearchPolicyNode.html">SearchPolicyNode</a> (<a class="el" href="namespacetvm_1_1auto [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1GlobalVarNode.html">GlobalVarNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1OneHotAttrs.html">OneHotAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1SearchSortedAttrs.html">SearchSortedAttrs</a> (<a class="el" href="namespacetvm_1_1relay.ht [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStep.html">CacheReadStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1algo_1_1GreedyBase.html">GreedyBase</a> (<a class="el" href="namespacetvm_1_1tir_1_1usmp_1_1algo.html">tvm::tir::usmp::algo</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1Op.html">Op< [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheReadStepNode.html">CacheReadStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GridSampleAttrs.html">GridSampleAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1OpAttrMap.html">OpAttrMap</a> ( [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStep.html">CacheWriteStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1GroupNormAttrs.html">GroupNormAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1Operation.html">Operation</a> (<a [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1CacheWriteStepNode.html">CacheWriteStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GT.html">GT</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1OperationNode.html">OperationNode</a> (<a class="el" hre [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Call.html">Call</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1GTNode.html">GTNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpImplementation.html">OpImplementation</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::rela [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Call.html">Call</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_h"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;h&#160;&#160;</div></td></tr></table>
+</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpImplementationNode.html">OpImplementationNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1SelectNode.html">SelectNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1TypeName_3_01uint64__t_01_4.html">TypeName&lt; uint64_t &gt [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1profiling_1_1CallFrame.html">CallFrame</a> (<a class="el" href="namespacetvm_1_1runtime_1_1profiling.html">tvm::runtime::profiling</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1OpNode.html">OpNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1SelectSEqualReduce.html">SelectSEqualReduce</a> (<a class="el" hr [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CallLoweredAttrs.html">CallLoweredAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1SimpleObjAllocator_1_1Handler.html">SimpleObjAllocator::Handler</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1OpRegEntry.html">OpRegEntry</a [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1CallNode.html">CallNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1SEqualReducer_1_1Handler.html">SEqualReducer::Handler</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpSpecialization.html">OpSpecialization</a> (<a class="el" href="namespacetvm_1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CallNode.html">CallNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1SHashReducer_1_1Handler.html">SHashReducer::Handler</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpSpecializationNode.html">OpSpecializationNode</a> (<a class="el" href="namespacetvm_1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1CallPattern.html">CallPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structdmlc_1_1serializer_1_1Handler_3_01DLDataType_01_4.html">Handler&lt; DLDataType &gt;</a> (<a class="el" href="namespacedmlc_1_1serializer.html">dmlc::serializer</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpStrategy.html">OpSt [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1CallPatternNode.html">CallPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structdmlc_1_1serializer_1_1Handler_3_01DLDevice_01_4.html">Handler&lt; DLDevice &gt;</a> (<a class="el" href="namespacedmlc_1_1serializer.html">dmlc::serializer</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1OpStrategyNode.ht [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1CanonicalSimplifier.html">CanonicalSimplifier</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1HardwareParams.html">HardwareParams</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Optional.html">Opti [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Cast.html">Cast</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1HardwareParamsNode.html">HardwareParamsNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Or.html">Or</a> (<a class="el" href="namespacetvm_1 [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CastAttrs.html">CastAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1HybridOp.html">HybridOp</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1OrNode.html">OrNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&# [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CastHintAttrs.html">CastHintAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1HybridOpNode.html">HybridOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_p"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#1 [...]
+</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1SequenceMaskAttrs.html">SequenceMaskAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TypeVarNode.html">TypeVarNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CastNode.html">CastNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td rowspan="2" valign="bottom"><a name="letter_i"></a><table border="0" cellspacing="0" cellpadding="0"><tr><td><div class="ah">&#160;&#160;i&#160;&#160;</div></td></tr></table>
+</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1Sequential.html">Sequential</a> (<a class="el" href="namespacetvm_1_1transform.html">tvm::transform</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1TypeVisitor.html">TypeVisitor</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td></tr>
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Clause.html">Clause</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1PackedFunc.html">PackedFunc</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1SequentialNode.html">SequentialNode</a> (<a class="el" href="namespacetvm_ [...]
 </td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CastNode.html">CastNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1SequentialNode.html">SequentialNode</a> (<a class="el" href="namespacetvm_1_1transform.html">tvm::transform</a>)&#160;&#160;&#160;</td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Clause.html">Clause</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Id.html">Id</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1PackedFunc.html">PackedFunc</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ClauseNode.html">ClauseNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IdNode.html">IdNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1PackedFuncObj.html">PackedFuncObj</a> (<a class="el" href="namespacetvm_1_1runtime [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ClipAttrs.html">ClipAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1If.html">If</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1PackedFuncSubObj.html">PackedFuncSubObj</a> (<a class="el" href="namespacetvm_1_1runtime.ht [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Closure.html">Closure</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IfNode.html">IfNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter.html">PackedFuncValueConverter</a> (<a class="el" href=" [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ClosureObj.html">ClosureObj</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IfPattern.html">IfPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01Optional_3_01T_01_4_01_4.html">PackedFun [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CmpOpNode.html">CmpOpNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IfPatternNode.html">IfPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01PrimExpr_01_4.html">PackedFuncValueConverter&l [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ClauseNode.html">ClauseNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Id.html">Id</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1PackedFuncObj.html">PackedFuncObj</a> (<a class="el" href="namespacetvm_1_1runtime.html">t [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ClipAttrs.html">ClipAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IdNode.html">IdNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1PackedFuncSubObj.html">PackedFuncSubObj</a> (<a class="el" href="namespacetvm_1_1ru [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1Closure.html">Closure</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1If.html">If</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter.html">PackedFuncValueConverter</a> (<a class="el" href="namespac [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1ClosureObj.html">ClosureObj</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IfNode.html">IfNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01Optional_3_01T_01_4_01_4.html">PackedFuncValue [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CmpOpNode.html">CmpOpNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IfPattern.html">IfPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01PrimExpr_01_4.html">PackedFuncValueConverter&lt; PrimE [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CommReducer.html">CommReducer</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1IfPatternNode.html">IfPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01tvm_1_1Bool_01_4.html">PackedFuncValueConv [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CommReducerNode.html">CommReducerNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IfThenElse.html">IfThenElse</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01tvm_1_1Integer_01_4.html">PackedFuncValueConve [...]
 </td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CommReducer.html">CommReducer</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IfThenElse.html">IfThenElse</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01tvm_1_1Bool_01_4.html">PackedFuncValueConverter&lt; tv [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1tir_1_1CommReducerNode.html">CommReducerNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IfThenElseNode.html">IfThenElseNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_01tvm_1_1Integer_01_4.html">PackedFuncVa [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1CompilationConfig.html">CompilationConfig</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSEqualReduce.html">ImplSEqualReduce</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_1_1tvm_1_1runtime_1_1String_01_4.html">Pa [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1CompilationConfigNode.html">CompilationConfigNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSEqualReduce_3_01T_00_01true_01_4.html">ImplSEqualReduce&lt; T, true &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1micro__rpc_1_1PacketFie [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1CompileError.html">CompileError</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSHashReduce.html">ImplSHashReduce</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1PadAttrs.html">PadAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm:: [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CompilerAttrs.html">CompilerAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSHashReduce_3_01T_00_01true_01_4.html">ImplSHashReduce&lt; T, true &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1Pass.html">Pa [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStep.html">ComputeAtStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplVisitAttrs.html">ImplVisitAttrs</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1PassContext.html">PassCon [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStepNode.html">ComputeAtStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplVisitAttrs_3_01T_00_01true_01_4.html">ImplVisitAttrs&lt; T, true &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="clas [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAG.html">ComputeDAG</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IncompleteType.html">IncompleteType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1PassInfo.html">PassInfo</a> (<a class="el" href="namespacetv [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAGNode.html">ComputeDAGNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IncompleteTypeNode.html">IncompleteTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1PassInfoNode.html">PassInfoNode</a> (<a clas [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeInlineStep.html">ComputeInlineStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1instrument_1_1PassInstrument.html">PassInstrument</a [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeInlineStepNode.html">ComputeInlineStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IndexMapNode.html">IndexMapNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.htm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1te_1_1ComputeOp.html">ComputeOp</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1InitOpAttrs.html">InitOpAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1PassNode.html">PassNode</a> (<a class="el" href="namespacetvm_1_1transform.html" [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1te_1_1ComputeOpNode.html">ComputeOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1InplaceArrayBase.html">InplaceArrayBase</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Pattern.html">Pattern</a> (<a class="el" href="namespacetvm_1 [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeRootStep.html">ComputeRootStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1InstanceNormAttrs.html">InstanceNormAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternConstructor.htm [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeRootStepNode.html">ComputeRootStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1vm_1_1Instruction.html">Instruction</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Patt [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConcatenateAttrs.html">ConcatenateAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Instruction.html">Instruction</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternFunctor.html">PatternFunctor</a> (<a class="el" href="namesp [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Constant.html">Constant</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionKind.html">InstructionKind</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html">Patt [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstantNode.html">ConstantNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionKindNode.html">InstructionKindNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternMutator.html">PatternMutator</a> (<a class="el" href= [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstantPattern.html">ConstantPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionKindRegEntry.html">InstructionKindRegEntry</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternNode.html">PatternNode</a> (<a class="e [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1CompilationConfig.html">CompilationConfig</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IfThenElseNode.html">IfThenElseNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1PackedFuncValueConverter_3_1_1tvm_1_1runtime_1_1String_01_4.html">PackedFuncValueC [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1CompilationConfigNode.html">CompilationConfigNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSEqualReduce.html">ImplSEqualReduce</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1micro__rpc_1_1PacketFieldSizeBytes.html">PacketFieldSizeBytes [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1CompileError.html">CompileError</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSEqualReduce_3_01T_00_01true_01_4.html">ImplSEqualReduce&lt; T, true &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1PadAttrs.html">PadAttrs</a> (<a class="el" [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1CompilerAttrs.html">CompilerAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSHashReduce.html">ImplSHashReduce</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1Pass.html">Pass</a> (<a class="el" href="namespacet [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStep.html">ComputeAtStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplSHashReduce_3_01T_00_01true_01_4.html">ImplSHashReduce&lt; T, true &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1 [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeAtStepNode.html">ComputeAtStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplVisitAttrs.html">ImplVisitAttrs</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1PassContextNode.h [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAG.html">ComputeDAG</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1ImplVisitAttrs_3_01T_00_01true_01_4.html">ImplVisitAttrs&lt; T, true &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transf [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeDAGNode.html">ComputeDAGNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IncompleteType.html">IncompleteType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1PassInfoNode.html">PassInfoNode</a> (<a class="el" h [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeInlineStep.html">ComputeInlineStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IncompleteTypeNode.html">IncompleteTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1instrument_1_1PassInstrument.html">PassInstrument</ [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeInlineStepNode.html">ComputeInlineStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IndexMap.html">IndexMap</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html">PassI [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1te_1_1ComputeOp.html">ComputeOp</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1IndexMapNode.html">IndexMapNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1transform_1_1PassNode.html">PassNode</a> (<a class="el" href="namespacetvm_1_1transform.html">tvm: [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1te_1_1ComputeOpNode.html">ComputeOpNode</a> (<a class="el" href="namespacetvm_1_1te.html">tvm::te</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1InitOpAttrs.html">InitOpAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Pattern.html">Pattern</a> (<a class="el" href="namespacetvm_1_1relay.html">t [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeRootStep.html">ComputeRootStep</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1InplaceArrayBase.html">InplaceArrayBase</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternConstructor. [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1auto__scheduler_1_1ComputeRootStepNode.html">ComputeRootStepNode</a> (<a class="el" href="namespacetvm_1_1auto__scheduler.html">tvm::auto_scheduler</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1relay_1_1InstanceNormAttrs.html">InstanceNormAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternConstru [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConcatenateAttrs.html">ConcatenateAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1runtime_1_1vm_1_1Instruction.html">Instruction</a> (<a class="el" href="namespacetvm_1_1runtime_1_1vm.html">tvm::runtime::vm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternFunctor.html">PatternFunctor</a [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1Constant.html">Constant</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1Instruction.html">Instruction</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html">PatternFunct [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstantNode.html">ConstantNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionKind.html">InstructionKind</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternMutator.html">PatternMutator</a> (<a class="el" href="namespa [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstantPattern.html">ConstantPattern</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionKindNode.html">InstructionKindNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternNode.html">PatternNode</a> (<a class="el" href= [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstantPatternNode.html">ConstantPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionKindRegEntry.html">InstructionKindRegEntry</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternTuple.html">PatternTuple</a> (< [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstIntBound.html">ConstIntBound</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionNode.html">InstructionNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternTupleNode.html">PatternTupleNode</a> (<a class="el" href="n [...]
 </td></tr>
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstantPatternNode.html">ConstantPatternNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1InstructionNode.html">InstructionNode</a> (<a class="el" href="namespacetvm_1_1tir.html">tvm::tir</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternTuple.html">PatternTuple</a> (<a class="el" hre [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstIntBound.html">ConstIntBound</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraints.html">IntConstraints</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternTupleNode.html">PatternTupleNode</a> (<a class="el" hre [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstIntBoundAnalyzer.html">ConstIntBoundAnalyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraintsNode.html">IntConstraintsNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> (<a cl [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstIntBoundNode.html">ConstIntBoundNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraintsTransform.html">IntConstraintsTransform</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternVarNode.html">PatternVarNode< [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstraintContext.html">ConstraintContext</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraintsTransformNode.html">IntConstraintsTransformNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternVisitor.html">Pattern [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1Constructor.html">Constructor</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1Integer.html">Integer</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternWildcard.html">PatternWildcard</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td>< [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1ConstructorNode.html">ConstructorNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1InterpreterClosure.html">InterpreterClosure</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternWildcardNode.html">PatternWildcardNode</a> (<a class="el" href="name [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstructorValue.html">ConstructorValue</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1InterpreterClosureObj.html">InterpreterClosureObj</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1PercentNode.html">PercentNo [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConstructorValueObj.html">ConstructorValueObj</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntGroupBounds.html">IntGroupBounds</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1PlaceholderOp.html">PlaceholderOp</a> (<a class="el" [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1NDArray_1_1Container.html">NDArray::Container</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntGroupBoundsNode.html">IntGroupBoundsNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1PlaceholderOpNode.html">PlaceholderOpNo [...]
-<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1NDArray_1_1ContainerBase.html">NDArray::ContainerBase</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IntImm.html">IntImm</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PointerType.html">PointerType</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#16 [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv1DAttrs.html">Conv1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IntImmNode.html">IntImmNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PointerTypeNode.html">PointerTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160 [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv1DTransposeAttrs.html">Conv1DTransposeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntSet.html">IntSet</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1PoolAllocation.html">PoolAllocation</a> (<a class="el" hr [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DAttrs.html">Conv2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntSetAnalyzer.html">IntSetAnalyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1PoolAllocationNode.html">PoolAllocationNode</a> (<a class= [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstIntBoundAnalyzer.html">ConstIntBoundAnalyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraints.html">IntConstraints</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> (<a class="el" [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstIntBoundNode.html">ConstIntBoundNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraintsNode.html">IntConstraintsNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternVarNode.html">PatternVarNode</a> (<a cl [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1arith_1_1ConstraintContext.html">ConstraintContext</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraintsTransform.html">IntConstraintsTransform</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternVisitor.html">PatternVisitor< [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1Constructor.html">Constructor</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntConstraintsTransformNode.html">IntConstraintsTransformNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternWildcard.html">PatternWildcard</a> (<a class="el" href="na [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1ConstructorNode.html">ConstructorNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1Integer.html">Integer</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1PatternWildcardNode.html">PatternWildcardNode</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&# [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1relay_1_1ConstructorValue.html">ConstructorValue</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1InterpreterClosure.html">InterpreterClosure</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1PercentNode.html">PercentNode</a> [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1ConstructorValueObj.html">ConstructorValueObj</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1relay_1_1InterpreterClosureObj.html">InterpreterClosureObj</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1PlaceholderOp.html">PlaceholderOp</a>  [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1NDArray_1_1Container.html">NDArray::Container</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntGroupBounds.html">IntGroupBounds</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1te_1_1PlaceholderOpNode.html">PlaceholderOpNode</a> ( [...]
+<tr><td valign="top"><a class="el" href="classtvm_1_1runtime_1_1NDArray_1_1ContainerBase.html">NDArray::ContainerBase</a> (<a class="el" href="namespacetvm_1_1runtime.html">tvm::runtime</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntGroupBoundsNode.html">IntGroupBoundsNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PointerType.html">PointerType</a> (<a [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv1DAttrs.html">Conv1DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IntImm.html">IntImm</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PointerTypeNode.html">PointerTypeNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><t [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv1DTransposeAttrs.html">Conv1DTransposeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IntImmNode.html">IntImmNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1tir_1_1usmp_1_1PoolAllocation.html">PoolAllocation</a> (<a class="el" href="namespacetvm_ [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DAttrs.html">Conv2DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntSet.html">IntSet</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1tir_1_1usmp_1_1PoolAllocationNode.html">PoolAllocationNode</a> (<a class="el" href="names [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DTransposeAttrs.html">Conv2DTransposeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntSetAnalyzer.html">IntSetAnalyzer</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PoolInfo.html">PoolInfo</a> (<a class="el" href="namespa [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DWinogradAttrs.html">Conv2DWinogradAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntSetNode.html">IntSetNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1PoolInfoNode.html">PoolInfoNode</a> (<a class="el" href="namespac [...]
 </td></tr>
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DTransposeAttrs.html">Conv2DTransposeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1arith_1_1IntSetNode.html">IntSetNode</a> (<a class="el" href="namespacetvm_1_1arith.html">tvm::arith</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1PoolInfo.html">PoolInfo</a> (<a class="el" href="namespacetvm.ht [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DWinogradAttrs.html">Conv2DWinogradAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IRModule.html">IRModule</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1PoolInfoNode.html">PoolInfoNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#16 [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DWinogradNNPACKWeightTransformAttrs.html">Conv2DWinogradNNPACKWeightTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IRModuleNode.html">IRModuleNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Postproc.html">Postproc</a> [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv3DAttrs.html">Conv3DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1is__specialized.html">is_specialized</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1PostprocNode.html">PostprocNode</a> (<a class="el"  [...]
-<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv3DTransposeAttrs.html">Conv3DTransposeAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="structtvm_1_1detail_1_1is__specialized_3_01Container_3_01Args_8_8_8_01_4_00_01Container_01_4.html">is_specialized&lt; Container&lt; Args... &gt;, Container &gt;</a> (<a class="el" href="namespacetvm_1_1detail.html">tvm::detail</a>)&#160;&#160;&#160;</td [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv2DWinogradNNPACKWeightTransformAttrs.html">Conv2DWinogradNNPACKWeightTransformAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IRModule.html">IRModule</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1Postproc.html">Postproc</a> (<a cla [...]
+<tr><td valign="top"><a class="el" href="structtvm_1_1relay_1_1Conv3DAttrs.html">Conv3DAttrs</a> (<a class="el" href="namespacetvm_1_1relay.html">tvm::relay</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1IRModuleNode.html">IRModuleNode</a> (<a class="el" href="namespacetvm.html">tvm</a>)&#160;&#160;&#160;</td><td valign="top"><a class="el" href="classtvm_1_1meta__schedule_1_1PostprocNode.html">PostprocNode</a> (<a class="el" href="namespacetvm_1_1meta__schedu [...]
 <tr><td></td><td></td><td></td><td></td><td></td></tr>
 </table>
 <div class="qindex"><a class="qindex" href="#letter_a">a</a>&#160;|&#160;<a class="qindex" href="#letter_b">b</a>&#160;|&#160;<a class="qindex" href="#letter_c">c</a>&#160;|&#160;<a class="qindex" href="#letter_d">d</a>&#160;|&#160;<a class="qindex" href="#letter_e">e</a>&#160;|&#160;<a class="qindex" href="#letter_f">f</a>&#160;|&#160;<a class="qindex" href="#letter_g">g</a>&#160;|&#160;<a class="qindex" href="#letter_h">h</a>&#160;|&#160;<a class="qindex" href="#letter_i">i</a>&#160;|& [...]
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParam-members.html b/docs/reference/api/doxygen/classtvm_1_1LinkedParam-members.html
deleted file mode 100644
index da218aff6..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParam-members.html
+++ /dev/null
@@ -1,102 +0,0 @@
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml">
-<head>
-<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
-<meta http-equiv="X-UA-Compatible" content="IE=9"/>
-<meta name="generator" content="Doxygen 1.8.13"/>
-<meta name="viewport" content="width=device-width, initial-scale=1"/>
-<title>tvm: Member List</title>
-<link href="tabs.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="jquery.js"></script>
-<script type="text/javascript" src="dynsections.js"></script>
-<link href="search/search.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="search/searchdata.js"></script>
-<script type="text/javascript" src="search/search.js"></script>
-<link href="doxygen.css" rel="stylesheet" type="text/css" />
-</head>
-<body>
-<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
-<div id="titlearea">
-<table cellspacing="0" cellpadding="0">
- <tbody>
- <tr style="height: 56px;">
-  <td id="projectalign" style="padding-left: 0.5em;">
-   <div id="projectname">tvm
-   </div>
-  </td>
- </tr>
- </tbody>
-</table>
-</div>
-<!-- end header part -->
-<!-- Generated by Doxygen 1.8.13 -->
-<script type="text/javascript">
-var searchBox = new SearchBox("searchBox", "search",false,'Search');
-</script>
-<script type="text/javascript" src="menudata.js"></script>
-<script type="text/javascript" src="menu.js"></script>
-<script type="text/javascript">
-$(function() {
-  initMenu('',true,false,'search.php','Search');
-  $(document).ready(function() { init_search(); });
-});
-</script>
-<div id="main-nav"></div>
-<!-- window showing the filter options -->
-<div id="MSearchSelectWindow"
-     onmouseover="return searchBox.OnSearchSelectShow()"
-     onmouseout="return searchBox.OnSearchSelectHide()"
-     onkeydown="return searchBox.OnSearchSelectKey(event)">
-</div>
-
-<!-- iframe showing the search results (closed by default) -->
-<div id="MSearchResultsWindow">
-<iframe src="javascript:void(0)" frameborder="0" 
-        name="MSearchResults" id="MSearchResults">
-</iframe>
-</div>
-
-<div id="nav-path" class="navpath">
-  <ul>
-<li class="navelem"><a class="el" href="namespacetvm.html">tvm</a></li><li class="navelem"><a class="el" href="classtvm_1_1LinkedParam.html">LinkedParam</a></li>  </ul>
-</div>
-</div><!-- top -->
-<div class="header">
-  <div class="headertitle">
-<div class="title">tvm::LinkedParam Member List</div>  </div>
-</div><!--header-->
-<div class="contents">
-
-<p>This is the complete list of members for <a class="el" href="classtvm_1_1LinkedParam.html">tvm::LinkedParam</a>, including all inherited members.</p>
-<table class="directory">
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a3e9b0901b6e01257b060a45e159cc37e">_type_is_nullable</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a2d76fa1fb628ff276a284e61123589c5">as</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aa5c355fbb7d2f7402ee360dba8a52cdd">ContainerType</a> typedef</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#ac261cdb80487fb29ac42b28678f8cbef">data_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a17d8d5ad92691f9e18e3e0ae8ef69e4f">defined</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#acd04bb22a6861e9952c344ee8547411f">DowncastNoCheck</a>(ObjectRef ref)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">protected</span><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a22e5bb9d64dbc773bb9263b70882239e">FFIClearAfterMove</a>(ObjectRef *ref)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">protected</span><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aadbc0886ffa80162ff31eefd0431ba09">get</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#ae423057ecf93c18714d17f53cd1d318f">get_mutable</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">protected</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aed593996e4076632450de8fde776707c">GetDataPtr</a>(const ObjectRef &amp;ref)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">protected</span><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1LinkedParam.html#a12aed83524087bde67a8e2eb4cfc5d97">LinkedParam</a>(int64_t id, tvm::runtime::NDArray param)</td><td class="entry"><a class="el" href="classtvm_1_1LinkedParam.html">tvm::LinkedParam</a></td><td class="entry"></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aa07c1f6d66a438ea950637d13ed09471">ObjectRef</a>()=default</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a6a7dd7404edf1c26f8dbd9bd92d03a02">ObjectRef</a>(ObjectPtr&lt; Object &gt; data)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">explicit</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aa1bd13a7185cb4b2b6bdde49416e8aa4">operator!=</a>(const ObjectRef &amp;other) const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a3deeeac5827a88f375b8c6ae1039c219">operator-&gt;</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a4744bf4a1b48f202d41b51dc5e08e6ee">operator&lt;</a>(const ObjectRef &amp;other) const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#affdf1b8cdb36e140de7b3ad7064e4617">operator==</a>(const ObjectRef &amp;other) const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#ae31a5b9f40781d60a2901994ead700e8">same_as</a>(const ObjectRef &amp;other) const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1LinkedParam.html#a17df7ce77a67396945de4e185174e4b5">TVM_DEFINE_OBJECT_REF_COW_METHOD</a>(LinkedParamNode)</td><td class="entry"><a class="el" href="classtvm_1_1LinkedParam.html">tvm::LinkedParam</a></td><td class="entry"></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1LinkedParam.html#af123bcc5f3ef0c0c5089f07e91fa3c19">TVM_DEFINE_OBJECT_REF_METHODS</a>(LinkedParam, ObjectRef, LinkedParamNode)</td><td class="entry"><a class="el" href="classtvm_1_1LinkedParam.html">tvm::LinkedParam</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a4e7cdb1574b93a59e784d70aa47b8da7">unique</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a0ae0da21d247cd87ea94fe3777c4405e">use_count</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-</table></div><!-- contents -->
-<!-- start footer part -->
-<hr class="footer"/><address class="footer"><small>
-Generated by &#160;<a href="http://www.doxygen.org/index.html">
-<img class="footer" src="doxygen.png" alt="doxygen"/>
-</a> 1.8.13
-</small></address>
-</body>
-</html>
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParam.html b/docs/reference/api/doxygen/classtvm_1_1LinkedParam.html
deleted file mode 100644
index 0e9742d2f..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParam.html
+++ /dev/null
@@ -1,256 +0,0 @@
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml">
-<head>
-<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
-<meta http-equiv="X-UA-Compatible" content="IE=9"/>
-<meta name="generator" content="Doxygen 1.8.13"/>
-<meta name="viewport" content="width=device-width, initial-scale=1"/>
-<title>tvm: tvm::LinkedParam Class Reference</title>
-<link href="tabs.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="jquery.js"></script>
-<script type="text/javascript" src="dynsections.js"></script>
-<link href="search/search.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="search/searchdata.js"></script>
-<script type="text/javascript" src="search/search.js"></script>
-<link href="doxygen.css" rel="stylesheet" type="text/css" />
-</head>
-<body>
-<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
-<div id="titlearea">
-<table cellspacing="0" cellpadding="0">
- <tbody>
- <tr style="height: 56px;">
-  <td id="projectalign" style="padding-left: 0.5em;">
-   <div id="projectname">tvm
-   </div>
-  </td>
- </tr>
- </tbody>
-</table>
-</div>
-<!-- end header part -->
-<!-- Generated by Doxygen 1.8.13 -->
-<script type="text/javascript">
-var searchBox = new SearchBox("searchBox", "search",false,'Search');
-</script>
-<script type="text/javascript" src="menudata.js"></script>
-<script type="text/javascript" src="menu.js"></script>
-<script type="text/javascript">
-$(function() {
-  initMenu('',true,false,'search.php','Search');
-  $(document).ready(function() { init_search(); });
-});
-</script>
-<div id="main-nav"></div>
-<!-- window showing the filter options -->
-<div id="MSearchSelectWindow"
-     onmouseover="return searchBox.OnSearchSelectShow()"
-     onmouseout="return searchBox.OnSearchSelectHide()"
-     onkeydown="return searchBox.OnSearchSelectKey(event)">
-</div>
-
-<!-- iframe showing the search results (closed by default) -->
-<div id="MSearchResultsWindow">
-<iframe src="javascript:void(0)" frameborder="0" 
-        name="MSearchResults" id="MSearchResults">
-</iframe>
-</div>
-
-<div id="nav-path" class="navpath">
-  <ul>
-<li class="navelem"><a class="el" href="namespacetvm.html">tvm</a></li><li class="navelem"><a class="el" href="classtvm_1_1LinkedParam.html">LinkedParam</a></li>  </ul>
-</div>
-</div><!-- top -->
-<div class="header">
-  <div class="summary">
-<a href="#pub-methods">Public Member Functions</a> &#124;
-<a href="classtvm_1_1LinkedParam-members.html">List of all members</a>  </div>
-  <div class="headertitle">
-<div class="title">tvm::LinkedParam Class Reference</div>  </div>
-</div><!--header-->
-<div class="contents">
-
-<p>Managed reference to <a class="el" href="classtvm_1_1LinkedParamNode.html" title="Describes one parameter that should be linked into the generated module. ">LinkedParamNode</a>.  
- <a href="classtvm_1_1LinkedParam.html#details">More...</a></p>
-
-<p><code>#include &lt;<a class="el" href="ir_2module_8h_source.html">module.h</a>&gt;</code></p>
-<div class="dynheader">
-Inheritance diagram for tvm::LinkedParam:</div>
-<div class="dyncontent">
-<div class="center"><iframe scrolling="no" frameborder="0" src="classtvm_1_1LinkedParam__inherit__graph.svg" width="216" height="507"><p><b>This browser is not able to show SVG: try Firefox, Chrome, Safari, or Opera instead.</b></p></iframe>
-</div>
-</div>
-<div class="dynheader">
-Collaboration diagram for tvm::LinkedParam:</div>
-<div class="dyncontent">
-<div class="center"><iframe scrolling="no" frameborder="0" src="classtvm_1_1LinkedParam__coll__graph.svg" width="216" height="795"><p><b>This browser is not able to show SVG: try Firefox, Chrome, Safari, or Opera instead.</b></p></iframe>
-</div>
-</div>
-<table class="memberdecls">
-<tr class="heading"><td colspan="2"><h2 class="groupheader"><a name="pub-methods"></a>
-Public Member Functions</h2></td></tr>
-<tr class="memitem:a12aed83524087bde67a8e2eb4cfc5d97"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParam.html#a12aed83524087bde67a8e2eb4cfc5d97">LinkedParam</a> (int64_t id, <a class="el" href="classtvm_1_1runtime_1_1NDArray.html">tvm::runtime::NDArray</a> param)</td></tr>
-<tr class="separator:a12aed83524087bde67a8e2eb4cfc5d97"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:af123bcc5f3ef0c0c5089f07e91fa3c19"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParam.html#af123bcc5f3ef0c0c5089f07e91fa3c19">TVM_DEFINE_OBJECT_REF_METHODS</a> (<a class="el" href="classtvm_1_1LinkedParam.html">LinkedParam</a>, <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a>, <a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a>)< [...]
-<tr class="separator:af123bcc5f3ef0c0c5089f07e91fa3c19"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a17df7ce77a67396945de4e185174e4b5"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParam.html#a17df7ce77a67396945de4e185174e4b5">TVM_DEFINE_OBJECT_REF_COW_METHOD</a> (<a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a>)</td></tr>
-<tr class="separator:a17df7ce77a67396945de4e185174e4b5"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td colspan="2" onclick="javascript:toggleInherit('pub_methods_classtvm_1_1runtime_1_1ObjectRef')"><img src="closed.png" alt="-"/>&#160;Public Member Functions inherited from <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td></tr>
-<tr class="memitem:aa07c1f6d66a438ea950637d13ed09471 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aa07c1f6d66a438ea950637d13ed09471">ObjectRef</a> ()=default</td></tr>
-<tr class="memdesc:aa07c1f6d66a438ea950637d13ed09471 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">default constructor  <a href="classtvm_1_1runtime_1_1ObjectRef.html#aa07c1f6d66a438ea950637d13ed09471">More...</a><br /></td></tr>
-<tr class="separator:aa07c1f6d66a438ea950637d13ed09471 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a6a7dd7404edf1c26f8dbd9bd92d03a02 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a6a7dd7404edf1c26f8dbd9bd92d03a02">ObjectRef</a> (<a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html">ObjectPtr</a>&lt; <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &gt; data)</td></tr>
-<tr class="memdesc:a6a7dd7404edf1c26f8dbd9bd92d03a02 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight"><a class="el" href="classtvm_1_1Constructor.html" title="Managed reference to ConstructorNode. ">Constructor</a> from existing object ptr.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#a6a7dd7404edf1c26f8dbd9bd92d03a02">More...</a><br /></td></tr>
-<tr class="separator:a6a7dd7404edf1c26f8dbd9bd92d03a02 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:ae31a5b9f40781d60a2901994ead700e8 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#ae31a5b9f40781d60a2901994ead700e8">same_as</a> (const <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a> &amp;other) const</td></tr>
-<tr class="memdesc:ae31a5b9f40781d60a2901994ead700e8 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Comparator.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#ae31a5b9f40781d60a2901994ead700e8">More...</a><br /></td></tr>
-<tr class="separator:ae31a5b9f40781d60a2901994ead700e8 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:affdf1b8cdb36e140de7b3ad7064e4617 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#affdf1b8cdb36e140de7b3ad7064e4617">operator==</a> (const <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a> &amp;other) const</td></tr>
-<tr class="memdesc:affdf1b8cdb36e140de7b3ad7064e4617 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Comparator.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#affdf1b8cdb36e140de7b3ad7064e4617">More...</a><br /></td></tr>
-<tr class="separator:affdf1b8cdb36e140de7b3ad7064e4617 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:aa1bd13a7185cb4b2b6bdde49416e8aa4 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aa1bd13a7185cb4b2b6bdde49416e8aa4">operator!=</a> (const <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a> &amp;other) const</td></tr>
-<tr class="memdesc:aa1bd13a7185cb4b2b6bdde49416e8aa4 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Comparator.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#aa1bd13a7185cb4b2b6bdde49416e8aa4">More...</a><br /></td></tr>
-<tr class="separator:aa1bd13a7185cb4b2b6bdde49416e8aa4 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a4744bf4a1b48f202d41b51dc5e08e6ee inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a4744bf4a1b48f202d41b51dc5e08e6ee">operator&lt;</a> (const <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a> &amp;other) const</td></tr>
-<tr class="memdesc:a4744bf4a1b48f202d41b51dc5e08e6ee inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Comparator.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#a4744bf4a1b48f202d41b51dc5e08e6ee">More...</a><br /></td></tr>
-<tr class="separator:a4744bf4a1b48f202d41b51dc5e08e6ee inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a17d8d5ad92691f9e18e3e0ae8ef69e4f inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a17d8d5ad92691f9e18e3e0ae8ef69e4f">defined</a> () const</td></tr>
-<tr class="separator:a17d8d5ad92691f9e18e3e0ae8ef69e4f inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:aadbc0886ffa80162ff31eefd0431ba09 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">const <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> *&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aadbc0886ffa80162ff31eefd0431ba09">get</a> () const</td></tr>
-<tr class="separator:aadbc0886ffa80162ff31eefd0431ba09 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a3deeeac5827a88f375b8c6ae1039c219 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">const <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> *&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a3deeeac5827a88f375b8c6ae1039c219">operator-&gt;</a> () const</td></tr>
-<tr class="separator:a3deeeac5827a88f375b8c6ae1039c219 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a4e7cdb1574b93a59e784d70aa47b8da7 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a4e7cdb1574b93a59e784d70aa47b8da7">unique</a> () const</td></tr>
-<tr class="separator:a4e7cdb1574b93a59e784d70aa47b8da7 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a0ae0da21d247cd87ea94fe3777c4405e inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">int&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a0ae0da21d247cd87ea94fe3777c4405e">use_count</a> () const</td></tr>
-<tr class="separator:a0ae0da21d247cd87ea94fe3777c4405e inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a2d76fa1fb628ff276a284e61123589c5 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memTemplParams" colspan="2">template&lt;typename ObjectType &gt; </td></tr>
-<tr class="memitem:a2d76fa1fb628ff276a284e61123589c5 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memTemplItemLeft" align="right" valign="top">const ObjectType *&#160;</td><td class="memTemplItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a2d76fa1fb628ff276a284e61123589c5">as</a> () const</td></tr>
-<tr class="memdesc:a2d76fa1fb628ff276a284e61123589c5 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Try to downcast the internal <a class="el" href="classtvm_1_1runtime_1_1Object.html" title="base class of all object containers. ">Object</a> to a raw pointer of a corresponding type.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#a2d76fa1fb628ff276a284e61123589c5">More...</a><br /></td></tr>
-<tr class="separator:a2d76fa1fb628ff276a284e61123589c5 inherit pub_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-</table><table class="memberdecls">
-<tr class="heading"><td colspan="2"><h2 class="groupheader"><a name="inherited"></a>
-Additional Inherited Members</h2></td></tr>
-<tr class="inherit_header pub_types_classtvm_1_1runtime_1_1ObjectRef"><td colspan="2" onclick="javascript:toggleInherit('pub_types_classtvm_1_1runtime_1_1ObjectRef')"><img src="closed.png" alt="-"/>&#160;Public Types inherited from <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td></tr>
-<tr class="memitem:aa5c355fbb7d2f7402ee360dba8a52cdd inherit pub_types_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">using&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aa5c355fbb7d2f7402ee360dba8a52cdd">ContainerType</a> = <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a></td></tr>
-<tr class="memdesc:aa5c355fbb7d2f7402ee360dba8a52cdd inherit pub_types_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">type indicate the container type.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#aa5c355fbb7d2f7402ee360dba8a52cdd">More...</a><br /></td></tr>
-<tr class="separator:aa5c355fbb7d2f7402ee360dba8a52cdd inherit pub_types_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pub_static_attribs_classtvm_1_1runtime_1_1ObjectRef"><td colspan="2" onclick="javascript:toggleInherit('pub_static_attribs_classtvm_1_1runtime_1_1ObjectRef')"><img src="closed.png" alt="-"/>&#160;Static Public Attributes inherited from <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td></tr>
-<tr class="memitem:a3e9b0901b6e01257b060a45e159cc37e inherit pub_static_attribs_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">static constexpr bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a3e9b0901b6e01257b060a45e159cc37e">_type_is_nullable</a> = true</td></tr>
-<tr class="separator:a3e9b0901b6e01257b060a45e159cc37e inherit pub_static_attribs_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pro_methods_classtvm_1_1runtime_1_1ObjectRef"><td colspan="2" onclick="javascript:toggleInherit('pro_methods_classtvm_1_1runtime_1_1ObjectRef')"><img src="closed.png" alt="-"/>&#160;Protected Member Functions inherited from <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td></tr>
-<tr class="memitem:ae423057ecf93c18714d17f53cd1d318f inherit pro_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> *&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#ae423057ecf93c18714d17f53cd1d318f">get_mutable</a> () const</td></tr>
-<tr class="separator:ae423057ecf93c18714d17f53cd1d318f inherit pro_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td colspan="2" onclick="javascript:toggleInherit('pro_static_methods_classtvm_1_1runtime_1_1ObjectRef')"><img src="closed.png" alt="-"/>&#160;Static Protected Member Functions inherited from <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td></tr>
-<tr class="memitem:acd04bb22a6861e9952c344ee8547411f inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memTemplParams" colspan="2">template&lt;typename T &gt; </td></tr>
-<tr class="memitem:acd04bb22a6861e9952c344ee8547411f inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memTemplItemLeft" align="right" valign="top">static T&#160;</td><td class="memTemplItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#acd04bb22a6861e9952c344ee8547411f">DowncastNoCheck</a> (<a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a> ref)</td></tr>
-<tr class="memdesc:acd04bb22a6861e9952c344ee8547411f inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Internal helper function downcast a ref without check.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#acd04bb22a6861e9952c344ee8547411f">More...</a><br /></td></tr>
-<tr class="separator:acd04bb22a6861e9952c344ee8547411f inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a22e5bb9d64dbc773bb9263b70882239e inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top">static void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#a22e5bb9d64dbc773bb9263b70882239e">FFIClearAfterMove</a> (<a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a> *ref)</td></tr>
-<tr class="memdesc:a22e5bb9d64dbc773bb9263b70882239e inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Clear the object ref data field without DecRef after we successfully moved the field.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#a22e5bb9d64dbc773bb9263b70882239e">More...</a><br /></td></tr>
-<tr class="separator:a22e5bb9d64dbc773bb9263b70882239e inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:aed593996e4076632450de8fde776707c inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memTemplParams" colspan="2">template&lt;typename ObjectType &gt; </td></tr>
-<tr class="memitem:aed593996e4076632450de8fde776707c inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memTemplItemLeft" align="right" valign="top">static <a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html">ObjectPtr</a>&lt; ObjectType &gt;&#160;</td><td class="memTemplItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#aed593996e4076632450de8fde776707c">GetDataPtr</a> (const <a class="el" href="classtvm_1_1runtime_1_1ObjectRe [...]
-<tr class="memdesc:aed593996e4076632450de8fde776707c inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Internal helper function get data_ as <a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html" title="A custom smart pointer for Object. ">ObjectPtr</a> of ObjectType.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#aed593996e4076632450de8fde776707c">More...</a><br /></td></tr>
-<tr class="separator:aed593996e4076632450de8fde776707c inherit pro_static_methods_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pro_attribs_classtvm_1_1runtime_1_1ObjectRef"><td colspan="2" onclick="javascript:toggleInherit('pro_attribs_classtvm_1_1runtime_1_1ObjectRef')"><img src="closed.png" alt="-"/>&#160;Protected Attributes inherited from <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></td></tr>
-<tr class="memitem:ac261cdb80487fb29ac42b28678f8cbef inherit pro_attribs_classtvm_1_1runtime_1_1ObjectRef"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html">ObjectPtr</a>&lt; <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &gt;&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#ac261cdb80487fb29ac42b28678f8cbef">data_</a></td></tr>
-<tr class="memdesc:ac261cdb80487fb29ac42b28678f8cbef inherit pro_attribs_classtvm_1_1runtime_1_1ObjectRef"><td class="mdescLeft">&#160;</td><td class="mdescRight">Internal pointer that backs the reference.  <a href="classtvm_1_1runtime_1_1ObjectRef.html#ac261cdb80487fb29ac42b28678f8cbef">More...</a><br /></td></tr>
-<tr class="separator:ac261cdb80487fb29ac42b28678f8cbef inherit pro_attribs_classtvm_1_1runtime_1_1ObjectRef"><td class="memSeparator" colspan="2">&#160;</td></tr>
-</table>
-<a name="details" id="details"></a><h2 class="groupheader">Detailed Description</h2>
-<div class="textblock"><p>Managed reference to <a class="el" href="classtvm_1_1LinkedParamNode.html" title="Describes one parameter that should be linked into the generated module. ">LinkedParamNode</a>. </p>
-</div><h2 class="groupheader">Constructor &amp; Destructor Documentation</h2>
-<a id="a12aed83524087bde67a8e2eb4cfc5d97"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a12aed83524087bde67a8e2eb4cfc5d97">&#9670;&nbsp;</a></span>LinkedParam()</h2>
-
-<div class="memitem">
-<div class="memproto">
-      <table class="memname">
-        <tr>
-          <td class="memname">tvm::LinkedParam::LinkedParam </td>
-          <td>(</td>
-          <td class="paramtype">int64_t&#160;</td>
-          <td class="paramname"><em>id</em>, </td>
-        </tr>
-        <tr>
-          <td class="paramkey"></td>
-          <td></td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1runtime_1_1NDArray.html">tvm::runtime::NDArray</a>&#160;</td>
-          <td class="paramname"><em>param</em>&#160;</td>
-        </tr>
-        <tr>
-          <td></td>
-          <td>)</td>
-          <td></td><td></td>
-        </tr>
-      </table>
-</div><div class="memdoc">
-
-</div>
-</div>
-<h2 class="groupheader">Member Function Documentation</h2>
-<a id="a17df7ce77a67396945de4e185174e4b5"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a17df7ce77a67396945de4e185174e4b5">&#9670;&nbsp;</a></span>TVM_DEFINE_OBJECT_REF_COW_METHOD()</h2>
-
-<div class="memitem">
-<div class="memproto">
-      <table class="memname">
-        <tr>
-          <td class="memname">tvm::LinkedParam::TVM_DEFINE_OBJECT_REF_COW_METHOD </td>
-          <td>(</td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a>&#160;</td>
-          <td class="paramname"></td><td>)</td>
-          <td></td>
-        </tr>
-      </table>
-</div><div class="memdoc">
-
-</div>
-</div>
-<a id="af123bcc5f3ef0c0c5089f07e91fa3c19"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#af123bcc5f3ef0c0c5089f07e91fa3c19">&#9670;&nbsp;</a></span>TVM_DEFINE_OBJECT_REF_METHODS()</h2>
-
-<div class="memitem">
-<div class="memproto">
-      <table class="memname">
-        <tr>
-          <td class="memname">tvm::LinkedParam::TVM_DEFINE_OBJECT_REF_METHODS </td>
-          <td>(</td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1LinkedParam.html">LinkedParam</a>&#160;</td>
-          <td class="paramname">, </td>
-        </tr>
-        <tr>
-          <td class="paramkey"></td>
-          <td></td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html">ObjectRef</a>&#160;</td>
-          <td class="paramname">, </td>
-        </tr>
-        <tr>
-          <td class="paramkey"></td>
-          <td></td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a>&#160;</td>
-          <td class="paramname">&#160;</td>
-        </tr>
-        <tr>
-          <td></td>
-          <td>)</td>
-          <td></td><td></td>
-        </tr>
-      </table>
-</div><div class="memdoc">
-
-</div>
-</div>
-<hr/>The documentation for this class was generated from the following file:<ul>
-<li>include/tvm/ir/<a class="el" href="ir_2module_8h_source.html">module.h</a></li>
-</ul>
-</div><!-- contents -->
-<!-- start footer part -->
-<hr class="footer"/><address class="footer"><small>
-Generated by &#160;<a href="http://www.doxygen.org/index.html">
-<img class="footer" src="doxygen.png" alt="doxygen"/>
-</a> 1.8.13
-</small></address>
-</body>
-</html>
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode-members.html b/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode-members.html
deleted file mode 100644
index d6eea0f2a..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode-members.html
+++ /dev/null
@@ -1,115 +0,0 @@
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml">
-<head>
-<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
-<meta http-equiv="X-UA-Compatible" content="IE=9"/>
-<meta name="generator" content="Doxygen 1.8.13"/>
-<meta name="viewport" content="width=device-width, initial-scale=1"/>
-<title>tvm: Member List</title>
-<link href="tabs.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="jquery.js"></script>
-<script type="text/javascript" src="dynsections.js"></script>
-<link href="search/search.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="search/searchdata.js"></script>
-<script type="text/javascript" src="search/search.js"></script>
-<link href="doxygen.css" rel="stylesheet" type="text/css" />
-</head>
-<body>
-<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
-<div id="titlearea">
-<table cellspacing="0" cellpadding="0">
- <tbody>
- <tr style="height: 56px;">
-  <td id="projectalign" style="padding-left: 0.5em;">
-   <div id="projectname">tvm
-   </div>
-  </td>
- </tr>
- </tbody>
-</table>
-</div>
-<!-- end header part -->
-<!-- Generated by Doxygen 1.8.13 -->
-<script type="text/javascript">
-var searchBox = new SearchBox("searchBox", "search",false,'Search');
-</script>
-<script type="text/javascript" src="menudata.js"></script>
-<script type="text/javascript" src="menu.js"></script>
-<script type="text/javascript">
-$(function() {
-  initMenu('',true,false,'search.php','Search');
-  $(document).ready(function() { init_search(); });
-});
-</script>
-<div id="main-nav"></div>
-<!-- window showing the filter options -->
-<div id="MSearchSelectWindow"
-     onmouseover="return searchBox.OnSearchSelectShow()"
-     onmouseout="return searchBox.OnSearchSelectHide()"
-     onkeydown="return searchBox.OnSearchSelectKey(event)">
-</div>
-
-<!-- iframe showing the search results (closed by default) -->
-<div id="MSearchResultsWindow">
-<iframe src="javascript:void(0)" frameborder="0" 
-        name="MSearchResults" id="MSearchResults">
-</iframe>
-</div>
-
-<div id="nav-path" class="navpath">
-  <ul>
-<li class="navelem"><a class="el" href="namespacetvm.html">tvm</a></li><li class="navelem"><a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a></li>  </ul>
-</div>
-</div><!-- top -->
-<div class="header">
-  <div class="headertitle">
-<div class="title">tvm::LinkedParamNode Member List</div>  </div>
-</div><!--header-->
-<div class="contents">
-
-<p>This is the complete list of members for <a class="el" href="classtvm_1_1LinkedParamNode.html">tvm::LinkedParamNode</a>, including all inherited members.</p>
-<table class="directory">
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a5fbebc47be111ecc1d5869bcc0476e21">_GetOrAllocRuntimeTypeIndex</a>()</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a14b234a745215da158b2386bbb34bd70">_type_child_slots</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a05ece7bcb6bf73e88765c1f193a489ce">_type_child_slots_can_overflow</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a55cb618bd4bbcd49317b35ea8e2996be">_type_final</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a92fe62494027b70af1f7696d611c21b6">_type_has_method_sequal_reduce</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ac97054694d03dc5eac58315fb569ef88">_type_has_method_shash_reduce</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a74e9f076b50b8b335b4a321e9b0bf03c">_type_has_method_visit_attrs</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#af6aed95d70af7e44ce376a8d7be6c5f1">_type_index</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html#a0fcaf48a2f8251d405730bd59fa16f4b">_type_key</a></td><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html">tvm::LinkedParamNode</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a70fb5361147634605d6595bb89381f03">DecRef</a>()</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">protected</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#af4407d2b59132e803ff791482dbe0145">deleter_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a9e84841ca982bff376a978ade0132631">FDeleter</a> typedef</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a726972ff315c446192df94027ddea032">GetOrAllocRuntimeTypeIndex</a>(const std::string &amp;key, uint32_t static_tindex, uint32_t parent_tindex, uint32_t type_child_slots, bool type_child_slots_can_overflow)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span><span class="mlabel">static</span [...]
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a4d951e51832081b85875669eac90e940">GetTypeKey</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a5693cbadcc1168b96db7b1cc5c200b86">GetTypeKeyHash</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html#a6000f7f468b8db072935053a1ac1fbf4">id</a></td><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html">tvm::LinkedParamNode</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ac9e5eed7719e322117bde996a171e33a">IncRef</a>()</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">protected</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a90e90b3f4ba8a590baff78c75807bbc7">IsInstance</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a133436a9ec5c4a768b94102bf95a660b">Object</a>()</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ab7968feb6ad38ecaffc320e13819d826">Object</a>(const Object &amp;other)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#aa1612f69ea5b4225d4cda759cd517323">Object</a>(Object &amp;&amp;other)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a69c32fbd96181f5c21d2c878ab285e4f">operator=</a>(const Object &amp;other)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ae341e561272ff43cdcbc927bc29ac50d">operator=</a>(Object &amp;&amp;other)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html#a96d730a027c9e169786f3aaea2e4cc10">param</a></td><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html">tvm::LinkedParamNode</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a0d492efee331e2239a093f4b2017c10f">ref_counter_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a55549a6c23987890246248682560a03d">RefCounterType</a> typedef</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ad94d79729ac85aa7c976e23d39066383">RuntimeTypeIndex</a>()</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html#a2a92d26184de1c29f2ec0c1c373af3b5">TVM_DECLARE_FINAL_OBJECT_INFO</a>(LinkedParamNode, Object)</td><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html">tvm::LinkedParamNode</a></td><td class="entry"></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a481f01923b14e1851ebd38506e9c66ea">type_index</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a4bfc2586cb55f2af47728187b3256255">type_index_</a></td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">protected</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">TypeIndex2Key</a>(uint32_t tindex)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6ee32a02dd44257da105fbbe5d9c8622">TypeIndex2KeyHash</a>(uint32_t tindex)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6841f97e06e6614dd7e82c6dd41b818a">TypeKey2Index</a>(const std::string &amp;key)</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">static</span></td></tr>
-  <tr><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html#afd548730a6139d19fe24473ad66026d7">unique</a>() const</td><td class="entry"><a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-  <tr class="even"><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html#a16df477a0c00bd0423cf2d46de60bfe3">VisitAttrs</a>(tvm::AttrVisitor *v)</td><td class="entry"><a class="el" href="classtvm_1_1LinkedParamNode.html">tvm::LinkedParamNode</a></td><td class="entry"><span class="mlabel">inline</span></td></tr>
-</table></div><!-- contents -->
-<!-- start footer part -->
-<hr class="footer"/><address class="footer"><small>
-Generated by &#160;<a href="http://www.doxygen.org/index.html">
-<img class="footer" src="doxygen.png" alt="doxygen"/>
-</a> 1.8.13
-</small></address>
-</body>
-</html>
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode.html b/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode.html
deleted file mode 100644
index d2127f5a6..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode.html
+++ /dev/null
@@ -1,320 +0,0 @@
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml">
-<head>
-<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
-<meta http-equiv="X-UA-Compatible" content="IE=9"/>
-<meta name="generator" content="Doxygen 1.8.13"/>
-<meta name="viewport" content="width=device-width, initial-scale=1"/>
-<title>tvm: tvm::LinkedParamNode Class Reference</title>
-<link href="tabs.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="jquery.js"></script>
-<script type="text/javascript" src="dynsections.js"></script>
-<link href="search/search.css" rel="stylesheet" type="text/css"/>
-<script type="text/javascript" src="search/searchdata.js"></script>
-<script type="text/javascript" src="search/search.js"></script>
-<link href="doxygen.css" rel="stylesheet" type="text/css" />
-</head>
-<body>
-<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
-<div id="titlearea">
-<table cellspacing="0" cellpadding="0">
- <tbody>
- <tr style="height: 56px;">
-  <td id="projectalign" style="padding-left: 0.5em;">
-   <div id="projectname">tvm
-   </div>
-  </td>
- </tr>
- </tbody>
-</table>
-</div>
-<!-- end header part -->
-<!-- Generated by Doxygen 1.8.13 -->
-<script type="text/javascript">
-var searchBox = new SearchBox("searchBox", "search",false,'Search');
-</script>
-<script type="text/javascript" src="menudata.js"></script>
-<script type="text/javascript" src="menu.js"></script>
-<script type="text/javascript">
-$(function() {
-  initMenu('',true,false,'search.php','Search');
-  $(document).ready(function() { init_search(); });
-});
-</script>
-<div id="main-nav"></div>
-<!-- window showing the filter options -->
-<div id="MSearchSelectWindow"
-     onmouseover="return searchBox.OnSearchSelectShow()"
-     onmouseout="return searchBox.OnSearchSelectHide()"
-     onkeydown="return searchBox.OnSearchSelectKey(event)">
-</div>
-
-<!-- iframe showing the search results (closed by default) -->
-<div id="MSearchResultsWindow">
-<iframe src="javascript:void(0)" frameborder="0" 
-        name="MSearchResults" id="MSearchResults">
-</iframe>
-</div>
-
-<div id="nav-path" class="navpath">
-  <ul>
-<li class="navelem"><a class="el" href="namespacetvm.html">tvm</a></li><li class="navelem"><a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a></li>  </ul>
-</div>
-</div><!-- top -->
-<div class="header">
-  <div class="summary">
-<a href="#pub-methods">Public Member Functions</a> &#124;
-<a href="#pub-attribs">Public Attributes</a> &#124;
-<a href="#pub-static-attribs">Static Public Attributes</a> &#124;
-<a href="classtvm_1_1LinkedParamNode-members.html">List of all members</a>  </div>
-  <div class="headertitle">
-<div class="title">tvm::LinkedParamNode Class Reference</div>  </div>
-</div><!--header-->
-<div class="contents">
-
-<p>Describes one parameter that should be linked into the generated module.  
- <a href="classtvm_1_1LinkedParamNode.html#details">More...</a></p>
-
-<p><code>#include &lt;<a class="el" href="ir_2module_8h_source.html">module.h</a>&gt;</code></p>
-<div class="dynheader">
-Inheritance diagram for tvm::LinkedParamNode:</div>
-<div class="dyncontent">
-<div class="center"><iframe scrolling="no" frameborder="0" src="classtvm_1_1LinkedParamNode__inherit__graph.svg" width="290" height="712"><p><b>This browser is not able to show SVG: try Firefox, Chrome, Safari, or Opera instead.</b></p></iframe>
-</div>
-</div>
-<div class="dynheader">
-Collaboration diagram for tvm::LinkedParamNode:</div>
-<div class="dyncontent">
-<div class="center"><iframe scrolling="no" frameborder="0" src="classtvm_1_1LinkedParamNode__coll__graph.svg" width="563" height="1346"><p><b>This browser is not able to show SVG: try Firefox, Chrome, Safari, or Opera instead.</b></p></iframe>
-</div>
-</div>
-<table class="memberdecls">
-<tr class="heading"><td colspan="2"><h2 class="groupheader"><a name="pub-methods"></a>
-Public Member Functions</h2></td></tr>
-<tr class="memitem:a16df477a0c00bd0423cf2d46de60bfe3"><td class="memItemLeft" align="right" valign="top">void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParamNode.html#a16df477a0c00bd0423cf2d46de60bfe3">VisitAttrs</a> (<a class="el" href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a> *v)</td></tr>
-<tr class="separator:a16df477a0c00bd0423cf2d46de60bfe3"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a2a92d26184de1c29f2ec0c1c373af3b5"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParamNode.html#a2a92d26184de1c29f2ec0c1c373af3b5">TVM_DECLARE_FINAL_OBJECT_INFO</a> (<a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a>, <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a>)</td></tr>
-<tr class="separator:a2a92d26184de1c29f2ec0c1c373af3b5"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pub_methods_classtvm_1_1runtime_1_1Object"><td colspan="2" onclick="javascript:toggleInherit('pub_methods_classtvm_1_1runtime_1_1Object')"><img src="closed.png" alt="-"/>&#160;Public Member Functions inherited from <a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td></tr>
-<tr class="memitem:a481f01923b14e1851ebd38506e9c66ea inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a481f01923b14e1851ebd38506e9c66ea">type_index</a> () const</td></tr>
-<tr class="separator:a481f01923b14e1851ebd38506e9c66ea inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a4d951e51832081b85875669eac90e940 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">std::string&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a4d951e51832081b85875669eac90e940">GetTypeKey</a> () const</td></tr>
-<tr class="separator:a4d951e51832081b85875669eac90e940 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a5693cbadcc1168b96db7b1cc5c200b86 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">size_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a5693cbadcc1168b96db7b1cc5c200b86">GetTypeKeyHash</a> () const</td></tr>
-<tr class="separator:a5693cbadcc1168b96db7b1cc5c200b86 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a90e90b3f4ba8a590baff78c75807bbc7 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memTemplParams" colspan="2">template&lt;typename TargetType &gt; </td></tr>
-<tr class="memitem:a90e90b3f4ba8a590baff78c75807bbc7 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memTemplItemLeft" align="right" valign="top">bool&#160;</td><td class="memTemplItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a90e90b3f4ba8a590baff78c75807bbc7">IsInstance</a> () const</td></tr>
-<tr class="separator:a90e90b3f4ba8a590baff78c75807bbc7 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:afd548730a6139d19fe24473ad66026d7 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#afd548730a6139d19fe24473ad66026d7">unique</a> () const</td></tr>
-<tr class="separator:afd548730a6139d19fe24473ad66026d7 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a133436a9ec5c4a768b94102bf95a660b inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a133436a9ec5c4a768b94102bf95a660b">Object</a> ()</td></tr>
-<tr class="separator:a133436a9ec5c4a768b94102bf95a660b inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:ab7968feb6ad38ecaffc320e13819d826 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ab7968feb6ad38ecaffc320e13819d826">Object</a> (const <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &amp;other)</td></tr>
-<tr class="separator:ab7968feb6ad38ecaffc320e13819d826 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:aa1612f69ea5b4225d4cda759cd517323 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#aa1612f69ea5b4225d4cda759cd517323">Object</a> (<a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &amp;&amp;other)</td></tr>
-<tr class="separator:aa1612f69ea5b4225d4cda759cd517323 inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a69c32fbd96181f5c21d2c878ab285e4f inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &amp;&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a69c32fbd96181f5c21d2c878ab285e4f">operator=</a> (const <a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &amp;other)</td></tr>
-<tr class="separator:a69c32fbd96181f5c21d2c878ab285e4f inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:ae341e561272ff43cdcbc927bc29ac50d inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &amp;&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ae341e561272ff43cdcbc927bc29ac50d">operator=</a> (<a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> &amp;&amp;other)</td></tr>
-<tr class="separator:ae341e561272ff43cdcbc927bc29ac50d inherit pub_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-</table><table class="memberdecls">
-<tr class="heading"><td colspan="2"><h2 class="groupheader"><a name="pub-attribs"></a>
-Public Attributes</h2></td></tr>
-<tr class="memitem:a6000f7f468b8db072935053a1ac1fbf4"><td class="memItemLeft" align="right" valign="top">int64_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParamNode.html#a6000f7f468b8db072935053a1ac1fbf4">id</a></td></tr>
-<tr class="memdesc:a6000f7f468b8db072935053a1ac1fbf4"><td class="mdescLeft">&#160;</td><td class="mdescRight">Unique numeric identifier used by runtimes to lookup this parameter.  <a href="#a6000f7f468b8db072935053a1ac1fbf4">More...</a><br /></td></tr>
-<tr class="separator:a6000f7f468b8db072935053a1ac1fbf4"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a96d730a027c9e169786f3aaea2e4cc10"><td class="memItemLeft" align="right" valign="top">::<a class="el" href="classtvm_1_1runtime_1_1NDArray.html">tvm::runtime::NDArray</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParamNode.html#a96d730a027c9e169786f3aaea2e4cc10">param</a></td></tr>
-<tr class="memdesc:a96d730a027c9e169786f3aaea2e4cc10"><td class="mdescLeft">&#160;</td><td class="mdescRight">Parameter data which should get linked into the final module.  <a href="#a96d730a027c9e169786f3aaea2e4cc10">More...</a><br /></td></tr>
-<tr class="separator:a96d730a027c9e169786f3aaea2e4cc10"><td class="memSeparator" colspan="2">&#160;</td></tr>
-</table><table class="memberdecls">
-<tr class="heading"><td colspan="2"><h2 class="groupheader"><a name="pub-static-attribs"></a>
-Static Public Attributes</h2></td></tr>
-<tr class="memitem:a0fcaf48a2f8251d405730bd59fa16f4b"><td class="memItemLeft" align="right" valign="top">static constexpr const char *&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1LinkedParamNode.html#a0fcaf48a2f8251d405730bd59fa16f4b">_type_key</a> = &quot;tir.LinkedParam&quot;</td></tr>
-<tr class="separator:a0fcaf48a2f8251d405730bd59fa16f4b"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pub_static_attribs_classtvm_1_1runtime_1_1Object"><td colspan="2" onclick="javascript:toggleInherit('pub_static_attribs_classtvm_1_1runtime_1_1Object')"><img src="closed.png" alt="-"/>&#160;Static Public Attributes inherited from <a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td></tr>
-<tr class="memitem:a43d6bf3191bebb805eced0744d859c1e inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr const char *&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a43d6bf3191bebb805eced0744d859c1e">_type_key</a> = &quot;runtime.Object&quot;</td></tr>
-<tr class="separator:a43d6bf3191bebb805eced0744d859c1e inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a55cb618bd4bbcd49317b35ea8e2996be inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a55cb618bd4bbcd49317b35ea8e2996be">_type_final</a> = false</td></tr>
-<tr class="separator:a55cb618bd4bbcd49317b35ea8e2996be inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a14b234a745215da158b2386bbb34bd70 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a14b234a745215da158b2386bbb34bd70">_type_child_slots</a> = 0</td></tr>
-<tr class="separator:a14b234a745215da158b2386bbb34bd70 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a05ece7bcb6bf73e88765c1f193a489ce inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a05ece7bcb6bf73e88765c1f193a489ce">_type_child_slots_can_overflow</a> = true</td></tr>
-<tr class="separator:a05ece7bcb6bf73e88765c1f193a489ce inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a74e9f076b50b8b335b4a321e9b0bf03c inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a74e9f076b50b8b335b4a321e9b0bf03c">_type_has_method_visit_attrs</a> = true</td></tr>
-<tr class="separator:a74e9f076b50b8b335b4a321e9b0bf03c inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a92fe62494027b70af1f7696d611c21b6 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a92fe62494027b70af1f7696d611c21b6">_type_has_method_sequal_reduce</a> = false</td></tr>
-<tr class="separator:a92fe62494027b70af1f7696d611c21b6 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:ac97054694d03dc5eac58315fb569ef88 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr bool&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ac97054694d03dc5eac58315fb569ef88">_type_has_method_shash_reduce</a> = false</td></tr>
-<tr class="separator:ac97054694d03dc5eac58315fb569ef88 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:af6aed95d70af7e44ce376a8d7be6c5f1 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static constexpr uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#af6aed95d70af7e44ce376a8d7be6c5f1">_type_index</a> = <a class="el" href="structtvm_1_1runtime_1_1TypeIndex.html#aed93c7318efc8052201d4c404b21a40da83fed6b80a5bcb3247430922fd85ea47">TypeIndex::kDynami [...]
-<tr class="separator:af6aed95d70af7e44ce376a8d7be6c5f1 inherit pub_static_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-</table><table class="memberdecls">
-<tr class="heading"><td colspan="2"><h2 class="groupheader"><a name="inherited"></a>
-Additional Inherited Members</h2></td></tr>
-<tr class="inherit_header pub_types_classtvm_1_1runtime_1_1Object"><td colspan="2" onclick="javascript:toggleInherit('pub_types_classtvm_1_1runtime_1_1Object')"><img src="closed.png" alt="-"/>&#160;Public Types inherited from <a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td></tr>
-<tr class="memitem:a9e84841ca982bff376a978ade0132631 inherit pub_types_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">typedef void(*&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a9e84841ca982bff376a978ade0132631">FDeleter</a>) (<a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a> *self)</td></tr>
-<tr class="memdesc:a9e84841ca982bff376a978ade0132631 inherit pub_types_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight"><a class="el" href="classtvm_1_1runtime_1_1Object.html" title="base class of all object containers. ">Object</a> deleter.  <a href="classtvm_1_1runtime_1_1Object.html#a9e84841ca982bff376a978ade0132631">More...</a><br /></td></tr>
-<tr class="separator:a9e84841ca982bff376a978ade0132631 inherit pub_types_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a55549a6c23987890246248682560a03d inherit pub_types_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">using&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a55549a6c23987890246248682560a03d">RefCounterType</a> = std::atomic&lt; int32_t &gt;</td></tr>
-<tr class="separator:a55549a6c23987890246248682560a03d inherit pub_types_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pub_static_methods_classtvm_1_1runtime_1_1Object"><td colspan="2" onclick="javascript:toggleInherit('pub_static_methods_classtvm_1_1runtime_1_1Object')"><img src="closed.png" alt="-"/>&#160;Static Public Member Functions inherited from <a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td></tr>
-<tr class="memitem:a817ba6c23b7ee1821c48a75edf255a30 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static std::string&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">TypeIndex2Key</a> (uint32_t tindex)</td></tr>
-<tr class="memdesc:a817ba6c23b7ee1821c48a75edf255a30 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">Get the type key of the corresponding index from runtime.  <a href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">More...</a><br /></td></tr>
-<tr class="separator:a817ba6c23b7ee1821c48a75edf255a30 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a6ee32a02dd44257da105fbbe5d9c8622 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static size_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6ee32a02dd44257da105fbbe5d9c8622">TypeIndex2KeyHash</a> (uint32_t tindex)</td></tr>
-<tr class="memdesc:a6ee32a02dd44257da105fbbe5d9c8622 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">Get the type key hash of the corresponding index from runtime.  <a href="classtvm_1_1runtime_1_1Object.html#a6ee32a02dd44257da105fbbe5d9c8622">More...</a><br /></td></tr>
-<tr class="separator:a6ee32a02dd44257da105fbbe5d9c8622 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a6841f97e06e6614dd7e82c6dd41b818a inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a6841f97e06e6614dd7e82c6dd41b818a">TypeKey2Index</a> (const std::string &amp;key)</td></tr>
-<tr class="memdesc:a6841f97e06e6614dd7e82c6dd41b818a inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">Get the type index of the corresponding key from runtime.  <a href="classtvm_1_1runtime_1_1Object.html#a6841f97e06e6614dd7e82c6dd41b818a">More...</a><br /></td></tr>
-<tr class="separator:a6841f97e06e6614dd7e82c6dd41b818a inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a5fbebc47be111ecc1d5869bcc0476e21 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a5fbebc47be111ecc1d5869bcc0476e21">_GetOrAllocRuntimeTypeIndex</a> ()</td></tr>
-<tr class="separator:a5fbebc47be111ecc1d5869bcc0476e21 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:ad94d79729ac85aa7c976e23d39066383 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ad94d79729ac85aa7c976e23d39066383">RuntimeTypeIndex</a> ()</td></tr>
-<tr class="separator:ad94d79729ac85aa7c976e23d39066383 inherit pub_static_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pro_methods_classtvm_1_1runtime_1_1Object"><td colspan="2" onclick="javascript:toggleInherit('pro_methods_classtvm_1_1runtime_1_1Object')"><img src="closed.png" alt="-"/>&#160;Protected Member Functions inherited from <a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td></tr>
-<tr class="memitem:ac9e5eed7719e322117bde996a171e33a inherit pro_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#ac9e5eed7719e322117bde996a171e33a">IncRef</a> ()</td></tr>
-<tr class="memdesc:ac9e5eed7719e322117bde996a171e33a inherit pro_methods_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">developer function, increases reference counter.  <a href="classtvm_1_1runtime_1_1Object.html#ac9e5eed7719e322117bde996a171e33a">More...</a><br /></td></tr>
-<tr class="separator:ac9e5eed7719e322117bde996a171e33a inherit pro_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a70fb5361147634605d6595bb89381f03 inherit pro_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">void&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a70fb5361147634605d6595bb89381f03">DecRef</a> ()</td></tr>
-<tr class="memdesc:a70fb5361147634605d6595bb89381f03 inherit pro_methods_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">developer function, decrease reference counter.  <a href="classtvm_1_1runtime_1_1Object.html#a70fb5361147634605d6595bb89381f03">More...</a><br /></td></tr>
-<tr class="separator:a70fb5361147634605d6595bb89381f03 inherit pro_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pro_static_methods_classtvm_1_1runtime_1_1Object"><td colspan="2" onclick="javascript:toggleInherit('pro_static_methods_classtvm_1_1runtime_1_1Object')"><img src="closed.png" alt="-"/>&#160;Static Protected Member Functions inherited from <a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td></tr>
-<tr class="memitem:a726972ff315c446192df94027ddea032 inherit pro_static_methods_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">static uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a726972ff315c446192df94027ddea032">GetOrAllocRuntimeTypeIndex</a> (const std::string &amp;key, uint32_t static_tindex, uint32_t parent_tindex, uint32_t type_child_slots, bool type_child_slots_can_overflow)</td></tr>
-<tr class="memdesc:a726972ff315c446192df94027ddea032 inherit pro_static_methods_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">Get the type index using type key.  <a href="classtvm_1_1runtime_1_1Object.html#a726972ff315c446192df94027ddea032">More...</a><br /></td></tr>
-<tr class="separator:a726972ff315c446192df94027ddea032 inherit pro_static_methods_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="inherit_header pro_attribs_classtvm_1_1runtime_1_1Object"><td colspan="2" onclick="javascript:toggleInherit('pro_attribs_classtvm_1_1runtime_1_1Object')"><img src="closed.png" alt="-"/>&#160;Protected Attributes inherited from <a class="el" href="classtvm_1_1runtime_1_1Object.html">tvm::runtime::Object</a></td></tr>
-<tr class="memitem:a4bfc2586cb55f2af47728187b3256255 inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top">uint32_t&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a4bfc2586cb55f2af47728187b3256255">type_index_</a> {0}</td></tr>
-<tr class="memdesc:a4bfc2586cb55f2af47728187b3256255 inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> index(tag) that indicates the type of the object.  <a href="classtvm_1_1runtime_1_1Object.html#a4bfc2586cb55f2af47728187b3256255">More...</a><br /></td></tr>
-<tr class="separator:a4bfc2586cb55f2af47728187b3256255 inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:a0d492efee331e2239a093f4b2017c10f inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a55549a6c23987890246248682560a03d">RefCounterType</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a0d492efee331e2239a093f4b2017c10f">ref_counter_</a> {0}</td></tr>
-<tr class="memdesc:a0d492efee331e2239a093f4b2017c10f inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">The internal reference counter.  <a href="classtvm_1_1runtime_1_1Object.html#a0d492efee331e2239a093f4b2017c10f">More...</a><br /></td></tr>
-<tr class="separator:a0d492efee331e2239a093f4b2017c10f inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-<tr class="memitem:af4407d2b59132e803ff791482dbe0145 inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="memItemLeft" align="right" valign="top"><a class="el" href="classtvm_1_1runtime_1_1Object.html#a9e84841ca982bff376a978ade0132631">FDeleter</a>&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classtvm_1_1runtime_1_1Object.html#af4407d2b59132e803ff791482dbe0145">deleter_</a> = nullptr</td></tr>
-<tr class="memdesc:af4407d2b59132e803ff791482dbe0145 inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="mdescLeft">&#160;</td><td class="mdescRight">deleter of this object to enable customized allocation. If the deleter is nullptr, no deletion will be performed. The creator of the object must always set the deleter field properly.  <a href="classtvm_1_1runtime_1_1Object.html#af4407d2b59132e803ff791482dbe0145">More...</a><br /></td></tr>
-<tr class="separator:af4407d2b59132e803ff791482dbe0145 inherit pro_attribs_classtvm_1_1runtime_1_1Object"><td class="memSeparator" colspan="2">&#160;</td></tr>
-</table>
-<a name="details" id="details"></a><h2 class="groupheader">Detailed Description</h2>
-<div class="textblock"><p>Describes one parameter that should be linked into the generated module. </p>
-<p>When parameters are to be linked in with generated code (i.e. on target_host-compatible backends), Relay attaches instances of this object to a global TIR function. Code-generators use the information contained in this node to include the parameter data in the generated module. </p>
-</div><h2 class="groupheader">Member Function Documentation</h2>
-<a id="a2a92d26184de1c29f2ec0c1c373af3b5"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a2a92d26184de1c29f2ec0c1c373af3b5">&#9670;&nbsp;</a></span>TVM_DECLARE_FINAL_OBJECT_INFO()</h2>
-
-<div class="memitem">
-<div class="memproto">
-      <table class="memname">
-        <tr>
-          <td class="memname">tvm::LinkedParamNode::TVM_DECLARE_FINAL_OBJECT_INFO </td>
-          <td>(</td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1LinkedParamNode.html">LinkedParamNode</a>&#160;</td>
-          <td class="paramname">, </td>
-        </tr>
-        <tr>
-          <td class="paramkey"></td>
-          <td></td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1runtime_1_1Object.html">Object</a>&#160;</td>
-          <td class="paramname">&#160;</td>
-        </tr>
-        <tr>
-          <td></td>
-          <td>)</td>
-          <td></td><td></td>
-        </tr>
-      </table>
-</div><div class="memdoc">
-
-</div>
-</div>
-<a id="a16df477a0c00bd0423cf2d46de60bfe3"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a16df477a0c00bd0423cf2d46de60bfe3">&#9670;&nbsp;</a></span>VisitAttrs()</h2>
-
-<div class="memitem">
-<div class="memproto">
-<table class="mlabels">
-  <tr>
-  <td class="mlabels-left">
-      <table class="memname">
-        <tr>
-          <td class="memname">void tvm::LinkedParamNode::VisitAttrs </td>
-          <td>(</td>
-          <td class="paramtype"><a class="el" href="classtvm_1_1AttrVisitor.html">tvm::AttrVisitor</a> *&#160;</td>
-          <td class="paramname"><em>v</em></td><td>)</td>
-          <td></td>
-        </tr>
-      </table>
-  </td>
-  <td class="mlabels-right">
-<span class="mlabels"><span class="mlabel">inline</span></span>  </td>
-  </tr>
-</table>
-</div><div class="memdoc">
-
-</div>
-</div>
-<h2 class="groupheader">Member Data Documentation</h2>
-<a id="a0fcaf48a2f8251d405730bd59fa16f4b"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a0fcaf48a2f8251d405730bd59fa16f4b">&#9670;&nbsp;</a></span>_type_key</h2>
-
-<div class="memitem">
-<div class="memproto">
-<table class="mlabels">
-  <tr>
-  <td class="mlabels-left">
-      <table class="memname">
-        <tr>
-          <td class="memname">constexpr const char* tvm::LinkedParamNode::_type_key = &quot;tir.LinkedParam&quot;</td>
-        </tr>
-      </table>
-  </td>
-  <td class="mlabels-right">
-<span class="mlabels"><span class="mlabel">static</span></span>  </td>
-  </tr>
-</table>
-</div><div class="memdoc">
-
-</div>
-</div>
-<a id="a6000f7f468b8db072935053a1ac1fbf4"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a6000f7f468b8db072935053a1ac1fbf4">&#9670;&nbsp;</a></span>id</h2>
-
-<div class="memitem">
-<div class="memproto">
-      <table class="memname">
-        <tr>
-          <td class="memname">int64_t tvm::LinkedParamNode::id</td>
-        </tr>
-      </table>
-</div><div class="memdoc">
-
-<p>Unique numeric identifier used by runtimes to lookup this parameter. </p>
-
-</div>
-</div>
-<a id="a96d730a027c9e169786f3aaea2e4cc10"></a>
-<h2 class="memtitle"><span class="permalink"><a href="#a96d730a027c9e169786f3aaea2e4cc10">&#9670;&nbsp;</a></span>param</h2>
-
-<div class="memitem">
-<div class="memproto">
-      <table class="memname">
-        <tr>
-          <td class="memname">::<a class="el" href="classtvm_1_1runtime_1_1NDArray.html">tvm::runtime::NDArray</a> tvm::LinkedParamNode::param</td>
-        </tr>
-      </table>
-</div><div class="memdoc">
-
-<p>Parameter data which should get linked into the final module. </p>
-
-</div>
-</div>
-<hr/>The documentation for this class was generated from the following file:<ul>
-<li>include/tvm/ir/<a class="el" href="ir_2module_8h_source.html">module.h</a></li>
-</ul>
-</div><!-- contents -->
-<!-- start footer part -->
-<hr class="footer"/><address class="footer"><small>
-Generated by &#160;<a href="http://www.doxygen.org/index.html">
-<img class="footer" src="doxygen.png" alt="doxygen"/>
-</a> 1.8.13
-</small></address>
-</body>
-</html>
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode__coll__graph.svg b/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode__coll__graph.svg
deleted file mode 100644
index 9881f3cca..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode__coll__graph.svg
+++ /dev/null
@@ -1,185 +0,0 @@
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
- "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
-<!-- Generated by graphviz version 2.40.1 (20161225.0304)
- -->
-<!-- Title: tvm::LinkedParamNode Pages: 1 -->
-<svg width="422pt" height="1009pt"
- viewBox="0.00 0.00 422.00 1009.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
-<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 1005)">
-<title>tvm::LinkedParamNode</title>
-<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-1005 418,-1005 418,4 -4,4"/>
-<!-- Node2 -->
-<g id="node1" class="node">
-<title>Node2</title>
-<polygon fill="#bfbfbf" stroke="#000000" points="112,-.5 112,-79.5 321,-79.5 321,-.5 112,-.5"/>
-<text text-anchor="middle" x="216.5" y="-67.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::LinkedParamNode</text>
-<polyline fill="none" stroke="#000000" points="112,-60.5 321,-60.5 "/>
-<text text-anchor="start" x="120" y="-48.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ id</text>
-<text text-anchor="start" x="120" y="-37.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_key</text>
-<polyline fill="none" stroke="#000000" points="112,-30.5 321,-30.5 "/>
-<text text-anchor="start" x="120" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ VisitAttrs()</text>
-<text text-anchor="start" x="120" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TVM_DECLARE_FINAL_OBJECT_INFO()</text>
-</g>
-<!-- Node3 -->
-<g id="node2" class="node">
-<title>Node3</title>
-<g id="a_node2"><a xlink:href="classtvm_1_1runtime_1_1Object.html" target="_top" xlink:title="base class of all object containers. ">
-<polygon fill="#ffffff" stroke="#000000" points="0,-127.5 0,-514.5 183,-514.5 183,-127.5 0,-127.5"/>
-<text text-anchor="middle" x="91.5" y="-502.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::Object</text>
-<polyline fill="none" stroke="#000000" points="0,-495.5 183,-495.5 "/>
-<text text-anchor="start" x="8" y="-483.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_key</text>
-<text text-anchor="start" x="8" y="-472.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_final</text>
-<text text-anchor="start" x="8" y="-461.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_child_slots</text>
-<text text-anchor="start" x="8" y="-450.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_child_slots_can</text>
-<text text-anchor="start" x="8" y="-439.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_overflow</text>
-<text text-anchor="start" x="8" y="-428.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_has_method_visit</text>
-<text text-anchor="start" x="8" y="-417.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_attrs</text>
-<text text-anchor="start" x="8" y="-406.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_has_method_sequal</text>
-<text text-anchor="start" x="8" y="-395.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_reduce</text>
-<text text-anchor="start" x="8" y="-384.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_has_method_shash</text>
-<text text-anchor="start" x="8" y="-373.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_reduce</text>
-<text text-anchor="start" x="8" y="-362.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_index</text>
-<text text-anchor="start" x="8" y="-351.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># type_index_</text>
-<text text-anchor="start" x="8" y="-340.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># ref_counter_</text>
-<polyline fill="none" stroke="#000000" points="0,-333.5 183,-333.5 "/>
-<text text-anchor="start" x="8" y="-321.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ type_index()</text>
-<text text-anchor="start" x="8" y="-310.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ GetTypeKey()</text>
-<text text-anchor="start" x="8" y="-299.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ GetTypeKeyHash()</text>
-<text text-anchor="start" x="8" y="-288.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ IsInstance()</text>
-<text text-anchor="start" x="8" y="-277.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ unique()</text>
-<text text-anchor="start" x="8" y="-266.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Object()</text>
-<text text-anchor="start" x="8" y="-255.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Object()</text>
-<text text-anchor="start" x="8" y="-244.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Object()</text>
-<text text-anchor="start" x="8" y="-233.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator=()</text>
-<text text-anchor="start" x="8" y="-222.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator=()</text>
-<text text-anchor="start" x="8" y="-211.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TypeIndex2Key()</text>
-<text text-anchor="start" x="8" y="-200.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TypeIndex2KeyHash()</text>
-<text text-anchor="start" x="8" y="-189.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TypeKey2Index()</text>
-<text text-anchor="start" x="8" y="-178.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _GetOrAllocRuntimeTypeIndex()</text>
-<text text-anchor="start" x="8" y="-167.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ RuntimeTypeIndex()</text>
-<text text-anchor="start" x="8" y="-156.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># IncRef()</text>
-<text text-anchor="start" x="8" y="-145.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># DecRef()</text>
-<text text-anchor="start" x="8" y="-134.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># GetOrAllocRuntimeTypeIndex()</text>
-</a>
-</g>
-</g>
-<!-- Node3&#45;&gt;Node2 -->
-<g id="edge1" class="edge">
-<title>Node3&#45;&gt;Node2</title>
-<path fill="none" stroke="#191970" d="M181.8142,-117.9736C188.0777,-103.8934 193.8729,-90.8657 198.8619,-79.6505"/>
-<polygon fill="none" stroke="#191970" points="178.501,-116.8105 177.6344,-127.3698 184.8968,-119.6556 178.501,-116.8105"/>
-</g>
-<!-- Node3&#45;&gt;Node3 -->
-<g id="edge2" class="edge">
-<title>Node3&#45;&gt;Node3</title>
-<path fill="none" stroke="#404040" d="M183.3625,-354.9248C194.0482,-348.6637 201,-337.3555 201,-321 201,-310.0112 197.8618,-301.3007 192.5615,-294.8687"/>
-<polygon fill="none" stroke="#404040" points="192.5184,-294.8322 185.3548,-294.0056 183.3625,-287.0752 190.5261,-287.9017 192.5184,-294.8322"/>
-<text text-anchor="middle" x="227" y="-318.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> #deleter_</text>
-</g>
-<!-- Node4 -->
-<g id="node3" class="node">
-<title>Node4</title>
-<g id="a_node3"><a xlink:href="classtvm_1_1runtime_1_1NDArray.html" target="_top" xlink:title="Managed NDArray. The array is backed by reference counted blocks. ">
-<polygon fill="#ffffff" stroke="#000000" points="271,-188 271,-454 414,-454 414,-188 271,-188"/>
-<text text-anchor="middle" x="342.5" y="-442" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::NDArray</text>
-<polyline fill="none" stroke="#000000" points="271,-435 414,-435 "/>
-<text text-anchor="middle" x="342.5" y="-423" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> </text>
-<polyline fill="none" stroke="#000000" points="271,-416 414,-416 "/>
-<text text-anchor="start" x="279" y="-404" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ NDArray()</text>
-<text text-anchor="start" x="279" y="-393" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ NDArray()</text>
-<text text-anchor="start" x="279" y="-382" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ reset()</text>
-<text text-anchor="start" x="279" y="-371" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ use_count()</text>
-<text text-anchor="start" x="279" y="-360" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&#45;&gt;()</text>
-<text text-anchor="start" x="279" y="-349" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ IsContiguous()</text>
-<text text-anchor="start" x="279" y="-338" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ CopyFrom()</text>
-<text text-anchor="start" x="279" y="-327" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ CopyFrom()</text>
-<text text-anchor="start" x="279" y="-316" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ CopyFromBytes()</text>
-<text text-anchor="start" x="279" y="-305" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ CopyTo()</text>
-<text text-anchor="start" x="279" y="-294" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">and 9 more...</text>
-<text text-anchor="start" x="279" y="-283" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Empty()</text>
-<text text-anchor="start" x="279" y="-272" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ FromExternalDLTensor()</text>
-<text text-anchor="start" x="279" y="-261" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ NewFromDLTensor()</text>
-<text text-anchor="start" x="279" y="-250" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ FromDLPack()</text>
-<text text-anchor="start" x="279" y="-239" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ CopyFromTo()</text>
-<text text-anchor="start" x="279" y="-228" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># get_mutable()</text>
-<text text-anchor="start" x="279" y="-217" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># FFIDataFromHandle()</text>
-<text text-anchor="start" x="279" y="-206" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># FFIDecRef()</text>
-<text text-anchor="start" x="279" y="-195" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># FFIGetHandle()</text>
-</a>
-</g>
-</g>
-<!-- Node4&#45;&gt;Node2 -->
-<g id="edge3" class="edge">
-<title>Node4&#45;&gt;Node2</title>
-<path fill="none" stroke="#404040" d="M289.6216,-187.9935C280.8804,-167.4292 271.6698,-146.5414 262.5,-127 256.8875,-115.0395 250.4685,-102.3666 244.2424,-90.5029"/>
-<polygon fill="none" stroke="#404040" points="244.2303,-90.4801 237.8851,-87.0538 238.6074,-79.8789 244.9525,-83.3052 244.2303,-90.4801"/>
-<text text-anchor="middle" x="274" y="-101" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> +param</text>
-</g>
-<!-- Node5 -->
-<g id="node4" class="node">
-<title>Node5</title>
-<g id="a_node4"><a xlink:href="classtvm_1_1runtime_1_1ObjectRef.html" target="_top" xlink:title="Base class of all object reference. ">
-<polygon fill="#ffffff" stroke="#000000" points="275.5,-552.5 275.5,-774.5 409.5,-774.5 409.5,-552.5 275.5,-552.5"/>
-<text text-anchor="middle" x="342.5" y="-762.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::ObjectRef</text>
-<polyline fill="none" stroke="#000000" points="275.5,-755.5 409.5,-755.5 "/>
-<text text-anchor="start" x="283.5" y="-743.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_is_nullable</text>
-<polyline fill="none" stroke="#000000" points="275.5,-736.5 409.5,-736.5 "/>
-<text text-anchor="start" x="283.5" y="-724.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectRef()</text>
-<text text-anchor="start" x="283.5" y="-713.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectRef()</text>
-<text text-anchor="start" x="283.5" y="-702.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ same_as()</text>
-<text text-anchor="start" x="283.5" y="-691.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator==()</text>
-<text text-anchor="start" x="283.5" y="-680.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator!=()</text>
-<text text-anchor="start" x="283.5" y="-669.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&lt;()</text>
-<text text-anchor="start" x="283.5" y="-658.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ defined()</text>
-<text text-anchor="start" x="283.5" y="-647.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ get()</text>
-<text text-anchor="start" x="283.5" y="-636.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&#45;&gt;()</text>
-<text text-anchor="start" x="283.5" y="-625.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ unique()</text>
-<text text-anchor="start" x="283.5" y="-614.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ use_count()</text>
-<text text-anchor="start" x="283.5" y="-603.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ as()</text>
-<text text-anchor="start" x="283.5" y="-592.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># get_mutable()</text>
-<text text-anchor="start" x="283.5" y="-581.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># DowncastNoCheck()</text>
-<text text-anchor="start" x="283.5" y="-570.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># FFIClearAfterMove()</text>
-<text text-anchor="start" x="283.5" y="-559.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># GetDataPtr()</text>
-</a>
-</g>
-</g>
-<!-- Node5&#45;&gt;Node4 -->
-<g id="edge4" class="edge">
-<title>Node5&#45;&gt;Node4</title>
-<path fill="none" stroke="#191970" d="M342.5,-542.284C342.5,-513.7101 342.5,-483.0989 342.5,-454.0351"/>
-<polygon fill="none" stroke="#191970" points="339.0001,-542.3 342.5,-552.3001 346.0001,-542.3001 339.0001,-542.3"/>
-</g>
-<!-- Node6 -->
-<g id="node5" class="node">
-<title>Node6</title>
-<g id="a_node5"><a xlink:href="classtvm_1_1runtime_1_1ObjectPtr.html" target="_top" xlink:title="{tvm::runtime::ObjectPtr\l\&lt; tvm::runtime::Object \&gt;\n||+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ~ObjectPtr()\l+ swap()\l+ get()\l+ operator&#45;\&gt;()\land 11 more...\l}">
-<polygon fill="#ffffff" stroke="#000000" points="272.5,-822.5 272.5,-1000.5 412.5,-1000.5 412.5,-822.5 272.5,-822.5"/>
-<text text-anchor="start" x="280.5" y="-988.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::ObjectPtr</text>
-<text text-anchor="middle" x="342.5" y="-977.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">&lt; tvm::runtime::Object &gt;</text>
-<polyline fill="none" stroke="#000000" points="272.5,-970.5 412.5,-970.5 "/>
-<text text-anchor="middle" x="342.5" y="-958.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> </text>
-<polyline fill="none" stroke="#000000" points="272.5,-951.5 412.5,-951.5 "/>
-<text text-anchor="start" x="280.5" y="-939.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="280.5" y="-928.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="280.5" y="-917.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="280.5" y="-906.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="280.5" y="-895.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="280.5" y="-884.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="280.5" y="-873.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ~ObjectPtr()</text>
-<text text-anchor="start" x="280.5" y="-862.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ swap()</text>
-<text text-anchor="start" x="280.5" y="-851.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ get()</text>
-<text text-anchor="start" x="280.5" y="-840.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&#45;&gt;()</text>
-<text text-anchor="start" x="280.5" y="-829.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">and 11 more...</text>
-</a>
-</g>
-</g>
-<!-- Node6&#45;&gt;Node5 -->
-<g id="edge5" class="edge">
-<title>Node6&#45;&gt;Node5</title>
-<path fill="none" stroke="#404040" d="M342.5,-822.3167C342.5,-810.8765 342.5,-799.0062 342.5,-787.1402"/>
-<polygon fill="none" stroke="#404040" points="342.5001,-786.7944 338.5,-780.7944 342.5,-774.7944 346.5,-780.7943 342.5001,-786.7944"/>
-<text text-anchor="middle" x="362" y="-796" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> #data_</text>
-</g>
-</g>
-</svg>
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode__inherit__graph.svg b/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode__inherit__graph.svg
deleted file mode 100644
index 22e1ef1d5..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParamNode__inherit__graph.svg
+++ /dev/null
@@ -1,76 +0,0 @@
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
- "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
-<!-- Generated by graphviz version 2.40.1 (20161225.0304)
- -->
-<!-- Title: tvm::LinkedParamNode Pages: 1 -->
-<svg width="217pt" height="534pt"
- viewBox="0.00 0.00 217.00 534.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
-<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 530)">
-<title>tvm::LinkedParamNode</title>
-<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-530 213,-530 213,4 -4,4"/>
-<!-- Node0 -->
-<g id="node1" class="node">
-<title>Node0</title>
-<polygon fill="#bfbfbf" stroke="#000000" points="0,-.5 0,-90.5 209,-90.5 209,-.5 0,-.5"/>
-<text text-anchor="middle" x="104.5" y="-78.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::LinkedParamNode</text>
-<polyline fill="none" stroke="#000000" points="0,-71.5 209,-71.5 "/>
-<text text-anchor="start" x="8" y="-59.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ id</text>
-<text text-anchor="start" x="8" y="-48.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ param</text>
-<text text-anchor="start" x="8" y="-37.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_key</text>
-<polyline fill="none" stroke="#000000" points="0,-30.5 209,-30.5 "/>
-<text text-anchor="start" x="8" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ VisitAttrs()</text>
-<text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TVM_DECLARE_FINAL_OBJECT_INFO()</text>
-</g>
-<!-- Node1 -->
-<g id="node2" class="node">
-<title>Node1</title>
-<g id="a_node2"><a xlink:href="classtvm_1_1runtime_1_1Object.html" target="_top" xlink:title="base class of all object containers. ">
-<polygon fill="#ffffff" stroke="#000000" points="13,-127.5 13,-525.5 196,-525.5 196,-127.5 13,-127.5"/>
-<text text-anchor="middle" x="104.5" y="-513.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::Object</text>
-<polyline fill="none" stroke="#000000" points="13,-506.5 196,-506.5 "/>
-<text text-anchor="start" x="21" y="-494.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_key</text>
-<text text-anchor="start" x="21" y="-483.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_final</text>
-<text text-anchor="start" x="21" y="-472.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_child_slots</text>
-<text text-anchor="start" x="21" y="-461.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_child_slots_can</text>
-<text text-anchor="start" x="21" y="-450.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_overflow</text>
-<text text-anchor="start" x="21" y="-439.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_has_method_visit</text>
-<text text-anchor="start" x="21" y="-428.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_attrs</text>
-<text text-anchor="start" x="21" y="-417.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_has_method_sequal</text>
-<text text-anchor="start" x="21" y="-406.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_reduce</text>
-<text text-anchor="start" x="21" y="-395.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_has_method_shash</text>
-<text text-anchor="start" x="21" y="-384.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_reduce</text>
-<text text-anchor="start" x="21" y="-373.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_index</text>
-<text text-anchor="start" x="21" y="-362.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># type_index_</text>
-<text text-anchor="start" x="21" y="-351.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># ref_counter_</text>
-<text text-anchor="start" x="21" y="-340.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># deleter_</text>
-<polyline fill="none" stroke="#000000" points="13,-333.5 196,-333.5 "/>
-<text text-anchor="start" x="21" y="-321.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ type_index()</text>
-<text text-anchor="start" x="21" y="-310.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ GetTypeKey()</text>
-<text text-anchor="start" x="21" y="-299.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ GetTypeKeyHash()</text>
-<text text-anchor="start" x="21" y="-288.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ IsInstance()</text>
-<text text-anchor="start" x="21" y="-277.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ unique()</text>
-<text text-anchor="start" x="21" y="-266.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Object()</text>
-<text text-anchor="start" x="21" y="-255.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Object()</text>
-<text text-anchor="start" x="21" y="-244.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ Object()</text>
-<text text-anchor="start" x="21" y="-233.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator=()</text>
-<text text-anchor="start" x="21" y="-222.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator=()</text>
-<text text-anchor="start" x="21" y="-211.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TypeIndex2Key()</text>
-<text text-anchor="start" x="21" y="-200.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TypeIndex2KeyHash()</text>
-<text text-anchor="start" x="21" y="-189.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TypeKey2Index()</text>
-<text text-anchor="start" x="21" y="-178.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _GetOrAllocRuntimeTypeIndex()</text>
-<text text-anchor="start" x="21" y="-167.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ RuntimeTypeIndex()</text>
-<text text-anchor="start" x="21" y="-156.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># IncRef()</text>
-<text text-anchor="start" x="21" y="-145.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># DecRef()</text>
-<text text-anchor="start" x="21" y="-134.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># GetOrAllocRuntimeTypeIndex()</text>
-</a>
-</g>
-</g>
-<!-- Node1&#45;&gt;Node0 -->
-<g id="edge1" class="edge">
-<title>Node1&#45;&gt;Node0</title>
-<path fill="none" stroke="#191970" d="M104.5,-116.8999C104.5,-107.6327 104.5,-98.8995 104.5,-90.9342"/>
-<polygon fill="none" stroke="#191970" points="101.0001,-117.1543 104.5,-127.1543 108.0001,-117.1543 101.0001,-117.1543"/>
-</g>
-</g>
-</svg>
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParam__coll__graph.svg b/docs/reference/api/doxygen/classtvm_1_1LinkedParam__coll__graph.svg
deleted file mode 100644
index f0f1a69ba..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParam__coll__graph.svg
+++ /dev/null
@@ -1,92 +0,0 @@
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
- "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
-<!-- Generated by graphviz version 2.40.1 (20161225.0304)
- -->
-<!-- Title: tvm::LinkedParam Pages: 1 -->
-<svg width="162pt" height="596pt"
- viewBox="0.00 0.00 162.00 596.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
-<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 592)">
-<title>tvm::LinkedParam</title>
-<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-592 158,-592 158,4 -4,4"/>
-<!-- Node2 -->
-<g id="node1" class="node">
-<title>Node2</title>
-<polygon fill="#bfbfbf" stroke="#000000" points="0,-.5 0,-101.5 154,-101.5 154,-.5 0,-.5"/>
-<text text-anchor="middle" x="77" y="-89.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::LinkedParam</text>
-<polyline fill="none" stroke="#000000" points="0,-82.5 154,-82.5 "/>
-<text text-anchor="middle" x="77" y="-70.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> </text>
-<polyline fill="none" stroke="#000000" points="0,-63.5 154,-63.5 "/>
-<text text-anchor="start" x="8" y="-51.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ LinkedParam()</text>
-<text text-anchor="start" x="8" y="-40.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TVM_DEFINE_OBJECT_REF</text>
-<text text-anchor="start" x="8" y="-29.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_METHODS()</text>
-<text text-anchor="start" x="8" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TVM_DEFINE_OBJECT_REF</text>
-<text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_COW_METHOD()</text>
-</g>
-<!-- Node3 -->
-<g id="node2" class="node">
-<title>Node3</title>
-<g id="a_node2"><a xlink:href="classtvm_1_1runtime_1_1ObjectRef.html" target="_top" xlink:title="Base class of all object reference. ">
-<polygon fill="#ffffff" stroke="#000000" points="10,-139.5 10,-361.5 144,-361.5 144,-139.5 10,-139.5"/>
-<text text-anchor="middle" x="77" y="-349.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::ObjectRef</text>
-<polyline fill="none" stroke="#000000" points="10,-342.5 144,-342.5 "/>
-<text text-anchor="start" x="18" y="-330.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_is_nullable</text>
-<polyline fill="none" stroke="#000000" points="10,-323.5 144,-323.5 "/>
-<text text-anchor="start" x="18" y="-311.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectRef()</text>
-<text text-anchor="start" x="18" y="-300.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectRef()</text>
-<text text-anchor="start" x="18" y="-289.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ same_as()</text>
-<text text-anchor="start" x="18" y="-278.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator==()</text>
-<text text-anchor="start" x="18" y="-267.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator!=()</text>
-<text text-anchor="start" x="18" y="-256.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&lt;()</text>
-<text text-anchor="start" x="18" y="-245.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ defined()</text>
-<text text-anchor="start" x="18" y="-234.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ get()</text>
-<text text-anchor="start" x="18" y="-223.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&#45;&gt;()</text>
-<text text-anchor="start" x="18" y="-212.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ unique()</text>
-<text text-anchor="start" x="18" y="-201.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ use_count()</text>
-<text text-anchor="start" x="18" y="-190.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ as()</text>
-<text text-anchor="start" x="18" y="-179.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># get_mutable()</text>
-<text text-anchor="start" x="18" y="-168.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># DowncastNoCheck()</text>
-<text text-anchor="start" x="18" y="-157.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># FFIClearAfterMove()</text>
-<text text-anchor="start" x="18" y="-146.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># GetDataPtr()</text>
-</a>
-</g>
-</g>
-<!-- Node3&#45;&gt;Node2 -->
-<g id="edge1" class="edge">
-<title>Node3&#45;&gt;Node2</title>
-<path fill="none" stroke="#191970" d="M77,-129.3399C77,-119.7909 77,-110.5166 77,-101.8898"/>
-<polygon fill="none" stroke="#191970" points="73.5001,-129.3748 77,-139.3748 80.5001,-129.3749 73.5001,-129.3748"/>
-</g>
-<!-- Node4 -->
-<g id="node3" class="node">
-<title>Node4</title>
-<g id="a_node3"><a xlink:href="classtvm_1_1runtime_1_1ObjectPtr.html" target="_top" xlink:title="{tvm::runtime::ObjectPtr\l\&lt; tvm::runtime::Object \&gt;\n||+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ~ObjectPtr()\l+ swap()\l+ get()\l+ operator&#45;\&gt;()\land 11 more...\l}">
-<polygon fill="#ffffff" stroke="#000000" points="7,-409.5 7,-587.5 147,-587.5 147,-409.5 7,-409.5"/>
-<text text-anchor="start" x="15" y="-575.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::ObjectPtr</text>
-<text text-anchor="middle" x="77" y="-564.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">&lt; tvm::runtime::Object &gt;</text>
-<polyline fill="none" stroke="#000000" points="7,-557.5 147,-557.5 "/>
-<text text-anchor="middle" x="77" y="-545.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> </text>
-<polyline fill="none" stroke="#000000" points="7,-538.5 147,-538.5 "/>
-<text text-anchor="start" x="15" y="-526.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="15" y="-515.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="15" y="-504.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="15" y="-493.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="15" y="-482.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="15" y="-471.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectPtr()</text>
-<text text-anchor="start" x="15" y="-460.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ~ObjectPtr()</text>
-<text text-anchor="start" x="15" y="-449.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ swap()</text>
-<text text-anchor="start" x="15" y="-438.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ get()</text>
-<text text-anchor="start" x="15" y="-427.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&#45;&gt;()</text>
-<text text-anchor="start" x="15" y="-416.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">and 11 more...</text>
-</a>
-</g>
-</g>
-<!-- Node4&#45;&gt;Node3 -->
-<g id="edge2" class="edge">
-<title>Node4&#45;&gt;Node3</title>
-<path fill="none" stroke="#404040" d="M77,-409.3167C77,-397.8765 77,-386.0062 77,-374.1402"/>
-<polygon fill="none" stroke="#404040" points="77.0001,-373.7944 73,-367.7944 77,-361.7944 81,-367.7943 77.0001,-373.7944"/>
-<text text-anchor="middle" x="96.5" y="-383" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> #data_</text>
-</g>
-</g>
-</svg>
diff --git a/docs/reference/api/doxygen/classtvm_1_1LinkedParam__inherit__graph.svg b/docs/reference/api/doxygen/classtvm_1_1LinkedParam__inherit__graph.svg
deleted file mode 100644
index 2b248f33d..000000000
--- a/docs/reference/api/doxygen/classtvm_1_1LinkedParam__inherit__graph.svg
+++ /dev/null
@@ -1,62 +0,0 @@
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
- "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
-<!-- Generated by graphviz version 2.40.1 (20161225.0304)
- -->
-<!-- Title: tvm::LinkedParam Pages: 1 -->
-<svg width="162pt" height="380pt"
- viewBox="0.00 0.00 162.00 380.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
-<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 376)">
-<title>tvm::LinkedParam</title>
-<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-376 158,-376 158,4 -4,4"/>
-<!-- Node0 -->
-<g id="node1" class="node">
-<title>Node0</title>
-<polygon fill="#bfbfbf" stroke="#000000" points="0,-.5 0,-101.5 154,-101.5 154,-.5 0,-.5"/>
-<text text-anchor="middle" x="77" y="-89.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::LinkedParam</text>
-<polyline fill="none" stroke="#000000" points="0,-82.5 154,-82.5 "/>
-<text text-anchor="middle" x="77" y="-70.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> </text>
-<polyline fill="none" stroke="#000000" points="0,-63.5 154,-63.5 "/>
-<text text-anchor="start" x="8" y="-51.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ LinkedParam()</text>
-<text text-anchor="start" x="8" y="-40.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TVM_DEFINE_OBJECT_REF</text>
-<text text-anchor="start" x="8" y="-29.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_METHODS()</text>
-<text text-anchor="start" x="8" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ TVM_DEFINE_OBJECT_REF</text>
-<text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">_COW_METHOD()</text>
-</g>
-<!-- Node1 -->
-<g id="node2" class="node">
-<title>Node1</title>
-<g id="a_node2"><a xlink:href="classtvm_1_1runtime_1_1ObjectRef.html" target="_top" xlink:title="Base class of all object reference. ">
-<polygon fill="#ffffff" stroke="#000000" points="10,-138.5 10,-371.5 144,-371.5 144,-138.5 10,-138.5"/>
-<text text-anchor="middle" x="77" y="-359.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::ObjectRef</text>
-<polyline fill="none" stroke="#000000" points="10,-352.5 144,-352.5 "/>
-<text text-anchor="start" x="18" y="-340.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ _type_is_nullable</text>
-<text text-anchor="start" x="18" y="-329.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># data_</text>
-<polyline fill="none" stroke="#000000" points="10,-322.5 144,-322.5 "/>
-<text text-anchor="start" x="18" y="-310.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectRef()</text>
-<text text-anchor="start" x="18" y="-299.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ ObjectRef()</text>
-<text text-anchor="start" x="18" y="-288.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ same_as()</text>
-<text text-anchor="start" x="18" y="-277.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator==()</text>
-<text text-anchor="start" x="18" y="-266.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator!=()</text>
-<text text-anchor="start" x="18" y="-255.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&lt;()</text>
-<text text-anchor="start" x="18" y="-244.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ defined()</text>
-<text text-anchor="start" x="18" y="-233.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ get()</text>
-<text text-anchor="start" x="18" y="-222.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ operator&#45;&gt;()</text>
-<text text-anchor="start" x="18" y="-211.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ unique()</text>
-<text text-anchor="start" x="18" y="-200.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ use_count()</text>
-<text text-anchor="start" x="18" y="-189.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">+ as()</text>
-<text text-anchor="start" x="18" y="-178.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># get_mutable()</text>
-<text text-anchor="start" x="18" y="-167.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># DowncastNoCheck()</text>
-<text text-anchor="start" x="18" y="-156.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># FFIClearAfterMove()</text>
-<text text-anchor="start" x="18" y="-145.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># GetDataPtr()</text>
-</a>
-</g>
-</g>
-<!-- Node1&#45;&gt;Node0 -->
-<g id="edge1" class="edge">
-<title>Node1&#45;&gt;Node0</title>
-<path fill="none" stroke="#191970" d="M77,-128.2431C77,-118.9714 77,-109.9902 77,-101.6317"/>
-<polygon fill="none" stroke="#191970" points="73.5001,-128.4021 77,-138.4021 80.5001,-128.4022 73.5001,-128.4021"/>
-</g>
-</g>
-</svg>
diff --git a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object.html b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object.html
index 7015bfe5e..233bffc5d 100644
--- a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object.html
+++ b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object.html
@@ -82,7 +82,7 @@ $(function() {
 
 <p><code>#include &lt;<a class="el" href="object_8h_source.html">object.h</a>&gt;</code></p>
 
-<p>Inherited by <a class="el" href="classtvm_1_1AffineTypeNode.html">tvm::AffineTypeNode</a>, <a class="el" href="classtvm_1_1arith_1_1ConstIntBoundNode.html">tvm::arith::ConstIntBoundNode</a>, <a class="el" href="classtvm_1_1arith_1_1IntConstraintsNode.html">tvm::arith::IntConstraintsNode</a>, <a class="el" href="classtvm_1_1arith_1_1IntConstraintsTransformNode.html">tvm::arith::IntConstraintsTransformNode</a>, <a class="el" href="classtvm_1_1arith_1_1IntGroupBoundsNode.html">tvm::arith [...]
+<p>Inherited by <a class="el" href="classtvm_1_1AffineTypeNode.html">tvm::AffineTypeNode</a>, <a class="el" href="classtvm_1_1arith_1_1ConstIntBoundNode.html">tvm::arith::ConstIntBoundNode</a>, <a class="el" href="classtvm_1_1arith_1_1IntConstraintsNode.html">tvm::arith::IntConstraintsNode</a>, <a class="el" href="classtvm_1_1arith_1_1IntConstraintsTransformNode.html">tvm::arith::IntConstraintsTransformNode</a>, <a class="el" href="classtvm_1_1arith_1_1IntGroupBoundsNode.html">tvm::arith [...]
 <div class="dynheader">
 Collaboration diagram for tvm::runtime::Object:</div>
 <div class="dyncontent">
diff --git a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef.html b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef.html
index e45e8a8e6..0ebae7e81 100644
--- a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef.html
+++ b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef.html
@@ -81,7 +81,7 @@ $(function() {
 
 <p><code>#include &lt;<a class="el" href="object_8h_source.html">object.h</a>&gt;</code></p>
 
-<p>Inherited by <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; Range &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; Region &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; tvm::arith::IterSplitExpr &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; tvm::AttrFieldInfo &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::ru [...]
+<p>Inherited by <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; Range &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; Region &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; tvm::arith::IterSplitExpr &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::runtime::Array&lt; tvm::AttrFieldInfo &gt;</a>, <a class="el" href="classtvm_1_1runtime_1_1Array.html">tvm::ru [...]
 <div class="dynheader">
 Collaboration diagram for tvm::runtime::ObjectRef:</div>
 <div class="dyncontent">
diff --git a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef__coll__graph.svg b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef__coll__graph.svg
index 46e398b26..4df6e1448 100644
--- a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef__coll__graph.svg
+++ b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1ObjectRef__coll__graph.svg
@@ -9,9 +9,9 @@
 <g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 453)">
 <title>tvm::runtime::ObjectRef</title>
 <polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-453 144,-453 144,4 -4,4"/>
-<!-- Node393 -->
+<!-- Node392 -->
 <g id="node1" class="node">
-<title>Node393</title>
+<title>Node392</title>
 <polygon fill="#bfbfbf" stroke="#000000" points="3,-.5 3,-222.5 137,-222.5 137,-.5 3,-.5"/>
 <text text-anchor="middle" x="70" y="-210.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::ObjectRef</text>
 <polyline fill="none" stroke="#000000" points="3,-203.5 137,-203.5 "/>
@@ -34,9 +34,9 @@
 <text text-anchor="start" x="11" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># FFIClearAfterMove()</text>
 <text text-anchor="start" x="11" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># GetDataPtr()</text>
 </g>
-<!-- Node394 -->
+<!-- Node393 -->
 <g id="node2" class="node">
-<title>Node394</title>
+<title>Node393</title>
 <g id="a_node2"><a xlink:href="classtvm_1_1runtime_1_1ObjectPtr.html" target="_top" xlink:title="{tvm::runtime::ObjectPtr\l\&lt; tvm::runtime::Object \&gt;\n||+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ObjectPtr()\l+ ~ObjectPtr()\l+ swap()\l+ get()\l+ operator&#45;\&gt;()\land 11 more...\l}">
 <polygon fill="#ffffff" stroke="#000000" points="0,-270.5 0,-448.5 140,-448.5 140,-270.5 0,-270.5"/>
 <text text-anchor="start" x="8" y="-436.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::ObjectPtr</text>
@@ -58,9 +58,9 @@
 </a>
 </g>
 </g>
-<!-- Node394&#45;&gt;Node393 -->
+<!-- Node393&#45;&gt;Node392 -->
 <g id="edge1" class="edge">
-<title>Node394&#45;&gt;Node393</title>
+<title>Node393&#45;&gt;Node392</title>
 <path fill="none" stroke="#404040" d="M70,-270.3167C70,-258.8765 70,-247.0062 70,-235.1402"/>
 <polygon fill="none" stroke="#404040" points="70.0001,-234.7944 66,-228.7944 70,-222.7944 74,-228.7943 70.0001,-234.7944"/>
 <text text-anchor="middle" x="89.5" y="-244" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> #data_</text>
diff --git a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object__coll__graph.svg b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object__coll__graph.svg
index 041b91094..720280aaf 100644
--- a/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object__coll__graph.svg
+++ b/docs/reference/api/doxygen/classtvm_1_1runtime_1_1Object__coll__graph.svg
@@ -9,9 +9,9 @@
 <g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 392)">
 <title>tvm::runtime::Object</title>
 <polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-392 257,-392 257,4 -4,4"/>
-<!-- Node605 -->
+<!-- Node604 -->
 <g id="node1" class="node">
-<title>Node605</title>
+<title>Node604</title>
 <polygon fill="#bfbfbf" stroke="#000000" points="0,-.5 0,-387.5 183,-387.5 183,-.5 0,-.5"/>
 <text text-anchor="middle" x="91.5" y="-375.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000">tvm::runtime::Object</text>
 <polyline fill="none" stroke="#000000" points="0,-368.5 183,-368.5 "/>
@@ -49,9 +49,9 @@
 <text text-anchor="start" x="8" y="-18.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># DecRef()</text>
 <text text-anchor="start" x="8" y="-7.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"># GetOrAllocRuntimeTypeIndex()</text>
 </g>
-<!-- Node605&#45;&gt;Node605 -->
+<!-- Node604&#45;&gt;Node604 -->
 <g id="edge1" class="edge">
-<title>Node605&#45;&gt;Node605</title>
+<title>Node604&#45;&gt;Node604</title>
 <path fill="none" stroke="#404040" d="M183.3625,-256.0888C194.0482,-244.6299 201,-223.9336 201,-194 201,-171.3159 197.0077,-153.9367 190.4236,-141.8623"/>
 <polygon fill="none" stroke="#404040" points="190.3069,-141.6977 183.5725,-139.1192 183.3625,-131.9112 190.0969,-134.4897 190.3069,-141.6977"/>
 <text text-anchor="middle" x="227" y="-191.5" font-family="Helvetica,sans-Serif" font-size="10.00" fill="#000000"> #deleter_</text>
diff --git a/docs/reference/api/doxygen/codegen_8h_source.html b/docs/reference/api/doxygen/codegen_8h_source.html
index a643638ad..73cd329ac 100644
--- a/docs/reference/api/doxygen/codegen_8h_source.html
+++ b/docs/reference/api/doxygen/codegen_8h_source.html
@@ -74,7 +74,7 @@ $(function() {
 <div class="ttc" id="namespacetvm_1_1codegen_html_ab2cd2a65bac4b26427a8ca0abe4e0bd6"><div class="ttname"><a href="namespacetvm_1_1codegen.html#ab2cd2a65bac4b26427a8ca0abe4e0bd6">tvm::codegen::PackImportsToLLVM</a></div><div class="ttdeci">runtime::Module PackImportsToLLVM(const runtime::Module &amp;m, bool system_lib, const std::string &amp;target_triple)</div><div class="ttdoc">Pack imported device library to a LLVM module. Compile the LLVM module and link with the host library...</div></div>
 <div class="ttc" id="classtvm_1_1Target_html"><div class="ttname"><a href="classtvm_1_1Target.html">tvm::Target</a></div><div class="ttdoc">Managed reference class to TargetNode. </div><div class="ttdef"><b>Definition:</b> target.h:141</div></div>
 <div class="ttc" id="namespacetvm_1_1codegen_html_abf02059ebadcdb8bbbe5c840b646d67b"><div class="ttname"><a href="namespacetvm_1_1codegen.html#abf02059ebadcdb8bbbe5c840b646d67b">tvm::codegen::PackImportsToC</a></div><div class="ttdeci">std::string PackImportsToC(const runtime::Module &amp;m, bool system_lib)</div><div class="ttdoc">Pack imported device library to a C file. Compile the C file and link with the host library will allo...</div></div>
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="namespacetvm_1_1codegen_html_a0d6322c2dda54a66a3b82022f5f3632c"><div class="ttname"><a href="namespacetvm_1_1codegen.html#a0d6322c2dda54a66a3b82022f5f3632c">tvm::codegen::Build</a></div><div class="ttdeci">runtime::Module Build(IRModule mod, Target target)</div><div class="ttdoc">Build a module from array of lowered function. </div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Module_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Module.html">tvm::runtime::Module</a></div><div class="ttdoc">Module container of TVM. </div><div class="ttdef"><b>Definition:</b> module.h:48</div></div>
 <div class="ttc" id="target_8h_html"><div class="ttname"><a href="target_8h.html">target.h</a></div><div class="ttdoc">Compilation target object. </div></div>
diff --git a/docs/reference/api/doxygen/database_8h_source.html b/docs/reference/api/doxygen/database_8h_source.html
index b8df6b52b..7b7cd84cb 100644
--- a/docs/reference/api/doxygen/database_8h_source.html
+++ b/docs/reference/api/doxygen/database_8h_source.html
@@ -99,7 +99,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1runtime_1_1ObjectRef_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></div><div class="ttdoc">Base class of all object reference. </div><div class="ttdef"><b>Definition:</b> object.h:511</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1TuningRecordNode_html"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1TuningRecordNode.html">tvm::meta_schedule::TuningRecordNode</a></div><div class="ttdoc">The class of tuning records. </div><div class="ttdef"><b>Definition:</b> database.h:95</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1WorkloadNode_html_ad6ac6e9052f30846d72489ced86a7690"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1WorkloadNode.html#ad6ac6e9052f30846d72489ced86a7690">tvm::meta_schedule::WorkloadNode::_type_key</a></div><div class="ttdeci">static constexpr const char * _type_key</div><div class="ttdef"><b>Definition:</b> database.h:44</div></div>
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1WorkloadNode_html_aa0ee452b287813b9509e649397846a3c"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1WorkloadNode.html#aa0ee452b287813b9509e649397846a3c">tvm::meta_schedule::WorkloadNode::AsJSON</a></div><div class="ttdeci">ObjectRef AsJSON() const</div><div class="ttdoc">Export the workload to a JSON string. </div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyDatabaseNode_html_add146bf1e2006f72ed1534b2004bcb06"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyDatabaseNode.html#add146bf1e2006f72ed1534b2004bcb06">tvm::meta_schedule::PyDatabaseNode::f_has_workload</a></div><div class="ttdeci">FHasWorkload f_has_workload</div><div class="ttdoc">The packed function to the HasWorkload function. </div><div class="ttdef"><b>Definition:</b> database.h:226</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1PyDatabaseNode_html_a00614b22bd1cae147dc0d83cdd071187"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1PyDatabaseNode.html#a00614b22bd1cae147dc0d83cdd071187">tvm::meta_schedule::PyDatabaseNode::CommitTuningRecord</a></div><div class="ttdeci">void CommitTuningRecord(const TuningRecord &amp;record) final</div><div class="ttdoc">Add a tuning record to the database. </div><div class="ttdef"><b>Definition:</b> database.h:257</div></div>
diff --git a/docs/reference/api/doxygen/dataflow__matcher_8h_source.html b/docs/reference/api/doxygen/dataflow__matcher_8h_source.html
index bb8a3d9ef..8ae6b0f65 100644
--- a/docs/reference/api/doxygen/dataflow__matcher_8h_source.html
+++ b/docs/reference/api/doxygen/dataflow__matcher_8h_source.html
@@ -82,7 +82,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1relay_1_1DFPatternCallbackNode_html_acfe3e9170d4c05ff59f3056b79ae58bb"><div class="ttname"><a href="classtvm_1_1relay_1_1DFPatternCallbackNode.html#acfe3e9170d4c05ff59f3056b79ae58bb">tvm::relay::DFPatternCallbackNode::pattern</a></div><div class="ttdeci">DFPattern pattern</div><div class="ttdoc">Pattern this callback matches. </div><div class="ttdef"><b>Definition:</b> dataflow_matcher.h:45</div></div>
 <div class="ttc" id="namespacetvm_1_1relay_html_a6d491e8dfcb3098241f6d77c3aa5efe2"><div class="ttname"><a href="namespacetvm_1_1relay.html#a6d491e8dfcb3098241f6d77c3aa5efe2">tvm::relay::InferType</a></div><div class="ttdeci">Expr InferType(const Expr &amp;expr)</div><div class="ttdoc">Infer the type of an expression. </div></div>
 <div class="ttc" id="dataflow__pattern_8h_html"><div class="ttname"><a href="dataflow__pattern_8h.html">dataflow_pattern.h</a></div><div class="ttdoc">A pattern language for matching dataflow properties. </div></div>
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="classtvm_1_1relay_1_1DFPatternCallbackNode_html"><div class="ttname"><a href="classtvm_1_1relay_1_1DFPatternCallbackNode.html">tvm::relay::DFPatternCallbackNode</a></div><div class="ttdoc">Base type of all dataflow pattern callbacks. </div><div class="ttdef"><b>Definition:</b> dataflow_matcher.h:42</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1Map_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1Map.html">tvm::runtime::Map</a></div><div class="ttdoc">Map container of NodeRef-&gt;NodeRef in DSL graph. Map implements copy on write semantics, which means map is mutable but copy will happen when array is referenced in more than two places. </div><div class="ttdef"><b>Definition:</b> map.h:1268</div></div>
 <div class="ttc" id="dataflow__pattern__functor_8h_html"><div class="ttname"><a href="dataflow__pattern__functor_8h.html">dataflow_pattern_functor.h</a></div><div class="ttdoc">A set of passes for operating on pattern graphs. </div></div>
diff --git a/docs/reference/api/doxygen/diagnostic_8h_source.html b/docs/reference/api/doxygen/diagnostic_8h_source.html
index 27c739e0a..3771723ea 100644
--- a/docs/reference/api/doxygen/diagnostic_8h_source.html
+++ b/docs/reference/api/doxygen/diagnostic_8h_source.html
@@ -107,7 +107,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1DiagnosticContextNode_html_aea5532b73702d459a53ee0c358607284"><div class="ttname"><a href="classtvm_1_1DiagnosticContextNode.html#aea5532b73702d459a53ee0c358607284">tvm::DiagnosticContextNode::renderer</a></div><div class="ttdeci">DiagnosticRenderer renderer</div><div class="ttdoc">The renderer set for the context. </div><div class="ttdef"><b>Definition:</b> diagnostic.h:177</div></div>
 <div class="ttc" id="object_8h_html_a3aea9b3f65aeb9150c0fa7800e5573c6"><div class="ttname"><a href="object_8h.html#a3aea9b3f65aeb9150c0fa7800e5573c6">TVM_DECLARE_FINAL_OBJECT_INFO</a></div><div class="ttdeci">#define TVM_DECLARE_FINAL_OBJECT_INFO(TypeName, ParentType)</div><div class="ttdoc">helper macro to declare type information in a final class. </div><div class="ttdef"><b>Definition:</b> object.h:671</div></div>
 <div class="ttc" id="namespacetvm_html_a908c332516a33fdc106cd9ee2ebc2b9ea244ce4b6c7f56eaa446d64fc2d068bbb"><div class="ttname"><a href="namespacetvm.html#a908c332516a33fdc106cd9ee2ebc2b9ea244ce4b6c7f56eaa446d64fc2d068bbb">tvm::DiagnosticLevel::kHelp</a></div></div>
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="classtvm_1_1DiagnosticBuilder_html_a52d9cc3cb33e655c5d82af47daa74c66"><div class="ttname"><a href="classtvm_1_1DiagnosticBuilder.html#a52d9cc3cb33e655c5d82af47daa74c66">tvm::DiagnosticBuilder::span</a></div><div class="ttdeci">Span span</div><div class="ttdoc">The span of the diagnostic. </div><div class="ttdef"><b>Definition:</b> diagnostic.h:105</div></div>
 <div class="ttc" id="classtvm_1_1DiagnosticBuilder_html_a3204dda7b9a0625027f3d7cba87558f7"><div class="ttname"><a href="classtvm_1_1DiagnosticBuilder.html#a3204dda7b9a0625027f3d7cba87558f7">tvm::DiagnosticBuilder::DiagnosticBuilder</a></div><div class="ttdeci">DiagnosticBuilder(const DiagnosticBuilder &amp;builder)</div><div class="ttdef"><b>Definition:</b> diagnostic.h:115</div></div>
 <div class="ttc" id="classtvm_1_1DiagnosticContextNode_html_a358426a0415010f8136fd8db0b632a77"><div class="ttname"><a href="classtvm_1_1DiagnosticContextNode.html#a358426a0415010f8136fd8db0b632a77">tvm::DiagnosticContextNode::SEqualReduce</a></div><div class="ttdeci">bool SEqualReduce(const DiagnosticContextNode *other, SEqualReducer equal) const</div><div class="ttdef"><b>Definition:</b> diagnostic.h:184</div></div>
diff --git a/docs/reference/api/doxygen/error_8h_source.html b/docs/reference/api/doxygen/error_8h_source.html
index 6063dbd49..de8097aa9 100644
--- a/docs/reference/api/doxygen/error_8h_source.html
+++ b/docs/reference/api/doxygen/error_8h_source.html
@@ -86,7 +86,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1CompileError_html_ac603e8927fd05acb728fe44ecbe9b6a5"><div class="ttname"><a href="classtvm_1_1CompileError.html#ac603e8927fd05acb728fe44ecbe9b6a5">tvm::CompileError::CompileError</a></div><div class="ttdeci">CompileError(const std::string &amp;msg)</div><div class="ttdoc">construct error from message. </div><div class="ttdef"><b>Definition:</b> error.h:76</div></div>
 <div class="ttc" id="classtvm_1_1ErrorReporter_html_a3e1c300e60077c38bc9540dddcd1a019"><div class="ttname"><a href="classtvm_1_1ErrorReporter.html#a3e1c300e60077c38bc9540dddcd1a019">tvm::ErrorReporter::ReportAt</a></div><div class="ttdeci">void ReportAt(const GlobalVar &amp;global, const ObjectRef &amp;node, std::stringstream &amp;err)</div><div class="ttdoc">Report an error against a program, using the full program error reporting strategy. </div><div class="ttdef"><b>Definition:</b> er [...]
 <div class="ttc" id="classtvm_1_1CompileError_html_a5f3349d1fad25a7753eae3853aea6e9e"><div class="ttname"><a href="classtvm_1_1CompileError.html#a5f3349d1fad25a7753eae3853aea6e9e">tvm::CompileError::CompileError</a></div><div class="ttdeci">CompileError(const CompileError &amp;other)</div><div class="ttdoc">copy constructor. </div><div class="ttdef"><b>Definition:</b> error.h:86</div></div>
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="classtvm_1_1CompileError_html_a0c10257ac3e83a751ff34723e832f1ca"><div class="ttname"><a href="classtvm_1_1CompileError.html#a0c10257ac3e83a751ff34723e832f1ca">tvm::CompileError::CompileError</a></div><div class="ttdeci">CompileError()</div><div class="ttdoc">default constructor. </div><div class="ttdef"><b>Definition:</b> error.h:89</div></div>
 <div class="ttc" id="classtvm_1_1ErrorReporter_html_aed0af73c114daa93db994ce2cfdc3fda"><div class="ttname"><a href="classtvm_1_1ErrorReporter.html#aed0af73c114daa93db994ce2cfdc3fda">tvm::ErrorReporter::ErrorReporter</a></div><div class="ttdeci">ErrorReporter()</div><div class="ttdoc">default constructor. </div><div class="ttdef"><b>Definition:</b> error.h:115</div></div>
 <div class="ttc" id="classtvm_1_1ErrorReporter_html_a7ec11efb5e9680cfd57e05d573fc0927"><div class="ttname"><a href="classtvm_1_1ErrorReporter.html#a7ec11efb5e9680cfd57e05d573fc0927">tvm::ErrorReporter::AnyErrors</a></div><div class="ttdeci">bool AnyErrors()</div><div class="ttdef"><b>Definition:</b> error.h:176</div></div>
diff --git a/docs/reference/api/doxygen/extracted__task_8h_source.html b/docs/reference/api/doxygen/extracted__task_8h_source.html
index 7eecf4c0b..1a8f0725e 100644
--- a/docs/reference/api/doxygen/extracted__task_8h_source.html
+++ b/docs/reference/api/doxygen/extracted__task_8h_source.html
@@ -83,7 +83,7 @@ $(function() {
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1ExtractedTaskNode_html_a50c40aa8beb57d0f31c36ef360042be6"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html#a50c40aa8beb57d0f31c36ef360042be6">tvm::meta_schedule::ExtractedTaskNode::mod</a></div><div class="ttdeci">IRModule mod</div><div class="ttdoc">The high-level IR. </div><div class="ttdef"><b>Definition:</b> extracted_task.h:33</div></div>
 <div class="ttc" id="classtvm_1_1runtime_1_1ObjectRef_html"><div class="ttname"><a href="classtvm_1_1runtime_1_1ObjectRef.html">tvm::runtime::ObjectRef</a></div><div class="ttdoc">Base class of all object reference. </div><div class="ttdef"><b>Definition:</b> object.h:511</div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1ExtractedTaskNode_html_a89729717843a9ea91a4535bafee8b14f"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html#a89729717843a9ea91a4535bafee8b14f">tvm::meta_schedule::ExtractedTaskNode::dispatched</a></div><div class="ttdeci">Array&lt; IRModule &gt; dispatched</div><div class="ttdoc">A list of low-level IRs that the high-level IR could potentially dispatch to. </div><div class="ttdef"><b>Definition:</b> extrac [...]
-<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:395</div></div>
+<div class="ttc" id="classtvm_1_1IRModule_html"><div class="ttname"><a href="classtvm_1_1IRModule.html">tvm::IRModule</a></div><div class="ttdoc">Managed reference class to IRModuleNode. </div><div class="ttdef"><b>Definition:</b> module.h:360</div></div>
 <div class="ttc" id="target_8h_html"><div class="ttname"><a href="target_8h.html">target.h</a></div><div class="ttdoc">Compilation target object. </div></div>
 <div class="ttc" id="classtvm_1_1meta__schedule_1_1ExtractedTaskNode_html_af0bff60f6a1950cd199cda30f38ea47d"><div class="ttname"><a href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html#af0bff60f6a1950cd199cda30f38ea47d">tvm::meta_schedule::ExtractedTaskNode::TVM_DECLARE_FINAL_OBJECT_INFO</a></div><div class="ttdeci">TVM_DECLARE_FINAL_OBJECT_INFO(ExtractedTaskNode, runtime::Object)</div></div>
 </div><!-- fragment --></div><!-- contents -->
diff --git a/docs/reference/api/doxygen/functions__.html b/docs/reference/api/doxygen/functions__.html
index 6914c2a23..7f1b9b503 100644
--- a/docs/reference/api/doxygen/functions__.html
+++ b/docs/reference/api/doxygen/functions__.html
@@ -248,7 +248,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html#a28e948c2dd9aa4d4809fabdd953dc33b">tvm::instrument::PassInstrumentNode</a>
 , <a class="el" href="classtvm_1_1IntImmNode.html#a4839f79367838a2baf94f33e6a98f072">tvm::IntImmNode</a>
 , <a class="el" href="classtvm_1_1IRModuleNode.html#a6437f77d18cf9a45f2c183d050605d15">tvm::IRModuleNode</a>
-, <a class="el" href="classtvm_1_1LinkedParamNode.html#a0fcaf48a2f8251d405730bd59fa16f4b">tvm::LinkedParamNode</a>
 , <a class="el" href="classtvm_1_1MemoryInfoNode.html#a679e07b35ff1da70063e3146e7cbb9dd">tvm::MemoryInfoNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html#a755012568d85aa7cba250c5f8be766cc">tvm::meta_schedule::ApplyHistoryBestNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfoNode.html#abde212d9062947bdbbdff5905dca87a3">tvm::meta_schedule::ArgInfoNode</a>
diff --git a/docs/reference/api/doxygen/functions_func_l.html b/docs/reference/api/doxygen/functions_func_l.html
index 526460e7a..fa93b1c85 100644
--- a/docs/reference/api/doxygen/functions_func_l.html
+++ b/docs/reference/api/doxygen/functions_func_l.html
@@ -86,9 +86,6 @@ $(function() {
 <li>LinearCongruentialEngine()
 : <a class="el" href="classtvm_1_1support_1_1LinearCongruentialEngine.html#af1286194e2b9e315bf4174f2dd759ecc">tvm::support::LinearCongruentialEngine</a>
 </li>
-<li>LinkedParam()
-: <a class="el" href="classtvm_1_1LinkedParam.html#a12aed83524087bde67a8e2eb4cfc5d97">tvm::LinkedParam</a>
-</li>
 <li>ListAttrNames()
 : <a class="el" href="classtvm_1_1ReflectionVTable.html#afa0fc95d88bc58c02f73eef524c299cc">tvm::ReflectionVTable</a>
 </li>
@@ -168,7 +165,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1IRModuleNode.html#ae078ad8def39579701d144578c787bcf">tvm::IRModuleNode</a>
 </li>
 <li>LookupTypeDef()
-: <a class="el" href="classtvm_1_1IRModuleNode.html#ae095c1fd87642bd417224668c5b4d910">tvm::IRModuleNode</a>
+: <a class="el" href="classtvm_1_1IRModuleNode.html#a23f3769fe60b3b06c9d163650ea7caaf">tvm::IRModuleNode</a>
 </li>
 <li>LoopRV()
 : <a class="el" href="classtvm_1_1tir_1_1LoopRV.html#ad47c4e83701875b84c9efd36ee3dc323">tvm::tir::LoopRV</a>
diff --git a/docs/reference/api/doxygen/functions_func_s.html b/docs/reference/api/doxygen/functions_func_s.html
index c20507ffa..7173eec4e 100644
--- a/docs/reference/api/doxygen/functions_func_s.html
+++ b/docs/reference/api/doxygen/functions_func_s.html
@@ -688,7 +688,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html#a93d1d23f24d903db844f75f51fe09a36">tvm::tir::ScheduleNode</a>
 </li>
 <li>StorageAlignStep()
-: <a class="el" href="classtvm_1_1auto__scheduler_1_1StorageAlignStep.html#af50b7c2f020f8e0a80f5bcc8e559b394">tvm::auto_scheduler::StorageAlignStep</a>
+: <a class="el" href="classtvm_1_1auto__scheduler_1_1StorageAlignStep.html#a99dbb8c55d9e7d78268b6d43fd348bc7">tvm::auto_scheduler::StorageAlignStep</a>
 </li>
 <li>Store()
 : <a class="el" href="classtvm_1_1tir_1_1Store.html#a2c4278b8bcdae57ada2022ecc7c290c3">tvm::tir::Store</a>
@@ -700,7 +700,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1runtime_1_1DeviceAPI.html#ac29b9295c432a87658392872c644864f">tvm::runtime::DeviceAPI</a>
 </li>
 <li>String()
-: <a class="el" href="classtvm_1_1runtime_1_1String.html#acf549b3c43142639879e0fc31ea5cd77">tvm::runtime::String</a>
+: <a class="el" href="classtvm_1_1runtime_1_1String.html#a68df7bab89fca339e3918438dd80300d">tvm::runtime::String</a>
 </li>
 <li>StringImm()
 : <a class="el" href="classtvm_1_1tir_1_1StringImm.html#a0f2830290e055f677c5d5dea98aab726">tvm::tir::StringImm</a>
diff --git a/docs/reference/api/doxygen/functions_func_t.html b/docs/reference/api/doxygen/functions_func_t.html
index 07b41b291..030f88ed9 100644
--- a/docs/reference/api/doxygen/functions_func_t.html
+++ b/docs/reference/api/doxygen/functions_func_t.html
@@ -466,7 +466,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1IncompleteTypeNode.html#afbd1522c6361c9476286344b7bae329c">tvm::IncompleteTypeNode</a>
 , <a class="el" href="classtvm_1_1IntImmNode.html#a222e06d4d1a79d26ee122ba57871eb10">tvm::IntImmNode</a>
 , <a class="el" href="classtvm_1_1IRModuleNode.html#a4840f698deaffe0e96317a436dfd079f">tvm::IRModuleNode</a>
-, <a class="el" href="classtvm_1_1LinkedParamNode.html#a2a92d26184de1c29f2ec0c1c373af3b5">tvm::LinkedParamNode</a>
 , <a class="el" href="classtvm_1_1MemoryInfoNode.html#a76cbbfb90ec0f5deeeb10171430a0ccb">tvm::MemoryInfoNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html#a124bdf490b05d2534053b09299db18dd">tvm::meta_schedule::ApplyHistoryBestNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html#aefeb9bc39e17443d9f4cb6683e5d2af6">tvm::meta_schedule::BuilderInputNode</a>
@@ -747,7 +746,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1State.html#aea500365cb7a964a21ac1677fc29fc99">tvm::auto_scheduler::State</a>
 , <a class="el" href="classtvm_1_1DictAttrs.html#adcb7e5ecede9d976bda30fb3f762c953">tvm::DictAttrs</a>
 , <a class="el" href="classtvm_1_1IRModule.html#ac3d7b217437ecefbd9096a57325ae29a">tvm::IRModule</a>
-, <a class="el" href="classtvm_1_1LinkedParam.html#a17df7ce77a67396945de4e185174e4b5">tvm::LinkedParam</a>
 , <a class="el" href="classtvm_1_1relay_1_1Call.html#ab9eee004a05e13a319c9f1db05602754">tvm::relay::Call</a>
 , <a class="el" href="classtvm_1_1relay_1_1Clause.html#a53074960bfc52dd8fbccfd543758f005">tvm::relay::Clause</a>
 , <a class="el" href="classtvm_1_1relay_1_1Function.html#ac085d821f02ee1e2a4927f85f72f6862">tvm::relay::Function</a>
@@ -829,7 +827,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1IncompleteType.html#a5956b02607eb9f4a61eeb3b250d41154">tvm::IncompleteType</a>
 , <a class="el" href="classtvm_1_1instrument_1_1PassInstrument.html#af3c70646089d0590beddec155bd04e6d">tvm::instrument::PassInstrument</a>
 , <a class="el" href="classtvm_1_1IntImm.html#a9ce59db1a112fb10b7f384b68a3afc9f">tvm::IntImm</a>
-, <a class="el" href="classtvm_1_1LinkedParam.html#af123bcc5f3ef0c0c5089f07e91fa3c19">tvm::LinkedParam</a>
 , <a class="el" href="classtvm_1_1MemoryInfo.html#aa913c570d467dda458f5ba5f5e7795be">tvm::MemoryInfo</a>
 , <a class="el" href="classtvm_1_1PointerType.html#abd62ac0b63821a91e4ed4946a1c7f941">tvm::PointerType</a>
 , <a class="el" href="classtvm_1_1PrimExpr.html#a3ad47a31c4ce693077a93f154b2b1e12">tvm::PrimExpr</a>
@@ -1022,7 +1019,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1TypedEnvFunc_3_01R_07Args_8_8_8_08_4.html#a0d72a6fa7263821c14bcd37837998ed9">tvm::TypedEnvFunc&lt; R(Args...)&gt;</a>
 </li>
 <li>TypedPackedFunc()
-: <a class="el" href="classtvm_1_1runtime_1_1TypedPackedFunc_3_01R_07Args_8_8_8_08_4.html#a36ca0d1876544463ee848766e70e5e96">tvm::runtime::TypedPackedFunc&lt; R(Args...)&gt;</a>
+: <a class="el" href="classtvm_1_1runtime_1_1TypedPackedFunc_3_01R_07Args_8_8_8_08_4.html#afd8ee9dd9648c19b468bb4b0b00e8e4e">tvm::runtime::TypedPackedFunc&lt; R(Args...)&gt;</a>
 </li>
 <li>TypeIndex2Key()
 : <a class="el" href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">tvm::runtime::Object</a>
@@ -1045,7 +1042,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1TypeRelation.html#ac26b1897eab8197ed26606ab81b7403b">tvm::TypeRelation</a>
 </li>
 <li>TypeReporter()
-: <a class="el" href="classtvm_1_1TypeReporter.html#a8e7e05a07f9f7ad9bea91f27afac9051">tvm::TypeReporter</a>
+: <a class="el" href="classtvm_1_1TypeReporter.html#aa3dc38a3c84d324d0b3a9f358460a091">tvm::TypeReporter</a>
 </li>
 <li>TypeVar()
 : <a class="el" href="classtvm_1_1TypeVar.html#adf5ef8e89d162735519b5d125c89e3e3">tvm::TypeVar</a>
diff --git a/docs/reference/api/doxygen/functions_func_v.html b/docs/reference/api/doxygen/functions_func_v.html
index f4e5ceb1f..79ca228cd 100644
--- a/docs/reference/api/doxygen/functions_func_v.html
+++ b/docs/reference/api/doxygen/functions_func_v.html
@@ -143,7 +143,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html#a89dddfdff1613182e02214ad88f78cce">tvm::instrument::PassInstrumentNode</a>
 , <a class="el" href="classtvm_1_1IntImmNode.html#a39ccfd3964e6d132ad8d4e4d544b5949">tvm::IntImmNode</a>
 , <a class="el" href="classtvm_1_1IRModuleNode.html#affbad8fa2513bd33cf8ac7d95aee132e">tvm::IRModuleNode</a>
-, <a class="el" href="classtvm_1_1LinkedParamNode.html#a16df477a0c00bd0423cf2d46de60bfe3">tvm::LinkedParamNode</a>
 , <a class="el" href="classtvm_1_1MemoryInfoNode.html#a93ff1429c6382dc4d17bf91d8dad5e81">tvm::MemoryInfoNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html#a1ef1c1f1d65ff784abf4c7e064d54637">tvm::meta_schedule::ApplyHistoryBestNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html#af640877ef243c29d4845977c62f1e12d">tvm::meta_schedule::BuilderInputNode</a>
@@ -340,7 +339,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1relay_1_1DFPatternVisitor.html#ae7e67d3a1709b0a180572417698ffaa8">tvm::relay::DFPatternVisitor</a>
 </li>
 <li>VisitDFPattern_()
-: <a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor_3_01R_07const_01DFPattern_01_6n_00_01Args_8_8_8_08_4.html#a1161e6f7e0591539407d6843edad039b">tvm::relay::DFPatternFunctor&lt; R(const DFPattern &amp;n, Args...)&gt;</a>
+: <a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor_3_01R_07const_01DFPattern_01_6n_00_01Args_8_8_8_08_4.html#a4c61918b0ba22edf08576f16adb09a9d">tvm::relay::DFPatternFunctor&lt; R(const DFPattern &amp;n, Args...)&gt;</a>
 , <a class="el" href="classtvm_1_1relay_1_1DFPatternVisitor.html#af6cb65b48220b7f937c751f9bfc18e91">tvm::relay::DFPatternVisitor</a>
 </li>
 <li>VisitDFPatternDefault_()
@@ -359,14 +358,14 @@ $(function() {
 , <a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html#a6d35a6081ee7dbc440e5a980f70795c6">tvm::tir::StmtVisitor</a>
 </li>
 <li>VisitExpr_()
-: <a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html#a106fe266717d897c6c633745ea484c98">tvm::relay::ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1relay_1_1ExprMutator.html#a139d2a36b9a3b4cb3c2d80ec09ff645b">tvm::relay::ExprMutator</a>
-, <a class="el" href="classtvm_1_1relay_1_1ExprVisitor.html#a52cec47d5e4792dd1cf0f5635ab14fa8">tvm::relay::ExprVisitor</a>
+: <a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html#ab067e6279e3e73817d41a17eef030726">tvm::relay::ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1relay_1_1ExprMutator.html#a03a4b48a2cdcf642f4cf3b9d55064f53">tvm::relay::ExprMutator</a>
+, <a class="el" href="classtvm_1_1relay_1_1ExprVisitor.html#a8b1ef43d965026767385fb3ee5791928">tvm::relay::ExprVisitor</a>
 , <a class="el" href="classtvm_1_1relay_1_1MixedModeMutator.html#a86656f533b4961437f53d1dbe30ae1fb">tvm::relay::MixedModeMutator</a>
 , <a class="el" href="classtvm_1_1relay_1_1MixedModeVisitor.html#a109b942e0299536851a3dc53a02b1ddc">tvm::relay::MixedModeVisitor</a>
-, <a class="el" href="classtvm_1_1tir_1_1ExprFunctor_3_01R_07const_01PrimExpr_01_6n_00_01Args_8_8_8_08_4.html#a22b383c5c332c23aca9f9248f4fedfd1">tvm::tir::ExprFunctor&lt; R(const PrimExpr &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1tir_1_1ExprMutator.html#a44047f3394527b92a7b9b2c09c3d1383">tvm::tir::ExprMutator</a>
-, <a class="el" href="classtvm_1_1tir_1_1ExprVisitor.html#a786dcfcef511795b23359a3c60c74477">tvm::tir::ExprVisitor</a>
+, <a class="el" href="classtvm_1_1tir_1_1ExprFunctor_3_01R_07const_01PrimExpr_01_6n_00_01Args_8_8_8_08_4.html#a9f18d0dac340380dfd22737e3aee6aee">tvm::tir::ExprFunctor&lt; R(const PrimExpr &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1tir_1_1ExprMutator.html#aabb9f6232b3ab527bd34293bb2c9d047">tvm::tir::ExprMutator</a>
+, <a class="el" href="classtvm_1_1tir_1_1ExprVisitor.html#abf1ea11bdeb9df050bc73155ffb50a8a">tvm::tir::ExprVisitor</a>
 </li>
 <li>VisitExprDefault_()
 : <a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html#ab35a37c57578e32a8c873cdfe9e31a0f">tvm::relay::ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a>
@@ -387,9 +386,9 @@ $(function() {
 , <a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html#ad6692c86b749bb0d93042aa2a0425a74">tvm::relay::PatternFunctor&lt; R(const Pattern &amp;n, Args...)&gt;</a>
 </li>
 <li>VisitPattern_()
-: <a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html#a11370205d1de851e817d40f031ad4811">tvm::relay::PatternFunctor&lt; R(const Pattern &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1relay_1_1PatternMutator.html#af8ea941a20a51cba2dc5e9e21f0ffc88">tvm::relay::PatternMutator</a>
-, <a class="el" href="classtvm_1_1relay_1_1PatternVisitor.html#a2d9a35bc9be4f5d0badb0c1bb5b86847">tvm::relay::PatternVisitor</a>
+: <a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html#afe53bd4de34ab8dda2ea3c46a91ea6a8">tvm::relay::PatternFunctor&lt; R(const Pattern &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1relay_1_1PatternMutator.html#aedeb370baf4bca6018153d01d2594a84">tvm::relay::PatternMutator</a>
+, <a class="el" href="classtvm_1_1relay_1_1PatternVisitor.html#a615c586aebfe563c7dfee3ff99e8ecb5">tvm::relay::PatternVisitor</a>
 </li>
 <li>VisitPatternDefault_()
 : <a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html#ad71efcd0b9a937b35f7fd4e2b6131773">tvm::relay::PatternFunctor&lt; R(const Pattern &amp;n, Args...)&gt;</a>
@@ -406,8 +405,8 @@ $(function() {
 </li>
 <li>VisitStmt_()
 : <a class="el" href="classtvm_1_1tir_1_1StmtFunctor_3_01R_07const_01Stmt_01_6n_00_01Args_8_8_8_01args_08_4.html#afb4abf8cb69c4a9105eb38e262e96bc7">tvm::tir::StmtFunctor&lt; R(const Stmt &amp;n, Args... args)&gt;</a>
-, <a class="el" href="classtvm_1_1tir_1_1StmtMutator.html#a3b116212aaf79bc898f3446a35f7fd3e">tvm::tir::StmtMutator</a>
-, <a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html#a38488c0f8137e12bc195fa2e0a0524c9">tvm::tir::StmtVisitor</a>
+, <a class="el" href="classtvm_1_1tir_1_1StmtMutator.html#aecd16bf1a6715ea36f6c30e5dc2ceae7">tvm::tir::StmtMutator</a>
+, <a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html#aff2335e1aea1de67bdfb92271c8c0e10">tvm::tir::StmtVisitor</a>
 </li>
 <li>VisitStmtDefault_()
 : <a class="el" href="classtvm_1_1tir_1_1StmtFunctor_3_01R_07const_01Stmt_01_6n_00_01Args_8_8_8_01args_08_4.html#ae51b328e2b59a50bed7112a93dba1aae">tvm::tir::StmtFunctor&lt; R(const Stmt &amp;n, Args... args)&gt;</a>
@@ -421,9 +420,9 @@ $(function() {
 , <a class="el" href="classtvm_1_1TypeMutator.html#a84e824911927d98e20a338eab8b75a45">tvm::TypeMutator</a>
 </li>
 <li>VisitType_()
-: <a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html#ae3a258acfcf5fe3ef0c7e291908d72ff">tvm::TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1TypeMutator.html#a4c7667d35d0a9a28c957165b65536c93">tvm::TypeMutator</a>
-, <a class="el" href="classtvm_1_1TypeVisitor.html#a8f548b8def48ea4f11a3eafa04d74d96">tvm::TypeVisitor</a>
+: <a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html#ac94cab8aea5c2a9afb439d7417f30a20">tvm::TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1TypeMutator.html#a2a78bda75555650a37a80e1e074d562a">tvm::TypeMutator</a>
+, <a class="el" href="classtvm_1_1TypeVisitor.html#a292b19b578526ea74b1434dc50514a18">tvm::TypeVisitor</a>
 </li>
 <li>VisitTypeDefault_()
 : <a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html#a91553f9e04c39b3821a70ae4f7b0c597">tvm::TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a>
diff --git a/docs/reference/api/doxygen/functions_i.html b/docs/reference/api/doxygen/functions_i.html
index 9915eb3f1..5f27a4e24 100644
--- a/docs/reference/api/doxygen/functions_i.html
+++ b/docs/reference/api/doxygen/functions_i.html
@@ -61,9 +61,6 @@ $(function() {
 <div class="textblock">Here is a list of all class members with links to the classes they belong to:</div>
 
 <h3><a id="index_i"></a>- i -</h3><ul>
-<li>id
-: <a class="el" href="classtvm_1_1LinkedParamNode.html#a6000f7f468b8db072935053a1ac1fbf4">tvm::LinkedParamNode</a>
-</li>
 <li>Id()
 : <a class="el" href="classtvm_1_1relay_1_1Id.html#a3448f7c12dde98c58cda1ea224321555">tvm::relay::Id</a>
 </li>
@@ -312,7 +309,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1arith_1_1IntConstraintsTransform.html#a1c5ea6cf05289065ee8967cbfb897181">tvm::arith::IntConstraintsTransform</a>
 </li>
 <li>Integer()
-: <a class="el" href="classtvm_1_1Integer.html#a2d3969d98441b5b2ee5d8a986a56c410">tvm::Integer</a>
+: <a class="el" href="classtvm_1_1Integer.html#a4bdb4edd6acf99ecfca13bf34da04fae">tvm::Integer</a>
 </li>
 <li>Internal
 : <a class="el" href="classtvm_1_1te_1_1SpecializedCondition.html#a8bde6eb35df6b3a9f53810e0bc79fdfd">tvm::te::SpecializedCondition</a>
@@ -351,7 +348,7 @@ $(function() {
 </li>
 <li>Invoke()
 : <a class="el" href="structtvm_1_1runtime_1_1vm_1_1Instruction.html#acb19406a24fa95bf39a29d15ad6be256">tvm::runtime::vm::Instruction</a>
-, <a class="el" href="classtvm_1_1runtime_1_1vm_1_1VirtualMachine.html#a1094291352e07e4c827a88b1167b89ad">tvm::runtime::vm::VirtualMachine</a>
+, <a class="el" href="classtvm_1_1runtime_1_1vm_1_1VirtualMachine.html#aa5f4724e2e702ef9d5c34e85dec53b02">tvm::runtime::vm::VirtualMachine</a>
 </li>
 <li>invoke_args_registers
 : <a class="el" href="structtvm_1_1runtime_1_1vm_1_1Instruction.html#a6fc678bca0e215303087981a79f23b7f">tvm::runtime::vm::Instruction</a>
@@ -584,10 +581,10 @@ $(function() {
 : <a class="el" href="classtvm_1_1tir_1_1IterVar.html#a1c0d6998203092c953b7da00f16c5c31">tvm::tir::IterVar</a>
 </li>
 <li>IterVarAttr()
-: <a class="el" href="classtvm_1_1te_1_1IterVarAttr.html#a5549479b7e3ce243d89b219b0dd7ef71">tvm::te::IterVarAttr</a>
+: <a class="el" href="classtvm_1_1te_1_1IterVarAttr.html#aa20680587a1c880b659063cd37ba4763">tvm::te::IterVarAttr</a>
 </li>
 <li>IterVarRelation()
-: <a class="el" href="classtvm_1_1te_1_1IterVarRelation.html#a3e611ee0870d9a542b8deb79575dbf66">tvm::te::IterVarRelation</a>
+: <a class="el" href="classtvm_1_1te_1_1IterVarRelation.html#a4b50caede957f1cb50587ce15a87109f">tvm::te::IterVarRelation</a>
 </li>
 </ul>
 </div><!-- contents -->
diff --git a/docs/reference/api/doxygen/functions_l.html b/docs/reference/api/doxygen/functions_l.html
index 13f61f0ae..793396492 100644
--- a/docs/reference/api/doxygen/functions_l.html
+++ b/docs/reference/api/doxygen/functions_l.html
@@ -161,9 +161,6 @@ $(function() {
 <li>LinearCongruentialEngine()
 : <a class="el" href="classtvm_1_1support_1_1LinearCongruentialEngine.html#af1286194e2b9e315bf4174f2dd759ecc">tvm::support::LinearCongruentialEngine</a>
 </li>
-<li>LinkedParam()
-: <a class="el" href="classtvm_1_1LinkedParam.html#a12aed83524087bde67a8e2eb4cfc5d97">tvm::LinkedParam</a>
-</li>
 <li>ListAttrNames()
 : <a class="el" href="classtvm_1_1ReflectionVTable.html#afa0fc95d88bc58c02f73eef524c299cc">tvm::ReflectionVTable</a>
 </li>
@@ -246,7 +243,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1IRModuleNode.html#ae078ad8def39579701d144578c787bcf">tvm::IRModuleNode</a>
 </li>
 <li>LookupTypeDef()
-: <a class="el" href="classtvm_1_1IRModuleNode.html#a23f3769fe60b3b06c9d163650ea7caaf">tvm::IRModuleNode</a>
+: <a class="el" href="classtvm_1_1IRModuleNode.html#ae095c1fd87642bd417224668c5b4d910">tvm::IRModuleNode</a>
 </li>
 <li>loop_var
 : <a class="el" href="classtvm_1_1tir_1_1ForNode.html#a7dbf66bdcf8ed397321517f0915a0946">tvm::tir::ForNode</a>
diff --git a/docs/reference/api/doxygen/functions_p.html b/docs/reference/api/doxygen/functions_p.html
index 05fb10265..395bec45a 100644
--- a/docs/reference/api/doxygen/functions_p.html
+++ b/docs/reference/api/doxygen/functions_p.html
@@ -141,9 +141,6 @@ $(function() {
 <li>ParallelizeVectorizeUnroll()
 : <a class="el" href="classtvm_1_1meta__schedule_1_1ScheduleRule.html#a0ef9b604081db7a8bf960f3fbfd3a804">tvm::meta_schedule::ScheduleRule</a>
 </li>
-<li>param
-: <a class="el" href="classtvm_1_1LinkedParamNode.html#a96d730a027c9e169786f3aaea2e4cc10">tvm::LinkedParamNode</a>
-</li>
 <li>param_device_indexes
 : <a class="el" href="structtvm_1_1runtime_1_1vm_1_1VMFunction.html#afff8cae6bf6100376c4275b301a11828">tvm::runtime::vm::VMFunction</a>
 </li>
@@ -328,7 +325,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1te_1_1IterVarAttrNode.html#aea7a6bc44a7ddca46c76c666eba37b7f">tvm::te::IterVarAttrNode</a>
 </li>
 <li>PragmaStep()
-: <a class="el" href="classtvm_1_1auto__scheduler_1_1PragmaStep.html#a7692c2a9934af1f36b218840034a88d5">tvm::auto_scheduler::PragmaStep</a>
+: <a class="el" href="classtvm_1_1auto__scheduler_1_1PragmaStep.html#a9f3ec96f3e561a14d8d9235c4d46e2eb">tvm::auto_scheduler::PragmaStep</a>
 </li>
 <li>pre_
 : <a class="el" href="classtvm_1_1relay_1_1MixedModeMutator.html#a81d6c2593e361659ed2d0bea78a8f58a">tvm::relay::MixedModeMutator</a>
@@ -385,7 +382,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1meta__schedule_1_1SearchStrategyNode.html#abd1485c82a7df42a54904de7822f0fbf">tvm::meta_schedule::SearchStrategyNode</a>
 </li>
 <li>PrimExpr()
-: <a class="el" href="classtvm_1_1PrimExpr.html#a756d3f8b17b019560946524951ae6118">tvm::PrimExpr</a>
+: <a class="el" href="classtvm_1_1PrimExpr.html#a7f0ca30e951608a0b36a77a66d4d19e0">tvm::PrimExpr</a>
 </li>
 <li>PrimFunc()
 : <a class="el" href="classtvm_1_1tir_1_1PrimFunc.html#ab01a529fafaf9fabdfca170605f7b0f8">tvm::tir::PrimFunc</a>
diff --git a/docs/reference/api/doxygen/functions_s.html b/docs/reference/api/doxygen/functions_s.html
index d725fc15f..83cf3fa59 100644
--- a/docs/reference/api/doxygen/functions_s.html
+++ b/docs/reference/api/doxygen/functions_s.html
@@ -1046,7 +1046,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1tir_1_1BufferNode.html#ac18ddd10b79a30ae57d3a8283686259d">tvm::tir::BufferNode</a>
 </li>
 <li>String()
-: <a class="el" href="classtvm_1_1runtime_1_1String.html#ac5d930b522e9fef9c07e51819d96d2f3">tvm::runtime::String</a>
+: <a class="el" href="classtvm_1_1runtime_1_1String.html#acf549b3c43142639879e0fc31ea5cd77">tvm::runtime::String</a>
 , <a class="el" href="classtvm_1_1runtime_1_1StringObj_1_1FromStd.html#a7fb804f7dc96dd9f705c84095f37f1ca">tvm::runtime::StringObj::FromStd</a>
 , <a class="el" href="classtvm_1_1runtime_1_1StringObj.html#a7fb804f7dc96dd9f705c84095f37f1ca">tvm::runtime::StringObj</a>
 </li>
diff --git a/docs/reference/api/doxygen/functions_t.html b/docs/reference/api/doxygen/functions_t.html
index ec9a2f4cc..f5ec6894e 100644
--- a/docs/reference/api/doxygen/functions_t.html
+++ b/docs/reference/api/doxygen/functions_t.html
@@ -78,7 +78,7 @@ $(function() {
 , <a class="el" href="structtvm_1_1runtime_1_1vm_1_1Instruction.html#a46879dbe84105fb621a6167f8d73b223">tvm::runtime::vm::Instruction</a>
 </li>
 <li>Target()
-: <a class="el" href="classtvm_1_1Target.html#a77f3d7cc97d8cfd7172af58b4e784d89">tvm::Target</a>
+: <a class="el" href="classtvm_1_1Target.html#ab825b350cf478bf948d807b6fdf636a0">tvm::Target</a>
 </li>
 <li>target
 : <a class="el" href="classtvm_1_1VirtualDeviceNode.html#a8b2d427d9e21886ccaeaae5e9cc55aaf">tvm::VirtualDeviceNode</a>
@@ -648,7 +648,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1IncompleteTypeNode.html#afbd1522c6361c9476286344b7bae329c">tvm::IncompleteTypeNode</a>
 , <a class="el" href="classtvm_1_1IntImmNode.html#a222e06d4d1a79d26ee122ba57871eb10">tvm::IntImmNode</a>
 , <a class="el" href="classtvm_1_1IRModuleNode.html#a4840f698deaffe0e96317a436dfd079f">tvm::IRModuleNode</a>
-, <a class="el" href="classtvm_1_1LinkedParamNode.html#a2a92d26184de1c29f2ec0c1c373af3b5">tvm::LinkedParamNode</a>
 , <a class="el" href="classtvm_1_1MemoryInfoNode.html#a76cbbfb90ec0f5deeeb10171430a0ccb">tvm::MemoryInfoNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html#a124bdf490b05d2534053b09299db18dd">tvm::meta_schedule::ApplyHistoryBestNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html#aefeb9bc39e17443d9f4cb6683e5d2af6">tvm::meta_schedule::BuilderInputNode</a>
@@ -929,7 +928,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1auto__scheduler_1_1State.html#aea500365cb7a964a21ac1677fc29fc99">tvm::auto_scheduler::State</a>
 , <a class="el" href="classtvm_1_1DictAttrs.html#adcb7e5ecede9d976bda30fb3f762c953">tvm::DictAttrs</a>
 , <a class="el" href="classtvm_1_1IRModule.html#ac3d7b217437ecefbd9096a57325ae29a">tvm::IRModule</a>
-, <a class="el" href="classtvm_1_1LinkedParam.html#a17df7ce77a67396945de4e185174e4b5">tvm::LinkedParam</a>
 , <a class="el" href="classtvm_1_1relay_1_1Call.html#ab9eee004a05e13a319c9f1db05602754">tvm::relay::Call</a>
 , <a class="el" href="classtvm_1_1relay_1_1Clause.html#a53074960bfc52dd8fbccfd543758f005">tvm::relay::Clause</a>
 , <a class="el" href="classtvm_1_1relay_1_1Function.html#ac085d821f02ee1e2a4927f85f72f6862">tvm::relay::Function</a>
@@ -1011,7 +1009,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1IncompleteType.html#a5956b02607eb9f4a61eeb3b250d41154">tvm::IncompleteType</a>
 , <a class="el" href="classtvm_1_1instrument_1_1PassInstrument.html#af3c70646089d0590beddec155bd04e6d">tvm::instrument::PassInstrument</a>
 , <a class="el" href="classtvm_1_1IntImm.html#a9ce59db1a112fb10b7f384b68a3afc9f">tvm::IntImm</a>
-, <a class="el" href="classtvm_1_1LinkedParam.html#af123bcc5f3ef0c0c5089f07e91fa3c19">tvm::LinkedParam</a>
 , <a class="el" href="classtvm_1_1MemoryInfo.html#aa913c570d467dda458f5ba5f5e7795be">tvm::MemoryInfo</a>
 , <a class="el" href="classtvm_1_1PointerType.html#abd62ac0b63821a91e4ed4946a1c7f941">tvm::PointerType</a>
 , <a class="el" href="classtvm_1_1PrimExpr.html#a3ad47a31c4ce693077a93f154b2b1e12">tvm::PrimExpr</a>
@@ -1169,7 +1166,7 @@ $(function() {
 </li>
 <li>TVMArgValue
 : <a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html#a7e8b2c6a4fde079ee813c425d2eb6b24">tvm::runtime::ObjectPtr&lt; T &gt;</a>
-, <a class="el" href="classtvm_1_1runtime_1_1TVMArgValue.html#a987b2fb283cea5484d4655e3f711c046">tvm::runtime::TVMArgValue</a>
+, <a class="el" href="classtvm_1_1runtime_1_1TVMArgValue.html#a5fbd71750e5bbba6edc9094178af9276">tvm::runtime::TVMArgValue</a>
 </li>
 <li>TVMMovableArgValue_
 : <a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html#acd985550cba6cf8509122cbd996c1557">tvm::runtime::ObjectPtr&lt; T &gt;</a>
@@ -1182,7 +1179,7 @@ $(function() {
 <li>TVMPODValue_
 : <a class="el" href="classtvm_1_1runtime_1_1NDArray.html#a9a9fd94393cfd7d4b6e6029348e3e19a">tvm::runtime::NDArray</a>
 , <a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html#a9a9fd94393cfd7d4b6e6029348e3e19a">tvm::runtime::ObjectPtr&lt; T &gt;</a>
-, <a class="el" href="classtvm_1_1runtime_1_1TVMPODValue__.html#a2f46b59a6c1d5eb4575d7f583b5f1a0c">tvm::runtime::TVMPODValue_</a>
+, <a class="el" href="classtvm_1_1runtime_1_1TVMPODValue__.html#afe1837bdbafe8341c2031c5cebcf6e74">tvm::runtime::TVMPODValue_</a>
 </li>
 <li>TVMRetValue
 : <a class="el" href="classtvm_1_1BaseAttrsNode.html#a1f56f080d0c1fab79d9469029aef8ebb">tvm::BaseAttrsNode</a>
@@ -1191,7 +1188,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html#ae0ea8b4adc6dab8c74086bceaef6b3e1">tvm::runtime::ObjectPtr&lt; T &gt;</a>
 , <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html#ae0ea8b4adc6dab8c74086bceaef6b3e1">tvm::runtime::ObjectRef</a>
 , <a class="el" href="classtvm_1_1runtime_1_1TVMPODValue__.html#ae0ea8b4adc6dab8c74086bceaef6b3e1">tvm::runtime::TVMPODValue_</a>
-, <a class="el" href="classtvm_1_1runtime_1_1TVMRetValue.html#ab86bf21f214fca72e73a7f6e20ffab8d">tvm::runtime::TVMRetValue</a>
+, <a class="el" href="classtvm_1_1runtime_1_1TVMRetValue.html#ac4a3850c0989e7c2d5cd8e0f096d0997">tvm::runtime::TVMRetValue</a>
 , <a class="el" href="classtvm_1_1runtime_1_1TypedPackedFunc_3_01R_07Args_8_8_8_08_4.html#ae0ea8b4adc6dab8c74086bceaef6b3e1">tvm::runtime::TypedPackedFunc&lt; R(Args...)&gt;</a>
 </li>
 <li>type
@@ -1263,7 +1260,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1TypedEnvFunc_3_01R_07Args_8_8_8_08_4.html#a41a6b9014d0feeb628ca7edfd0d26f0b">tvm::TypedEnvFunc&lt; R(Args...)&gt;</a>
 </li>
 <li>TypedPackedFunc()
-: <a class="el" href="classtvm_1_1runtime_1_1TypedPackedFunc_3_01R_07Args_8_8_8_08_4.html#a0161d426f9ca366c860ad48c384f7192">tvm::runtime::TypedPackedFunc&lt; R(Args...)&gt;</a>
+: <a class="el" href="classtvm_1_1runtime_1_1TypedPackedFunc_3_01R_07Args_8_8_8_08_4.html#afd8ee9dd9648c19b468bb4b0b00e8e4e">tvm::runtime::TypedPackedFunc&lt; R(Args...)&gt;</a>
 </li>
 <li>TypeIndex2Key()
 : <a class="el" href="classtvm_1_1runtime_1_1Object.html#a817ba6c23b7ee1821c48a75edf255a30">tvm::runtime::Object</a>
@@ -1286,7 +1283,7 @@ $(function() {
 : <a class="el" href="classtvm_1_1TypeRelation.html#ac26b1897eab8197ed26606ab81b7403b">tvm::TypeRelation</a>
 </li>
 <li>TypeReporter()
-: <a class="el" href="classtvm_1_1TypeReporter.html#a8e7e05a07f9f7ad9bea91f27afac9051">tvm::TypeReporter</a>
+: <a class="el" href="classtvm_1_1TypeReporter.html#aa3dc38a3c84d324d0b3a9f358460a091">tvm::TypeReporter</a>
 </li>
 <li>types
 : <a class="el" href="classtvm_1_1TupleAffineTypeNode.html#a30c834b7e1cb64467e6587ac16ebb187">tvm::TupleAffineTypeNode</a>
diff --git a/docs/reference/api/doxygen/functions_v.html b/docs/reference/api/doxygen/functions_v.html
index dcbd6760d..5f0558912 100644
--- a/docs/reference/api/doxygen/functions_v.html
+++ b/docs/reference/api/doxygen/functions_v.html
@@ -154,7 +154,7 @@ $(function() {
 , <a class="el" href="classtvm_1_1relay_1_1PatternVarNode.html#acfa1269806fbf19e7badd424c19c64bf">tvm::relay::PatternVarNode</a>
 </li>
 <li>Var()
-: <a class="el" href="classtvm_1_1relay_1_1Var.html#a06ef8ae1d07a5b8a3c25ca7775d17762">tvm::relay::Var</a>
+: <a class="el" href="classtvm_1_1relay_1_1Var.html#a45372a62057ee9332a391e29845505ff">tvm::relay::Var</a>
 </li>
 <li>var
 : <a class="el" href="classtvm_1_1tir_1_1IterVarNode.html#a09036ef2df09e7caf21e66dcb62675a6">tvm::tir::IterVarNode</a>
@@ -283,7 +283,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html#a89dddfdff1613182e02214ad88f78cce">tvm::instrument::PassInstrumentNode</a>
 , <a class="el" href="classtvm_1_1IntImmNode.html#a39ccfd3964e6d132ad8d4e4d544b5949">tvm::IntImmNode</a>
 , <a class="el" href="classtvm_1_1IRModuleNode.html#affbad8fa2513bd33cf8ac7d95aee132e">tvm::IRModuleNode</a>
-, <a class="el" href="classtvm_1_1LinkedParamNode.html#a16df477a0c00bd0423cf2d46de60bfe3">tvm::LinkedParamNode</a>
 , <a class="el" href="classtvm_1_1MemoryInfoNode.html#a93ff1429c6382dc4d17bf91d8dad5e81">tvm::MemoryInfoNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html#a1ef1c1f1d65ff784abf4c7e064d54637">tvm::meta_schedule::ApplyHistoryBestNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html#af640877ef243c29d4845977c62f1e12d">tvm::meta_schedule::BuilderInputNode</a>
@@ -481,8 +480,8 @@ $(function() {
 , <a class="el" href="classtvm_1_1relay_1_1DFPatternVisitor.html#ae7e67d3a1709b0a180572417698ffaa8">tvm::relay::DFPatternVisitor</a>
 </li>
 <li>VisitDFPattern_()
-: <a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor_3_01R_07const_01DFPattern_01_6n_00_01Args_8_8_8_08_4.html#aec22935453c78417c3acc1cd16947cd6">tvm::relay::DFPatternFunctor&lt; R(const DFPattern &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1relay_1_1DFPatternVisitor.html#aa211e582d65045611b52dd1bf79f3e18">tvm::relay::DFPatternVisitor</a>
+: <a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor_3_01R_07const_01DFPattern_01_6n_00_01Args_8_8_8_08_4.html#a3508d4f2f172303005aa3563c3f31646">tvm::relay::DFPatternFunctor&lt; R(const DFPattern &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1relay_1_1DFPatternVisitor.html#a054e15a4234d52df211a62c190b675fc">tvm::relay::DFPatternVisitor</a>
 </li>
 <li>VisitDFPatternDefault_()
 : <a class="el" href="classtvm_1_1relay_1_1DFPatternFunctor_3_01R_07const_01DFPattern_01_6n_00_01Args_8_8_8_08_4.html#a5b505cf396e6efcd18aeacb0177eeb2a">tvm::relay::DFPatternFunctor&lt; R(const DFPattern &amp;n, Args...)&gt;</a>
@@ -503,14 +502,14 @@ $(function() {
 , <a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html#a6d35a6081ee7dbc440e5a980f70795c6">tvm::tir::StmtVisitor</a>
 </li>
 <li>VisitExpr_()
-: <a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html#aa2d375485eefe5e7a9b3964bcec214f6">tvm::relay::ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1relay_1_1ExprMutator.html#a139d2a36b9a3b4cb3c2d80ec09ff645b">tvm::relay::ExprMutator</a>
-, <a class="el" href="classtvm_1_1relay_1_1ExprVisitor.html#a52cec47d5e4792dd1cf0f5635ab14fa8">tvm::relay::ExprVisitor</a>
+: <a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html#a9af46513cb9fbb62a718adf40ddbb950">tvm::relay::ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1relay_1_1ExprMutator.html#a03a4b48a2cdcf642f4cf3b9d55064f53">tvm::relay::ExprMutator</a>
+, <a class="el" href="classtvm_1_1relay_1_1ExprVisitor.html#a8b1ef43d965026767385fb3ee5791928">tvm::relay::ExprVisitor</a>
 , <a class="el" href="classtvm_1_1relay_1_1MixedModeMutator.html#a86656f533b4961437f53d1dbe30ae1fb">tvm::relay::MixedModeMutator</a>
-, <a class="el" href="classtvm_1_1relay_1_1MixedModeVisitor.html#a55a146580dac0de6220db966a9ac1fa5">tvm::relay::MixedModeVisitor</a>
-, <a class="el" href="classtvm_1_1tir_1_1ExprFunctor_3_01R_07const_01PrimExpr_01_6n_00_01Args_8_8_8_08_4.html#a5f29e35fa0bb60e57bbcb0dec2205f8a">tvm::tir::ExprFunctor&lt; R(const PrimExpr &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1tir_1_1ExprMutator.html#aa587c243decbe1667b93050e7e6128ff">tvm::tir::ExprMutator</a>
-, <a class="el" href="classtvm_1_1tir_1_1ExprVisitor.html#a7d7bd095902563e3e4c239bf322fe325">tvm::tir::ExprVisitor</a>
+, <a class="el" href="classtvm_1_1relay_1_1MixedModeVisitor.html#a109b942e0299536851a3dc53a02b1ddc">tvm::relay::MixedModeVisitor</a>
+, <a class="el" href="classtvm_1_1tir_1_1ExprFunctor_3_01R_07const_01PrimExpr_01_6n_00_01Args_8_8_8_08_4.html#a20601a5bc2e67867c54c3078cd7bca60">tvm::tir::ExprFunctor&lt; R(const PrimExpr &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1tir_1_1ExprMutator.html#aa0156d62a4bf1cf57ffab381125bfd6b">tvm::tir::ExprMutator</a>
+, <a class="el" href="classtvm_1_1tir_1_1ExprVisitor.html#a4490d95fc014da418769ac27589ea51b">tvm::tir::ExprVisitor</a>
 </li>
 <li>VisitExprDefault_()
 : <a class="el" href="classtvm_1_1relay_1_1ExprFunctor_3_01R_07const_01Expr_01_6n_00_01Args_8_8_8_08_4.html#ab35a37c57578e32a8c873cdfe9e31a0f">tvm::relay::ExprFunctor&lt; R(const Expr &amp;n, Args...)&gt;</a>
@@ -531,8 +530,8 @@ $(function() {
 , <a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html#ad6692c86b749bb0d93042aa2a0425a74">tvm::relay::PatternFunctor&lt; R(const Pattern &amp;n, Args...)&gt;</a>
 </li>
 <li>VisitPattern_()
-: <a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html#aa1bf3196b98aedddc028f14f5dcf5384">tvm::relay::PatternFunctor&lt; R(const Pattern &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1relay_1_1PatternMutator.html#a5c4cdc5bd1b1929edf9afa3cf85b9857">tvm::relay::PatternMutator</a>
+: <a class="el" href="classtvm_1_1relay_1_1PatternFunctor_3_01R_07const_01Pattern_01_6n_00_01Args_8_8_8_08_4.html#a11370205d1de851e817d40f031ad4811">tvm::relay::PatternFunctor&lt; R(const Pattern &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1relay_1_1PatternMutator.html#a45f7cdfa9d72a3ab0ce2cb4ea04fec5b">tvm::relay::PatternMutator</a>
 , <a class="el" href="classtvm_1_1relay_1_1PatternVisitor.html#ad5ed2a5c3b88ec027df9e4269dff4b80">tvm::relay::PatternVisitor</a>
 </li>
 <li>VisitPatternDefault_()
@@ -550,8 +549,8 @@ $(function() {
 </li>
 <li>VisitStmt_()
 : <a class="el" href="classtvm_1_1tir_1_1StmtFunctor_3_01R_07const_01Stmt_01_6n_00_01Args_8_8_8_01args_08_4.html#a313eb657690ea8e3bc724252285b9d4e">tvm::tir::StmtFunctor&lt; R(const Stmt &amp;n, Args... args)&gt;</a>
-, <a class="el" href="classtvm_1_1tir_1_1StmtMutator.html#a60b18d6d6bfcb692ab4a369465a175a3">tvm::tir::StmtMutator</a>
-, <a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html#a54d994b1b7deb653e908ff5e59bb691e">tvm::tir::StmtVisitor</a>
+, <a class="el" href="classtvm_1_1tir_1_1StmtMutator.html#a7fefa4227a37b988b91141f746181394">tvm::tir::StmtMutator</a>
+, <a class="el" href="classtvm_1_1tir_1_1StmtVisitor.html#afcb1a0ec03b7a7da4304c5b790b27210">tvm::tir::StmtVisitor</a>
 </li>
 <li>VisitStmtDefault_()
 : <a class="el" href="classtvm_1_1tir_1_1StmtFunctor_3_01R_07const_01Stmt_01_6n_00_01Args_8_8_8_01args_08_4.html#ae51b328e2b59a50bed7112a93dba1aae">tvm::tir::StmtFunctor&lt; R(const Stmt &amp;n, Args... args)&gt;</a>
@@ -565,9 +564,9 @@ $(function() {
 , <a class="el" href="classtvm_1_1TypeMutator.html#a84e824911927d98e20a338eab8b75a45">tvm::TypeMutator</a>
 </li>
 <li>VisitType_()
-: <a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html#a949688cb9b4a8fa16ddc3e5fbbf13580">tvm::TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a>
-, <a class="el" href="classtvm_1_1TypeMutator.html#a18a04668d3fb464d957f3a26a4274104">tvm::TypeMutator</a>
-, <a class="el" href="classtvm_1_1TypeVisitor.html#a82c83b1524502579f56d194138badd3e">tvm::TypeVisitor</a>
+: <a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html#a2dfb9ba3f052f560506c12ddcb6040d8">tvm::TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a>
+, <a class="el" href="classtvm_1_1TypeMutator.html#a76915eced75e531e0ca73ad6882bbaae">tvm::TypeMutator</a>
+, <a class="el" href="classtvm_1_1TypeVisitor.html#a2d6a319537d4d3dba04054f3ef8f32f9">tvm::TypeVisitor</a>
 </li>
 <li>VisitTypeDefault_()
 : <a class="el" href="classtvm_1_1TypeFunctor_3_01R_07const_01Type_01_6n_00_01Args_8_8_8_08_4.html#a91553f9e04c39b3821a70ae4f7b0c597">tvm::TypeFunctor&lt; R(const Type &amp;n, Args...)&gt;</a>
diff --git a/docs/reference/api/doxygen/functions_vars.html b/docs/reference/api/doxygen/functions_vars.html
index 26b90f985..30f917ecd 100644
--- a/docs/reference/api/doxygen/functions_vars.html
+++ b/docs/reference/api/doxygen/functions_vars.html
@@ -245,7 +245,6 @@ $(function() {
 , <a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html#a28e948c2dd9aa4d4809fabdd953dc33b">tvm::instrument::PassInstrumentNode</a>
 , <a class="el" href="classtvm_1_1IntImmNode.html#a4839f79367838a2baf94f33e6a98f072">tvm::IntImmNode</a>
 , <a class="el" href="classtvm_1_1IRModuleNode.html#a6437f77d18cf9a45f2c183d050605d15">tvm::IRModuleNode</a>
-, <a class="el" href="classtvm_1_1LinkedParamNode.html#a0fcaf48a2f8251d405730bd59fa16f4b">tvm::LinkedParamNode</a>
 , <a class="el" href="classtvm_1_1MemoryInfoNode.html#a679e07b35ff1da70063e3146e7cbb9dd">tvm::MemoryInfoNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html#a755012568d85aa7cba250c5f8be766cc">tvm::meta_schedule::ApplyHistoryBestNode</a>
 , <a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfoNode.html#abde212d9062947bdbbdff5905dca87a3">tvm::meta_schedule::ArgInfoNode</a>
diff --git a/docs/reference/api/doxygen/functions_vars_i.html b/docs/reference/api/doxygen/functions_vars_i.html
index 14405db51..3281647ac 100644
--- a/docs/reference/api/doxygen/functions_vars_i.html
+++ b/docs/reference/api/doxygen/functions_vars_i.html
@@ -61,9 +61,6 @@ $(function() {
 &#160;
 
 <h3><a id="index_i"></a>- i -</h3><ul>
-<li>id
-: <a class="el" href="classtvm_1_1LinkedParamNode.html#a6000f7f468b8db072935053a1ac1fbf4">tvm::LinkedParamNode</a>
-</li>
 <li>id_index
 : <a class="el" href="structtvm_1_1relay_1_1GetValidCountsAttrs.html#ac389b60b8ef5e90becba282516860c8e">tvm::relay::GetValidCountsAttrs</a>
 , <a class="el" href="structtvm_1_1relay_1_1NonMaximumSuppressionAttrs.html#a30440265e31b996c01f9732be77156cb">tvm::relay::NonMaximumSuppressionAttrs</a>
diff --git a/docs/reference/api/doxygen/functions_vars_p.html b/docs/reference/api/doxygen/functions_vars_p.html
index e2eb1d8f5..59a4f8e5e 100644
--- a/docs/reference/api/doxygen/functions_vars_p.html
+++ b/docs/reference/api/doxygen/functions_vars_p.html
@@ -116,9 +116,6 @@ $(function() {
 <li>paddings
 : <a class="el" href="structtvm_1_1relay_1_1SpaceToBatchNDAttrs.html#aabc579d65229d49279a1c3a903a99095">tvm::relay::SpaceToBatchNDAttrs</a>
 </li>
-<li>param
-: <a class="el" href="classtvm_1_1LinkedParamNode.html#a96d730a027c9e169786f3aaea2e4cc10">tvm::LinkedParamNode</a>
-</li>
 <li>param_device_indexes
 : <a class="el" href="structtvm_1_1runtime_1_1vm_1_1VMFunction.html#afff8cae6bf6100376c4275b301a11828">tvm::runtime::vm::VMFunction</a>
 </li>
diff --git a/docs/reference/api/doxygen/hierarchy.html b/docs/reference/api/doxygen/hierarchy.html
index c037274db..f75506105 100644
--- a/docs/reference/api/doxygen/hierarchy.html
+++ b/docs/reference/api/doxygen/hierarchy.html
@@ -609,201 +609,200 @@ This inheritance list is sorted roughly, but not completely, alphabetically:</di
 <tr id="row_98_37_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1GenericFuncNode.html" target="_self">tvm::GenericFuncNode</a></td><td class="desc">Represents a generic function that can be specialized on a per-target basis </td></tr>
 <tr id="row_98_38_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html" target="_self">tvm::instrument::PassInstrumentNode</a></td><td class="desc"><a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html" title="PassInstrumentNode forms an instrument implementation. It provides API for users to register  [...]
 <tr id="row_98_39_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1IRModuleNode.html" target="_self">tvm::IRModuleNode</a></td><td class="desc"><a class="el" href="classtvm_1_1IRModule.html" title="Managed reference class to IRModuleNode. ">IRModule</a> that holds functions and type definitions </td></tr>
-<tr id="row_98_40_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1LinkedParamNode.html" target="_self">tvm::LinkedParamNode</a></td><td class="desc">Describes one parameter that should be linked into the generated module </td></tr>
-<tr id="row_98_41_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1MemoryInfoNode.html" target="_self">tvm::MemoryInfoNode</a></td><td class="desc">Memory information of special memory region. Use <a class="el" href="classtvm_1_1MemoryInfo.html" title="Defines memory info. ">MemoryInfo</a> as its container type </td></tr>
-<tr id="row_98_42_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html" target="_self">tvm::meta_schedule::ApplyHistoryBestNode</a></td><td class="desc">An integration context that allows application of historically best records from a database </td></tr>
-<tr id="row_98_43_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_43_" class="arrow" onclick="toggleFolder('98_43_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfoNode.html" target="_self">tvm::meta_schedule::ArgInfoNode</a></td><td class="desc">The argument information </td></tr>
-<tr id="row_98_43_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TensorInfoNode.html" target="_self">tvm::meta_schedule::TensorInfoNode</a></td><td class="desc">The tensor argument information </td></tr>
-<tr id="row_98_44_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html" target="_self">tvm::meta_schedule::BuilderInputNode</a></td><td class="desc">The builder's input, containing an <a class="el" href="classtvm_1_1IRModule.html" title="Managed reference class to IRModuleNode. ">IRModule</a> and the target </td></tr>
-<tr id="row_98_45_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_45_" class="arrow" onclick="toggleFolder('98_45_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderNode.html" target="_self">tvm::meta_schedule::BuilderNode</a></td><td class="desc">The abstract builder interface </td></tr>
-<tr id="row_98_45_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyBuilderNode.html" target="_self">tvm::meta_schedule::PyBuilderNode</a></td><td class="desc">An abstract builder with customized build method on the python-side </td></tr>
-<tr id="row_98_46_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResultNode.html" target="_self">tvm::meta_schedule::BuilderResultNode</a></td><td class="desc">The builder's output, containing the artifact path or error message if any </td></tr>
-<tr id="row_98_47_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_47_" class="arrow" onclick="toggleFolder('98_47_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1CostModelNode.html" target="_self">tvm::meta_schedule::CostModelNode</a></td><td class="desc">Cost model </td></tr>
-<tr id="row_98_47_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyCostModelNode.html" target="_self">tvm::meta_schedule::PyCostModelNode</a></td><td class="desc">The cost model with customized methods on the python-side </td></tr>
-<tr id="row_98_48_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_48_" class="arrow" onclick="toggleFolder('98_48_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1DatabaseNode.html" target="_self">tvm::meta_schedule::DatabaseNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_48_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyDatabaseNode.html" target="_self">tvm::meta_schedule::PyDatabaseNode</a></td><td class="desc">The database with customized methods on the python-side </td></tr>
-<tr id="row_98_49_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html" target="_self">tvm::meta_schedule::ExtractedTaskNode</a></td><td class="desc">A tuning task extracted from the high-level IR </td></tr>
-<tr id="row_98_50_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_50_" class="arrow" onclick="toggleFolder('98_50_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractorNode.html" target="_self">tvm::meta_schedule::FeatureExtractorNode</a></td><td class="desc">Extractor for features from measure candidates for use in cost model </td></tr>
-<tr id="row_98_50_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyFeatureExtractorNode.html" target="_self">tvm::meta_schedule::PyFeatureExtractorNode</a></td><td class="desc">The feature extractor with customized methods on the python-side </td></tr>
-<tr id="row_98_51_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_51_" class="arrow" onclick="toggleFolder('98_51_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallbackNode.html" target="_self">tvm::meta_schedule::MeasureCallbackNode</a></td><td class="desc">Rules to apply after measure results is available </td></tr>
-<tr id="row_98_51_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyMeasureCallbackNode.html" target="_self">tvm::meta_schedule::PyMeasureCallbackNode</a></td><td class="desc">The measure callback with customized methods on the python-side </td></tr>
-<tr id="row_98_52_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html" target="_self">tvm::meta_schedule::MeasureCandidateNode</a></td><td class="desc">The schedule (with input shapes) to be measured </td></tr>
-<tr id="row_98_53_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_53_" class="arrow" onclick="toggleFolder('98_53_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MutatorNode.html" target="_self">tvm::meta_schedule::MutatorNode</a></td><td class="desc"><a class="el" href="classtvm_1_1meta__schedule_1_1Mutator.html" title="Managed reference to Mut [...]
-<tr id="row_98_53_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyMutatorNode.html" target="_self">tvm::meta_schedule::PyMutatorNode</a></td><td class="desc">The mutator with customized methods on the python-side </td></tr>
-<tr id="row_98_54_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_54_" class="arrow" onclick="toggleFolder('98_54_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PostprocNode.html" target="_self">tvm::meta_schedule::PostprocNode</a></td><td class="desc">Rules to apply a postprocessor to a schedule </td></tr>
-<tr id="row_98_54_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyPostprocNode.html" target="_self">tvm::meta_schedule::PyPostprocNode</a></td><td class="desc">The postprocessor with customized methods on the python-side </td></tr>
-<tr id="row_98_55_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerFutureNode.html" target="_self">tvm::meta_schedule::RunnerFutureNode</a></td><td class="desc">A class to asynchronously fetch runner's output </td></tr>
-<tr id="row_98_56_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerInputNode.html" target="_self">tvm::meta_schedule::RunnerInputNode</a></td><td class="desc"><a class="el" href="classtvm_1_1meta__schedule_1_1Runner.html" title="Managed reference to RunnerNode. ">Runner</a>'s input containing path of artifact, type of device an [...]
-<tr id="row_98_57_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_57_" class="arrow" onclick="toggleFolder('98_57_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerNode.html" target="_self">tvm::meta_schedule::RunnerNode</a></td><td class="desc">The abstract runner interface </td></tr>
-<tr id="row_98_57_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyRunnerNode.html" target="_self">tvm::meta_schedule::PyRunnerNode</a></td><td class="desc">An abstract runner with customized build method on the python-side </td></tr>
-<tr id="row_98_58_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerResultNode.html" target="_self">tvm::meta_schedule::RunnerResultNode</a></td><td class="desc"><a class="el" href="classtvm_1_1meta__schedule_1_1Runner.html" title="Managed reference to RunnerNode. ">Runner</a>'s output containing measurement result of <a class=" [...]
-<tr id="row_98_59_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_59_" class="arrow" onclick="toggleFolder('98_59_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ScheduleRuleNode.html" target="_self">tvm::meta_schedule::ScheduleRuleNode</a></td><td class="desc">Rules to modify a block in a schedule </td></tr>
-<tr id="row_98_59_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyScheduleRuleNode.html" target="_self">tvm::meta_schedule::PyScheduleRuleNode</a></td><td class="desc">The schedule rule with customized methods on the python-side </td></tr>
-<tr id="row_98_60_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_60_" class="arrow" onclick="toggleFolder('98_60_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1SearchStrategyNode.html" target="_self">tvm::meta_schedule::SearchStrategyNode</a></td><td class="desc">The search strategy for measure candidates generation </td></tr>
-<tr id="row_98_60_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PySearchStrategyNode.html" target="_self">tvm::meta_schedule::PySearchStrategyNode</a></td><td class="desc">The python side customizable class for measure candidate generation </td></tr>
-<tr id="row_98_61_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_61_" class="arrow" onclick="toggleFolder('98_61_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1SpaceGeneratorNode.html" target="_self">tvm::meta_schedule::SpaceGeneratorNode</a></td><td class="desc">The abstract class for design space generation </td></tr>
-<tr id="row_98_61_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PySpaceGeneratorNode.html" target="_self">tvm::meta_schedule::PySpaceGeneratorNode</a></td><td class="desc">The design space generator with customized methods on the python-side </td></tr>
-<tr id="row_98_62_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_62_" class="arrow" onclick="toggleFolder('98_62_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TaskSchedulerNode.html" target="_self">tvm::meta_schedule::TaskSchedulerNode</a></td><td class="desc">The abstract interface of task schedulers </td></tr>
-<tr id="row_98_62_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyTaskSchedulerNode.html" target="_self">tvm::meta_schedule::PyTaskSchedulerNode</a></td><td class="desc">The task scheduler with customized methods on the python-side </td></tr>
-<tr id="row_98_63_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TuneContextNode.html" target="_self">tvm::meta_schedule::TuneContextNode</a></td><td class="desc">The auto tuning context </td></tr>
-<tr id="row_98_64_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TuningRecordNode.html" target="_self">tvm::meta_schedule::TuningRecordNode</a></td><td class="desc">The class of tuning records </td></tr>
-<tr id="row_98_65_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1WorkloadNode.html" target="_self">tvm::meta_schedule::WorkloadNode</a></td><td class="desc">A workload, i.e. an <a class="el" href="classtvm_1_1IRModule.html" title="Managed reference class to IRModuleNode. ">IRModule</a> and its structural hash </td></tr>
-<tr id="row_98_66_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1parser_1_1SourceMapNode.html" target="_self">tvm::parser::SourceMapNode</a></td><td class="desc">Stores locations in frontend source that generated a node </td></tr>
-<tr id="row_98_67_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1parser_1_1SourceNode.html" target="_self">tvm::parser::SourceNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_68_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1PoolInfoNode.html" target="_self">tvm::PoolInfoNode</a></td><td class="desc">Describes a pool of memory accessible by one or more targets </td></tr>
-<tr id="row_98_69_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RangeNode.html" target="_self">tvm::RangeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Range.html" title="Range constainer. ">Range</a> over one dimension </td></tr>
-<tr id="row_98_70_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ClauseNode.html" target="_self">tvm::relay::ClauseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Clause.html">Clause</a> container node </td></tr>
-<tr id="row_98_71_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1relay_1_1ConstructorValueObj.html" target="_self">tvm::relay::ConstructorValueObj</a></td><td class="desc"></td></tr>
-<tr id="row_98_72_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DFPatternCallbackNode.html" target="_self">tvm::relay::DFPatternCallbackNode</a></td><td class="desc">Base type of all dataflow pattern callbacks </td></tr>
-<tr id="row_98_73_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_73_" class="arrow" onclick="toggleFolder('98_73_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DFPatternNode.html" target="_self">tvm::relay::DFPatternNode</a></td><td class="desc">Base type of all dataflow patterns </td></tr>
-<tr id="row_98_73_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1AltPatternNode.html" target="_self">tvm::relay::AltPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Alternate Expressions </td></tr>
-<tr id="row_98_73_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1AttrPatternNode.html" target="_self">tvm::relay::AttrPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Attributes </td></tr>
-<tr id="row_98_73_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1CallPatternNode.html" target="_self">tvm::relay::CallPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1CallPattern.html">CallPattern</a> container </td></tr>
-<tr id="row_98_73_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ConstantPatternNode.html" target="_self">tvm::relay::ConstantPatternNode</a></td><td class="desc">Container for <a class="el" href="classtvm_1_1relay_1_1Constant.html">Constant</a> </td></tr>
-<tr id="row_98_73_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DataTypePatternNode.html" target="_self">tvm::relay::DataTypePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Types </td></tr>
-<tr id="row_98_73_5_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DominatorPatternNode.html" target="_self">tvm::relay::DominatorPatternNode</a></td><td class="desc">Dominated Graph <a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> <a class="el" href="cla [...]
-<tr id="row_98_73_6_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ExprPatternNode.html" target="_self">tvm::relay::ExprPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Relay Expression </td></tr>
-<tr id="row_98_73_7_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1FunctionPatternNode.html" target="_self">tvm::relay::FunctionPatternNode</a></td><td class="desc">Relay <a class="el" href="classtvm_1_1relay_1_1Function.html" title="Managed reference to FunctionNode. ">Function</a> container </td></tr>
-<tr id="row_98_73_8_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1IfPatternNode.html" target="_self">tvm::relay::IfPatternNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_73_9_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1LetPatternNode.html" target="_self">tvm::relay::LetPatternNode</a></td><td class="desc">A binding of a sub-network </td></tr>
-<tr id="row_98_73_10_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ShapePatternNode.html" target="_self">tvm::relay::ShapePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Shapes </td></tr>
-<tr id="row_98_73_11_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TupleGetItemPatternNode.html" target="_self">tvm::relay::TupleGetItemPatternNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_73_12_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TuplePatternNode.html" target="_self">tvm::relay::TuplePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Tuple.html">Tuple</a> container </td></tr>
-<tr id="row_98_73_13_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TypePatternNode.html" target="_self">tvm::relay::TypePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Types </td></tr>
-<tr id="row_98_73_14_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1VarPatternNode.html" target="_self">tvm::relay::VarPatternNode</a></td><td class="desc">Container for <a class="el" href="classtvm_1_1relay_1_1Var.html">Var</a> </td></tr>
-<tr id="row_98_73_15_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1WildcardPatternNode.html" target="_self">tvm::relay::WildcardPatternNode</a></td><td class="desc">Wildcard <a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> </td></tr>
-<tr id="row_98_74_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ExecutorNode.html" target="_self">tvm::relay::ExecutorNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Executor.html" title="Managed reference class to ExecutorNode. ">Executor</a> information </td></tr>
-<tr id="row_98_75_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1IdNode.html" target="_self">tvm::relay::IdNode</a></td><td class="desc">The unique identifier of variables </td></tr>
-<tr id="row_98_76_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpImplementationNode.html" target="_self">tvm::relay::OpImplementationNode</a></td><td class="desc">Operator implementation that includes compute and schedule function </td></tr>
-<tr id="row_98_77_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpSpecializationNode.html" target="_self">tvm::relay::OpSpecializationNode</a></td><td class="desc">Specialized implementations for operators under certain conditions </td></tr>
-<tr id="row_98_78_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpStrategyNode.html" target="_self">tvm::relay::OpStrategyNode</a></td><td class="desc">Operator strategy to choose implementation </td></tr>
-<tr id="row_98_79_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RecClosureObj.html" target="_self">tvm::relay::RecClosureObj</a></td><td class="desc">The container type of <a class="el" href="classtvm_1_1relay_1_1RecClosure.html">RecClosure</a> </td></tr>
-<tr id="row_98_80_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1relay_1_1RefValueObj.html" target="_self">tvm::relay::RefValueObj</a></td><td class="desc"></td></tr>
-<tr id="row_98_81_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_81_" class="arrow" onclick="toggleFolder('98_81_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RelayNode.html" target="_self">tvm::relay::RelayNode</a></td><td class="desc">This is the base node container of all relay structures </td></tr>
-<tr id="row_98_81_0_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_81_0_" class="arrow" onclick="toggleFolder('98_81_0_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternNode.html" target="_self">tvm::relay::PatternNode</a></td><td class="desc">Base type for declaring relay pattern </td></tr>
-<tr id="row_98_81_0_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternConstructorNode.html" target="_self">tvm::relay::PatternConstructorNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> container node </td></tr>
-<tr id="row_98_81_0_1_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternTupleNode.html" target="_self">tvm::relay::PatternTupleNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> container node </td></tr>
-<tr id="row_98_81_0_2_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternVarNode.html" target="_self">tvm::relay::PatternVarNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> container node </td></tr>
-<tr id="row_98_81_0_3_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternWildcardNode.html" target="_self">tvm::relay::PatternWildcardNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternWildcard.html">PatternWildcard</a> container node </td></tr>
-<tr id="row_98_82_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RuntimeNode.html" target="_self">tvm::relay::RuntimeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Runtime.html" title="Managed reference class to RuntimeNode. ">Runtime</a> information </td></tr>
-<tr id="row_98_83_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ADTObj.html" target="_self">tvm::runtime::ADTObj</a></td><td class="desc">An object representing a structure or enumeration </td></tr>
-<tr id="row_98_84_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ArrayNode.html" target="_self">tvm::runtime::ArrayNode</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Array.html" title="Array, container representing a contiguous sequence of ObjectRefs. ">Array</a> node content in array </td></tr>
-<tr id="row_98_85_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_85_" class="arrow" onclick="toggleFolder('98_85_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ClosureObj.html" target="_self">tvm::runtime::ClosureObj</a></td><td class="desc">An object representing a closure. This object is used by both the Relay VM and interpreter </td></tr>
-<tr id="row_98_85_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1InterpreterClosureObj.html" target="_self">tvm::relay::InterpreterClosureObj</a></td><td class="desc">The container type of Closures used by the interpreter </td></tr>
-<tr id="row_98_85_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1VMClosureObj.html" target="_self">tvm::runtime::vm::VMClosureObj</a></td><td class="desc">An object representing a vm closure </td></tr>
-<tr id="row_98_86_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_86_" class="arrow" onclick="toggleFolder('98_86_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1MapNode.html" target="_self">tvm::runtime::MapNode</a></td><td class="desc">Shared content of all specializations of hash map </td></tr>
-<tr id="row_98_86_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1DenseMapNode.html" target="_self">tvm::runtime::DenseMapNode</a></td><td class="desc">A specialization of hash map that implements the idea of array-based hash map. Another reference implementation can be found [1] </td></tr>
-<tr id="row_98_86_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1SmallMapNode.html" target="_self">tvm::runtime::SmallMapNode</a></td><td class="desc">A specialization of small-sized hash map </td></tr>
-<tr id="row_98_87_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_87_" class="arrow" onclick="toggleFolder('98_87_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataBaseNode.html" target="_self">tvm::runtime::metadata::MetadataBaseNode</a></td><td class="desc">Common base class for all Metadata </td></tr>
-<tr id="row_98_87_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArrayNode.html" target="_self">tvm::runtime::metadata::MetadataArrayNode</a></td><td class="desc">Container for arrays in the metadata </td></tr>
-<tr id="row_98_88_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_88_" class="arrow" onclick="toggleFolder('98_88_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ModuleNode.html" target="_self">tvm::runtime::ModuleNode</a></td><td class="desc">Base container of module </td></tr>
-<tr id="row_98_88_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1Executable.html" target="_self">tvm::runtime::vm::Executable</a></td><td class="desc">The executable emitted by the VM compiler </td></tr>
-<tr id="row_98_88_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1VirtualMachine.html" target="_self">tvm::runtime::vm::VirtualMachine</a></td><td class="desc">The virtual machine </td></tr>
-<tr id="row_98_89_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1NDArray_1_1Container.html" target="_self">tvm::runtime::NDArray::Container</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Object.html" title="base class of all object containers. ">Object</a> container class that backs <a class="el" href="classtvm_1_1run [...]
-<tr id="row_98_90_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_90_" class="arrow" onclick="toggleFolder('98_90_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1PackedFuncObj.html" target="_self">tvm::runtime::PackedFuncObj</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Object.html" title="base class of all object containers. ">Ob [...]
-<tr id="row_98_90_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1PackedFuncSubObj.html" target="_self">tvm::runtime::PackedFuncSubObj&lt; TCallable &gt;</a></td><td class="desc">Derived object class for constructing <a class="el" href="classtvm_1_1runtime_1_1PackedFuncObj.html" title="Object container class that backs PackedFunc. ">Pack [...]
-<tr id="row_98_91_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1CountNode.html" target="_self">tvm::runtime::profiling::CountNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_92_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1runtime_1_1profiling_1_1DeviceWrapperNode.html" target="_self">tvm::runtime::profiling::DeviceWrapperNode</a></td><td class="desc">Wrapper for <code>Device</code> because <code>Device</code> is not passable across the <a class="el" href="classtvm_1_1runtime_1_1PackedFunc.html" title=" [...]
-<tr id="row_98_93_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1DurationNode.html" target="_self">tvm::runtime::profiling::DurationNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_94_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollectorNode.html" target="_self">tvm::runtime::profiling::MetricCollectorNode</a></td><td class="desc">Interface for user defined profiling metric collection </td></tr>
-<tr id="row_98_95_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1PercentNode.html" target="_self">tvm::runtime::profiling::PercentNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_96_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1ReportNode.html" target="_self">tvm::runtime::profiling::ReportNode</a></td><td class="desc">Data collected from a profiling run. Includes per-call metrics and per-device metrics </td></tr>
-<tr id="row_98_97_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_97_" class="arrow" onclick="toggleFolder('98_97_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ShapeTupleObj.html" target="_self">tvm::runtime::ShapeTupleObj</a></td><td class="desc">An object representing a shape tuple </td></tr>
-<tr id="row_98_97_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ShapeTupleObj_1_1FromStd.html" target="_self">tvm::runtime::ShapeTupleObj::FromStd</a></td><td class="desc">An object representing shape tuple moved from std::vector </td></tr>
-<tr id="row_98_98_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_98_" class="arrow" onclick="toggleFolder('98_98_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1StringObj.html" target="_self">tvm::runtime::StringObj</a></td><td class="desc">An object representing string. It's POD type </td></tr>
-<tr id="row_98_98_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1StringObj_1_1FromStd.html" target="_self">tvm::runtime::StringObj::FromStd</a></td><td class="desc">An object representing string moved from std::string </td></tr>
-<tr id="row_98_99_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1TimerNode.html" target="_self">tvm::runtime::TimerNode</a></td><td class="desc">Base class for all implementations </td></tr>
-<tr id="row_98_100_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1StorageObj.html" target="_self">tvm::runtime::vm::StorageObj</a></td><td class="desc">An object representing a storage allocation </td></tr>
-<tr id="row_98_101_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SourceNameNode.html" target="_self">tvm::SourceNameNode</a></td><td class="desc">The name of a source fragment </td></tr>
-<tr id="row_98_102_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SpanNode.html" target="_self">tvm::SpanNode</a></td><td class="desc">Stores locations in frontend source that generated a node </td></tr>
-<tr id="row_98_103_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindNode.html" target="_self">tvm::TargetKindNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Target.html" title="Managed reference class to TargetNode. ">Target</a> kind, specifies the kind of the target </td></tr>
-<tr id="row_98_104_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetNode.html" target="_self">tvm::TargetNode</a></td><td class="desc">Compilation target </td></tr>
-<tr id="row_98_105_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTagNode.html" target="_self">tvm::TargetTagNode</a></td><td class="desc">A target tag </td></tr>
-<tr id="row_98_106_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1IterVarAttrNode.html" target="_self">tvm::te::IterVarAttrNode</a></td><td class="desc">Node container for IterVar attr </td></tr>
-<tr id="row_98_107_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_107_" class="arrow" onclick="toggleFolder('98_107_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1IterVarRelationNode.html" target="_self">tvm::te::IterVarRelationNode</a></td><td class="desc">Base node of iteration var </td></tr>
-<tr id="row_98_107_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1FuseNode.html" target="_self">tvm::te::FuseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Fuse.html" title="Managed reference to FuseNode. ">Fuse</a> two domains into one domain </td></tr>
-<tr id="row_98_107_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1RebaseNode.html" target="_self">tvm::te::RebaseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Rebase.html" title="Managed reference to RebaseNode. ">Rebase</a> the iteration to make min to be 0. This is useful to normalize the <a class="el" href="classtvm_ [...]
-<tr id="row_98_107_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1SingletonNode.html" target="_self">tvm::te::SingletonNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Singleton.html" title="Managed reference to SingletonNode. ">Singleton</a> iterator [0, 1) </td></tr>
-<tr id="row_98_107_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1SplitNode.html" target="_self">tvm::te::SplitNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Split.html" title="Managed reference to SplitNode. ">Split</a> the parent domain into product of outer and iter </td></tr>
-<tr id="row_98_107_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TransformNode.html" target="_self">tvm::te::TransformNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Transform.html">Transform</a> iterator according to some arbitrary expression </td></tr>
-<tr id="row_98_108_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_108_" class="arrow" onclick="toggleFolder('98_108_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1OperationNode.html" target="_self">tvm::te::OperationNode</a></td><td class="desc">Base class of all operation nodes </td></tr>
-<tr id="row_98_108_0_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_108_0_" class="arrow" onclick="toggleFolder('98_108_0_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1BaseComputeOpNode.html" target="_self">tvm::te::BaseComputeOpNode</a></td><td class="desc">A Compute op that compute a tensor on certain domain. This is the base class for <a class="el" hr [...]
-<tr id="row_98_108_0_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ComputeOpNode.html" target="_self">tvm::te::ComputeOpNode</a></td><td class="desc">A Compute op that compute a tensor on certain domain </td></tr>
-<tr id="row_98_108_0_1_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorComputeOpNode.html" target="_self">tvm::te::TensorComputeOpNode</a></td><td class="desc">A TenorCompute op that compute a tensor with an tensor intrinsic </td></tr>
-<tr id="row_98_108_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ExternOpNode.html" target="_self">tvm::te::ExternOpNode</a></td><td class="desc">External computation that cannot be splitted </td></tr>
-<tr id="row_98_108_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1HybridOpNode.html" target="_self">tvm::te::HybridOpNode</a></td><td class="desc">A computation operator that generated by hybrid script </td></tr>
-<tr id="row_98_108_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1PlaceholderOpNode.html" target="_self">tvm::te::PlaceholderOpNode</a></td><td class="desc">A placeholder op represents an input placeholder </td></tr>
-<tr id="row_98_108_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ScanOpNode.html" target="_self">tvm::te::ScanOpNode</a></td><td class="desc">Symbolic scan </td></tr>
-<tr id="row_98_109_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ScheduleNode.html" target="_self">tvm::te::ScheduleNode</a></td><td class="desc">Node container for schedule </td></tr>
-<tr id="row_98_110_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1SpecializedConditionNode.html" target="_self">tvm::te::SpecializedConditionNode</a></td><td class="desc">Container for specialization conditions </td></tr>
-<tr id="row_98_111_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1StageNode.html" target="_self">tvm::te::StageNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Stage.html" title="Stage, contains scheduling for a stage of computation. ">Stage</a> </td></tr>
-<tr id="row_98_112_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorIntrinCallNode.html" target="_self">tvm::te::TensorIntrinCallNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_113_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorIntrinNode.html" target="_self">tvm::te::TensorIntrinNode</a></td><td class="desc">Node to represent a <a class="el" href="classtvm_1_1te_1_1Tensor.html" title="Tensor structure representing a possible input, or intermediate computation result. ">Tensor</a> intrinsic opera [...]
-<tr id="row_98_114_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BijectiveLayoutNode.html" target="_self">tvm::tir::BijectiveLayoutNode</a></td><td class="desc"></td></tr>
-<tr id="row_98_115_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockRVNode.html" target="_self">tvm::tir::BlockRVNode</a></td><td class="desc">A random variable that evaluates to a TensorIR block </td></tr>
-<tr id="row_98_116_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockScopeNode.html" target="_self">tvm::tir::BlockScopeNode</a></td><td class="desc">An object with 1-to-1 correspondence with each block reference in the sref tree. This data structure is used to track the producer-consumer dependencies between blocks. <a class="el" href="cla [...]
-<tr id="row_98_117_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferNode.html" target="_self">tvm::tir::BufferNode</a></td><td class="desc">Node to represent a buffer </td></tr>
-<tr id="row_98_118_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferRegionNode.html" target="_self">tvm::tir::BufferRegionNode</a></td><td class="desc">Representing the region of multi-dimensional buffer access </td></tr>
-<tr id="row_98_119_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1CommReducerNode.html" target="_self">tvm::tir::CommReducerNode</a></td><td class="desc">A commutative reducer node to represent a commutative binary operator with identity element </td></tr>
-<tr id="row_98_120_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_120_" class="arrow" onclick="toggleFolder('98_120_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1DataProducerNode.html" target="_self">tvm::tir::DataProducerNode</a></td><td class="desc">Base node for data producers </td></tr>
-<tr id="row_98_120_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorNode.html" target="_self">tvm::te::TensorNode</a></td><td class="desc">Node to represent a tensor </td></tr>
-<tr id="row_98_121_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1DependencyNode.html" target="_self">tvm::tir::DependencyNode</a></td><td class="desc">A tuple (src, dst, kind) representing certain types of dependency. <a class="el" href="classtvm_1_1tir_1_1For.html" title="Managed reference to ForNode. ">For</a> example, (A, B, kRAW) means b [...]
-<tr id="row_98_122_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IndexMapNode.html" target="_self">tvm::tir::IndexMapNode</a></td><td class="desc">Defines a mapping between two representations of indices into a buffer </td></tr>
-<tr id="row_98_123_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1InstructionKindNode.html" target="_self">tvm::tir::InstructionKindNode</a></td><td class="desc">Kind of an instruction, e.g. Split, Reorder, etc. Besides the name, every kind of instruction has its own properties, including: 1) A boolean indicating if the instruction is pure, i [...]
-<tr id="row_98_124_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1InstructionNode.html" target="_self">tvm::tir::InstructionNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Schedule.html" title="Managed reference to ScheduleNode. ">Schedule</a> instructions each corresponds to a schedule primitive </td></tr>
-<tr id="row_98_125_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IterVarNode.html" target="_self">tvm::tir::IterVarNode</a></td><td class="desc">An iteration variable representing an iteration over a one dimensional interval </td></tr>
-<tr id="row_98_126_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1LayoutNode.html" target="_self">tvm::tir::LayoutNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Layout.html" title="Managed reference to LayoutNode. ">Layout</a> is to describe how data is organized within an N-dimention tensor. It is composed of upper case [...]
-<tr id="row_98_127_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1LoopRVNode.html" target="_self">tvm::tir::LoopRVNode</a></td><td class="desc">A random variable that evaluates to a TensorIR for loop </td></tr>
-<tr id="row_98_128_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1MatchBufferRegionNode.html" target="_self">tvm::tir::MatchBufferRegionNode</a></td><td class="desc">Match introduces a constraint that the source buffer region can be remapped to the data layout specified by the buffer field. The constraint can be checked in later part of lower [...]
-<tr id="row_98_129_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html" target="_self">tvm::tir::ScheduleNode</a></td><td class="desc">The user-facing schedule class </td></tr>
-<tr id="row_98_130_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ScheduleStateNode.html" target="_self">tvm::tir::ScheduleStateNode</a></td><td class="desc">The state of scheduling, which exposes a <code>Replace</code> method as the primary interface for all the scheduling primitives to manipulate the TensorIR </td></tr>
-<tr id="row_98_131_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_131_" class="arrow" onclick="toggleFolder('98_131_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1StmtNode.html" target="_self">tvm::tir::StmtNode</a></td><td class="desc">Base node of all statements </td></tr>
-<tr id="row_98_131_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AllocateConstNode.html" target="_self">tvm::tir::AllocateConstNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Allocate.html" title="Managed reference to AllocateNode. ">Allocate</a> a buffer that can be used in body </td></tr>
-<tr id="row_98_131_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AllocateNode.html" target="_self">tvm::tir::AllocateNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Allocate.html" title="Managed reference to AllocateNode. ">Allocate</a> a buffer that can be used in body </td></tr>
-<tr id="row_98_131_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AssertStmtNode.html" target="_self">tvm::tir::AssertStmtNode</a></td><td class="desc">Assert condition, if an error occurs, return the error message </td></tr>
-<tr id="row_98_131_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AttrStmtNode.html" target="_self">tvm::tir::AttrStmtNode</a></td><td class="desc">Define certain auxiliary attribute for the body to be a symbolic value. This provide auxiliary information for IR passes that transforms body </td></tr>
-<tr id="row_98_131_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockNode.html" target="_self">tvm::tir::BlockNode</a></td><td class="desc">A block is a basic schedule unit in TIR </td></tr>
-<tr id="row_98_131_5_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockRealizeNode.html" target="_self">tvm::tir::BlockRealizeNode</a></td><td class="desc">A block realization node represents execution of the block at the binding values </td></tr>
-<tr id="row_98_131_6_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferRealizeNode.html" target="_self">tvm::tir::BufferRealizeNode</a></td><td class="desc">Annotate the region where the buffer need to be read and write in the body. We only need to allocate the space for the corresponding region </td></tr>
-<tr id="row_98_131_7_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferStoreNode.html" target="_self">tvm::tir::BufferStoreNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Store.html" title="Managed reference to StoreNode. ">Store</a> value to the high dimension buffer </td></tr>
-<tr id="row_98_131_8_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1EvaluateNode.html" target="_self">tvm::tir::EvaluateNode</a></td><td class="desc">Evaluates an expression. This is mostly used for putting a <a class="el" href="classtvm_1_1tir_1_1Call.html" title="Managed reference to CallNode. ">Call</a> node into <a class="el" href="classt [...]
-<tr id="row_98_131_9_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ForNode.html" target="_self">tvm::tir::ForNode</a></td><td class="desc">A for loop, with poissible type annotations </td></tr>
-<tr id="row_98_131_10_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IfThenElseNode.html" target="_self">tvm::tir::IfThenElseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1IfThenElse.html" title="Managed reference to IfThenElseNode. ">IfThenElse</a> statment </td></tr>
-<tr id="row_98_131_11_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1LetStmtNode.html" target="_self">tvm::tir::LetStmtNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Let.html" title="Managed reference to LetNode. ">Let</a> binding, bind var to value, then run body </td></tr>
-<tr id="row_98_131_12_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1PrefetchNode.html" target="_self">tvm::tir::PrefetchNode</a></td><td class="desc">A prefetch hint for a buffer </td></tr>
-<tr id="row_98_131_13_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ProducerRealizeNode.html" target="_self">tvm::tir::ProducerRealizeNode</a></td><td class="desc">Annotate the bounds where the data produced by the producer need to be written and read in body. We will need to allocate space for the corresponding regions </td></tr>
-<tr id="row_98_131_14_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ProducerStoreNode.html" target="_self">tvm::tir::ProducerStoreNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Store.html" title="Managed reference to StoreNode. ">Store</a> value into mult-dimensional array that will be read by the consumer of the produc [...]
-<tr id="row_98_131_15_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1SeqStmtNode.html" target="_self">tvm::tir::SeqStmtNode</a></td><td class="desc">The container of seq statement. Represent a sequence of statements </td></tr>
-<tr id="row_98_131_16_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1StoreNode.html" target="_self">tvm::tir::StoreNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Store.html" title="Managed reference to StoreNode. ">Store</a> value to the buffer </td></tr>
-<tr id="row_98_131_17_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1WhileNode.html" target="_self">tvm::tir::WhileNode</a></td><td class="desc">A <a class="el" href="classtvm_1_1tir_1_1While.html" title="Managed reference to WhileNode. ">While</a> loop </td></tr>
-<tr id="row_98_132_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1StmtSRefNode.html" target="_self">tvm::tir::StmtSRefNode</a></td><td class="desc">An object that refers to schedulable elements (block/for-loop) in TensorIR, aka "sref" </td></tr>
-<tr id="row_98_133_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1TensorIntrinNode.html" target="_self">tvm::tir::TensorIntrinNode</a></td><td class="desc">Tensor intrinsics for tensorization </td></tr>
-<tr id="row_98_134_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1TraceNode.html" target="_self">tvm::tir::TraceNode</a></td><td class="desc">An execution trace of a scheduling program </td></tr>
-<tr id="row_98_135_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1AllocatedPoolInfoNode.html" target="_self">tvm::tir::usmp::AllocatedPoolInfoNode</a></td><td class="desc">This object contains information post-allocation for <a class="el" href="classtvm_1_1PoolInfo.html">PoolInfo</a> objects </td></tr>
-<tr id="row_98_136_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoAnalysisNode.html" target="_self">tvm::tir::usmp::BufferInfoAnalysisNode</a></td><td class="desc">This is a composite node that is produced by extract_buffer_info analysis pass that contains useful global information that could be useful for memory planning a [...]
-<tr id="row_98_137_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoNode.html" target="_self">tvm::tir::usmp::BufferInfoNode</a></td><td class="desc">Describes an abstract memory buffer that will get allocated inside a pool. The actual memory buffer in represented by <a class="el" href="structtvm_1_1tir_1_1usmp_1_1PoolAllocat [...]
-<tr id="row_98_138_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1PoolAllocationNode.html" target="_self">tvm::tir::usmp::PoolAllocationNode</a></td><td class="desc">The pool allocation produced after the USMP algorithm </td></tr>
-<tr id="row_98_139_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1PassContextNode.html" target="_self">tvm::transform::PassContextNode</a></td><td class="desc"><a class="el" href="classtvm_1_1transform_1_1PassContextNode.html" title="PassContextNode contains the information that a pass can rely on, such as analysis results...">PassConte [...]
-<tr id="row_98_140_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1PassInfoNode.html" target="_self">tvm::transform::PassInfoNode</a></td><td class="desc">Meta data that will be used to help optimization and analysis </td></tr>
-<tr id="row_98_141_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_141_" class="arrow" onclick="toggleFolder('98_141_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1PassNode.html" target="_self">tvm::transform::PassNode</a></td><td class="desc"><a class="el" href="classtvm_1_1transform_1_1PassNode.html" title="PassNode is the base type of differnt ty [...]
-<tr id="row_98_141_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1SequentialNode.html" target="_self">tvm::transform::SequentialNode</a></td><td class="desc">The <a class="el" href="classtvm_1_1transform_1_1SequentialNode.html" title="The SequentialNode contains a set of passes that transform Relay programs from one AST to another sem [...]
-<tr id="row_98_142_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_142_" class="arrow" onclick="toggleFolder('98_142_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeNode.html" target="_self">tvm::TypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> is the base type of all types </td></tr>
-<tr id="row_98_142_0_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_142_0_" class="arrow" onclick="toggleFolder('98_142_0_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1BaseTensorTypeNode.html" target="_self">tvm::BaseTensorTypeNode</a></td><td class="desc">Base of all Tensor types This container can hold <a class="el" href="classtvm_1_1TensorType.html" title=" [...]
-<tr id="row_98_142_0_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorTypeNode.html" target="_self">tvm::TensorTypeNode</a></td><td class="desc">This is the most commonly used type in relay. <a class="el" href="classtvm_1_1TensorType.html" title="Managed reference to TensorTypeNode. ">TensorType</a> have a fixed dimension, data type </td></tr>
-<tr id="row_98_142_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1FuncTypeNode.html" target="_self">tvm::FuncTypeNode</a></td><td class="desc">Function type </td></tr>
-<tr id="row_98_142_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1GlobalTypeVarNode.html" target="_self">tvm::GlobalTypeVarNode</a></td><td class="desc">A global type variable that is used for defining new types or type aliases </td></tr>
-<tr id="row_98_142_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1IncompleteTypeNode.html" target="_self">tvm::IncompleteTypeNode</a></td><td class="desc">Intermediate values that is used to indicate incomplete type during type inference </td></tr>
-<tr id="row_98_142_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PointerTypeNode.html" target="_self">tvm::PointerTypeNode</a></td><td class="desc">Low-level raw pointer type </td></tr>
-<tr id="row_98_142_5_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimTypeNode.html" target="_self">tvm::PrimTypeNode</a></td><td class="desc">Primitive data types used in the low-level IR </td></tr>
-<tr id="row_98_142_6_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayRefTypeNode.html" target="_self">tvm::RelayRefTypeNode</a></td><td class="desc">Reference <a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> High-level Relay IR </td></tr>
-<tr id="row_98_142_7_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleTypeNode.html" target="_self">tvm::TupleTypeNode</a></td><td class="desc">The type of tuple values </td></tr>
-<tr id="row_98_142_8_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeCallNode.html" target="_self">tvm::TypeCallNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> function application </td></tr>
-<tr id="row_98_142_9_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_142_9_" class="arrow" onclick="toggleFolder('98_142_9_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeConstraintNode.html" target="_self">tvm::TypeConstraintNode</a></td><td class="desc">Potential Constraints in a function </td></tr>
-<tr id="row_98_142_9_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeRelationNode.html" target="_self">tvm::TypeRelationNode</a></td><td class="desc">User defined type relation, it is an input-output relation on types </td></tr>
-<tr id="row_98_142_10_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeDataNode.html" target="_self">tvm::TypeDataNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TypeData.html" title="Stores all data for an Algebraic Data Type (ADT). ">TypeData</a> container node </td></tr>
-<tr id="row_98_142_11_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVarNode.html" target="_self">tvm::TypeVarNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> parameter in functions </td></tr>
-<tr id="row_98_143_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeReporterNode.html" target="_self">tvm::TypeReporterNode</a></td><td class="desc">Reporter that reports back to the type resolution information </td></tr>
-<tr id="row_98_144_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1WorkspaceMemoryPoolsNode.html" target="_self">tvm::WorkspaceMemoryPoolsNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_40_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1MemoryInfoNode.html" target="_self">tvm::MemoryInfoNode</a></td><td class="desc">Memory information of special memory region. Use <a class="el" href="classtvm_1_1MemoryInfo.html" title="Defines memory info. ">MemoryInfo</a> as its container type </td></tr>
+<tr id="row_98_41_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html" target="_self">tvm::meta_schedule::ApplyHistoryBestNode</a></td><td class="desc">An integration context that allows application of historically best records from a database </td></tr>
+<tr id="row_98_42_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_42_" class="arrow" onclick="toggleFolder('98_42_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfoNode.html" target="_self">tvm::meta_schedule::ArgInfoNode</a></td><td class="desc">The argument information </td></tr>
+<tr id="row_98_42_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TensorInfoNode.html" target="_self">tvm::meta_schedule::TensorInfoNode</a></td><td class="desc">The tensor argument information </td></tr>
+<tr id="row_98_43_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html" target="_self">tvm::meta_schedule::BuilderInputNode</a></td><td class="desc">The builder's input, containing an <a class="el" href="classtvm_1_1IRModule.html" title="Managed reference class to IRModuleNode. ">IRModule</a> and the target </td></tr>
+<tr id="row_98_44_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_44_" class="arrow" onclick="toggleFolder('98_44_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderNode.html" target="_self">tvm::meta_schedule::BuilderNode</a></td><td class="desc">The abstract builder interface </td></tr>
+<tr id="row_98_44_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyBuilderNode.html" target="_self">tvm::meta_schedule::PyBuilderNode</a></td><td class="desc">An abstract builder with customized build method on the python-side </td></tr>
+<tr id="row_98_45_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResultNode.html" target="_self">tvm::meta_schedule::BuilderResultNode</a></td><td class="desc">The builder's output, containing the artifact path or error message if any </td></tr>
+<tr id="row_98_46_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_46_" class="arrow" onclick="toggleFolder('98_46_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1CostModelNode.html" target="_self">tvm::meta_schedule::CostModelNode</a></td><td class="desc">Cost model </td></tr>
+<tr id="row_98_46_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyCostModelNode.html" target="_self">tvm::meta_schedule::PyCostModelNode</a></td><td class="desc">The cost model with customized methods on the python-side </td></tr>
+<tr id="row_98_47_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_47_" class="arrow" onclick="toggleFolder('98_47_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1DatabaseNode.html" target="_self">tvm::meta_schedule::DatabaseNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_47_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyDatabaseNode.html" target="_self">tvm::meta_schedule::PyDatabaseNode</a></td><td class="desc">The database with customized methods on the python-side </td></tr>
+<tr id="row_98_48_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html" target="_self">tvm::meta_schedule::ExtractedTaskNode</a></td><td class="desc">A tuning task extracted from the high-level IR </td></tr>
+<tr id="row_98_49_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_49_" class="arrow" onclick="toggleFolder('98_49_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractorNode.html" target="_self">tvm::meta_schedule::FeatureExtractorNode</a></td><td class="desc">Extractor for features from measure candidates for use in cost model </td></tr>
+<tr id="row_98_49_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyFeatureExtractorNode.html" target="_self">tvm::meta_schedule::PyFeatureExtractorNode</a></td><td class="desc">The feature extractor with customized methods on the python-side </td></tr>
+<tr id="row_98_50_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_50_" class="arrow" onclick="toggleFolder('98_50_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallbackNode.html" target="_self">tvm::meta_schedule::MeasureCallbackNode</a></td><td class="desc">Rules to apply after measure results is available </td></tr>
+<tr id="row_98_50_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyMeasureCallbackNode.html" target="_self">tvm::meta_schedule::PyMeasureCallbackNode</a></td><td class="desc">The measure callback with customized methods on the python-side </td></tr>
+<tr id="row_98_51_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html" target="_self">tvm::meta_schedule::MeasureCandidateNode</a></td><td class="desc">The schedule (with input shapes) to be measured </td></tr>
+<tr id="row_98_52_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_52_" class="arrow" onclick="toggleFolder('98_52_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MutatorNode.html" target="_self">tvm::meta_schedule::MutatorNode</a></td><td class="desc"><a class="el" href="classtvm_1_1meta__schedule_1_1Mutator.html" title="Managed reference to Mut [...]
+<tr id="row_98_52_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyMutatorNode.html" target="_self">tvm::meta_schedule::PyMutatorNode</a></td><td class="desc">The mutator with customized methods on the python-side </td></tr>
+<tr id="row_98_53_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_53_" class="arrow" onclick="toggleFolder('98_53_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PostprocNode.html" target="_self">tvm::meta_schedule::PostprocNode</a></td><td class="desc">Rules to apply a postprocessor to a schedule </td></tr>
+<tr id="row_98_53_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyPostprocNode.html" target="_self">tvm::meta_schedule::PyPostprocNode</a></td><td class="desc">The postprocessor with customized methods on the python-side </td></tr>
+<tr id="row_98_54_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerFutureNode.html" target="_self">tvm::meta_schedule::RunnerFutureNode</a></td><td class="desc">A class to asynchronously fetch runner's output </td></tr>
+<tr id="row_98_55_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerInputNode.html" target="_self">tvm::meta_schedule::RunnerInputNode</a></td><td class="desc"><a class="el" href="classtvm_1_1meta__schedule_1_1Runner.html" title="Managed reference to RunnerNode. ">Runner</a>'s input containing path of artifact, type of device an [...]
+<tr id="row_98_56_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_56_" class="arrow" onclick="toggleFolder('98_56_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerNode.html" target="_self">tvm::meta_schedule::RunnerNode</a></td><td class="desc">The abstract runner interface </td></tr>
+<tr id="row_98_56_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyRunnerNode.html" target="_self">tvm::meta_schedule::PyRunnerNode</a></td><td class="desc">An abstract runner with customized build method on the python-side </td></tr>
+<tr id="row_98_57_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerResultNode.html" target="_self">tvm::meta_schedule::RunnerResultNode</a></td><td class="desc"><a class="el" href="classtvm_1_1meta__schedule_1_1Runner.html" title="Managed reference to RunnerNode. ">Runner</a>'s output containing measurement result of <a class=" [...]
+<tr id="row_98_58_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_58_" class="arrow" onclick="toggleFolder('98_58_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ScheduleRuleNode.html" target="_self">tvm::meta_schedule::ScheduleRuleNode</a></td><td class="desc">Rules to modify a block in a schedule </td></tr>
+<tr id="row_98_58_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyScheduleRuleNode.html" target="_self">tvm::meta_schedule::PyScheduleRuleNode</a></td><td class="desc">The schedule rule with customized methods on the python-side </td></tr>
+<tr id="row_98_59_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_59_" class="arrow" onclick="toggleFolder('98_59_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1SearchStrategyNode.html" target="_self">tvm::meta_schedule::SearchStrategyNode</a></td><td class="desc">The search strategy for measure candidates generation </td></tr>
+<tr id="row_98_59_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PySearchStrategyNode.html" target="_self">tvm::meta_schedule::PySearchStrategyNode</a></td><td class="desc">The python side customizable class for measure candidate generation </td></tr>
+<tr id="row_98_60_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_60_" class="arrow" onclick="toggleFolder('98_60_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1SpaceGeneratorNode.html" target="_self">tvm::meta_schedule::SpaceGeneratorNode</a></td><td class="desc">The abstract class for design space generation </td></tr>
+<tr id="row_98_60_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PySpaceGeneratorNode.html" target="_self">tvm::meta_schedule::PySpaceGeneratorNode</a></td><td class="desc">The design space generator with customized methods on the python-side </td></tr>
+<tr id="row_98_61_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_61_" class="arrow" onclick="toggleFolder('98_61_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TaskSchedulerNode.html" target="_self">tvm::meta_schedule::TaskSchedulerNode</a></td><td class="desc">The abstract interface of task schedulers </td></tr>
+<tr id="row_98_61_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1PyTaskSchedulerNode.html" target="_self">tvm::meta_schedule::PyTaskSchedulerNode</a></td><td class="desc">The task scheduler with customized methods on the python-side </td></tr>
+<tr id="row_98_62_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TuneContextNode.html" target="_self">tvm::meta_schedule::TuneContextNode</a></td><td class="desc">The auto tuning context </td></tr>
+<tr id="row_98_63_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TuningRecordNode.html" target="_self">tvm::meta_schedule::TuningRecordNode</a></td><td class="desc">The class of tuning records </td></tr>
+<tr id="row_98_64_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1WorkloadNode.html" target="_self">tvm::meta_schedule::WorkloadNode</a></td><td class="desc">A workload, i.e. an <a class="el" href="classtvm_1_1IRModule.html" title="Managed reference class to IRModuleNode. ">IRModule</a> and its structural hash </td></tr>
+<tr id="row_98_65_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1parser_1_1SourceMapNode.html" target="_self">tvm::parser::SourceMapNode</a></td><td class="desc">Stores locations in frontend source that generated a node </td></tr>
+<tr id="row_98_66_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1parser_1_1SourceNode.html" target="_self">tvm::parser::SourceNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_67_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1PoolInfoNode.html" target="_self">tvm::PoolInfoNode</a></td><td class="desc">Describes a pool of memory accessible by one or more targets </td></tr>
+<tr id="row_98_68_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RangeNode.html" target="_self">tvm::RangeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Range.html" title="Range constainer. ">Range</a> over one dimension </td></tr>
+<tr id="row_98_69_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ClauseNode.html" target="_self">tvm::relay::ClauseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Clause.html">Clause</a> container node </td></tr>
+<tr id="row_98_70_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1relay_1_1ConstructorValueObj.html" target="_self">tvm::relay::ConstructorValueObj</a></td><td class="desc"></td></tr>
+<tr id="row_98_71_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DFPatternCallbackNode.html" target="_self">tvm::relay::DFPatternCallbackNode</a></td><td class="desc">Base type of all dataflow pattern callbacks </td></tr>
+<tr id="row_98_72_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_72_" class="arrow" onclick="toggleFolder('98_72_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DFPatternNode.html" target="_self">tvm::relay::DFPatternNode</a></td><td class="desc">Base type of all dataflow patterns </td></tr>
+<tr id="row_98_72_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1AltPatternNode.html" target="_self">tvm::relay::AltPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Alternate Expressions </td></tr>
+<tr id="row_98_72_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1AttrPatternNode.html" target="_self">tvm::relay::AttrPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Attributes </td></tr>
+<tr id="row_98_72_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1CallPatternNode.html" target="_self">tvm::relay::CallPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1CallPattern.html">CallPattern</a> container </td></tr>
+<tr id="row_98_72_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ConstantPatternNode.html" target="_self">tvm::relay::ConstantPatternNode</a></td><td class="desc">Container for <a class="el" href="classtvm_1_1relay_1_1Constant.html">Constant</a> </td></tr>
+<tr id="row_98_72_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DataTypePatternNode.html" target="_self">tvm::relay::DataTypePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Types </td></tr>
+<tr id="row_98_72_5_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DominatorPatternNode.html" target="_self">tvm::relay::DominatorPatternNode</a></td><td class="desc">Dominated Graph <a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> <a class="el" href="cla [...]
+<tr id="row_98_72_6_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ExprPatternNode.html" target="_self">tvm::relay::ExprPatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Relay Expression </td></tr>
+<tr id="row_98_72_7_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1FunctionPatternNode.html" target="_self">tvm::relay::FunctionPatternNode</a></td><td class="desc">Relay <a class="el" href="classtvm_1_1relay_1_1Function.html" title="Managed reference to FunctionNode. ">Function</a> container </td></tr>
+<tr id="row_98_72_8_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1IfPatternNode.html" target="_self">tvm::relay::IfPatternNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_72_9_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1LetPatternNode.html" target="_self">tvm::relay::LetPatternNode</a></td><td class="desc">A binding of a sub-network </td></tr>
+<tr id="row_98_72_10_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ShapePatternNode.html" target="_self">tvm::relay::ShapePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Shapes </td></tr>
+<tr id="row_98_72_11_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TupleGetItemPatternNode.html" target="_self">tvm::relay::TupleGetItemPatternNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_72_12_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TuplePatternNode.html" target="_self">tvm::relay::TuplePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Tuple.html">Tuple</a> container </td></tr>
+<tr id="row_98_72_13_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TypePatternNode.html" target="_self">tvm::relay::TypePatternNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> for Types </td></tr>
+<tr id="row_98_72_14_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1VarPatternNode.html" target="_self">tvm::relay::VarPatternNode</a></td><td class="desc">Container for <a class="el" href="classtvm_1_1relay_1_1Var.html">Var</a> </td></tr>
+<tr id="row_98_72_15_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1WildcardPatternNode.html" target="_self">tvm::relay::WildcardPatternNode</a></td><td class="desc">Wildcard <a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT match pattern in Relay. ">Pattern</a> </td></tr>
+<tr id="row_98_73_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ExecutorNode.html" target="_self">tvm::relay::ExecutorNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Executor.html" title="Managed reference class to ExecutorNode. ">Executor</a> information </td></tr>
+<tr id="row_98_74_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1IdNode.html" target="_self">tvm::relay::IdNode</a></td><td class="desc">The unique identifier of variables </td></tr>
+<tr id="row_98_75_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpImplementationNode.html" target="_self">tvm::relay::OpImplementationNode</a></td><td class="desc">Operator implementation that includes compute and schedule function </td></tr>
+<tr id="row_98_76_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpSpecializationNode.html" target="_self">tvm::relay::OpSpecializationNode</a></td><td class="desc">Specialized implementations for operators under certain conditions </td></tr>
+<tr id="row_98_77_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpStrategyNode.html" target="_self">tvm::relay::OpStrategyNode</a></td><td class="desc">Operator strategy to choose implementation </td></tr>
+<tr id="row_98_78_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RecClosureObj.html" target="_self">tvm::relay::RecClosureObj</a></td><td class="desc">The container type of <a class="el" href="classtvm_1_1relay_1_1RecClosure.html">RecClosure</a> </td></tr>
+<tr id="row_98_79_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1relay_1_1RefValueObj.html" target="_self">tvm::relay::RefValueObj</a></td><td class="desc"></td></tr>
+<tr id="row_98_80_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_80_" class="arrow" onclick="toggleFolder('98_80_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RelayNode.html" target="_self">tvm::relay::RelayNode</a></td><td class="desc">This is the base node container of all relay structures </td></tr>
+<tr id="row_98_80_0_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_80_0_" class="arrow" onclick="toggleFolder('98_80_0_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternNode.html" target="_self">tvm::relay::PatternNode</a></td><td class="desc">Base type for declaring relay pattern </td></tr>
+<tr id="row_98_80_0_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternConstructorNode.html" target="_self">tvm::relay::PatternConstructorNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> container node </td></tr>
+<tr id="row_98_80_0_1_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternTupleNode.html" target="_self">tvm::relay::PatternTupleNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> container node </td></tr>
+<tr id="row_98_80_0_2_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternVarNode.html" target="_self">tvm::relay::PatternVarNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternVar.html">PatternVar</a> container node </td></tr>
+<tr id="row_98_80_0_3_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternWildcardNode.html" target="_self">tvm::relay::PatternWildcardNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1PatternWildcard.html">PatternWildcard</a> container node </td></tr>
+<tr id="row_98_81_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RuntimeNode.html" target="_self">tvm::relay::RuntimeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Runtime.html" title="Managed reference class to RuntimeNode. ">Runtime</a> information </td></tr>
+<tr id="row_98_82_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ADTObj.html" target="_self">tvm::runtime::ADTObj</a></td><td class="desc">An object representing a structure or enumeration </td></tr>
+<tr id="row_98_83_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ArrayNode.html" target="_self">tvm::runtime::ArrayNode</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Array.html" title="Array, container representing a contiguous sequence of ObjectRefs. ">Array</a> node content in array </td></tr>
+<tr id="row_98_84_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_84_" class="arrow" onclick="toggleFolder('98_84_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ClosureObj.html" target="_self">tvm::runtime::ClosureObj</a></td><td class="desc">An object representing a closure. This object is used by both the Relay VM and interpreter </td></tr>
+<tr id="row_98_84_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1InterpreterClosureObj.html" target="_self">tvm::relay::InterpreterClosureObj</a></td><td class="desc">The container type of Closures used by the interpreter </td></tr>
+<tr id="row_98_84_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1VMClosureObj.html" target="_self">tvm::runtime::vm::VMClosureObj</a></td><td class="desc">An object representing a vm closure </td></tr>
+<tr id="row_98_85_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_85_" class="arrow" onclick="toggleFolder('98_85_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1MapNode.html" target="_self">tvm::runtime::MapNode</a></td><td class="desc">Shared content of all specializations of hash map </td></tr>
+<tr id="row_98_85_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1DenseMapNode.html" target="_self">tvm::runtime::DenseMapNode</a></td><td class="desc">A specialization of hash map that implements the idea of array-based hash map. Another reference implementation can be found [1] </td></tr>
+<tr id="row_98_85_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1SmallMapNode.html" target="_self">tvm::runtime::SmallMapNode</a></td><td class="desc">A specialization of small-sized hash map </td></tr>
+<tr id="row_98_86_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_86_" class="arrow" onclick="toggleFolder('98_86_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataBaseNode.html" target="_self">tvm::runtime::metadata::MetadataBaseNode</a></td><td class="desc">Common base class for all Metadata </td></tr>
+<tr id="row_98_86_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArrayNode.html" target="_self">tvm::runtime::metadata::MetadataArrayNode</a></td><td class="desc">Container for arrays in the metadata </td></tr>
+<tr id="row_98_87_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_87_" class="arrow" onclick="toggleFolder('98_87_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ModuleNode.html" target="_self">tvm::runtime::ModuleNode</a></td><td class="desc">Base container of module </td></tr>
+<tr id="row_98_87_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1Executable.html" target="_self">tvm::runtime::vm::Executable</a></td><td class="desc">The executable emitted by the VM compiler </td></tr>
+<tr id="row_98_87_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1VirtualMachine.html" target="_self">tvm::runtime::vm::VirtualMachine</a></td><td class="desc">The virtual machine </td></tr>
+<tr id="row_98_88_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1NDArray_1_1Container.html" target="_self">tvm::runtime::NDArray::Container</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Object.html" title="base class of all object containers. ">Object</a> container class that backs <a class="el" href="classtvm_1_1run [...]
+<tr id="row_98_89_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_89_" class="arrow" onclick="toggleFolder('98_89_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1PackedFuncObj.html" target="_self">tvm::runtime::PackedFuncObj</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Object.html" title="base class of all object containers. ">Ob [...]
+<tr id="row_98_89_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1PackedFuncSubObj.html" target="_self">tvm::runtime::PackedFuncSubObj&lt; TCallable &gt;</a></td><td class="desc">Derived object class for constructing <a class="el" href="classtvm_1_1runtime_1_1PackedFuncObj.html" title="Object container class that backs PackedFunc. ">Pack [...]
+<tr id="row_98_90_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1CountNode.html" target="_self">tvm::runtime::profiling::CountNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_91_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1runtime_1_1profiling_1_1DeviceWrapperNode.html" target="_self">tvm::runtime::profiling::DeviceWrapperNode</a></td><td class="desc">Wrapper for <code>Device</code> because <code>Device</code> is not passable across the <a class="el" href="classtvm_1_1runtime_1_1PackedFunc.html" title=" [...]
+<tr id="row_98_92_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1DurationNode.html" target="_self">tvm::runtime::profiling::DurationNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_93_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollectorNode.html" target="_self">tvm::runtime::profiling::MetricCollectorNode</a></td><td class="desc">Interface for user defined profiling metric collection </td></tr>
+<tr id="row_98_94_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1PercentNode.html" target="_self">tvm::runtime::profiling::PercentNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_95_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1ReportNode.html" target="_self">tvm::runtime::profiling::ReportNode</a></td><td class="desc">Data collected from a profiling run. Includes per-call metrics and per-device metrics </td></tr>
+<tr id="row_98_96_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_96_" class="arrow" onclick="toggleFolder('98_96_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ShapeTupleObj.html" target="_self">tvm::runtime::ShapeTupleObj</a></td><td class="desc">An object representing a shape tuple </td></tr>
+<tr id="row_98_96_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ShapeTupleObj_1_1FromStd.html" target="_self">tvm::runtime::ShapeTupleObj::FromStd</a></td><td class="desc">An object representing shape tuple moved from std::vector </td></tr>
+<tr id="row_98_97_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_97_" class="arrow" onclick="toggleFolder('98_97_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1StringObj.html" target="_self">tvm::runtime::StringObj</a></td><td class="desc">An object representing string. It's POD type </td></tr>
+<tr id="row_98_97_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1StringObj_1_1FromStd.html" target="_self">tvm::runtime::StringObj::FromStd</a></td><td class="desc">An object representing string moved from std::string </td></tr>
+<tr id="row_98_98_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1TimerNode.html" target="_self">tvm::runtime::TimerNode</a></td><td class="desc">Base class for all implementations </td></tr>
+<tr id="row_98_99_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1StorageObj.html" target="_self">tvm::runtime::vm::StorageObj</a></td><td class="desc">An object representing a storage allocation </td></tr>
+<tr id="row_98_100_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SourceNameNode.html" target="_self">tvm::SourceNameNode</a></td><td class="desc">The name of a source fragment </td></tr>
+<tr id="row_98_101_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SpanNode.html" target="_self">tvm::SpanNode</a></td><td class="desc">Stores locations in frontend source that generated a node </td></tr>
+<tr id="row_98_102_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKindNode.html" target="_self">tvm::TargetKindNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Target.html" title="Managed reference class to TargetNode. ">Target</a> kind, specifies the kind of the target </td></tr>
+<tr id="row_98_103_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetNode.html" target="_self">tvm::TargetNode</a></td><td class="desc">Compilation target </td></tr>
+<tr id="row_98_104_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTagNode.html" target="_self">tvm::TargetTagNode</a></td><td class="desc">A target tag </td></tr>
+<tr id="row_98_105_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1IterVarAttrNode.html" target="_self">tvm::te::IterVarAttrNode</a></td><td class="desc">Node container for IterVar attr </td></tr>
+<tr id="row_98_106_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_106_" class="arrow" onclick="toggleFolder('98_106_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1IterVarRelationNode.html" target="_self">tvm::te::IterVarRelationNode</a></td><td class="desc">Base node of iteration var </td></tr>
+<tr id="row_98_106_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1FuseNode.html" target="_self">tvm::te::FuseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Fuse.html" title="Managed reference to FuseNode. ">Fuse</a> two domains into one domain </td></tr>
+<tr id="row_98_106_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1RebaseNode.html" target="_self">tvm::te::RebaseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Rebase.html" title="Managed reference to RebaseNode. ">Rebase</a> the iteration to make min to be 0. This is useful to normalize the <a class="el" href="classtvm_ [...]
+<tr id="row_98_106_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1SingletonNode.html" target="_self">tvm::te::SingletonNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Singleton.html" title="Managed reference to SingletonNode. ">Singleton</a> iterator [0, 1) </td></tr>
+<tr id="row_98_106_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1SplitNode.html" target="_self">tvm::te::SplitNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Split.html" title="Managed reference to SplitNode. ">Split</a> the parent domain into product of outer and iter </td></tr>
+<tr id="row_98_106_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TransformNode.html" target="_self">tvm::te::TransformNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Transform.html">Transform</a> iterator according to some arbitrary expression </td></tr>
+<tr id="row_98_107_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_107_" class="arrow" onclick="toggleFolder('98_107_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1OperationNode.html" target="_self">tvm::te::OperationNode</a></td><td class="desc">Base class of all operation nodes </td></tr>
+<tr id="row_98_107_0_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_107_0_" class="arrow" onclick="toggleFolder('98_107_0_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1BaseComputeOpNode.html" target="_self">tvm::te::BaseComputeOpNode</a></td><td class="desc">A Compute op that compute a tensor on certain domain. This is the base class for <a class="el" hr [...]
+<tr id="row_98_107_0_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ComputeOpNode.html" target="_self">tvm::te::ComputeOpNode</a></td><td class="desc">A Compute op that compute a tensor on certain domain </td></tr>
+<tr id="row_98_107_0_1_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorComputeOpNode.html" target="_self">tvm::te::TensorComputeOpNode</a></td><td class="desc">A TenorCompute op that compute a tensor with an tensor intrinsic </td></tr>
+<tr id="row_98_107_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ExternOpNode.html" target="_self">tvm::te::ExternOpNode</a></td><td class="desc">External computation that cannot be splitted </td></tr>
+<tr id="row_98_107_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1HybridOpNode.html" target="_self">tvm::te::HybridOpNode</a></td><td class="desc">A computation operator that generated by hybrid script </td></tr>
+<tr id="row_98_107_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1PlaceholderOpNode.html" target="_self">tvm::te::PlaceholderOpNode</a></td><td class="desc">A placeholder op represents an input placeholder </td></tr>
+<tr id="row_98_107_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ScanOpNode.html" target="_self">tvm::te::ScanOpNode</a></td><td class="desc">Symbolic scan </td></tr>
+<tr id="row_98_108_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ScheduleNode.html" target="_self">tvm::te::ScheduleNode</a></td><td class="desc">Node container for schedule </td></tr>
+<tr id="row_98_109_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1SpecializedConditionNode.html" target="_self">tvm::te::SpecializedConditionNode</a></td><td class="desc">Container for specialization conditions </td></tr>
+<tr id="row_98_110_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1StageNode.html" target="_self">tvm::te::StageNode</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Stage.html" title="Stage, contains scheduling for a stage of computation. ">Stage</a> </td></tr>
+<tr id="row_98_111_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorIntrinCallNode.html" target="_self">tvm::te::TensorIntrinCallNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_112_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorIntrinNode.html" target="_self">tvm::te::TensorIntrinNode</a></td><td class="desc">Node to represent a <a class="el" href="classtvm_1_1te_1_1Tensor.html" title="Tensor structure representing a possible input, or intermediate computation result. ">Tensor</a> intrinsic opera [...]
+<tr id="row_98_113_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BijectiveLayoutNode.html" target="_self">tvm::tir::BijectiveLayoutNode</a></td><td class="desc"></td></tr>
+<tr id="row_98_114_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockRVNode.html" target="_self">tvm::tir::BlockRVNode</a></td><td class="desc">A random variable that evaluates to a TensorIR block </td></tr>
+<tr id="row_98_115_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockScopeNode.html" target="_self">tvm::tir::BlockScopeNode</a></td><td class="desc">An object with 1-to-1 correspondence with each block reference in the sref tree. This data structure is used to track the producer-consumer dependencies between blocks. <a class="el" href="cla [...]
+<tr id="row_98_116_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferNode.html" target="_self">tvm::tir::BufferNode</a></td><td class="desc">Node to represent a buffer </td></tr>
+<tr id="row_98_117_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferRegionNode.html" target="_self">tvm::tir::BufferRegionNode</a></td><td class="desc">Representing the region of multi-dimensional buffer access </td></tr>
+<tr id="row_98_118_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1CommReducerNode.html" target="_self">tvm::tir::CommReducerNode</a></td><td class="desc">A commutative reducer node to represent a commutative binary operator with identity element </td></tr>
+<tr id="row_98_119_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_119_" class="arrow" onclick="toggleFolder('98_119_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1DataProducerNode.html" target="_self">tvm::tir::DataProducerNode</a></td><td class="desc">Base node for data producers </td></tr>
+<tr id="row_98_119_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorNode.html" target="_self">tvm::te::TensorNode</a></td><td class="desc">Node to represent a tensor </td></tr>
+<tr id="row_98_120_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1DependencyNode.html" target="_self">tvm::tir::DependencyNode</a></td><td class="desc">A tuple (src, dst, kind) representing certain types of dependency. <a class="el" href="classtvm_1_1tir_1_1For.html" title="Managed reference to ForNode. ">For</a> example, (A, B, kRAW) means b [...]
+<tr id="row_98_121_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IndexMapNode.html" target="_self">tvm::tir::IndexMapNode</a></td><td class="desc">Defines a mapping between two representations of indices into a buffer </td></tr>
+<tr id="row_98_122_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1InstructionKindNode.html" target="_self">tvm::tir::InstructionKindNode</a></td><td class="desc">Kind of an instruction, e.g. Split, Reorder, etc. Besides the name, every kind of instruction has its own properties, including: 1) A boolean indicating if the instruction is pure, i [...]
+<tr id="row_98_123_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1InstructionNode.html" target="_self">tvm::tir::InstructionNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Schedule.html" title="Managed reference to ScheduleNode. ">Schedule</a> instructions each corresponds to a schedule primitive </td></tr>
+<tr id="row_98_124_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IterVarNode.html" target="_self">tvm::tir::IterVarNode</a></td><td class="desc">An iteration variable representing an iteration over a one dimensional interval </td></tr>
+<tr id="row_98_125_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1LayoutNode.html" target="_self">tvm::tir::LayoutNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Layout.html" title="Managed reference to LayoutNode. ">Layout</a> is to describe how data is organized within an N-dimention tensor. It is composed of upper case [...]
+<tr id="row_98_126_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1LoopRVNode.html" target="_self">tvm::tir::LoopRVNode</a></td><td class="desc">A random variable that evaluates to a TensorIR for loop </td></tr>
+<tr id="row_98_127_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1MatchBufferRegionNode.html" target="_self">tvm::tir::MatchBufferRegionNode</a></td><td class="desc">Match introduces a constraint that the source buffer region can be remapped to the data layout specified by the buffer field. The constraint can be checked in later part of lower [...]
+<tr id="row_98_128_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html" target="_self">tvm::tir::ScheduleNode</a></td><td class="desc">The user-facing schedule class </td></tr>
+<tr id="row_98_129_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ScheduleStateNode.html" target="_self">tvm::tir::ScheduleStateNode</a></td><td class="desc">The state of scheduling, which exposes a <code>Replace</code> method as the primary interface for all the scheduling primitives to manipulate the TensorIR </td></tr>
+<tr id="row_98_130_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_130_" class="arrow" onclick="toggleFolder('98_130_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1StmtNode.html" target="_self">tvm::tir::StmtNode</a></td><td class="desc">Base node of all statements </td></tr>
+<tr id="row_98_130_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AllocateConstNode.html" target="_self">tvm::tir::AllocateConstNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Allocate.html" title="Managed reference to AllocateNode. ">Allocate</a> a buffer that can be used in body </td></tr>
+<tr id="row_98_130_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AllocateNode.html" target="_self">tvm::tir::AllocateNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Allocate.html" title="Managed reference to AllocateNode. ">Allocate</a> a buffer that can be used in body </td></tr>
+<tr id="row_98_130_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AssertStmtNode.html" target="_self">tvm::tir::AssertStmtNode</a></td><td class="desc">Assert condition, if an error occurs, return the error message </td></tr>
+<tr id="row_98_130_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1AttrStmtNode.html" target="_self">tvm::tir::AttrStmtNode</a></td><td class="desc">Define certain auxiliary attribute for the body to be a symbolic value. This provide auxiliary information for IR passes that transforms body </td></tr>
+<tr id="row_98_130_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockNode.html" target="_self">tvm::tir::BlockNode</a></td><td class="desc">A block is a basic schedule unit in TIR </td></tr>
+<tr id="row_98_130_5_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockRealizeNode.html" target="_self">tvm::tir::BlockRealizeNode</a></td><td class="desc">A block realization node represents execution of the block at the binding values </td></tr>
+<tr id="row_98_130_6_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferRealizeNode.html" target="_self">tvm::tir::BufferRealizeNode</a></td><td class="desc">Annotate the region where the buffer need to be read and write in the body. We only need to allocate the space for the corresponding region </td></tr>
+<tr id="row_98_130_7_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferStoreNode.html" target="_self">tvm::tir::BufferStoreNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Store.html" title="Managed reference to StoreNode. ">Store</a> value to the high dimension buffer </td></tr>
+<tr id="row_98_130_8_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1EvaluateNode.html" target="_self">tvm::tir::EvaluateNode</a></td><td class="desc">Evaluates an expression. This is mostly used for putting a <a class="el" href="classtvm_1_1tir_1_1Call.html" title="Managed reference to CallNode. ">Call</a> node into <a class="el" href="classt [...]
+<tr id="row_98_130_9_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ForNode.html" target="_self">tvm::tir::ForNode</a></td><td class="desc">A for loop, with poissible type annotations </td></tr>
+<tr id="row_98_130_10_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IfThenElseNode.html" target="_self">tvm::tir::IfThenElseNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1IfThenElse.html" title="Managed reference to IfThenElseNode. ">IfThenElse</a> statment </td></tr>
+<tr id="row_98_130_11_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1LetStmtNode.html" target="_self">tvm::tir::LetStmtNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Let.html" title="Managed reference to LetNode. ">Let</a> binding, bind var to value, then run body </td></tr>
+<tr id="row_98_130_12_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1PrefetchNode.html" target="_self">tvm::tir::PrefetchNode</a></td><td class="desc">A prefetch hint for a buffer </td></tr>
+<tr id="row_98_130_13_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ProducerRealizeNode.html" target="_self">tvm::tir::ProducerRealizeNode</a></td><td class="desc">Annotate the bounds where the data produced by the producer need to be written and read in body. We will need to allocate space for the corresponding regions </td></tr>
+<tr id="row_98_130_14_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ProducerStoreNode.html" target="_self">tvm::tir::ProducerStoreNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Store.html" title="Managed reference to StoreNode. ">Store</a> value into mult-dimensional array that will be read by the consumer of the produc [...]
+<tr id="row_98_130_15_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1SeqStmtNode.html" target="_self">tvm::tir::SeqStmtNode</a></td><td class="desc">The container of seq statement. Represent a sequence of statements </td></tr>
+<tr id="row_98_130_16_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1StoreNode.html" target="_self">tvm::tir::StoreNode</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Store.html" title="Managed reference to StoreNode. ">Store</a> value to the buffer </td></tr>
+<tr id="row_98_130_17_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1WhileNode.html" target="_self">tvm::tir::WhileNode</a></td><td class="desc">A <a class="el" href="classtvm_1_1tir_1_1While.html" title="Managed reference to WhileNode. ">While</a> loop </td></tr>
+<tr id="row_98_131_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1StmtSRefNode.html" target="_self">tvm::tir::StmtSRefNode</a></td><td class="desc">An object that refers to schedulable elements (block/for-loop) in TensorIR, aka "sref" </td></tr>
+<tr id="row_98_132_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1TensorIntrinNode.html" target="_self">tvm::tir::TensorIntrinNode</a></td><td class="desc">Tensor intrinsics for tensorization </td></tr>
+<tr id="row_98_133_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1TraceNode.html" target="_self">tvm::tir::TraceNode</a></td><td class="desc">An execution trace of a scheduling program </td></tr>
+<tr id="row_98_134_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1AllocatedPoolInfoNode.html" target="_self">tvm::tir::usmp::AllocatedPoolInfoNode</a></td><td class="desc">This object contains information post-allocation for <a class="el" href="classtvm_1_1PoolInfo.html">PoolInfo</a> objects </td></tr>
+<tr id="row_98_135_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoAnalysisNode.html" target="_self">tvm::tir::usmp::BufferInfoAnalysisNode</a></td><td class="desc">This is a composite node that is produced by extract_buffer_info analysis pass that contains useful global information that could be useful for memory planning a [...]
+<tr id="row_98_136_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1BufferInfoNode.html" target="_self">tvm::tir::usmp::BufferInfoNode</a></td><td class="desc">Describes an abstract memory buffer that will get allocated inside a pool. The actual memory buffer in represented by <a class="el" href="structtvm_1_1tir_1_1usmp_1_1PoolAllocat [...]
+<tr id="row_98_137_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1tir_1_1usmp_1_1PoolAllocationNode.html" target="_self">tvm::tir::usmp::PoolAllocationNode</a></td><td class="desc">The pool allocation produced after the USMP algorithm </td></tr>
+<tr id="row_98_138_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1PassContextNode.html" target="_self">tvm::transform::PassContextNode</a></td><td class="desc"><a class="el" href="classtvm_1_1transform_1_1PassContextNode.html" title="PassContextNode contains the information that a pass can rely on, such as analysis results...">PassConte [...]
+<tr id="row_98_139_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1PassInfoNode.html" target="_self">tvm::transform::PassInfoNode</a></td><td class="desc">Meta data that will be used to help optimization and analysis </td></tr>
+<tr id="row_98_140_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_140_" class="arrow" onclick="toggleFolder('98_140_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1PassNode.html" target="_self">tvm::transform::PassNode</a></td><td class="desc"><a class="el" href="classtvm_1_1transform_1_1PassNode.html" title="PassNode is the base type of differnt ty [...]
+<tr id="row_98_140_0_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1transform_1_1SequentialNode.html" target="_self">tvm::transform::SequentialNode</a></td><td class="desc">The <a class="el" href="classtvm_1_1transform_1_1SequentialNode.html" title="The SequentialNode contains a set of passes that transform Relay programs from one AST to another sem [...]
+<tr id="row_98_141_" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_98_141_" class="arrow" onclick="toggleFolder('98_141_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeNode.html" target="_self">tvm::TypeNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> is the base type of all types </td></tr>
+<tr id="row_98_141_0_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_141_0_" class="arrow" onclick="toggleFolder('98_141_0_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1BaseTensorTypeNode.html" target="_self">tvm::BaseTensorTypeNode</a></td><td class="desc">Base of all Tensor types This container can hold <a class="el" href="classtvm_1_1TensorType.html" title=" [...]
+<tr id="row_98_141_0_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TensorTypeNode.html" target="_self">tvm::TensorTypeNode</a></td><td class="desc">This is the most commonly used type in relay. <a class="el" href="classtvm_1_1TensorType.html" title="Managed reference to TensorTypeNode. ">TensorType</a> have a fixed dimension, data type </td></tr>
+<tr id="row_98_141_1_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1FuncTypeNode.html" target="_self">tvm::FuncTypeNode</a></td><td class="desc">Function type </td></tr>
+<tr id="row_98_141_2_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1GlobalTypeVarNode.html" target="_self">tvm::GlobalTypeVarNode</a></td><td class="desc">A global type variable that is used for defining new types or type aliases </td></tr>
+<tr id="row_98_141_3_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1IncompleteTypeNode.html" target="_self">tvm::IncompleteTypeNode</a></td><td class="desc">Intermediate values that is used to indicate incomplete type during type inference </td></tr>
+<tr id="row_98_141_4_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PointerTypeNode.html" target="_self">tvm::PointerTypeNode</a></td><td class="desc">Low-level raw pointer type </td></tr>
+<tr id="row_98_141_5_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PrimTypeNode.html" target="_self">tvm::PrimTypeNode</a></td><td class="desc">Primitive data types used in the low-level IR </td></tr>
+<tr id="row_98_141_6_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1RelayRefTypeNode.html" target="_self">tvm::RelayRefTypeNode</a></td><td class="desc">Reference <a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> High-level Relay IR </td></tr>
+<tr id="row_98_141_7_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TupleTypeNode.html" target="_self">tvm::TupleTypeNode</a></td><td class="desc">The type of tuple values </td></tr>
+<tr id="row_98_141_8_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeCallNode.html" target="_self">tvm::TypeCallNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> function application </td></tr>
+<tr id="row_98_141_9_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span id="arr_98_141_9_" class="arrow" onclick="toggleFolder('98_141_9_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeConstraintNode.html" target="_self">tvm::TypeConstraintNode</a></td><td class="desc">Potential Constraints in a function </td></tr>
+<tr id="row_98_141_9_0_" style="display:none;"><td class="entry"><span style="width:64px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeRelationNode.html" target="_self">tvm::TypeRelationNode</a></td><td class="desc">User defined type relation, it is an input-output relation on types </td></tr>
+<tr id="row_98_141_10_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeDataNode.html" target="_self">tvm::TypeDataNode</a></td><td class="desc"><a class="el" href="classtvm_1_1TypeData.html" title="Stores all data for an Algebraic Data Type (ADT). ">TypeData</a> container node </td></tr>
+<tr id="row_98_141_11_" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeVarNode.html" target="_self">tvm::TypeVarNode</a></td><td class="desc"><a class="el" href="classtvm_1_1Type.html" title="Managed reference to TypeNode. ">Type</a> parameter in functions </td></tr>
+<tr id="row_98_142_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TypeReporterNode.html" target="_self">tvm::TypeReporterNode</a></td><td class="desc">Reporter that reports back to the type resolution information </td></tr>
+<tr id="row_98_143_" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1WorkspaceMemoryPoolsNode.html" target="_self">tvm::WorkspaceMemoryPoolsNode</a></td><td class="desc"></td></tr>
 <tr id="row_99_"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1runtime_1_1ObjectEqual.html" target="_self">tvm::runtime::ObjectEqual</a></td><td class="desc">String-aware <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html" title="Base class of all object reference. ">ObjectRef</a> hash functor </td></tr>
 <tr id="row_100_" class="even"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="structtvm_1_1runtime_1_1ObjectHash.html" target="_self">tvm::runtime::ObjectHash</a></td><td class="desc">String-aware <a class="el" href="classtvm_1_1runtime_1_1ObjectRef.html" title="Base class of all object reference. ">ObjectRef</a> equal functor </td></tr>
 <tr id="row_101_"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ObjectPtr.html" target="_self">tvm::runtime::ObjectPtr&lt; T &gt;</a></td><td class="desc">A custom smart pointer for <a class="el" href="classtvm_1_1runtime_1_1Object.html" title="base class of all object containers. ">Object</a> </td></tr>
@@ -1031,180 +1030,179 @@ This inheritance list is sorted roughly, but not completely, alphabetically:</di
 <tr id="row_107_130_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1GenericFunc.html" target="_self">tvm::GenericFunc</a></td><td class="desc">Generic function that can be specialized on a per-target basis </td></tr>
 <tr id="row_107_131_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1instrument_1_1PassInstrument.html" target="_self">tvm::instrument::PassInstrument</a></td><td class="desc">Managed reference class for <a class="el" href="classtvm_1_1instrument_1_1PassInstrumentNode.html" title="PassInstrumentNode forms an instrument implementation. It  [...]
 <tr id="row_107_132_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1IRModule.html" target="_self">tvm::IRModule</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1IRModuleNode.html" title="IRModule that holds functions and type definitions. ">IRModuleNode</a> </td></tr>
-<tr id="row_107_133_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1LinkedParam.html" target="_self">tvm::LinkedParam</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1LinkedParamNode.html" title="Describes one parameter that should be linked into the generated module. ">LinkedParamNode</a> </td></tr>
-<tr id="row_107_134_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1MemoryInfo.html" target="_self">tvm::MemoryInfo</a></td><td class="desc">Defines memory info </td></tr>
-<tr id="row_107_135_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBest.html" target="_self">tvm::meta_schedule::ApplyHistoryBest</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1ApplyHistoryBestNode.html" title="An integration context that allows application o [...]
-<tr id="row_107_136_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_136_" class="arrow" onclick="toggleFolder('107_136_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfo.html" target="_self">tvm::meta_schedule::ArgInfo</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1ArgInfoNode.h [...]
-<tr id="row_107_136_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TensorInfo.html" target="_self">tvm::meta_schedule::TensorInfo</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1TensorInfoNode.html" title="The tensor argument information. ">TensorInfoNode</a> </td></tr>
-<tr id="row_107_137_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1Builder.html" target="_self">tvm::meta_schedule::Builder</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1BuilderNode.html" title="The abstract builder interface. ">BuilderNode</a> </td></tr>
-<tr id="row_107_138_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInput.html" target="_self">tvm::meta_schedule::BuilderInput</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1BuilderInputNode.html" title="The builder&#39;s input, containing an IRModule and the targ [...]
-<tr id="row_107_139_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResult.html" target="_self">tvm::meta_schedule::BuilderResult</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1BuilderResultNode.html" title="The builder&#39;s output, containing the artifact path or [...]
-<tr id="row_107_140_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1CostModel.html" target="_self">tvm::meta_schedule::CostModel</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1CostModelNode.html" title="Cost model. ">CostModelNode</a> </td></tr>
-<tr id="row_107_141_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1Database.html" target="_self">tvm::meta_schedule::Database</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1DatabaseNode.html">DatabaseNode</a> </td></tr>
-<tr id="row_107_142_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTask.html" target="_self">tvm::meta_schedule::ExtractedTask</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1ExtractedTaskNode.html" title="A tuning task extracted from the high-level IR. ">Extract [...]
-<tr id="row_107_143_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractor.html" target="_self">tvm::meta_schedule::FeatureExtractor</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1FeatureExtractorNode.html" title="Extractor for features from measure candidates f [...]
-<tr id="row_107_144_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallback.html" target="_self">tvm::meta_schedule::MeasureCallback</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCallbackNode.html" title="Rules to apply after measure results is available.  [...]
-<tr id="row_107_145_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidate.html" target="_self">tvm::meta_schedule::MeasureCandidate</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1MeasureCandidateNode.html" title="The schedule (with input shapes) to be measured. [...]
-<tr id="row_107_146_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1Mutator.html" target="_self">tvm::meta_schedule::Mutator</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1MutatorNode.html" title="Mutator is designed to mutate the trace to explore the design space. ">Muta [...]
-<tr id="row_107_147_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1Postproc.html" target="_self">tvm::meta_schedule::Postproc</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1PostprocNode.html" title="Rules to apply a postprocessor to a schedule. ">PostprocNode</a> </td></tr>
-<tr id="row_107_148_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1Runner.html" target="_self">tvm::meta_schedule::Runner</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1RunnerNode.html" title="The abstract runner interface. ">RunnerNode</a> </td></tr>
-<tr id="row_107_149_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerFuture.html" target="_self">tvm::meta_schedule::RunnerFuture</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1RunnerFutureNode.html" title="A class to asynchronously fetch runner&#39;s output. ">Runne [...]
-<tr id="row_107_150_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerInput.html" target="_self">tvm::meta_schedule::RunnerInput</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1RunnerInputNode.html" title="Runner&#39;s input containing path of artifact, type of device  [...]
-<tr id="row_107_151_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1RunnerResult.html" target="_self">tvm::meta_schedule::RunnerResult</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1RunnerResultNode.html" title="Runner&#39;s output containing measurement result of Measure [...]
-<tr id="row_107_152_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1ScheduleRule.html" target="_self">tvm::meta_schedule::ScheduleRule</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1ScheduleRuleNode.html" title="Rules to modify a block in a schedule. ">ScheduleRuleNode</a [...]
-<tr id="row_107_153_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1SearchStrategy.html" target="_self">tvm::meta_schedule::SearchStrategy</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1SearchStrategyNode.html" title="The search strategy for measure candidates generation. [...]
-<tr id="row_107_154_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1SpaceGenerator.html" target="_self">tvm::meta_schedule::SpaceGenerator</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1SpaceGeneratorNode.html" title="The abstract class for design space generation. ">Spac [...]
-<tr id="row_107_155_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TaskScheduler.html" target="_self">tvm::meta_schedule::TaskScheduler</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1TaskSchedulerNode.html" title="The abstract interface of task schedulers. ">TaskSchedule [...]
-<tr id="row_107_156_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TuneContext.html" target="_self">tvm::meta_schedule::TuneContext</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1TuneContextNode.html" title="The auto tuning context. ">TuneContextNode</a> </td></tr>
-<tr id="row_107_157_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1TuningRecord.html" target="_self">tvm::meta_schedule::TuningRecord</a></td><td class="desc">The managed reference of <a class="el" href="classtvm_1_1meta__schedule_1_1TuningRecordNode.html" title="The class of tuning records. ">TuningRecordNode</a> </td></tr>
-<tr id="row_107_158_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1meta__schedule_1_1Workload.html" target="_self">tvm::meta_schedule::Workload</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1meta__schedule_1_1WorkloadNode.html" title="A workload, i.e. an IRModule and its structural hash. ">WorkloadNode</a> [...]
-<tr id="row_107_159_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1parser_1_1Source.html" target="_self">tvm::parser::Source</a></td><td class="desc"></td></tr>
-<tr id="row_107_160_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1parser_1_1SourceMap.html" target="_self">tvm::parser::SourceMap</a></td><td class="desc"></td></tr>
-<tr id="row_107_161_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1PoolInfo.html" target="_self">tvm::PoolInfo</a></td><td class="desc"></td></tr>
-<tr id="row_107_162_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Range.html" target="_self">tvm::Range</a></td><td class="desc"><a class="el" href="classtvm_1_1Range.html" title="Range constainer. ">Range</a> constainer </td></tr>
-<tr id="row_107_163_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1Clause.html" target="_self">tvm::relay::Clause</a></td><td class="desc"></td></tr>
-<tr id="row_107_164_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ConstructorValue.html" target="_self">tvm::relay::ConstructorValue</a></td><td class="desc"></td></tr>
-<tr id="row_107_165_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_165_" class="arrow" onclick="toggleFolder('107_165_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DFPattern.html" target="_self">tvm::relay::DFPattern</a></td><td class="desc">Managed reference to dataflow patterns </td></tr>
-<tr id="row_107_165_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1AltPattern.html" target="_self">tvm::relay::AltPattern</a></td><td class="desc">A pattern which matches either of two patterns </td></tr>
-<tr id="row_107_165_1_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1AttrPattern.html" target="_self">tvm::relay::AttrPattern</a></td><td class="desc">A pattern which matches attributes in another pattern </td></tr>
-<tr id="row_107_165_2_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1CallPattern.html" target="_self">tvm::relay::CallPattern</a></td><td class="desc"></td></tr>
-<tr id="row_107_165_3_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ConstantPattern.html" target="_self">tvm::relay::ConstantPattern</a></td><td class="desc"></td></tr>
-<tr id="row_107_165_4_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DataTypePattern.html" target="_self">tvm::relay::DataTypePattern</a></td><td class="desc">A pattern which matches a type in another pattern </td></tr>
-<tr id="row_107_165_5_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DominatorPattern.html" target="_self">tvm::relay::DominatorPattern</a></td><td class="desc">A pattern which matches a variable length dominator path </td></tr>
-<tr id="row_107_165_6_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ExprPattern.html" target="_self">tvm::relay::ExprPattern</a></td><td class="desc">A pattern which matches a literal expression </td></tr>
-<tr id="row_107_165_7_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1FunctionPattern.html" target="_self">tvm::relay::FunctionPattern</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1relay_1_1FunctionNode.html" title="Relay Function container. ">FunctionNode</a> </td></tr>
-<tr id="row_107_165_8_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1IfPattern.html" target="_self">tvm::relay::IfPattern</a></td><td class="desc"></td></tr>
-<tr id="row_107_165_9_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1LetPattern.html" target="_self">tvm::relay::LetPattern</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Let.html">Let</a> binding that binds a local var </td></tr>
-<tr id="row_107_165_10_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1ShapePattern.html" target="_self">tvm::relay::ShapePattern</a></td><td class="desc">A pattern which matches a type in another pattern </td></tr>
-<tr id="row_107_165_11_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TupleGetItemPattern.html" target="_self">tvm::relay::TupleGetItemPattern</a></td><td class="desc"></td></tr>
-<tr id="row_107_165_12_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TuplePattern.html" target="_self">tvm::relay::TuplePattern</a></td><td class="desc"></td></tr>
-<tr id="row_107_165_13_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1TypePattern.html" target="_self">tvm::relay::TypePattern</a></td><td class="desc">A pattern which matches a type in another pattern </td></tr>
-<tr id="row_107_165_14_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1VarPattern.html" target="_self">tvm::relay::VarPattern</a></td><td class="desc"></td></tr>
-<tr id="row_107_165_15_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1WildcardPattern.html" target="_self">tvm::relay::WildcardPattern</a></td><td class="desc">A pattern which matches anything </td></tr>
-<tr id="row_107_166_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1DFPatternCallback.html" target="_self">tvm::relay::DFPatternCallback</a></td><td class="desc">Managed reference to dataflow pattern callbacks </td></tr>
-<tr id="row_107_167_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1Executor.html" target="_self">tvm::relay::Executor</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1relay_1_1ExecutorNode.html" title="Executor information. ">ExecutorNode</a> </td></tr>
-<tr id="row_107_168_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1Id.html" target="_self">tvm::relay::Id</a></td><td class="desc"></td></tr>
-<tr id="row_107_169_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpImplementation.html" target="_self">tvm::relay::OpImplementation</a></td><td class="desc">Operator implementation class </td></tr>
-<tr id="row_107_170_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpSpecialization.html" target="_self">tvm::relay::OpSpecialization</a></td><td class="desc">Operator specialization class </td></tr>
-<tr id="row_107_171_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1OpStrategy.html" target="_self">tvm::relay::OpStrategy</a></td><td class="desc">Operator strategy class </td></tr>
-<tr id="row_107_172_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_172_" class="arrow" onclick="toggleFolder('107_172_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1Pattern.html" target="_self">tvm::relay::Pattern</a></td><td class="desc"><a class="el" href="classtvm_1_1relay_1_1Pattern.html" title="Pattern is the base type for an ADT mat [...]
-<tr id="row_107_172_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternConstructor.html" target="_self">tvm::relay::PatternConstructor</a></td><td class="desc"></td></tr>
-<tr id="row_107_172_1_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternTuple.html" target="_self">tvm::relay::PatternTuple</a></td><td class="desc"></td></tr>
-<tr id="row_107_172_2_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternVar.html" target="_self">tvm::relay::PatternVar</a></td><td class="desc"></td></tr>
-<tr id="row_107_172_3_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1PatternWildcard.html" target="_self">tvm::relay::PatternWildcard</a></td><td class="desc"></td></tr>
-<tr id="row_107_173_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RecClosure.html" target="_self">tvm::relay::RecClosure</a></td><td class="desc"></td></tr>
-<tr id="row_107_174_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1RefValue.html" target="_self">tvm::relay::RefValue</a></td><td class="desc"></td></tr>
-<tr id="row_107_175_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1Runtime.html" target="_self">tvm::relay::Runtime</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1relay_1_1RuntimeNode.html" title="Runtime information. ">RuntimeNode</a> </td></tr>
-<tr id="row_107_176_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ADT.html" target="_self">tvm::runtime::ADT</a></td><td class="desc">Reference to algebraic data type objects </td></tr>
-<tr id="row_107_177_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1Array.html" target="_self">tvm::runtime::Array&lt; T, typename &gt;</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Array.html" title="Array, container representing a contiguous sequence of ObjectRefs. ">Array</a>, container representing a  [...]
-<tr id="row_107_178_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_178_" class="arrow" onclick="toggleFolder('107_178_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1Closure.html" target="_self">tvm::runtime::Closure</a></td><td class="desc">Reference to closure </td></tr>
-<tr id="row_107_178_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1relay_1_1InterpreterClosure.html" target="_self">tvm::relay::InterpreterClosure</a></td><td class="desc"></td></tr>
-<tr id="row_107_178_1_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1VMClosure.html" target="_self">tvm::runtime::vm::VMClosure</a></td><td class="desc">Reference to closure </td></tr>
-<tr id="row_107_179_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1Map.html" target="_self">tvm::runtime::Map&lt; K, V, typename, typename &gt;</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Map.html" title="Map container of NodeRef-&gt;NodeRef in DSL graph. Map implements copy on write semantics, which m [...]
-<tr id="row_107_180_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_180_" class="arrow" onclick="toggleFolder('107_180_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataBase.html" target="_self">tvm::runtime::metadata::MetadataBase</a></td><td class="desc">Reference class for the common <a class="el" href="classtvm_1_1ru [...]
-<tr id="row_107_180_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArray.html" target="_self">tvm::runtime::metadata::MetadataArray</a></td><td class="desc">Reference class for <a class="el" href="classtvm_1_1runtime_1_1metadata_1_1MetadataArray.html" title="Reference class for MetadataArray. ">MetadataA [...]
-<tr id="row_107_181_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1Module.html" target="_self">tvm::runtime::Module</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Module.html" title="Module container of TVM. ">Module</a> container of TVM </td></tr>
-<tr id="row_107_182_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1NDArray.html" target="_self">tvm::runtime::NDArray</a></td><td class="desc">Managed <a class="el" href="classtvm_1_1runtime_1_1NDArray.html" title="Managed NDArray. The array is backed by reference counted blocks. ">NDArray</a>. The array is backed by referenc [...]
-<tr id="row_107_183_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1Optional.html" target="_self">tvm::runtime::Optional&lt; T &gt;</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Optional.html" title="Optional container that to represent to a Nullable variant of T. ">Optional</a> container that to represen [...]
-<tr id="row_107_184_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1PackedFunc.html" target="_self">tvm::runtime::PackedFunc</a></td><td class="desc">Packed function is a type-erased function. The arguments are passed by packed format </td></tr>
-<tr id="row_107_185_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1DeviceWrapper.html" target="_self">tvm::runtime::profiling::DeviceWrapper</a></td><td class="desc">Wrapper for <code>Device</code> </td></tr>
-<tr id="row_107_186_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollector.html" target="_self">tvm::runtime::profiling::MetricCollector</a></td><td class="desc">Wrapper for <code><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1MetricCollectorNode.html" title="Interface for user defined profiling  [...]
-<tr id="row_107_187_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1profiling_1_1Report.html" target="_self">tvm::runtime::profiling::Report</a></td><td class="desc"></td></tr>
-<tr id="row_107_188_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1ShapeTuple.html" target="_self">tvm::runtime::ShapeTuple</a></td><td class="desc">Reference to shape tuple objects </td></tr>
-<tr id="row_107_189_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1String.html" target="_self">tvm::runtime::String</a></td><td class="desc">Reference to string objects </td></tr>
-<tr id="row_107_190_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1Timer.html" target="_self">tvm::runtime::Timer</a></td><td class="desc"><a class="el" href="classtvm_1_1runtime_1_1Timer.html" title="Timer for a specific device. ">Timer</a> for a specific device </td></tr>
-<tr id="row_107_191_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1runtime_1_1vm_1_1Storage.html" target="_self">tvm::runtime::vm::Storage</a></td><td class="desc">Reference to storage </td></tr>
-<tr id="row_107_192_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1SourceName.html" target="_self">tvm::SourceName</a></td><td class="desc">The source name of a file span </td></tr>
-<tr id="row_107_193_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Span.html" target="_self">tvm::Span</a></td><td class="desc"></td></tr>
-<tr id="row_107_194_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1Target.html" target="_self">tvm::Target</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetNode.html" title="Compilation target. ">TargetNode</a> </td></tr>
-<tr id="row_107_195_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetKind.html" target="_self">tvm::TargetKind</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetKindNode.html" title="Target kind, specifies the kind of the target. ">TargetKindNode</a> </td></tr>
-<tr id="row_107_196_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1TargetTag.html" target="_self">tvm::TargetTag</a></td><td class="desc">Managed reference class to <a class="el" href="classtvm_1_1TargetTagNode.html" title="A target tag. ">TargetTagNode</a> </td></tr>
-<tr id="row_107_197_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1IterVarAttr.html" target="_self">tvm::te::IterVarAttr</a></td><td class="desc">Additional scheduable attributes about IterVar </td></tr>
-<tr id="row_107_198_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_198_" class="arrow" onclick="toggleFolder('107_198_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1IterVarRelation.html" target="_self">tvm::te::IterVarRelation</a></td><td class="desc">The schedule relation between IterVars can be <a class="el" href="classtvm_1_1te_1_1Split.h [...]
-<tr id="row_107_198_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Fuse.html" target="_self">tvm::te::Fuse</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1FuseNode.html" title="Fuse two domains into one domain. ">FuseNode</a> </td></tr>
-<tr id="row_107_198_1_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Rebase.html" target="_self">tvm::te::Rebase</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1RebaseNode.html" title="Rebase the iteration to make min to be 0. This is useful to normalize the Schedule to make every leaf...">Rebas [...]
-<tr id="row_107_198_2_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Singleton.html" target="_self">tvm::te::Singleton</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1SingletonNode.html" title="Singleton iterator [0, 1) ">SingletonNode</a> </td></tr>
-<tr id="row_107_198_3_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Split.html" target="_self">tvm::te::Split</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1SplitNode.html" title="Split the parent domain into product of outer and iter. ">SplitNode</a> </td></tr>
-<tr id="row_107_198_4_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Transform.html" target="_self">tvm::te::Transform</a></td><td class="desc"></td></tr>
-<tr id="row_107_199_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_199_" class="arrow" onclick="toggleFolder('107_199_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Operation.html" target="_self">tvm::te::Operation</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Operation.html" title="Operation that produces tensors. ">Operati [...]
-<tr id="row_107_199_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ComputeOp.html" target="_self">tvm::te::ComputeOp</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1ComputeOpNode.html" title="A Compute op that compute a tensor on certain domain. ">ComputeOpNode</a> </td></tr>
-<tr id="row_107_199_1_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ExternOp.html" target="_self">tvm::te::ExternOp</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1ExternOpNode.html" title="External computation that cannot be splitted. ">ExternOpNode</a> </td></tr>
-<tr id="row_107_199_2_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1HybridOp.html" target="_self">tvm::te::HybridOp</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1HybridOpNode.html" title="A computation operator that generated by hybrid script. ">HybridOpNode</a> </td></tr>
-<tr id="row_107_199_3_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1PlaceholderOp.html" target="_self">tvm::te::PlaceholderOp</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1PlaceholderOpNode.html" title="A placeholder op represents an input placeholder. ">PlaceholderOpNode</a> </td></tr>
-<tr id="row_107_199_4_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1ScanOp.html" target="_self">tvm::te::ScanOp</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1ScanOpNode.html" title="Symbolic scan. ">ScanOpNode</a> </td></tr>
-<tr id="row_107_199_5_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorComputeOp.html" target="_self">tvm::te::TensorComputeOp</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1TensorComputeOpNode.html" title="A TenorCompute op that compute a tensor with an tensor intrinsic. ">TensorComputeOpN [...]
-<tr id="row_107_200_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Schedule.html" target="_self">tvm::te::Schedule</a></td><td class="desc">Global schedule container For operations and all the operations they depend on. The schedule per <a class="el" href="classtvm_1_1te_1_1Operation.html" title="Operation that produces tensors. " [...]
-<tr id="row_107_201_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1SpecializedCondition.html" target="_self">tvm::te::SpecializedCondition</a></td><td class="desc">Specialized condition to enable op specialization </td></tr>
-<tr id="row_107_202_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Stage.html" target="_self">tvm::te::Stage</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Stage.html" title="Stage, contains scheduling for a stage of computation. ">Stage</a>, contains scheduling for a stage of computation </td></tr>
-<tr id="row_107_203_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorIntrin.html" target="_self">tvm::te::TensorIntrin</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1TensorIntrinNode.html" title="Node to represent a Tensor intrinsic operator. ">TensorIntrinNode</a> </td></tr>
-<tr id="row_107_204_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1TensorIntrinCall.html" target="_self">tvm::te::TensorIntrinCall</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1te_1_1TensorIntrinCallNode.html">TensorIntrinCallNode</a> </td></tr>
-<tr id="row_107_205_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BijectiveLayout.html" target="_self">tvm::tir::BijectiveLayout</a></td><td class="desc">Bijective function mapping for data layout transformation. Given two <a class="el" href="classtvm_1_1tir_1_1Layout.html" title="Managed reference to LayoutNode. ">Layout</a>, < [...]
-<tr id="row_107_206_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockRV.html" target="_self">tvm::tir::BlockRV</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1BlockRVNode.html" title="A random variable that evaluates to a TensorIR block. ">BlockRVNode</a> </td></tr>
-<tr id="row_107_207_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BlockScope.html" target="_self">tvm::tir::BlockScope</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1BlockScopeNode.html" title="An object with 1-to-1 correspondence with each block reference in the sref tree. This data structu [...]
-<tr id="row_107_208_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1Buffer.html" target="_self">tvm::tir::Buffer</a></td><td class="desc"><a class="el" href="classtvm_1_1tir_1_1Buffer.html" title="Buffer is a symbolic n-darray structure. It is a composition of primitive symbolic types...">Buffer</a> is a symbolic n-darray structur [...]
-<tr id="row_107_209_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1BufferRegion.html" target="_self">tvm::tir::BufferRegion</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1BufferRegionNode.html" title="Representing the region of multi-dimensional buffer access. ">BufferRegionNode</a> </td></tr>
-<tr id="row_107_210_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1CommReducer.html" target="_self">tvm::tir::CommReducer</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1CommReducerNode.html" title="A commutative reducer node to represent a commutative binary operator with identity element..." [...]
-<tr id="row_107_211_" class="even" style="display:none;"><td class="entry"><span style="width:16px;display:inline-block;">&#160;</span><span id="arr_107_211_" class="arrow" onclick="toggleFolder('107_211_')">&#9658;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1DataProducer.html" target="_self">tvm::tir::DataProducer</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1DataProducerNode.html" title="Base  [...]
-<tr id="row_107_211_0_" class="even" style="display:none;"><td class="entry"><span style="width:48px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1te_1_1Tensor.html" target="_self">tvm::te::Tensor</a></td><td class="desc"><a class="el" href="classtvm_1_1te_1_1Tensor.html" title="Tensor structure representing a possible input, or intermediate computation result. ">Tensor</a> structure representing a possible input [...]
-<tr id="row_107_212_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1Dependency.html" target="_self">tvm::tir::Dependency</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1DependencyNode.html" title="A tuple (src, dst, kind) representing certain types of dependency. For example, (A, B, kRAW) means [...]
-<tr id="row_107_213_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IndexMap.html" target="_self">tvm::tir::IndexMap</a></td><td class="desc"></td></tr>
-<tr id="row_107_214_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1Instruction.html" target="_self">tvm::tir::Instruction</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1InstructionNode.html" title="Schedule instructions each corresponds to a schedule primitive. ">InstructionNode</a> </td></tr>
-<tr id="row_107_215_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1InstructionKind.html" target="_self">tvm::tir::InstructionKind</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1InstructionKindNode.html" title="Kind of an instruction, e.g. Split, Reorder, etc. Besides the name, every kind of i [...]
-<tr id="row_107_216_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1IterVar.html" target="_self">tvm::tir::IterVar</a></td><td class="desc">Iteration Variable, represents an iteration over an integer interval </td></tr>
-<tr id="row_107_217_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1Layout.html" target="_self">tvm::tir::Layout</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1LayoutNode.html" title="Layout is to describe how data is organized within an N-dimention tensor. It is composed of upper cas...">Layo [...]
-<tr id="row_107_218_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1LoopRV.html" target="_self">tvm::tir::LoopRV</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1LoopRVNode.html" title="A random variable that evaluates to a TensorIR for loop. ">LoopRVNode</a> </td></tr>
-<tr id="row_107_219_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1MatchBufferRegion.html" target="_self">tvm::tir::MatchBufferRegion</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1MatchBufferRegionNode.html" title="Match introduces a constraint that the source buffer region can be remapped t [...]
-<tr id="row_107_220_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1Schedule.html" target="_self">tvm::tir::Schedule</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1ScheduleNode.html" title="The user-facing schedule class. ">ScheduleNode</a> </td></tr>
-<tr id="row_107_221_" class="even" style="display:none;"><td class="entry"><span style="width:32px;display:inline-block;">&#160;</span><span class="icona"><span class="icon">C</span></span><a class="el" href="classtvm_1_1tir_1_1ScheduleState.html" target="_self">tvm::tir::ScheduleState</a></td><td class="desc">Managed reference to <a class="el" href="classtvm_1_1tir_1_1ScheduleStateNode.html" title="The state of scheduling, which exposes a Replace method as the primary interface for all  [...]
... 17726 lines suppressed ...