# Feature Improvement

## Accelerator and Microcontroller Support

- Cleanup legacy verilog code ([#4576](
- uTVM support for ARM STM32F746XX boards ([#4274](
- Add --runtime=c, remove micro_dev target, enable LLVM backend [#6145](

## Arithmetic Analysis

- Linear system and equation solver ([#5171](
- Inequalities solver [#5618](
- Improve IntervalSet's floormod ([#5367](
- Remove legacy const pattern functions ([#5387](
- Handle likely in IRMutatorWithAnalyzer [#5665](
- ExtendedEuclidean merge impl to int_operator [#5625](
- Rewrite simplify fix for Vectorized Cooperative Fetching [#5924](

## AutoTVM and Graph Tuner

- Adding ROCM schedules for TOPI ([#4507](
- NHWC conv2d schedule templates for ARM ([#3859](
- Use VM compile to extract autotvm tasks [#4328](
- Download fallback schedule file if it does not exist [#4671](
- Ignore error when removing tmpdir [#4781](
- Fix a bug in generating the search space [#4779](
- Minor bug fixes in AutoTVM for QNN graphs [#4797](
- Fix autotvm customized template [#5034](
- Add opt out operator for has_multiple_inputs for graph tuner [#5000](
- Customize SI prefix in logging ([#5411](
- Update XGBoost verbosity option [#5649](
- Support range in index based tuners [#4870](


- [BYOC] Bind constant tuples in graph partitioner ([#5476](
- [BYOC] Add support for composite functions in BYOC ([#5261](
- [BYOC] Register pattern tables from external codegens ([#5262](
- [BYOC] Enhance partitioning and external codegen ([#5310](
- [BYOC] Refine AnnotateTarget and MergeCompilerRegion Passes ([#5277](
- [BYOC] Use Non-Recursive Visitor/Mutator ([#5410](
- [BYOC] Refine DNNL Codegen ([#5288](
- [BYOC] Add example of Composite + Annotate for DNNL fused op ([#5272](
- [BYOC] Prevent duplicate outputs in subgraph Tuple ([#5320](

## Codegen

- Intrinsic dispatching with OCML instead of LLVM for ROCm ([#4499](
- Make target codegen take IRModule and PrimFunc. [#5107](
- Enhance CUDA codegen for SelectNode [#4983](
- Vectorization for intrinsics [#5101](
- [LLVM] Do not use x86_vcvtph2ps_256 intrinsic with LLVM 11+ ([#5267](
- [LLVM] Use llvm::ElementCount with LLVM 11+ when creating vectors ([#5265](
- [LLVM] Use llvm::FunctionCallee in IRBuilder::CreateCall with LLVM 11+ ([#5338](
- [LLVM] Include Support/Host.h for declaration of getDefaultTargetTriple ([#5268](
- [LLVM] Replace calls to Type::getVectorNumElements ([#5398](
- [LLVM] Use ArrayRef in calls to CreateShuffleVector ([#5399](
- [LLVM] Use llvm::Align with LLVM 11+ to avoid warnings ([#5264](
- [CodeGen] Cleanup generated code (#5424)
- Rename target_id => target_kind [#6199](
- 64-bit RPi4b target [#6211](
- Creating Target from JSON-like Configuration [#6218](
- Add python binding to new JSON target construction [#6315](
- Use target class in all codegens [#6347](
- Initial support for Hexagon codegen [#6261](
- Add --runtime=c, remove micro_dev target, enable LLVM backend [#6145](
- Add tvm::support::hexdump() debug utility [#6154](
- Adding AMD codegen unit tests ([#4509](
- Support cuda tensorcore subbyte int data type in auto tensorcore [#4546](
- Handle empty LLVMModule in GetFunction [#5146](
- Support int4/int8 conv2d tensor core with HWNC layout [#6121](

## Dynamism Support

- Add shape function for zero, zeros_like, ones, ones_like ([#4448](, tile ([#4441](
- Support symbolic newshape for Reshape #5429
- Support symbolic TopK, Ones, Zeros and Full [#5459](
- Add shape_of instruction [#5855](
- symbolic max_output_size [#5844](
- Dynamic TopK Op [#6008](
- Dynamic broadcast_to, zeros, ones [#6007](
- Add dynamic reshape grad [#6080](
- Keep fixed dim when unifying dynamic shape [#5795](
- OneHot operation [#6209](
- Add Dynamic Resize Op [#6198](
- Dynamic full operator [#6260](
- Dynamic upsampling relay op [#6273](
- Dynamic Tile Op [#5983](

## Frontend and User Interface

- TFLite parser support for transpose_conv ([#4440](, unpack ([#4447](
- LLDB pretty printers for relay ([#4453](
- ONNX to Relay converter op support: expand op ([#4483](
- ONNX auto_pad in conv and convtranspose ([#4563](
- TF to Relay converter op support: bilinear and neighbour implementation refactor ([#4504](, max_pool3d ([#4551](, conv2d_transpose with “same” padding support for larger than 1x1 kernels ([#4484](
- Remove unnecessary cast of constants in ONNX converter ([#4573](
- Add support for tf.Keras networks in Relay Keras frontend [#4630](
- Add conv3d [#4604](
- Fix incorrect calculations in tf SLICE [#4518](
- Dynamically calculate input_stats of any fake_quant range [#4789](
- LSTM Support [#4825](
- Add MIRROR_PAD operator [#4822](
- use qnn helper function in softmax [#4840](
- Add Resize op converter [#4838](
- Add support for TFLite_Detection_PostProcess [#4543](
- Fix tests for tflite unary elemwise operations [#4913](
- GaussianDropout/Noise parsing support [#4928](
- Add parser support for 'square' operator [#4915](
- make_loss operator support [#4930](
- Add parser support for l2_normalization [#4966](
- ReadVariableOp operator support [#4952](
- Check graph inputs match expected [#4992](
- support multiply outputs [#4980](
- TFLite: Using real image for QNN testing. [#4816](
- TFLite: FLOOR_MOD & FLOOR_DIV support [#4971](
- PyTorch: Upsampling op support and enable registering a user defined op conversion map [#4961](
- PyTorch: fix unordered dictionary problem for python version under 3.6 [#4982](
- Operator support NonZero [#5073](
- Upsampling op support and enable registering a user defined op conversion map [#4961](
- Check graph inputs match expected [#4992](
- Add support for quantized models via QNN [#4977](
- Add initial control flow support [#4964](
- Remove FP32 piggy back and use QNN add/mul/concatenate [#5061](
- Add missing upcast to uint8 avg_pool conversion [#5089](
- Add initial 3D op support and test on Resnet 3D [#5075](
- Fix conv2d conversion for group conv (group > 1 but != in channels) [#5132](
- Add support for max_pool1d [#5142](
- Add support for split [#5174](
- FLOOR_MOD & FLOOR_DIV support [#4971](
- Activation functions support [#4978](
- Round op parsing support added [#5022](
- DepthToSpace and SpaceToDepth support [#5041](
- TOP_K op parser support [#5051](
- ReadVariableOp operator support [#4952](
- Support multiply outputs [#4980](
- Reduce_any op parsing support [#4926](
- TensorFlow Parser Control Flow Enhancement [#5020](
- TensorFlow Frontend support with shared params [#5042](
- Support for AddV2 in Relay Tensorflow frontend converter. [#5046](
- conv3d frontend operator support [#5080](
- Max_pool3d and Averagepool3d operator support [#5085](
- Support for Atan/Atan2 in Relay Tensorflow frontend converter. [#5104](
- Use leaky by default for LeakyReLU [#5192](
- Conv3D ONNX support and conv3D_ncdhw x86 schedules [#4949](
- Add support for FusedBatchNormV3 [#5065](
- Activations for pytorch [#5194](
- Dropouts And InstanceNorm support added [#5203](
- [Frontend] Asymmetric padding of convolution support ([#4803](
- [ONNX]Pool3d & upsample3d op support ([#5135](
- Add TopK to ONNX Frontend ([#5441](
- Add RoiAlign to Onnx frontend ([#5454](
- [PYTORCH]AvgPool3d, MaxPool3d and Squeeze op support ([#5220](
- [PYTORCH]celu, gelu, selu activations ([#5263](
- [Pytorch]layernorm bug fix and testcase updated ([#5257](
- [PYTORCH]LayerNorm support added ([#5249](
- [RELAY-OP][PYTORCH]GroupNorm op support added ([#5358](
- [TOPI][PYTORCH]Logical & Bitwise operator support ([#5341](
- [PYTORCH]Tensor creation ops support ([#5347](
- [RELAY][PYTORCH]cosh,sinh,log2,log10,log1p op support ([#5395](
- [PYTORCH]Rsub, Embedded, OneHot ops support ([#5434](
- [PYTORCH]Abs, Arange, Softplus ops ([#5295](
- [RELAY][PYTORCH]isNan, isinf, isfinite, ceil, clamp, round ops ([#5316](
- [PYTORCH]Activations for pytorch ([#5194](
- [PYTORCH]Repeat, Reciprocal & Reshape Op support ([#5280](
- [PYTORCH]Reduce_ops support added ([#5308](
- [PYTORCH]Take, Topk op support ([#5332](
- [PYTORCH]Dropouts And InstanceNorm support added ([#5203](
- [PYTORCH]Unary Ops frontend support. ([#5378](
- [Torch] Support Python list, more realistic recurrent networks ([#5306](
- [PYTORCH]where, addcdiv, addcmul op support ([#5383](
- [Torch] Add support for split ([#5174](
- [Frontend][Torch] Fix up graph input handling ([#5204](
- [FRONTEND][TFLITE]Logical not op support ([#5475](
- [TFLITE]Hard Swish & MobilnetV3 model testing ([#5239](
- [FRONTEND][TFLITE]Gather, StridedSlice op support added ([#4788](
- [TFLITE] Match TFLite shape for SSD custom op ([#5473](
- Factor out import of common tflite.Operator in tflite frontend. ([#5355](
- [Frontend][TFLite] support for FILL and SPLIT_V operators ([#5330](
- [Frontend][TFLite] L2_POOL_2D operator ([#5452](
- [TFLite] Add config option to specify FlatBuffers location ([#5425](
- [FRONTEND][TFLITE]Logical not op support ([#5475](
- [TENSORFLOW]reduce ops updated ([#5180](
- [FRONTEND][TENSORFLOW] Fix gather_nd indices ([#5279](
- [Frontend][TensorFlow]Improve TensorFlow Static Shape Tensor Array ([#5243](
- [KERAS]Minimum & AlphaDropout op support ([#5380](
- [KERAS]Embedding layer ([#5444](
- [FRONTEND][KERAS]Max_pool3d and Averagepool3d operator support ([#5085](
- [RELAY][FRONTEND][CAFFE2] add Mul and ConvTranspose operator ([#5302](
- [MXNET]DepthToSpace & SpaceToDepth Operator ([#5408](
- [MXNET]broadcast and logical op support ([#5461](
- [FRONTEND][MXNET] Use leaky by default for LeakyReLU ([#5192](
- [FRONTEND][MXNET] support elemwise logic ops ([#5361](
- [Frontend|MXNet] SwapAxis operator support ([#5246](
- [RELAY] Move frontend utils (#5345)
- [Fontend][Pytorch] Fix translation of transpose when axis argument is as a list (#5451)
- LpPool Support added [#5696](
- Skip ADD inside Gemm op when vector is zero [#5697](
- ReduceL1, ReduceL2, ReduceSumSquare, ReduceLogSum ops added [#5721](
- MaxRoiPool, Mod & Xor op support added [#5729](
- Skip multiply with 1.0f constant for GEMM import [#5800](
- StatefulPartitionedCall/PartitionedCall Ops support added [#5617](
- Don't add cast for batch norm when type isn't changing [#5731](
- Conv3d Transpose OP added [#5775](
- expand bug fix [#5576](
- Support max_pool2d_with_indices [#5549](
- Add prim::device op [#5584](
- ImplicitTensorToNum support added [#5603](
- Matmul fix for batch_matmul [#5604](
- ReflectionPad2d op [#5624](
- Padding op support [#5638](
- Minor bug fixes [#5683](
- floor_divide support for squeezenet [#5702](
- ReplicationPad support added [#5708](
- aten::norm support added [#5776](
- broadcast and logical op support [#5461](
- MaxPool3d and AvgPool3d Ops support added [#5614](
- Softmin, trunc op support added [#5715](
- conv3d and conv3d_transpose addedx [#5814](
- Model importer to be compatible with tflite 2.1.0 [#5497](
- Nit: Function names made consistent [#5515](
- Select op support for tflite frontend [#5486](
- GATHER_ND [#5508](
- Quantize & Dequantize op [#5394](
- Fully connected op conversion made in sync with TFLite [#5510](
- ADD_N operator [#5474](
- onnx, mxnet, pytorch mathops added [#5561](
- abs, round, reciprocal, sign, softsign, hard_sigmoid ops support [#5587](
- Gather nd bug fix for one dim support in tensorflow [#5588](
- Add parser support for shape and range [#5329](
- Darknet support batch size for yolo [#5688](
- Improve Control Flow and TensorArray [#5699](
- MXNet: Softmin, trunc op support added [#5715](
- MXNet: conv3d and conv3d_transpose addedx [#5814](
- MXNet: Add parser for contrib.box_decode [#5967](
- Onnx: ReduceL1, ReduceL2, ReduceSumSquare, ReduceLogSum ops added [#5721](
- Onnx: MaxRoiPool, Mod & Xor op support added [#5729](
- Onnx: Skip multiply with 1.0f constant for GEMM import [#5800](
- Onnx: Fix an issue with #5755 and add Batch norm unit tests. [#5845](
- TensorFlow: StatefulPartitionedCall/PartitionedCall Ops support added [#5617](
- TensorFlow: Don’t add cast for batch norm when type isn’t changing [#5731](
- TensorFlow: Conv3d Transpose OP added [#5775](
- Add parser support for shape and range [#5329](
- Darknet support batch size for yolo [#5688](
- Improve Control Flow and TensorArray [#5699](
- Improve TF Parser to keep output nodes for saved_model [#5794](
- Add parser support for relu6, leaky_relu, relu_n1_to_1, log_softmax [#4805](
- Fix TF Dynamic input shape [#5825](
- Support a few contrib ops in mxnet [#5819](
- Improve TF Parser to keep output nodes for saved_model [#5794](
- Add parser support for relu6, leaky_relu, relu_n1_to_1, log_softmax [#4805](
- Check all unsupported ops before raising an exception [#5929](
- Add Pytorch advanced indexing [#6318](
- Support index_select [#6295](
- Fix cast to long [#6301](
- Fix dtype handling for modules with integer parameters [#6311](
- pytorch frontend support conv1d [#6203](
- Add cast to double, fix flatten conversion [#6357](
- Fix aten::max and aten::min conversion [#6372](
- Match pytorch 1.6 googlenet pretrained model (#6201) [#6212]( unbiased variance op and corresponding support in pytorch frontend [#6232](
- Implemented PADV2 Operator for TFLite and added support for constant values in PAD. [#6167](
- Implemented ONE_HOT Operator for TFLite. [#6223](
- Implemented EXPAND_DIMS Operator for TFLite. [#6243](
- Implemented REVERSE_V2 Operator for TFLite. [#6304](
- Implemented MATRIX_SET_DIAG Operator for Relay/TOPI and TFLite Frontend. [#6303](
- RESHAPE with dynamic shape arg in TFLite frontend [#6208](
- Constant input attr added to fully connected operation in TFLite frontend [#6228](
- Gather operation with indices as tensor expr in TFLite frontend [#6168](
- Added support for tflite quantized maximum and minimum [#6018](
- Unary ops support added in frontend [#6196](
- Introduce caffe frontend for tvm [#6206](
- Keras softmax and prelu fix under NHWC [#6278](
- add support for MXNET numpy operators [#6054](
- Refine tensorflow frontend 1.x & 2.x compatibility [#6240](
- Reduceops support added to frontend [#6252](
- Update precision in the ONNX strided_slice, update precision of ToScalar [#6272](
- NHWC import support. [#4899](
- Refine tensorflow frontend 1.x & 2.x compatibility [#6240](
- Fix node indices attribute error for tensorflow 2.3 [#6288](
- Support NMSv4 [#6085](
- Support for PyTorch Non-Maximum Suppression [#6314](
- ReplicationPad support added [#5708](
- MXNet pre-quantized BERT [#6039](
- Keep parameter names from PyTorch [#5887](
- Refine LSTMBlockCell to support dynamic rnn [#5963](

