You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by gi...@apache.org on 2023/01/04 12:10:56 UTC
[mxnet] branch dependabot/pip/cd/utils/pyyaml-5.4 updated (389478b639 -> d83748854f)

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch dependabot/pip/cd/utils/pyyaml-5.4
in repository https://gitbox.apache.org/repos/asf/mxnet.git


 discard 389478b639 Bump pyyaml from 5.1 to 5.4 in /cd/utils
     add b322bee0e7 [FEATURE] Add property removing duplicate Cast operations (#21020)
     add 08f578b946 Fix test_bf16_binary_broadcast_elemwise_mixed_input (#20986)
     add 1dba76998d Diversify default RNG seed (#21058)
     add 1ad198d639 [FEATURE] Refactor SwapAxis operator. (#21024)
     add 63aea9e031 [FEATURE] Add quantization for npi_add with oneDNN (#21041)
     add ef2be51265 Refactor SupportDNNL functions (#21032)
     add b4aca83e31 Use requested mem in dot op to reduce memory usage (#21067)
     add afbef154ed Type fix for FullyConnected with sum (#21043)
     add c486a0e304 [master] Node elimination graph pass (#21046)
     add cdffaf0994 [FEATURE] Add tanh approximation for GeLU activation (#21034)
     add d8872c876b Fix building of master website due to removed blog page. (#21083)
     add 9745d36ff4 Improve masked_softmax performance with temperature parameter (#21082)
     add 5abdc77f3c [FEATURE] Add _npi_power_scalar and _npi_multiply_scalar fuse (#20976)
     add b713dc5aa3 [BUGFIX] Fix DNNL requantize operator overflow error (#21079)
     add 84b1626b66 oneDNN FullyConnected weight caching & refactor (#21047)
     add 26243eea86 Fix broadcast ops descriptions (#21087)
     add e36c9f075a Refactor fc_sum_fuse (#21077)
     add e522bea513 [BUGFIX] Fix Gluon2.0 guide (#21090)
     add ded6096126 [FEATURE] Add pytest with benchmarking operator (#21088)
     add f6d1ed1872 Improve bf16 support (#21002)
     add cf15e0a478 [BUGFIX] Fix remove Cast fuse (#21086)
     add ef0415d645 [BUGFIX] Fix floor divide (#21096)
     add cca8f4e8c6 Reduce overhead in sg_onednn_fully_connected for floats (#21092)
     add 7b1daf9bc3 Requantize scale fix (#21100)
     add 183e012f01 Get rid of warnings (#21099)
     add ecb5026116 Comments formatting fix (#21101)
     add 5e5e0e3fc1 [BUGFIX] Fix SupportDNNL for multiple inputs (#21102)
     add db39bb1126 Fix multivariate normal bug (#21105)
     add dedb8c97af [WIP] [BUGFIX] Fix flakey TemporaryDirectory() cleanup on Windows (#21107)
     add 0b4ecdbc4a [BUGFIX] Fix threadsafety and shutdown issues with threaded_engine_perdevice (#21110)
     add 97e25cfc7a [submodule] Upgrade oneDNN to v2.6.1 (#21108)
     add 9975ab41a6 [BUGFIX] Reenable fwd conv engine 5 on test_group_conv2d_16c (#21104)
     add 6d1fbe35d2 Add size threshold for few oneDNN operators (#21106)
     add 736313f4e7 Add support for bool data type for condition in where operator (#21103)
     add 1058369f8a [BUGFIX] _npi_repeats with swap (#21112)
     add 7748ae7edf docs: Fix a few typos (#21094)
     add daac02c785 Fix fused resnet low accuracy (#21122)
     add 1a418e4e1c [FEATURE] Add query_keys transformer version without split (#21115)
     add 2d72ce465a [DOC] Add tutotrial about improving accuracy of quantization with oneDNN (#21127)
     add 8d933fdcdb Add proper link to scripts in quantization with INC example (#21133)
     add 3a19f0e50d [FEATURE] Dnnl sum primitive path (#21132)
     add f803641b53 [DOC] Add custom strategy script to quantization with INC example (#21134)
     add bd6405b787 Add quantized batch norm operator fused with ReLU (#21137)
     add c8922fedff Python string formatting (#21136)
     add 7d602e3b23 [DOC] Fix the table in Improving accuracy with INC (#21140)
     add 48d7f4af70 Port top-level-project updates from v1.x branch (#21162)
     add d83748854f Bump pyyaml from 5.1 to 5.4 in /cd/utils

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (389478b639)
            \
             N -- N -- N   refs/heads/dependabot/pip/cd/utils/pyyaml-5.4 (d83748854f)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 .licenserc.yaml                                    |   6 +
 3rdparty/mshadow/mshadow/base.h                    |  20 +-
 3rdparty/mshadow/mshadow/bfloat.h                  |   2 +
 3rdparty/onednn                                    |   2 +-
 CMakeLists.txt                                     |  17 +
 CONTRIBUTORS.md                                    |   9 +-
 DISCLAIMER                                         |  10 -
 LICENSE                                            |   4 +-
 NEWS.md                                            |  96 +--
 NOTICE                                             |   4 +-
 README.md                                          |  47 +-
 benchmark/opperf/README.md                         |   6 +-
 benchmark/opperf/nd_operations/misc_operators.py   |   2 +-
 benchmark/opperf/opperf.py                         |   6 +-
 benchmark/opperf/rules/default_params.py           |   8 +-
 .../opperf/utils/benchmark_operators_pytest.py     | 110 ++++
 benchmark/opperf/utils/benchmark_utils.py          |  35 +-
 benchmark/opperf/utils/common_utils.py             |   9 +-
 benchmark/opperf/utils/ndarray_utils.py            |   6 +-
 benchmark/opperf/utils/op_registry_utils.py        |   4 +-
 benchmark/python/control_flow/rnn.py               |   8 +-
 benchmark/python/dnnl/fc_add.py                    |  26 +-
 benchmark/python/einsum/benchmark_einsum.py        |  20 +-
 benchmark/python/ffi/benchmark_ffi.py              |  10 +-
 benchmark/python/metric/benchmark_metric.py        |  16 +-
 benchmark/python/quantization/benchmark_op.py      |   9 +-
 benchmark/python/sparse/cast_storage.py            |   6 +-
 benchmark/python/sparse/dot.py                     |  35 +-
 benchmark/python/sparse/sparse_op.py               |  18 +-
 cd/README.md                                       |   2 +-
 cd/python/pypi/README.md                           |   2 +-
 cd/python/pypi/pypi_package.sh                     |   2 +-
 cd/utils/artifact_repository.md                    |   2 +-
 cd/utils/requirements.txt                          |  16 +
 ci/docker/runtime_functions.sh                     |   7 +-
 ci/jenkins/Jenkinsfile_centos_gpu                  |   2 +-
 ci/jenkins/Jenkinsfile_unix_cpu                    |   2 +-
 ci/publish/website/deploy.sh                       |  12 +-
 ci/requirements.txt                                |  16 +
 conftest.py                                        |   8 +-
 contrib/tvmop/opdef.py                             |   2 +-
 contrib/tvmop/space.py                             |  10 +-
 cpp-package/README.md                              |  10 +-
 cpp-package/example/README.md                      |  30 +-
 cpp-package/example/inference/README.md            |  18 +-
 .../multi_threaded_inference/get_model.py          |   2 +-
 .../multi_threaded_inference.cc                    |   2 +-
 cpp-package/include/mxnet-cpp/contrib.h            |   6 +-
 cpp-package/include/mxnet-cpp/symbol.hpp           |   6 +-
 cpp-package/scripts/OpWrapperGenerator.py          |  40 +-
 cpp-package/scripts/lint.py                        |   7 +-
 cpp-package/tests/ci_test.sh                       |   2 +-
 doap.rdf                                           |   6 +-
 docker/docker-python/README.md                     |   4 +-
 docs/README.md                                     |  10 +-
 docs/python_docs/python/scripts/conf.py            |   8 +-
 docs/python_docs/python/scripts/md2ipynb.py        |   4 +-
 docs/python_docs/python/scripts/process_rst.py     |   2 +-
 .../getting-started/crash-course/6-train-nn.md     |   2 +-
 .../getting-started/crash-course/7-use-gpus.md     |   2 +-
 .../gluon_from_experiment_to_deployment.md         |   4 +-
 .../getting-started/gluon_migration_guide.md       |  40 +-
 .../packages/gluon/training/fit_api_tutorial.md    |   2 +-
 .../legacy/ndarray/gotchas_numpy_in_mxnet.md       |   4 +-
 .../performance/backend/dnnl/dnnl_quantization.md  |   2 +
 .../backend/dnnl/dnnl_quantization_inc.md          | 296 +++++++++
 .../performance/backend/dnnl/dnnl_readme.md        |  20 +-
 .../tutorials/performance/backend/dnnl/index.rst   |   9 +-
 .../tutorials/performance/backend/profiler.md      |   4 +-
 .../python/tutorials/performance/index.rst         |   2 +-
 .../themes/mx-theme/mxtheme/footer.html            |   6 +-
 .../themes/mx-theme/mxtheme/header_top.html        |   2 +-
 docs/static_site/src/_config.yml                   |   2 +-
 docs/static_site/src/_config_beta.yml              |   2 +-
 docs/static_site/src/_config_prod.yml              |   2 +-
 docs/static_site/src/_includes/footer.html         |  12 +-
 .../src/_includes/get_started/cloud/cpu.md         |   2 +-
 .../src/_includes/get_started/cloud/gpu.md         |   2 +-
 .../get_started/linux/python/cpu/docker.md         |   2 +-
 .../_includes/get_started/linux/python/cpu/pip.md  |   2 +-
 .../get_started/linux/python/gpu/docker.md         |   2 +-
 .../_includes/get_started/linux/python/gpu/pip.md  |   2 +-
 docs/static_site/src/_includes/header.html         |   3 +-
 docs/static_site/src/assets/img/asf_logo.svg       | 210 ++++++
 docs/static_site/src/index.html                    |   2 +-
 .../pages/api/architecture/exception_handling.md   |   2 +-
 .../src/pages/api/architecture/note_engine.md      |   2 +-
 .../src/pages/api/architecture/program_model.md    |   2 +-
 .../cpp/docs/tutorials/multi_threaded_inference.md |  16 +-
 .../docs/tutorials/mxnet_cpp_inference_tutorial.md |  22 +-
 docs/static_site/src/pages/api/cpp/index.md        |  10 +-
 ...github_contribution_and_PR_verification_tips.md |   6 +-
 .../exception_handing_and_custom_error_types.md    |   2 +-
 .../src/pages/api/faq/add_op_in_backend.md         |   8 +-
 .../src/pages/api/faq/distributed_training.md      |   8 +-
 docs/static_site/src/pages/api/faq/env_var.md      |   4 +-
 docs/static_site/src/pages/api/faq/float16.md      |   2 +-
 .../src/pages/api/faq/gradient_compression.md      |   2 +-
 .../src/pages/api/faq/large_tensor_support.md      |   6 +-
 docs/static_site/src/pages/api/faq/perf.md         |   2 +-
 .../pages/api/java/docs/tutorials/ssd_inference.md |   6 +-
 .../src/pages/api/r/docs/tutorials/symbol.md       |   2 +-
 .../src/pages/api/scala/docs/tutorials/infer.md    |   6 +-
 .../src/pages/api/scala/docs/tutorials/io.md       |   6 +-
 docs/static_site/src/pages/api/scala/index.md      |   4 +-
 docs/static_site/src/pages/ecosystem.html          |   2 +-
 .../src/pages/get_started/build_from_source.md     |   8 +-
 docs/static_site/src/pages/get_started/download.md |   2 +-
 docs/static_site/src/pages/get_started/index.html  |   2 +-
 .../src/pages/get_started/jetson_setup.md          |   4 +-
 .../src/pages/get_started/validate_mxnet.md        |   2 +-
 example/README.md                                  |   6 +-
 .../distributed_training-horovod/gluon_mnist.py    |  17 +-
 .../resnet50_imagenet.py                           |   2 +-
 example/distributed_training/README.md             |   2 +-
 example/distributed_training/cifar10_dist.py       |   2 +-
 .../distributed_training/cifar10_kvstore_hvd.py    |   2 +-
 example/extensions/lib_external_ops/CMakeLists.txt |  17 +
 example/extensions/lib_pass/test_pass.py           |   2 +-
 example/extensions/lib_subgraph/test_subgraph.py   |   6 +-
 .../house_prices/kaggle_k_fold_cross_validation.py |   7 +-
 example/gluon/image_classification.py              |  12 +-
 example/gluon/mnist/mnist.py                       |   6 +-
 example/gluon/super_resolution/super_resolution.py |   4 +-
 example/profiler/profiler_ndarray.py               |   6 +-
 example/quantization/README.md                     |  14 +-
 example/quantization/imagenet_gen_qsym_onednn.py   |  31 +-
 example/quantization/imagenet_inference.py         |  22 +-
 example/quantization_inc/custom_strategy.py        | 193 ++++++
 .../quantization_inc/resnet50v2_mse.yaml           |  22 +-
 example/quantization_inc/resnet_measurement.py     |  68 ++
 example/quantization_inc/resnet_mse.py             |  65 ++
 example/quantization_inc/resnet_tuning.py          | 116 ++++
 example/recommenders/movielens_data.py             |  12 +-
 include/mxnet/imperative.h                         |   1 +
 include/mxnet/op_attr_types.h                      |   2 +-
 include/mxnet/tensor_blob.h                        |   4 +-
 python/mxnet/_ffi/_ctypes/function.py              |   4 +-
 python/mxnet/_ffi/function.py                      |   2 +-
 python/mxnet/_ffi/node_generic.py                  |   2 +-
 python/mxnet/amp/amp.py                            |   8 +-
 python/mxnet/amp/lists/symbol_bf16.py              | 690 +++++++++++++------
 python/mxnet/amp/lists/symbol_fp16.py              |   7 +-
 python/mxnet/autograd.py                           |  12 +-
 python/mxnet/base.py                               |  41 +-
 python/mxnet/contrib/quantization.py               |  76 +--
 python/mxnet/contrib/tensorboard.py                |   2 +-
 python/mxnet/contrib/text/embedding.py             |  42 +-
 python/mxnet/contrib/text/vocab.py                 |   2 +-
 python/mxnet/device.py                             |   2 +-
 python/mxnet/error.py                              |   2 +-
 python/mxnet/executor.py                           |   4 +-
 python/mxnet/gluon/block.py                        |  54 +-
 python/mxnet/gluon/contrib/estimator/estimator.py  |   6 +-
 .../mxnet/gluon/contrib/estimator/event_handler.py |  61 +-
 python/mxnet/gluon/data/_internal.py               |  14 +-
 python/mxnet/gluon/data/batchify.py                |   4 +-
 python/mxnet/gluon/data/dataset.py                 |   9 +-
 python/mxnet/gluon/data/sampler.py                 |   4 +-
 python/mxnet/gluon/data/vision/datasets.py         |   5 +-
 python/mxnet/gluon/loss.py                         |   7 +-
 python/mxnet/gluon/metric.py                       |   8 +-
 python/mxnet/gluon/model_zoo/model_store.py        |  13 +-
 python/mxnet/gluon/model_zoo/vision/__init__.py    |   3 +-
 python/mxnet/gluon/model_zoo/vision/densenet.py    |   2 +-
 python/mxnet/gluon/model_zoo/vision/mobilenet.py   |   4 +-
 python/mxnet/gluon/model_zoo/vision/resnet.py      |   7 +-
 python/mxnet/gluon/model_zoo/vision/squeezenet.py  |   2 +-
 python/mxnet/gluon/model_zoo/vision/vgg.py         |   2 +-
 python/mxnet/gluon/nn/activations.py               |  12 +-
 python/mxnet/gluon/nn/basic_layers.py              |  89 +--
 python/mxnet/gluon/parameter.py                    | 120 ++--
 python/mxnet/gluon/rnn/conv_rnn_cell.py            |   2 +-
 python/mxnet/gluon/rnn/rnn_cell.py                 |   2 +-
 python/mxnet/gluon/rnn/rnn_layer.py                |   5 +-
 python/mxnet/gluon/trainer.py                      |  13 +-
 python/mxnet/gluon/utils.py                        |  11 +-
 python/mxnet/image/detection.py                    |  11 +-
 python/mxnet/image/image.py                        |   2 +-
 python/mxnet/initializer.py                        |  11 +-
 python/mxnet/io/io.py                              |  10 +-
 python/mxnet/io/utils.py                           |   4 +-
 python/mxnet/kvstore/base.py                       |   8 +-
 python/mxnet/kvstore/kvstore_server.py             |   3 +-
 python/mxnet/library.py                            |   6 +-
 python/mxnet/lr_scheduler.py                       |   2 +-
 python/mxnet/model.py                              |  20 +-
 python/mxnet/name.py                               |   2 +-
 python/mxnet/ndarray/contrib.py                    |  14 +-
 python/mxnet/ndarray/ndarray.py                    |  36 +-
 python/mxnet/ndarray/numpy/_op.py                  |   6 +-
 .../mxnet/ndarray/numpy_extension/control_flow.py  |   4 +-
 python/mxnet/ndarray/random.py                     |   6 +-
 python/mxnet/ndarray/register.py                   |   8 +-
 python/mxnet/ndarray/sparse.py                     |  11 +-
 python/mxnet/ndarray_doc.py                        |   7 +-
 python/mxnet/numpy/function_base.py                |   3 +-
 python/mxnet/numpy/multiarray.py                   |   2 +-
 python/mxnet/numpy_op_fallback.py                  |   2 +-
 python/mxnet/onnx/mx2onnx/_export_model.py         |   2 +-
 python/mxnet/onnx/mx2onnx/_export_onnx.py          |  10 +-
 .../_op_translations/_op_translations_opset12.py   |   6 +-
 .../_op_translations/_op_translations_opset13.py   |   2 +-
 python/mxnet/onnx/setup.py                         |   2 +-
 python/mxnet/operator.py                           | 115 ++--
 python/mxnet/optimizer/optimizer.py                |   9 +-
 python/mxnet/recordio.py                           |   4 +-
 python/mxnet/registry.py                           |  23 +-
 python/mxnet/rtc.py                                |  15 +-
 python/mxnet/symbol/contrib.py                     |  15 +-
 python/mxnet/symbol/numpy/_symbol.py               |   8 +-
 python/mxnet/symbol/random.py                      |   6 +-
 python/mxnet/symbol/register.py                    |   8 +-
 python/mxnet/symbol/symbol.py                      |  72 +-
 python/mxnet/symbol_doc.py                         |   7 +-
 python/mxnet/test_utils.py                         |  75 +--
 python/mxnet/util.py                               |  24 +-
 python/setup.py                                    |  10 +-
 rat-excludes                                       |   1 -
 .../operator/numpy_extension/npx_leaky_relu_op.cc  |   6 +-
 src/c_api/c_api_symbolic.cc                        |   8 +-
 src/c_api/c_api_test.cc                            |   1 +
 src/common/cuda/cudnn_cxx.cc                       |   2 +-
 src/common/cuda/cudnn_cxx.h                        |   4 +-
 src/common/cuda/rtc/backward_functions-inl.h       |   2 +-
 src/common/cuda/rtc/forward_functions-inl.h        |   2 +-
 src/common/utils.h                                 |   7 +-
 src/engine/threaded_engine_perdevice.cc            |  32 +-
 src/imperative/cached_op.cc                        |   2 +-
 src/imperative/imperative.cc                       |   2 +-
 src/ndarray/ndarray.cc                             |   4 +-
 src/nnvm/low_precision_pass.cc                     |   6 +-
 src/operator/c_lapack_api.h                        |   8 +
 src/operator/contrib/adaptive_avg_pooling.cc       |  22 +-
 src/operator/cudnn_ops.cc                          | 148 ++++-
 src/operator/cudnn_ops.h                           |  83 ++-
 src/operator/fusion/fused_op-inl.h                 |   4 +-
 src/operator/leaky_relu-inl.h                      |  42 +-
 src/operator/leaky_relu.cc                         |  19 +-
 src/operator/linalg_impl.h                         |   2 +-
 src/operator/mshadow_op.h                          |  60 +-
 src/operator/nn/batch_norm-inl.h                   |   2 +-
 src/operator/nn/batch_norm.cc                      |  25 +-
 src/operator/nn/concat.cc                          |  11 +-
 src/operator/nn/dnnl/dnnl_act.cc                   |  23 +-
 src/operator/nn/dnnl/dnnl_base-inl.h               | 182 +++--
 src/operator/nn/dnnl/dnnl_batch_dot-inl.h          |   6 +-
 src/operator/nn/dnnl/dnnl_batch_dot.cc             |  13 +-
 src/operator/nn/dnnl/dnnl_batch_norm-inl.h         | 420 +++---------
 .../{dnnl_batch_norm-inl.h => dnnl_batch_norm.cc}  | 287 +++-----
 src/operator/nn/dnnl/dnnl_binary.cc                |  20 +-
 src/operator/nn/dnnl/dnnl_convolution-inl.h        |  10 +-
 src/operator/nn/dnnl/dnnl_convolution.cc           |   3 +-
 src/operator/nn/dnnl/dnnl_deconvolution.cc         |   4 +-
 src/operator/nn/dnnl/dnnl_dot-inl.h                |   5 +-
 src/operator/nn/dnnl/dnnl_dot.cc                   |  32 +-
 src/operator/nn/dnnl/dnnl_eltwise.cc               |  10 +-
 src/operator/nn/dnnl/dnnl_fully_connected-inl.h    |  75 ++-
 src/operator/nn/dnnl/dnnl_fully_connected.cc       |  16 +-
 src/operator/nn/dnnl/dnnl_layer_norm.cc            |  12 +-
 src/operator/nn/dnnl/dnnl_log_softmax.cc           |  16 +-
 src/operator/nn/dnnl/dnnl_masked_softmax.cc        |  36 +-
 src/operator/nn/dnnl/dnnl_pooling-inl.h            |  14 +-
 src/operator/nn/dnnl/dnnl_pow_mul_scalar-inl.h     | 100 +++
 ...dnnl_power_scalar.cc => dnnl_pow_mul_scalar.cc} |  65 +-
 src/operator/nn/dnnl/dnnl_power_scalar-inl.h       |  66 --
 src/operator/nn/dnnl/dnnl_reduce.cc                |  13 +-
 src/operator/nn/dnnl/dnnl_reshape.cc               |   8 +-
 src/operator/nn/dnnl/dnnl_rnn-inl.h                |   1 +
 src/operator/nn/dnnl/dnnl_slice-inl.h              |  71 --
 src/operator/nn/dnnl/dnnl_slice.cc                 | 111 ----
 src/operator/nn/dnnl/dnnl_softmax.cc               |  22 +-
 src/operator/nn/dnnl/dnnl_softmax_output.cc        |   6 +-
 src/operator/nn/dnnl/dnnl_split.cc                 |   6 -
 src/operator/nn/dnnl/dnnl_stack.cc                 |  22 +-
 src/operator/nn/dnnl/dnnl_transpose-inl.h          |   2 -
 src/operator/nn/dnnl/dnnl_transpose.cc             |  10 -
 src/operator/nn/dnnl/dnnl_where.cc                 |  53 +-
 src/operator/nn/fully_connected-inl.h              |   2 +-
 src/operator/nn/fully_connected.cc                 |   7 +-
 src/operator/nn/log_softmax.cc                     |   4 +-
 src/operator/nn/lrn.cc                             |   4 +-
 src/operator/nn/masked_softmax.cc                  |   3 +-
 src/operator/nn/softmax.cc                         |   4 +-
 src/operator/numpy/np_broadcast_reduce_op_value.h  |   1 -
 src/operator/numpy/np_dot_forward.cc               |   2 +-
 src/operator/numpy/np_elemwise_broadcast_op.h      |  11 +-
 src/operator/numpy/np_matrix_op.cc                 |   2 +-
 src/operator/numpy/np_repeat_op-inl.h              | 165 +++--
 src/operator/numpy/np_true_divide-inl.h            |   8 +-
 src/operator/operator_common.h                     |   7 +-
 src/operator/operator_tune.cc                      |   6 +-
 .../quantization/dnnl/dnnl_quantized_batch_norm.cc |  23 +-
 .../dnnl/dnnl_quantized_elemwise_add.cc            | 234 ++++---
 .../quantization/dnnl/dnnl_quantized_flatten.cc    |   2 +-
 .../dnnl/dnnl_quantized_fully_connected.cc         |   9 +-
 .../quantization/dnnl/dnnl_quantized_reshape.cc    |   2 +-
 .../quantization/dnnl/dnnl_quantized_rnn.cc        |   2 -
 .../quantization/dnnl/dnnl_quantized_transpose.cc  |   8 +-
 .../quantization/dnnl/dnnl_requantize-inl.h        |  11 +-
 ..._batch_norm.cc => quantized_batch_norm_relu.cc} |  39 +-
 .../quantization/quantized_elemwise_add-inl.h      |  15 +
 .../quantization/quantized_elemwise_add.cc         |  50 ++
 src/operator/random/sample_op.h                    |  34 +-
 src/operator/softmax_output.cc                     |   2 +-
 src/operator/subgraph/build_subgraph.cc            |  14 +-
 src/operator/subgraph/common.h                     |   2 +
 src/operator/subgraph/default_subgraph_property.cc |   4 +-
 .../subgraph/default_subgraph_property_v2.cc       |   4 +-
 src/operator/subgraph/dnnl/dnnl_batch_dot.cc       |  34 +-
 .../dnnl/dnnl_bn_relu.cc}                          | 103 +--
 src/operator/subgraph/dnnl/dnnl_bn_relu_property.h |   9 +-
 src/operator/subgraph/dnnl/dnnl_conv.cc            |  32 +-
 src/operator/subgraph/dnnl/dnnl_conv_property.h    |   9 +-
 src/operator/subgraph/dnnl/dnnl_fc.cc              | 731 ++++++++++++---------
 src/operator/subgraph/dnnl/dnnl_fc_property.h      |  15 +-
 ...l_fc_sum_fuse.h => dnnl_fc_sum_fuse_property.h} | 117 ++--
 .../subgraph/dnnl/dnnl_post_amp_property.h         |   3 +-
 .../subgraph/dnnl/dnnl_post_quantize_property.h    |  33 +-
 src/operator/subgraph/dnnl/dnnl_pow_mul_scalar.cc  | 208 ++++++
 .../subgraph/dnnl/dnnl_pow_mul_scalar_property.h   | 126 ++++
 .../subgraph/dnnl/dnnl_remove_casts_property.h     | 154 +++++
 .../subgraph/dnnl/dnnl_subgraph_property.cc        |   9 +-
 src/operator/subgraph/dnnl/dnnl_transformer-inl.h  |  14 +-
 src/operator/subgraph/dnnl/dnnl_transformer.cc     | 328 ++++++---
 .../subgraph/dnnl/dnnl_transformer_qk_common.h     | 230 +++++++
 .../subgraph/dnnl/dnnl_transformer_qk_property.h   | 217 +++---
 .../dnnl/dnnl_transformer_valatt_property.h        |  12 +-
 .../eliminate_common_nodes_pass.cc}                |  33 +-
 .../partitioner/custom_subgraph_property.h         |   8 +-
 .../subgraph/static_shape_subgraph_property.cc     |   4 +-
 src/operator/subgraph/subgraph_property.h          |  14 +-
 src/operator/swapaxis-inl.h                        | 317 ++++-----
 src/operator/swapaxis.cc                           |  48 +-
 src/operator/swapaxis.cu                           |   9 +-
 src/operator/tensor/broadcast_reduce-inl.h         |   2 +-
 src/operator/tensor/dot.cc                         |   4 +-
 src/operator/tensor/elemwise_binary_broadcast_op.h |  16 +-
 .../tensor/elemwise_binary_broadcast_op_basic.cc   |  69 +-
 .../elemwise_binary_broadcast_op_extended.cc       |   6 +-
 src/operator/tensor/elemwise_binary_op_basic.cc    |  11 +-
 src/operator/tensor/elemwise_binary_scalar_op.h    |   3 -
 .../tensor/elemwise_binary_scalar_op_extended.cc   |  11 +-
 src/operator/tensor/elemwise_unary_op.h            |  10 +-
 src/operator/tensor/index_update-inl.h             |   1 -
 src/operator/tensor/matrix_op-inl.h                |  24 +-
 src/operator/tensor/matrix_op.cc                   |  23 +-
 src/resource.cc                                    |   2 +-
 tests/CMakeLists.txt                               |  16 +
 tests/cpp/engine/threaded_engine_test.cc           |   2 -
 tests/cpp/include/test_util.h                      |  10 +-
 tests/cpp/operator/batchnorm_test.cc               |   2 +-
 tests/nightly/TestDoc/doc_spell_checker.py         |   4 +-
 .../model_backwards_compatibility_check/common.py  |  16 +-
 .../model_backwards_compat_inference.py            |  28 +-
 .../model_backwards_compat_train.py                |  10 +-
 tests/nightly/test_large_array.py                  |  23 +-
 tests/nightly/test_large_vector.py                 |  12 +-
 tests/python/dnnl/op_cfg.py                        | 416 ++++++++++++
 tests/python/dnnl/subgraphs/subgraph_common.py     |  32 +-
 tests/python/dnnl/subgraphs/test_amp_subgraph.py   |  48 +-
 tests/python/dnnl/subgraphs/test_conv_subgraph.py  |  24 +-
 tests/python/dnnl/subgraphs/test_fc_subgraph.py    |  61 +-
 .../python/dnnl/subgraphs/test_matmul_subgraph.py  |  38 +-
 .../subgraphs/test_pow_mul_subgraph.py}            |  38 +-
 tests/python/dnnl/test_amp.py                      |  66 ++
 tests/python/dnnl/test_bf16_operator.py            |   4 +-
 tests/python/dnnl/test_dnnl.py                     |  60 +-
 tests/python/doctest/test_docstring.py             |   5 +-
 tests/python/gpu/test_extensions_gpu.py            |   6 +-
 tests/python/gpu/test_gluon_model_zoo_gpu.py       |   4 +-
 tests/python/gpu/test_numpy_fallback.py            |   4 +-
 tests/python/gpu/test_operator_gpu.py              |   8 +-
 tests/python/gpu/test_profiler_gpu.py              |   6 +-
 tests/python/onnx/test_models.py                   |   8 +-
 tests/python/quantization/test_quantization.py     |  81 ++-
 tests/python/train/test_autograd.py                |   2 +-
 tests/python/unittest/common.py                    |  10 +-
 tests/python/unittest/test_deferred_compute.py     |   4 +-
 tests/python/unittest/test_executor.py             |   2 +-
 tests/python/unittest/test_extensions.py           |  10 +-
 tests/python/unittest/test_gluon.py                |  41 +-
 tests/python/unittest/test_gluon_model_zoo.py      |   2 +-
 tests/python/unittest/test_gluon_rnn.py            |  16 +-
 tests/python/unittest/test_gluon_utils.py          |   2 +-
 tests/python/unittest/test_memory_opt.py           |   8 +-
 tests/python/unittest/test_ndarray.py              |  20 +-
 tests/python/unittest/test_numpy_gluon.py          |   5 +
 tests/python/unittest/test_numpy_ndarray.py        |   2 +-
 tests/python/unittest/test_numpy_op.py             |   2 +-
 tests/python/unittest/test_operator.py             | 166 +++--
 tests/python/unittest/test_profiler.py             |  18 +-
 tests/python/unittest/test_random.py               |  24 +-
 tests/python/unittest/test_sparse_ndarray.py       |   2 +-
 tests/python/unittest/test_sparse_operator.py      |   4 +-
 tests/python/unittest/test_symbol.py               |   2 +-
 tests/python/unittest/test_test_utils.py           |   2 +-
 tests/tutorials/test_tutorials.py                  |   6 +-
 tools/bandwidth/measure.py                         |   5 +-
 tools/create_source_archive.sh                     |   2 +-
 tools/dependencies/README.md                       |  18 +-
 tools/diagnose.py                                  |   2 +-
 tools/im2rec.py                                    |  20 +-
 tools/kill-mxnet.py                                |   4 +-
 tools/launch.py                                    |   4 +-
 tools/parse_log.py                                 |   8 +-
 tools/pip/MANIFEST.in                              |   1 -
 tools/pip/doc/PYPI_README.md                       |   2 +-
 tools/pip/setup.py                                 |   2 +-
 tools/rec2idx.py                                   |   2 +-
 tools/staticbuild/build.sh                         |   1 -
 tools/windowsbuild/README.md                       |   2 +-
 412 files changed, 7148 insertions(+), 4388 deletions(-)
 delete mode 100644 DISCLAIMER
 create mode 100644 benchmark/opperf/utils/benchmark_operators_pytest.py
 create mode 100644 docs/python_docs/python/tutorials/performance/backend/dnnl/dnnl_quantization_inc.md
 create mode 100644 docs/static_site/src/assets/img/asf_logo.svg
 create mode 100755 example/quantization_inc/custom_strategy.py
 copy docs/cpp_docs/Makefile => example/quantization_inc/resnet50v2_mse.yaml (72%)
 create mode 100644 example/quantization_inc/resnet_measurement.py
 create mode 100644 example/quantization_inc/resnet_mse.py
 create mode 100644 example/quantization_inc/resnet_tuning.py
 copy src/operator/nn/dnnl/{dnnl_batch_norm-inl.h => dnnl_batch_norm.cc} (56%)
 create mode 100644 src/operator/nn/dnnl/dnnl_pow_mul_scalar-inl.h
 rename src/operator/nn/dnnl/{dnnl_power_scalar.cc => dnnl_pow_mul_scalar.cc} (51%)
 delete mode 100644 src/operator/nn/dnnl/dnnl_power_scalar-inl.h
 delete mode 100644 src/operator/nn/dnnl/dnnl_slice-inl.h
 delete mode 100644 src/operator/nn/dnnl/dnnl_slice.cc
 copy src/operator/quantization/{quantized_batch_norm.cc => quantized_batch_norm_relu.cc} (84%)
 rename src/operator/{contrib/batch_norm_relu.cc => subgraph/dnnl/dnnl_bn_relu.cc} (69%)
 rename src/operator/subgraph/dnnl/{dnnl_fc_sum_fuse.h => dnnl_fc_sum_fuse_property.h} (69%)
 create mode 100644 src/operator/subgraph/dnnl/dnnl_pow_mul_scalar.cc
 create mode 100644 src/operator/subgraph/dnnl/dnnl_pow_mul_scalar_property.h
 create mode 100644 src/operator/subgraph/dnnl/dnnl_remove_casts_property.h
 create mode 100644 src/operator/subgraph/dnnl/dnnl_transformer_qk_common.h
 copy src/operator/{operator.cc => subgraph/eliminate_common_nodes_pass.cc} (56%)
 create mode 100644 tests/python/dnnl/op_cfg.py
 copy tests/python/{profiling/simple_forward.py => dnnl/subgraphs/test_pow_mul_subgraph.py} (51%)