You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/07/03 07:31:23 UTC
[GitHub] [incubator-mxnet] xinyu-intel commented on a change in pull request #15448: [MKLDNN]Enhance Quantization APIs and Tutorial

xinyu-intel commented on a change in pull request #15448: [MKLDNN]Enhance Quantization APIs and Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/15448#discussion_r299812483
 
 

 ##########
 File path: docs/tutorials/mkldnn/MKLDNN_QUANTIZATION.md
 ##########
 @@ -0,0 +1,257 @@
+
+<!--- Licensed to the Apache Software Foundation (ASF) under one -->
+<!--- or more contributor license agreements.  See the NOTICE file -->
+<!--- distributed with this work for additional information -->
+<!--- regarding copyright ownership.  The ASF licenses this file -->
+<!--- to you under the Apache License, Version 2.0 (the -->
+<!--- "License"); you may not use this file except in compliance -->
+<!--- with the License.  You may obtain a copy of the License at -->
+
+<!---   http://www.apache.org/licenses/LICENSE-2.0 -->
+
+<!--- Unless required by applicable law or agreed to in writing, -->
+<!--- software distributed under the License is distributed on an -->
+<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
+<!--- KIND, either express or implied.  See the License for the -->
+<!--- specific language governing permissions and limitations -->
+<!--- under the License. -->
+
+# Quantize custom models for production-level inference with MKL-DNN backend
+
+This document is to introduce how to quantize the customer models from FP32 to INT8 with Apache/MXNet toolkit and APIs under Intel CPU.
+
+If you are not familiar with Apache/MXNet quantizaiton flow, please reference [quantization blog](https://medium.com/apache-mxnet/model-quantization-for-production-level-neural-network-inference-f54462ebba05) first, and the perforamnce data is shown in [Apache/MXNet C++ interface](https://github.com/apache/incubator-mxnet/tree/master/cpp-package/example/inference) and [GluonCV](https://gluon-cv.mxnet.io/build/examples_deployment/int8_inference.html). 
+
+## Installation and Prerequisites
+
+Installing MXNet with MKLDNN backend is an easy and essential process. You can follow [How to build and install MXNet with MKL-DNN backend](https://mxnet.incubator.apache.org/tutorials/mkldnn/MKLDNN_README.html) to build and install MXNet from source. Also, you can install the release or nightly version via PyPi and pip directly by running:
+
+```
+# release version
+pip install mxnet-mkl
+# nightly version
+pip install mxnet-mkl --pre
+```
+
+## Image Classification Demo
+
+A quantization script [imagenet_gen_qsym_mkldnn.py](https://github.com/apache/incubator-mxnet/blob/master/example/quantization/imagenet_gen_qsym_mkldnn.py) has been designed to launch quantization for image-classification models. This script is  integrated with [Gluon-CV modelzoo](https://gluon-cv.mxnet.io/model_zoo/classification.html), so that all pre-trained models can be downloaded from Gluon-CV and then converted for quantization. For details, you can refer [Model Quantization with Calibration Examples](https://github.com/apache/incubator-mxnet/blob/master/example/quantization/README.md).
+
+## Integrate Quantization Flow to Your Project
+
+Quantization flow works for both symbolic and Gluon models. If you're using Gluon, you can first refer [Saving and Loading Gluon Models](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/save_load_params.html) to hybridize your computation graph and export it as a symbol before running quantization.
+
+In general, the quantization flow includes 4 steps. The user can get the acceptable accuracy from step 1 to 3 with minimum effort. Most of thing in this stage is out-of-box and the data scientists and researchers only need to focus on how to represent data and layers in their model. After a quantized model is generated, you may want to deploy it online and the performance will be the next key point. Thus, step 4, calibration, can improve the performance a lot by reducing lots of runtime calculation.
+
+![quantization flow](quantization.png)
+
+Now, we are going to take Gluon ResNet18 as an example to show how each step work.
+
+### Initialize Model
+
+```python
+import logging
+import mxnet as mx
+from mxnet.gluon.model_zoo import vision
+from mxnet.contrib.quantization import *
+
+logging.basicConfig()
+logger = logging.getLogger('logger')
+logger.setLevel(logging.INFO)
+
+batch_shape = (1, 3, 224, 224)
+resnet18 = vision.resnet18_v1(pretrained=True)
+resnet18.hybridize()
+resnet18.forward(mx.nd.zeros(batch_shape))
+resnet18.export('resnet18_v1')
+sym, arg_params, aux_params = mx.model.load_checkpoint('resnet18_v1', 0)
+# (optional) visualize float32 model
+mx.viz.plot_network(sym)
+```
+First, we download resnet18-v1 model from gluon modelzoo and export it as a symbol. You can visualize float32 model. Below is a raw residual block.
+
+![float32 model](fp32_raw.png)
 
 Review comment:
   thanks:)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services