You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2022/02/14 14:16:36 UTC
[GitHub] [incubator-mxnet] bgawrych opened a new pull request #20894: Reduce after quantization memory usage
bgawrych opened a new pull request #20894:
URL: https://github.com/apache/incubator-mxnet/pull/20894
## Description ##
This change prevents MXNet from allocating additional memory space for gradients in quantized model as it can't be used anyway.
Memory measurement script:
```
import mxnet as mx
from mxnet.gluon.model_zoo import vision
import psutil
import os
def get_process_memory():
process = psutil.Process(os.getpid())
mem_info = process.memory_info()
return mem_info.rss * 1e-6
batch_shape = (1, 3, 224, 224)
data = mx.np.random.normal(size=batch_shape)
print("memory before loading model: ", get_process_memory())
net = vision.resnet50_v1(pretrained=True)
print("memory after loading model: ", get_process_memory())
out = net(data)
out.wait_to_read()
print("memory after fp32 forward pass", get_process_memory())
dataset = mx.gluon.data.ArrayDataset(data)
data_loader = mx.gluon.data.DataLoader(dataset, batch_size=1)
net_quantized = mx.contrib.quant.quantize_net(net, quantized_dtype='int8',
quantize_mode="smart",
calib_mode='naive',
calib_data=data_loader,
num_calib_batches=1,
ctx=mx.current_context())
print("memory after quantization: ", get_process_memory())
outputs = net_quantized(data)
outputs.wait_to_read()
print("memory after int8 forward pass: ", get_process_memory())
```
**Output before:**
```
memory before loading model: 213.430272
[15:14:11] ../src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU
memory after loading model: 530.702336
memory after fp32 forward pass 611.241984
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1918: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
data: None
input_sym_arg_type = in_param.infer_type()[0]
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1251: UserWarning: register_op_hook is experimental when static_alloc=True / static_shape=True and may not work correctly
warnings.warn("register_op_hook is experimental when static_alloc=True / static_shape=True "
memory after quantization: 1064.57088
memory after int8 forward pass: 1071.005696
```
**Output after:**
```
memory before loading model: 214.28633599999998
[15:13:17] ../src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU
memory after loading model: 531.2593919999999
memory after fp32 forward pass 609.513472
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1918: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
data: None
input_sym_arg_type = in_param.infer_type()[0]
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1251: UserWarning: register_op_hook is experimental when static_alloc=True / static_shape=True and may not work correctly
warnings.warn("register_op_hook is experimental when static_alloc=True / static_shape=True "
memory after quantization: 890.273792
memory after int8 forward pass: 895.2258559999999
```
Significant memory usage reduction can be observed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20894: Reduce after quantization memory usage
Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20894:
URL: https://github.com/apache/incubator-mxnet/pull/20894#issuecomment-1039137401
Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:
- To trigger all jobs: @mxnet-bot run ci [all]
- To trigger specific jobs: @mxnet-bot run ci [job1, job2]
***
**CI supported jobs**: [edge, miscellaneous, windows-gpu, centos-cpu, unix-cpu, unix-gpu, windows-cpu, website, sanity, clang, centos-gpu]
***
_Note_:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] bgawrych merged pull request #20894: Reduce after quantization memory usage
Posted by GitBox <gi...@apache.org>.
bgawrych merged pull request #20894:
URL: https://github.com/apache/incubator-mxnet/pull/20894
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org