You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@tvm.apache.org by Sergio via TVM Discuss <no...@discuss.tvm.ai> on 2020/05/06 16:05:48 UTC
[TVM Discuss] [Questions] Question about "qconfig" options for
quantization
Hi eveyrone,
I would like to know whether int4/int16 quantization was possible using "`relay.quantize.quantize`". So far, I have gone through the documentation in
https://github.com/apache/incubator-tvm/blob/master/python/tvm/relay/quantize/quantize.py
But I have a few questions:
1) What is the difference between `nbit_weight` and `dtype_weight`?. I was expecting that the type of the workloads for my tasks would change by only changing `nbit_weight`, but I had to also use `dtype_weight = "int16"` to achieve that.
The above also applies to "nbit_input" and "dtype_input"
2) What are the parameters that you have to modify to have "int16" quantization. So far, my code is:
`with relay.quantize.qconfig(calibrate_mode='global_scale', nbit_dict = 16, nbit_weight = 16, dtype_input = "int16", dtype_weight = "int16", global_scale=8.0):`
`mod = relay.quantize.quantize(mod, params=dict_params)`
Would this be enough?
3) In the literature, you often find that quantization takes the form:
`xint = xfloat/scale + offset`
Is there any `offset` available in the `relay.quantize.qconfig` function?
---
[Visit Topic](https://discuss.tvm.ai/t/question-about-qconfig-options-for-quantization/6602/1) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/1cb08e410f048359a376586bcf8e191d99763120a7fbb28b74aa4a16aba0e5e7).