You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@tvm.apache.org by Sergio via TVM Discuss <no...@discuss.tvm.ai> on 2020/05/06 16:05:48 UTC

[TVM Discuss] [Questions] Question about "qconfig" options for quantization


Hi eveyrone,

I would like to know whether int4/int16 quantization was possible using "`relay.quantize.quantize`". So far, I have gone through the documentation in

https://github.com/apache/incubator-tvm/blob/master/python/tvm/relay/quantize/quantize.py

But I have a few questions:

1) What is the difference between `nbit_weight` and `dtype_weight`?. I was expecting that the type of the workloads for my tasks would change by only changing `nbit_weight`, but I had to also use `dtype_weight = "int16"` to achieve that.

The above also applies to "nbit_input" and "dtype_input"

2) What are the parameters that you have to modify to have "int16" quantization. So far, my code is:

`with relay.quantize.qconfig(calibrate_mode='global_scale', nbit_dict = 16, nbit_weight = 16, dtype_input = "int16", dtype_weight = "int16", global_scale=8.0):`
`mod = relay.quantize.quantize(mod, params=dict_params)`

Would this be enough?

3) In the literature, you often find that quantization takes the form:

`xint = xfloat/scale + offset`

Is there any `offset` available in the `relay.quantize.qconfig` function?





---
[Visit Topic](https://discuss.tvm.ai/t/question-about-qconfig-options-for-quantization/6602/1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/1cb08e410f048359a376586bcf8e191d99763120a7fbb28b74aa4a16aba0e5e7).