You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/09/06 08:10:15 UTC

[GitHub] [tvm] AndrewZhaoLuo commented on issue #8296: [RFC][Tracking Issue][AMP] Tracking Issue for Mixed Precision Pass

AndrewZhaoLuo commented on issue #8296:
URL: https://github.com/apache/tvm/issues/8296#issuecomment-913440596

Yeah the issue behind creating defaults is that we cannot create defaults that work best for every situation. This is especially true since whenever we want speed we trade accuracy which can sometimes become a problem.

For the defaults I envision that for most ops we don't accumulate to FP32. For some ops like the global pools and sums we might turn it on. Really the best way to determine the criteria is to do a lot of the work you've been doing in trying out different models in different applications and seeing what needs to be turned on and off.

That being said, this is really designed to be a tool which requires the user sometimes to go back and modify the default values provided to either get more speed if their model can afford it, or accuracy if they need it. It requires investigation and I don't think we can probably hit all cases well. A tutorial here would help (which is on my long list of TODOs).

Finally, while things are done on a per-op basis, the actual mixed precision function can look at some parts of the relay call like the attributes or the node or the input tensor sizes. Therefore we can be smart about the quantization (e.g. for global pooling, only accumulate in fp32 if the input to output reduction is large enough). Again, a tutorial or example would help flesh this out.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org