You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/10/16 15:33:24 UTC

[GitHub] [incubator-mxnet] grygielski opened a new issue #19361: [RFC] Denormal floating point values handling

grygielski opened a new issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361


   ## Problem statement
   Currently in MXNet there is no mechanism of handling denormal floating point values ([wikipedia](https://en.wikipedia.org/wiki/Denormal_number)) of parameters/inputs/outputs. Such numbers are problematic in terms of computations because adding/multiplying them require more CPU instructions than on normal floating point numbers. However, they are so close to zero (e.g. ~1e-30) that most of the times they can be rounded to 0 without any lose in model's accuracy.
   
   It can be done simply by checking every single parameter of the model with some, small threshold and rounding all parameters below this threshold to 0. It adds some overhead to saving/loading parameters and it's not perfect because denormal values can be created during inference on input/output values too.
   
   Cleaner solution would to to use hardware features of modern CPUs. Since SSE2 extension there are CPU flags that handle denormals automatically. These flags are DAZ (denormals-are-zero) and FTZ (flush-to-zero). They can be set inside C++ code using intrinsic instructions.
   
   Important point is that denormal values are rather rare since most modern NN architectures do not work asymptotically close to 0. However it can happen that they will show up in RNN models (because of sigmoid gate activation) or when using layers like PReLU (https://github.com/apache/incubator-mxnet/issues/19218).
   
   My question here is what is a way of handling such cases preferred by a community? I would love to hear your suggestions and opinions about proposed solutions.
   
   ## Proposed solutions
   - Simplest one: leave handling denormals as users responsibility. They can iterate through parameters by themselves or use some external packages for setting CPU flags like https://github.com/chainer/daz.
   **Example** code deleting denormals from PReLU gamma parameter:
   ```Python
   def fix_denorm_params():
       global arg_params
       for key in arg_params.keys():
           if 'prelu' in key:
               gammas = arg_params[key]
               for index, gamma in enumerate(gammas):
                   if abs(gamma) < 1e-20:
                       arg_params[key][index] = 0.
   ```
   **Pros:** simple solution, no change in framework behavior
   **Cons:** users may not be aware of denormals slow-down, require using additional code or library, not user-friendly
   
   - Enabling DAZ and FTZ flags by default and do not create Python API on that. This is Tensorflow-like solution because they enable these 2 flags and do not allow user to change that.
   **Example** code used during execution in Tensorflow:
   ```Cpp
   ScopedFlushDenormal::ScopedFlushDenormal() {
     SetDenormalState(/*flush_zero_mode=*/true, /*denormals_zero_mode=*/true);
   }
   ```
   **Usage:**
   ```Cpp
   EnvThread* CreateThread(std::function<void()> f) {
       return env_->StartThread(thread_options_, name_, [=]() {
         // Set the processor flag to flush denormals to zero.
         port::ScopedFlushDenormal flush;
         // Set the processor rounding mode to ROUND TO NEAREST.
         port::ScopedSetRound round(FE_TONEAREST);
         if (thread_options_.numa_node != port::kNUMANoAffinity) {
           port::NUMASetThreadNodeAffinity(thread_options_.numa_node);
         }
         f();
       });
     }
   ```
   **Pros:** users do not have to worry about denorm cases, no change in external API
   **Cons:** sometimes it may lead to wrong results (?), it cannot be switched off if needed
   
   - Creating Python API function enabling DAZ and FTZ flags. This is PyTorch-like solution since they do not handle denormals by default but user can invoke Python function to treat denormals as 0s.
   **Example** from PyTorch documentation:
   https://pytorch.org/docs/stable/generated/torch.set_flush_denormal.html
   **Pros:** users can control behavior of the framework, simple one-line API
   **Cons:** users have to be aware of denormals existence, additional functionality in API
   
   - Combination of 2 previous ones: Enable it by default and expose Python API function disabling DAZ and FTZ.
   **Pros:** user-friendly solution but allows user to control framework behavior if needed
   **Cons:** the most complex solution in terms of implementation


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] szha commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
szha commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-775423615


   Based on the discussion, I think the combined approach for dealing with denormal floats sounds reasonable. @grygielski thanks for the proposal


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] xidulu commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
xidulu commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-713305611


   @szha 
   In gluon.distribution, floating number of very small value are often clipped to a minimum value to avoid numerical issue in downstream tasks, e.g.
   https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/probability/distributions/utils.py#L164-L172
   
   The clip is very necessary, otherwise tons of NaN would come up when very small value are fed into OPs like `log`. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] TaoLv commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
TaoLv commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-712591488


   @pengzhao-intel @mgouicem could you please help to review the proposal? Many thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
pengzhao-intel commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-712596251


   It will be more convenient for setting FAZ to true by default. The only concern is that if it affects the training accuracy (suppose very limited). 
   
   We have encountered several performance issues with denormal computation in the past but only happen in the user's debugging mode by randomly generated numbers. Thus, I am not sure if this issue will be happening in real cases.
   
   Let's wait for a while for the inputs from other members :)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] szha commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
szha commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-713288634


   I'd like to see if there are real use cases where denormal floats are legitimate use cases. @xidulu @szhengac is there any known cases where such precision is needed?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] mgouicem commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
mgouicem commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-713136788


   Thanks @grygielski for the proposal. I definitely agree with the premise of this proposal: most users do not know/care about denormals and they just get in the way of good performance for some use cases.
   
   For ease of use, I would encourage disabling denormals by default and go for option 2 or 4 (so set FTZ and DAZ), since the users that need denormals for accuracy usually know about denormals in the first place, whereas for the general users, denormals will likely not make any difference for accuracy but will impact performance.
   
   I have no opinion on which one is the best for code simplicity/maintenance though, so I let the MXNet contributors further comment on that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] github-actions[bot] commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-710119242


   Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue.
   Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly.
   If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on [contributing to MXNet](https://mxnet.apache.org/community/contribute) and our [development guides wiki](https://cwiki.apache.org/confluence/display/MXNET/Developments).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] TaoLv commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
TaoLv commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-713266400


   Thanks for your comments, @mgouicem! @szha @leezu could you please help to review? If we want to address this on framework level, probably we need to clearly define the behaviors on different hardware platforms.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] grygielski commented on issue #19361: [RFC] Denormal floating point values handling

Posted by GitBox <gi...@apache.org>.
grygielski commented on issue #19361:
URL: https://github.com/apache/incubator-mxnet/issues/19361#issuecomment-713356876


   @xidulu Thanks a lot for your user experience comment. In this case, using `np.finfo('float32').eps` returns the machine epsilon which is far from denormal number. Therefore, these 2 flags shouldn't affect your clipping. To create denormal number from machine epsilon you have to take it to ~6th power.
   ```Python
   import daz
   daz.set_ftz()
   daz.set_daz()
   
   np.power(np.finfo('float32').eps, 5, dtype=np.float32)
   >>> 2.4074124e-35
   
   np.power(np.finfo('float32').eps, 6, dtype=np.float32)
   >>> 0.0
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org