You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/27 03:21:24 UTC

[GitHub] [incubator-mxnet] DickJC123 edited a comment on issue #15529: MXNET_CUDNN_AUTOTUNE_DEFAULT problems

DickJC123 edited a comment on issue #15529:
URL: https://github.com/apache/incubator-mxnet/issues/15529#issuecomment-664097394


   Sorry this issue got buried on the todo stack.  To be honest though, it may not be a bug in the convolution algo cache implementation, but more an issue with the policy of emitting warning messages.  The repeated message comes after every 50 new additions to the convolution algo cache.  At the point when there's 1000 differently-spec'd convolutions in the model, the following message appears once:
   ```
   If you see this message in the middle of training, you are probably using bucketing. Consider setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable cudnn tuning.
   ```
   So if @intgogo still cares about this issue, maybe we can hear if the model in question is continually generating convolutions with unique parameters (shapes, strides, etc.).  Also, I wouldn't mind hearing opinions about a new algo-cache-growth warning policy.  Here are a couple of options:
   - Emit warning once when the 1st conv algo is added to the cache,
   - Emit warning once when the Xth conv algo is added to the cache, e.g. 50,
   - Emit warnings as the cache doubles in size, e.g. with size 50, 100, 200, 400, 800, 1600, ...
   
   Or we could add an environment variable to silence the warning altogether.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org