You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/06/22 15:12:14 UTC

[GitHub] wenyangchu edited a comment on issue #11341: Deterministic cudnn algorithms

wenyangchu edited a comment on issue #11341: Deterministic cudnn algorithms
URL: https://github.com/apache/incubator-mxnet/issues/11341#issuecomment-399475683
 
 
   Hi @DickJC123 , 
   Thanks for your reply,
   I did an implementation this week due to my urgent need. I just put it into a pullrequest meant for discussion for now:
   
   @https://github.com/apache/incubator-mxnet/pull/11361
   Please check the last 2 commits.
   
   It is meant for us to come out with a good solution later on.
   
   For your questions:
   1. If MXNET_PREFER_DETERMINISM is set and it can not find a deterministic algorithm, I suppose it has to have a fatal error because user's need is not to be able to be satisfied.
   
   2. I think it is a good idea to have it for the entire platform but I will try to solve it with cudnn first because it is the most used one I suppose? I do not see other obvious issue in other backends yet maybe anyone else can suggest where can be not deterministic?
   
   I have tested CPU version with intel MKL in a limited scenarios and it was deterministic for training. I think we need advice and tests to figure out which part of any other backends is not deterministic.
   
   3.  I think it is good to have a global determinism control if feasible. If it is possible to have control over individual layers, I think it is also very good to have.
   
   In the pullrequest I added deterministic parameter (default = False) to Maxpooling:
   nn.MaxPool2D(pool_size=(3,3), strides=(2,2) ,deterministic=True)
   
   Added env parameters to select Deterministic algorithms for Conv back propagation algorithm
   os.environ["MXNET_CUDNN_AUTOTUNE_DEFAULT"] = "3"
   
   Old:
   #Value of 1 chooses the best algo in a limited workspace
   #Value of 2 chooses the fastest algo whose memory requirements may be larger than the default workspace threshold
   Added: 
   #Value of 3 choose the deterministic best algo in a limited workspace
   #Value of 4 chooses the deterministic fastest algo whose memory requirements may be larger than the default workspace threshold
   
   They could be replaced by a global deterministic flag.
   
   4. As you see above, I actually think it is good to let user to select deterministic algorithm according to constraints: speed or memory size.
   
   The problem of this solution is that, if cudnn chooses different deterministic algos, it can fail repeatability. I think it is good to have another mechanism to let user to select cudnn algorithm directly if available.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services