You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/11/14 01:20:28 UTC

[GitHub] mseth10 commented on issue #12314: flaky test: test_operator.test_dropout

mseth10 commented on issue #12314: flaky test: test_operator.test_dropout
URL: https://github.com/apache/incubator-mxnet/issues/12314#issuecomment-438501681
 
 
   I investigated this issue. The dropout symbol implementation calls a function BernoulliGenerate when MKL and OpenMP flags are on. This is where the flakiness comes from.
   
   BernoulliGenerate uses multithreading and calls MKL library function viRngBernoulli to populate a mask vector. https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/dropout-inl.h#L82
   
   When **p**=1.0, it is supposed to populate the mask vector **r** of length **n** with all 1s. But for a few values of **seed** internally generated, mask vector has n-1 1s and a 0 (located at a different index for a different flaky seed), which causes the error.
   
   Also, the error only occurs when multi-threading used. It is not reproduced when single thread used.
   
   I suspect that viRngBernoulli is not thread-safe.
   
   @TaoLv @pengzhao-intel @ZhennanQin @xinyu-intel Can you please take a look? Would really like your inputs on the same. Thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services