You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/10 12:17:21 UTC

[GitHub] [incubator-mxnet] wkcn opened a new pull request #18688: Fix the flaky bug of 'test_npx_batch_norm'

wkcn opened a new pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688


   ## Description ##
   Fix #18687 
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments are documented. 
   - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
   - Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be made.
   - Interesting edge cases to note here
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] wkcn commented on pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
wkcn commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-656786234


   @mxnet-bot run ci [windows-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18688: Fix the flaky bug of 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-656645431


   Hey @wkcn , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [windows-cpu, sanity, windows-gpu, website, unix-cpu, centos-cpu, centos-gpu, miscellaneous, edge, clang, unix-gpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] DickJC123 commented on pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
DickJC123 commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-656859038


   I ran into this issue developing a PR I have yet to submit.  The discrepancy in model outputs is caused by the fact that when cudnn calculates the running variance, it uses the 'sample variance', while this test is comparing in all cases to the 'population variance'.  The difference is that the sample variance uses a factor of N-1 in the denominator, while the population variance uses N (where N is the number of elements in the sample).
   
   My upcoming PR will include a fix for this, and after it's merged, if you want you could revert this commit that changed the problem sizes, since that is not the real issue here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] wkcn commented on pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
wkcn commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-659811667


   @DickJC123 
   Thank you!
   The problem size could be reverted back after merging the PR #18694 : )


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-656786283


   Jenkins CI successfully triggered : [windows-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu merged pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
leezu merged pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] DickJC123 commented on pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
DickJC123 commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-659766979


   My PR https://github.com/apache/incubator-mxnet/pull/18694 is now pushed and I am working toward getting a clean CI prior to merging.  The sample- vs. population-variance issue I mentioned above is corrected in that PR's commit https://github.com/apache/incubator-mxnet/pull/18694/commits/e0a7dda38d17c7c607a94ef4efe3b88ff1955fb3 .  Let me know if you would like the problem sizes reverted back to what they were and I can add a commit for that.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
leezu commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-656794745


   Can you elaborate why 24 results in imprecise results?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on pull request #18688: Fix the flaky test 'test_npx_batch_norm'

Posted by GitBox <gi...@apache.org>.
leezu commented on pull request #18688:
URL: https://github.com/apache/incubator-mxnet/pull/18688#issuecomment-656833732


   I merge this PR to prevent flakyness. Please still elaborate on the root cause for the imprecision. Thank you!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org