You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/09/15 03:08:06 UTC

[GitHub] [incubator-mxnet] DickJC123 opened a new pull request #19148: [v1.x] Backport Unittest tolerance handling improvements (#18694). Also test seeding (#18762).

DickJC123 opened a new pull request #19148:
URL: https://github.com/apache/incubator-mxnet/pull/19148


   ## Description ##
   This backport prepares MXNet 1.8 to be built against CUDA 11 and cuDNN 8 and run on A100 GPUs, which employ TensorFloat-32 (TF32) by default.  See PR https://github.com/apache/incubator-mxnet/pull/18694 for full details.
   
   During the development of this backported PR, I fixed numerous other CI issues that kept me from getting a passing CI.  At the time the PR was accepted, I was working on a couple of additional fixes that I made into a follow-up PR  https://github.com/apache/incubator-mxnet/pull/18694 "Improve test seeding and robustness in test_numpy_interoperablity.py".  To help get a passing CI, this PR backports that as well.
   
   @samskalicky @anirudh2290 @ChaiBapchya @ptrendx
   ## Checklist ##
   ### Essentials ###
   - [ X] PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
   - [ X] Changes are complete (i.e. I finished coding on this PR)
   - [ X] All changes have test coverage
   - [X ] Code is well-documented
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be made.
   - Interesting edge cases to note here
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] samskalicky commented on pull request #19148: [v1.x] Backport Unittest tolerance handling improvements (#18694). Also test seeding (#18762).

Posted by GitBox <gi...@apache.org>.
samskalicky commented on pull request #19148:
URL: https://github.com/apache/incubator-mxnet/pull/19148#issuecomment-692463200


   ```
   [2020-09-15T04:31:26.709Z] [ 99%] Linking CXX shared library mxnet_52.dll
   [2020-09-15T04:37:33.250Z] LINK: command "C:\PROGRA~2\MICROS~1.0\VC\bin\X86_AM~1\link.exe /nologo @CMakeFiles\mxnet_52.dir\objects1.rsp /out:mxnet_52.dll /implib:mxnet_52.lib /pdb:C:\jenkins_slave\workspace\build-gpu\build\mxnet_52.pdb /dll /version:0.0 /machine:x64 /INCREMENTAL:NO /OPT:REF /OPT:ICF -LIBPATH:C:\PROGRA~1\NVIDIA~2\CUDA\v10.2\lib\x64 3rdparty\mkldnn\src\dnnl.lib C:\Program Files\OpenBLAS-windows-v0_2_19\lib\libopenblas.dll.a C:\Program Files\opencv\x64\vc14\lib\opencv_world412.lib C:\Program Files\opencv\x64\vc14\lib\opencv_world412.lib C:\Program Files\opencv\x64\vc14\lib\opencv_world412.lib C:\Program Files\opencv\x64\vc14\lib\opencv_world412.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\cudnn.lib cuda.lib 3rdparty\dmlc-core\dmlc.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\cudart.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\cufft.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\
 x64\cublas.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\cusolver.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\cusparse.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\curand.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\nvrtc.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\cuda.lib cudadevrt.lib cudart_static.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:mxnet_52.dll.manifest" failed (exit code 1102) with the following output:
   [2020-09-15T04:37:33.250Z]    Creating library mxnet_52.lib and object mxnet_52.exp
   [2020-09-15T04:37:33.250Z] LINK : fatal error LNK1102: out of memory
   ```
   Do we need to enable compression?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha merged pull request #19148: [v1.x] Backport Unittest tolerance handling improvements (#18694). Also test seeding (#18762).

Posted by GitBox <gi...@apache.org>.
szha merged pull request #19148:
URL: https://github.com/apache/incubator-mxnet/pull/19148


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19148: [v1.x] Backport Unittest tolerance handling improvements (#18694). Also test seeding (#18762).

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19148:
URL: https://github.com/apache/incubator-mxnet/pull/19148#issuecomment-692435078


   Hey @DickJC123 , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [website, unix-gpu, sanity, centos-gpu, windows-gpu, edge, miscellaneous, windows-cpu, clang, unix-cpu, centos-cpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org