You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/11/22 20:36:34 UTC

[GitHub] [incubator-mxnet] access2rohit opened a new pull request #19576: Fix large tensor nightly tests by splitting into 2 parts

access2rohit opened a new pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576


   ## Description ##
   splitting Large Tensor nightly tests into 2 parts, so they don't cause OOM.
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] Code is well-documented
   
   ## Testing ##
   Part 1
   ```
   (pytest) ubuntu@ip-172-31-90-243 ~/workspace/incubator-mxnet (lt_nightly) $ python -m pytest -s --exitfirst --timeout=0 --verbose tests/nightly/test
   _np_large_array_part1.py
   ==================================================================================================================================== test session st
   arts ====================================================================================================================================
   platform linux -- Python 3.6.10, pytest-5.3.5, py-1.8.2, pluggy-0.13.1 -- /home/ubuntu/anaconda3/envs/pytest/bin/python
   cachedir: .pytest_cache
   rootdir: /home/ubuntu/workspace/incubator-mxnet, inifile: pytest.ini
   plugins: timeout-1.4.2
   collected 115 items
   
   
   tests/nightly/test_np_large_array_part1.py::test_gluon_embedding Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=294716448 to reprod
   uce.
   [23:41:23] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
   PASSED
   tests/nightly/test_np_large_array_part1.py::test_fully_connected PASSED
   tests/nightly/test_np_large_array_part1.py::test_dense PASSED
   tests/nightly/test_np_large_array_part1.py::test_softmax PASSED
   tests/nightly/test_np_large_array_part1.py::test_ones PASSED
   tests/nightly/test_np_large_array_part1.py::test_zeros PASSED
   tests/nightly/test_np_large_array_part1.py::test_ones_like PASSED
   tests/nightly/test_np_large_array_part1.py::test_zeros_like PASSED
   tests/nightly/test_np_large_array_part1.py::test_abs [23:43:41] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib ver
   sion 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   PASSED
   tests/nightly/test_np_large_array_part1.py::test_binary_broadcast PASSED
   tests/nightly/test_np_large_array_part1.py::test_all PASSED
   tests/nightly/test_np_large_array_part1.py::test_amin PASSED
   tests/nightly/test_np_large_array_part1.py::test_amax PASSED
   tests/nightly/test_np_large_array_part1.py::test_argmin PASSED
   tests/nightly/test_np_large_array_part1.py::test_argmax PASSED
   tests/nightly/test_np_large_array_part1.py::test_trigonometric_family PASSED
   tests/nightly/test_np_large_array_part1.py::test_any PASSED
   tests/nightly/test_np_large_array_part1.py::test_append PASSED
   tests/nightly/test_np_large_array_part1.py::test_arange PASSED
   tests/nightly/test_np_large_array_part1.py::test_argsort PASSED
   tests/nightly/test_np_large_array_part1.py::test_atleast_xd_family PASSED
   tests/nightly/test_np_large_array_part1.py::test_average PASSED
   tests/nightly/test_np_large_array_part1.py::test_bincount PASSED
   tests/nightly/test_np_large_array_part1.py::test_blackman PASSED
   tests/nightly/test_np_large_array_part1.py::test_broadcast_to PASSED
   tests/nightly/test_np_large_array_part1.py::test_root_family PASSED
   tests/nightly/test_np_large_array_part1.py::test_ceil_floor PASSED
   tests/nightly/test_np_large_array_part1.py::test_clip PASSED
   tests/nightly/test_np_large_array_part1.py::test_column_stack PASSED
   tests/nightly/test_np_large_array_part1.py::test_concatenate PASSED
   tests/nightly/test_np_large_array_part1.py::test_copysign PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_uniform PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_normal PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_gamma SKIPPED
   tests/nightly/test_np_large_array_part1.py::test_random_exponential PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_laplace PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_choice PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_gumbel PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_logistic PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_multinomial SKIPPED
   tests/nightly/test_np_large_array_part1.py::test_random_pareto PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_power PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_rayleigh PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_weibull PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_shuffle PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_lognormal PASSED
   tests/nightly/test_np_large_array_part1.py::test_random_randint PASSED
   tests/nightly/test_np_large_array_part1.py::test_slice_assign PASSED
   tests/nightly/test_np_large_array_part1.py::test_logical_family PASSED
   tests/nightly/test_np_large_array_part1.py::test_deg_rad PASSED
   tests/nightly/test_np_large_array_part1.py::test_divide PASSED
   tests/nightly/test_np_large_array_part1.py::test_minimum PASSED
   tests/nightly/test_np_large_array_part1.py::test_maximum PASSED
   tests/nightly/test_np_large_array_part1.py::test_eye PASSED
   tests/nightly/test_np_large_array_part1.py::test_fix PASSED
   tests/nightly/test_np_large_array_part1.py::test_flip PASSED
   tests/nightly/test_np_large_array_part1.py::test_fliplr PASSED
   tests/nightly/test_np_large_array_part1.py::test_flipud PASSED
   tests/nightly/test_np_large_array_part1.py::test_full PASSED
   tests/nightly/test_np_large_array_part1.py::test_full_like PASSED
   tests/nightly/test_np_large_array_part1.py::test_comparison_family PASSED
   tests/nightly/test_np_large_array_part1.py::test_lcm PASSED
   tests/nightly/test_np_large_array_part1.py::test_log_family PASSED
   tests/nightly/test_np_large_array_part1.py::test_expand_dims PASSED
   tests/nightly/test_np_large_array_part1.py::test_hamming PASSED
   tests/nightly/test_np_large_array_part1.py::test_hanning PASSED
   tests/nightly/test_np_large_array_part1.py::test_fmax PASSED
   tests/nightly/test_np_large_array_part1.py::test_fmin PASSED
   tests/nightly/test_np_large_array_part1.py::test_fmod PASSED
   tests/nightly/test_np_large_array_part1.py::test_mod PASSED
   tests/nightly/test_np_large_array_part1.py::test_value_check_family PASSED
   tests/nightly/test_np_large_array_part1.py::test_rint PASSED
   tests/nightly/test_np_large_array_part1.py::test_invert PASSED
   tests/nightly/test_np_large_array_part1.py::test_exp PASSED
   tests/nightly/test_np_large_array_part1.py::test_expm1 PASSED
   tests/nightly/test_np_large_array_part1.py::test_frexp SKIPPED
   tests/nightly/test_np_large_array_part1.py::test_reciprocal PASSED
   tests/nightly/test_np_large_array_part1.py::test_sum PASSED
   tests/nightly/test_np_large_array_part1.py::test_negative PASSED
   tests/nightly/test_np_large_array_part1.py::test_identity PASSED
   tests/nightly/test_np_large_array_part1.py::test_square PASSED
   tests/nightly/test_np_large_array_part1.py::test_sign PASSED
   tests/nightly/test_np_large_array_part1.py::test_prod PASSED
   tests/nightly/test_np_large_array_part1.py::test_add PASSED
   tests/nightly/test_np_large_array_part1.py::test_hypot PASSED
   tests/nightly/test_np_large_array_part1.py::test_power PASSED
   tests/nightly/test_np_large_array_part1.py::test_ldexp PASSED
   tests/nightly/test_np_large_array_part1.py::test_multiply PASSED
   tests/nightly/test_np_large_array_part1.py::test_subtract PASSED
   tests/nightly/test_np_large_array_part1.py::test_diag PASSED
   tests/nightly/test_np_large_array_part1.py::test_diag_indices_from PASSED
   tests/nightly/test_np_large_array_part1.py::test_diagflat PASSED
   tests/nightly/test_np_large_array_part1.py::test_diagonal PASSED
   tests/nightly/test_np_large_array_part1.py::test_roll PASSED
   tests/nightly/test_np_large_array_part1.py::test_polyval PASSED
   tests/nightly/test_np_large_array_part1.py::test_activation PASSED
   tests/nightly/test_np_large_array_part1.py::test_arange_like PASSED
   tests/nightly/test_np_large_array_part1.py::test_batch_dot SKIPPED
   tests/nightly/test_np_large_array_part1.py::test_cast PASSED
   tests/nightly/test_np_large_array_part1.py::test_broadcast_like PASSED
   tests/nightly/test_np_large_array_part1.py::test_constraint_check PASSED
   tests/nightly/test_np_large_array_part1.py::test_batch_flatten PASSED
   tests/nightly/test_np_large_array_part1.py::test_batch_norm SKIPPED
   tests/nightly/test_np_large_array_part1.py::test_nonzero PASSED
   tests/nightly/test_np_large_array_part1.py::test_one_hot PASSED
   tests/nightly/test_np_large_array_part1.py::test_pick PASSED
   tests/nightly/test_np_large_array_part1.py::test_scalar_poisson PASSED
   tests/nightly/test_np_large_array_part1.py::test_tensor_poisson PASSED
   tests/nightly/test_np_large_array_part1.py::test_reshape PASSED
   tests/nightly/test_np_large_array_part1.py::test_reshape_like PASSED
   tests/nightly/test_np_large_array_part1.py::test_sigmoid PASSED
   tests/nightly/test_np_large_array_part1.py::test_shape_array PASSED
   tests/nightly/test_np_large_array_part1.py::test_stop_gradient PASSED
   tests/nightly/test_np_large_array_part1.py::test_sequence_mask PASSED
   tests/nightly/test_np_large_array_part1.py::test_topk PASSED
   
   ===================================================================================================================================== warnings summary ======================================================================================================================================
   tests/nightly/test_np_large_array_part1.py:91
     /home/ubuntu/workspace/incubator-mxnet/tests/nightly/test_np_large_array_part1.py:91: DeprecationWarning: invalid escape sequence \
       '''
   tests/nightly/test_np_large_array_part1.py:1321
     /home/ubuntu/workspace/incubator-mxnet/tests/nightly/test_np_large_array_part1.py:1321: DeprecationWarning: invalid escape sequence \
       '''
   
   -- Docs: https://docs.pytest.org/en/latest/warnings.html
   ================================================================================================================== 110 passed, 5 skipped, 2 warnings in 1877.16s (0:31:17) ==================================================================================================================
   ```
   
   Part2:
   ```
   (pytest) ubuntu@ip-172-31-90-243 ~/workspace/incubator-mxnet (lt_nightly) $ python -m pytest -s --exitfirst --timeout=0 --verbose tests/nightly/test_np_large_array_part2.py
   =============================================================== test session starts ================================================================
   platform linux -- Python 3.6.10, pytest-5.3.5, py-1.8.2, pluggy-0.13.1 -- /home/ubuntu/anaconda3/envs/pytest/bin/python
   cachedir: .pytest_cache
   rootdir: /home/ubuntu/workspace/incubator-mxnet, inifile: pytest.ini
   plugins: timeout-1.4.2
   collected 51 items
   
   tests/nightly/test_np_large_array_part2.py::test_slice Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1391863372 to reproduce.
   [00:49:35] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
   PASSED
   tests/nightly/test_np_large_array_part2.py::test_smooth_l1 [00:49:40] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   PASSED
   tests/nightly/test_np_large_array_part2.py::test_gamma PASSED
   tests/nightly/test_np_large_array_part2.py::test_gammaln PASSED
   tests/nightly/test_np_large_array_part2.py::test_digamma PASSED
   tests/nightly/test_np_large_array_part2.py::test_rnn_dim_check SKIPPED
   tests/nightly/test_np_large_array_part2.py::test_rnn_vanilla SKIPPED
   tests/nightly/test_np_large_array_part2.py::test_rnn_gru PASSED
   tests/nightly/test_np_large_array_part2.py::test_rnn_lstm PASSED
   tests/nightly/test_np_large_array_part2.py::test_ctc_loss PASSED
   tests/nightly/test_np_large_array_part2.py::test_erf PASSED
   tests/nightly/test_np_large_array_part2.py::test_erfinv PASSED
   tests/nightly/test_np_large_array_part2.py::test_index_add PASSED
   tests/nightly/test_np_large_array_part2.py::test_index_update PASSED
   tests/nightly/test_np_large_array_part2.py::test_layer_norm PASSED
   tests/nightly/test_np_large_array_part2.py::test_dlpack PASSED
   tests/nightly/test_np_large_array_part2.py::test_pooling PASSED
   tests/nightly/test_np_large_array_part2.py::test_roi_pooling PASSED
   tests/nightly/test_np_large_array_part2.py::test_save_load SKIPPED
   tests/nightly/test_np_large_array_part2.py::test_gather_nd PASSED
   tests/nightly/test_np_large_array_part2.py::test_random_bernoulli PASSED
   tests/nightly/test_np_large_array_part2.py::test_cumsum PASSED
   tests/nightly/test_np_large_array_part2.py::test_round PASSED
   tests/nightly/test_np_large_array_part2.py::test_cross PASSED
   tests/nightly/test_np_large_array_part2.py::test_array_split PASSED
   tests/nightly/test_np_large_array_part2.py::test_take PASSED
   tests/nightly/test_np_large_array_part2.py::test_std PASSED
   tests/nightly/test_np_large_array_part2.py::test_var PASSED
   tests/nightly/test_np_large_array_part2.py::test_rollaxis PASSED
   tests/nightly/test_np_large_array_part2.py::test_vstack PASSED
   tests/nightly/test_np_large_array_part2.py::test_ediff1d PASSED
   tests/nightly/test_np_large_array_part2.py::test_split PASSED
   tests/nightly/test_np_large_array_part2.py::test_hsplit PASSED
   tests/nightly/test_np_large_array_part2.py::test_vsplit PASSED
   tests/nightly/test_np_large_array_part2.py::test_dsplit PASSED
   tests/nightly/test_np_large_array_part2.py::test_tril_indices PASSED
   tests/nightly/test_np_large_array_part2.py::test_tril_indices_extreme PASSED
   tests/nightly/test_np_large_array_part2.py::test_diff PASSED
   tests/nightly/test_np_large_array_part2.py::test_kron PASSED
   tests/nightly/test_np_large_array_part2.py::test_logspace PASSED
   tests/nightly/test_np_large_array_part2.py::test_linspace PASSED
   tests/nightly/test_np_large_array_part2.py::test_histogram PASSED
   tests/nightly/test_np_large_array_part2.py::test_nan_to_num PASSED
   tests/nightly/test_np_large_array_part2.py::test_interp PASSED
   tests/nightly/test_np_large_array_part2.py::test_edge_padding PASSED
   tests/nightly/test_np_large_array_part2.py::test_constant_padding PASSED
   tests/nightly/test_np_large_array_part2.py::test_minimum_padding PASSED
   tests/nightly/test_np_large_array_part2.py::test_reflection_padding PASSED
   tests/nightly/test_np_large_array_part2.py::test_symmetric_padding PASSED
   tests/nightly/test_np_large_array_part2.py::test_fill_diagonal PASSED
   tests/nightly/test_np_large_array_part2.py::test_insert PASSED
   
   ================================================================= warnings summary =================================================================
   tests/nightly/test_np_large_array_part2.py:49
     /home/ubuntu/workspace/incubator-mxnet/tests/nightly/test_np_large_array_part2.py:49: DeprecationWarning: invalid escape sequence \
       '''
   
   -- Docs: https://docs.pytest.org/en/latest/warnings.html
   ============================================== 48 passed, 3 skipped, 1 warning in 1640.82s (0:27:20) ===============================================
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #19576: Fix large tensor nightly tests by running each test as subprocess

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-736795418


   @access2rohit is this ready to merge?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 merged pull request #19576: Fix large tensor nightly tests by running each test as subprocess

Posted by GitBox <gi...@apache.org>.
Zha0q1 merged pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on pull request #19576: Fix large tensor nightly tests by splitting into 2 parts

Posted by GitBox <gi...@apache.org>.
leezu commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-732421203


   If it's an issue with the memory pool, would there be any issue with disabling the pool? `MXNET_GPU_MEM_POOL_TYPE=Unpooled`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] access2rohit commented on pull request #19576: Fix large tensor nightly tests by splitting into 2 parts

Posted by GitBox <gi...@apache.org>.
access2rohit commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-732414584


   > Why does the single file cause OOM? It looks like there is a memory leak?
   
   No there isn't. Its the MXNet's pooled memory management that is the issue here. Individually none of these tests take more than 120GB but when run consecutively they cause OOM.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19576: Fix large tensor nightly tests by splitting into 2 parts

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-731843270


   Hey @access2rohit , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [windows-gpu, unix-gpu, miscellaneous, sanity, clang, website, edge, centos-gpu, centos-cpu, unix-cpu, windows-cpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org