You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/11/22 20:36:34 UTC
[GitHub] [incubator-mxnet] access2rohit opened a new pull request #19576: Fix large tensor nightly tests by splitting into 2 parts
access2rohit opened a new pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576
## Description ##
splitting Large Tensor nightly tests into 2 parts, so they don't cause OOM.
## Checklist ##
### Essentials ###
- [x] Changes are complete (i.e. I finished coding on this PR)
- [x] All changes have test coverage
- [x] Code is well-documented
## Testing ##
Part 1
```
(pytest) ubuntu@ip-172-31-90-243 ~/workspace/incubator-mxnet (lt_nightly) $ python -m pytest -s --exitfirst --timeout=0 --verbose tests/nightly/test
_np_large_array_part1.py
==================================================================================================================================== test session st
arts ====================================================================================================================================
platform linux -- Python 3.6.10, pytest-5.3.5, py-1.8.2, pluggy-0.13.1 -- /home/ubuntu/anaconda3/envs/pytest/bin/python
cachedir: .pytest_cache
rootdir: /home/ubuntu/workspace/incubator-mxnet, inifile: pytest.ini
plugins: timeout-1.4.2
collected 115 items
tests/nightly/test_np_large_array_part1.py::test_gluon_embedding Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=294716448 to reprod
uce.
[23:41:23] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
PASSED
tests/nightly/test_np_large_array_part1.py::test_fully_connected PASSED
tests/nightly/test_np_large_array_part1.py::test_dense PASSED
tests/nightly/test_np_large_array_part1.py::test_softmax PASSED
tests/nightly/test_np_large_array_part1.py::test_ones PASSED
tests/nightly/test_np_large_array_part1.py::test_zeros PASSED
tests/nightly/test_np_large_array_part1.py::test_ones_like PASSED
tests/nightly/test_np_large_array_part1.py::test_zeros_like PASSED
tests/nightly/test_np_large_array_part1.py::test_abs [23:43:41] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib ver
sion 7501, which is older than the oldest version tested by CI (7600). Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
PASSED
tests/nightly/test_np_large_array_part1.py::test_binary_broadcast PASSED
tests/nightly/test_np_large_array_part1.py::test_all PASSED
tests/nightly/test_np_large_array_part1.py::test_amin PASSED
tests/nightly/test_np_large_array_part1.py::test_amax PASSED
tests/nightly/test_np_large_array_part1.py::test_argmin PASSED
tests/nightly/test_np_large_array_part1.py::test_argmax PASSED
tests/nightly/test_np_large_array_part1.py::test_trigonometric_family PASSED
tests/nightly/test_np_large_array_part1.py::test_any PASSED
tests/nightly/test_np_large_array_part1.py::test_append PASSED
tests/nightly/test_np_large_array_part1.py::test_arange PASSED
tests/nightly/test_np_large_array_part1.py::test_argsort PASSED
tests/nightly/test_np_large_array_part1.py::test_atleast_xd_family PASSED
tests/nightly/test_np_large_array_part1.py::test_average PASSED
tests/nightly/test_np_large_array_part1.py::test_bincount PASSED
tests/nightly/test_np_large_array_part1.py::test_blackman PASSED
tests/nightly/test_np_large_array_part1.py::test_broadcast_to PASSED
tests/nightly/test_np_large_array_part1.py::test_root_family PASSED
tests/nightly/test_np_large_array_part1.py::test_ceil_floor PASSED
tests/nightly/test_np_large_array_part1.py::test_clip PASSED
tests/nightly/test_np_large_array_part1.py::test_column_stack PASSED
tests/nightly/test_np_large_array_part1.py::test_concatenate PASSED
tests/nightly/test_np_large_array_part1.py::test_copysign PASSED
tests/nightly/test_np_large_array_part1.py::test_random_uniform PASSED
tests/nightly/test_np_large_array_part1.py::test_random_normal PASSED
tests/nightly/test_np_large_array_part1.py::test_random_gamma SKIPPED
tests/nightly/test_np_large_array_part1.py::test_random_exponential PASSED
tests/nightly/test_np_large_array_part1.py::test_random_laplace PASSED
tests/nightly/test_np_large_array_part1.py::test_random_choice PASSED
tests/nightly/test_np_large_array_part1.py::test_random_gumbel PASSED
tests/nightly/test_np_large_array_part1.py::test_random_logistic PASSED
tests/nightly/test_np_large_array_part1.py::test_random_multinomial SKIPPED
tests/nightly/test_np_large_array_part1.py::test_random_pareto PASSED
tests/nightly/test_np_large_array_part1.py::test_random_power PASSED
tests/nightly/test_np_large_array_part1.py::test_random_rayleigh PASSED
tests/nightly/test_np_large_array_part1.py::test_random_weibull PASSED
tests/nightly/test_np_large_array_part1.py::test_random_shuffle PASSED
tests/nightly/test_np_large_array_part1.py::test_random_lognormal PASSED
tests/nightly/test_np_large_array_part1.py::test_random_randint PASSED
tests/nightly/test_np_large_array_part1.py::test_slice_assign PASSED
tests/nightly/test_np_large_array_part1.py::test_logical_family PASSED
tests/nightly/test_np_large_array_part1.py::test_deg_rad PASSED
tests/nightly/test_np_large_array_part1.py::test_divide PASSED
tests/nightly/test_np_large_array_part1.py::test_minimum PASSED
tests/nightly/test_np_large_array_part1.py::test_maximum PASSED
tests/nightly/test_np_large_array_part1.py::test_eye PASSED
tests/nightly/test_np_large_array_part1.py::test_fix PASSED
tests/nightly/test_np_large_array_part1.py::test_flip PASSED
tests/nightly/test_np_large_array_part1.py::test_fliplr PASSED
tests/nightly/test_np_large_array_part1.py::test_flipud PASSED
tests/nightly/test_np_large_array_part1.py::test_full PASSED
tests/nightly/test_np_large_array_part1.py::test_full_like PASSED
tests/nightly/test_np_large_array_part1.py::test_comparison_family PASSED
tests/nightly/test_np_large_array_part1.py::test_lcm PASSED
tests/nightly/test_np_large_array_part1.py::test_log_family PASSED
tests/nightly/test_np_large_array_part1.py::test_expand_dims PASSED
tests/nightly/test_np_large_array_part1.py::test_hamming PASSED
tests/nightly/test_np_large_array_part1.py::test_hanning PASSED
tests/nightly/test_np_large_array_part1.py::test_fmax PASSED
tests/nightly/test_np_large_array_part1.py::test_fmin PASSED
tests/nightly/test_np_large_array_part1.py::test_fmod PASSED
tests/nightly/test_np_large_array_part1.py::test_mod PASSED
tests/nightly/test_np_large_array_part1.py::test_value_check_family PASSED
tests/nightly/test_np_large_array_part1.py::test_rint PASSED
tests/nightly/test_np_large_array_part1.py::test_invert PASSED
tests/nightly/test_np_large_array_part1.py::test_exp PASSED
tests/nightly/test_np_large_array_part1.py::test_expm1 PASSED
tests/nightly/test_np_large_array_part1.py::test_frexp SKIPPED
tests/nightly/test_np_large_array_part1.py::test_reciprocal PASSED
tests/nightly/test_np_large_array_part1.py::test_sum PASSED
tests/nightly/test_np_large_array_part1.py::test_negative PASSED
tests/nightly/test_np_large_array_part1.py::test_identity PASSED
tests/nightly/test_np_large_array_part1.py::test_square PASSED
tests/nightly/test_np_large_array_part1.py::test_sign PASSED
tests/nightly/test_np_large_array_part1.py::test_prod PASSED
tests/nightly/test_np_large_array_part1.py::test_add PASSED
tests/nightly/test_np_large_array_part1.py::test_hypot PASSED
tests/nightly/test_np_large_array_part1.py::test_power PASSED
tests/nightly/test_np_large_array_part1.py::test_ldexp PASSED
tests/nightly/test_np_large_array_part1.py::test_multiply PASSED
tests/nightly/test_np_large_array_part1.py::test_subtract PASSED
tests/nightly/test_np_large_array_part1.py::test_diag PASSED
tests/nightly/test_np_large_array_part1.py::test_diag_indices_from PASSED
tests/nightly/test_np_large_array_part1.py::test_diagflat PASSED
tests/nightly/test_np_large_array_part1.py::test_diagonal PASSED
tests/nightly/test_np_large_array_part1.py::test_roll PASSED
tests/nightly/test_np_large_array_part1.py::test_polyval PASSED
tests/nightly/test_np_large_array_part1.py::test_activation PASSED
tests/nightly/test_np_large_array_part1.py::test_arange_like PASSED
tests/nightly/test_np_large_array_part1.py::test_batch_dot SKIPPED
tests/nightly/test_np_large_array_part1.py::test_cast PASSED
tests/nightly/test_np_large_array_part1.py::test_broadcast_like PASSED
tests/nightly/test_np_large_array_part1.py::test_constraint_check PASSED
tests/nightly/test_np_large_array_part1.py::test_batch_flatten PASSED
tests/nightly/test_np_large_array_part1.py::test_batch_norm SKIPPED
tests/nightly/test_np_large_array_part1.py::test_nonzero PASSED
tests/nightly/test_np_large_array_part1.py::test_one_hot PASSED
tests/nightly/test_np_large_array_part1.py::test_pick PASSED
tests/nightly/test_np_large_array_part1.py::test_scalar_poisson PASSED
tests/nightly/test_np_large_array_part1.py::test_tensor_poisson PASSED
tests/nightly/test_np_large_array_part1.py::test_reshape PASSED
tests/nightly/test_np_large_array_part1.py::test_reshape_like PASSED
tests/nightly/test_np_large_array_part1.py::test_sigmoid PASSED
tests/nightly/test_np_large_array_part1.py::test_shape_array PASSED
tests/nightly/test_np_large_array_part1.py::test_stop_gradient PASSED
tests/nightly/test_np_large_array_part1.py::test_sequence_mask PASSED
tests/nightly/test_np_large_array_part1.py::test_topk PASSED
===================================================================================================================================== warnings summary ======================================================================================================================================
tests/nightly/test_np_large_array_part1.py:91
/home/ubuntu/workspace/incubator-mxnet/tests/nightly/test_np_large_array_part1.py:91: DeprecationWarning: invalid escape sequence \
'''
tests/nightly/test_np_large_array_part1.py:1321
/home/ubuntu/workspace/incubator-mxnet/tests/nightly/test_np_large_array_part1.py:1321: DeprecationWarning: invalid escape sequence \
'''
-- Docs: https://docs.pytest.org/en/latest/warnings.html
================================================================================================================== 110 passed, 5 skipped, 2 warnings in 1877.16s (0:31:17) ==================================================================================================================
```
Part2:
```
(pytest) ubuntu@ip-172-31-90-243 ~/workspace/incubator-mxnet (lt_nightly) $ python -m pytest -s --exitfirst --timeout=0 --verbose tests/nightly/test_np_large_array_part2.py
=============================================================== test session starts ================================================================
platform linux -- Python 3.6.10, pytest-5.3.5, py-1.8.2, pluggy-0.13.1 -- /home/ubuntu/anaconda3/envs/pytest/bin/python
cachedir: .pytest_cache
rootdir: /home/ubuntu/workspace/incubator-mxnet, inifile: pytest.ini
plugins: timeout-1.4.2
collected 51 items
tests/nightly/test_np_large_array_part2.py::test_slice Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1391863372 to reproduce.
[00:49:35] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
PASSED
tests/nightly/test_np_large_array_part2.py::test_smooth_l1 [00:49:40] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600). Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
PASSED
tests/nightly/test_np_large_array_part2.py::test_gamma PASSED
tests/nightly/test_np_large_array_part2.py::test_gammaln PASSED
tests/nightly/test_np_large_array_part2.py::test_digamma PASSED
tests/nightly/test_np_large_array_part2.py::test_rnn_dim_check SKIPPED
tests/nightly/test_np_large_array_part2.py::test_rnn_vanilla SKIPPED
tests/nightly/test_np_large_array_part2.py::test_rnn_gru PASSED
tests/nightly/test_np_large_array_part2.py::test_rnn_lstm PASSED
tests/nightly/test_np_large_array_part2.py::test_ctc_loss PASSED
tests/nightly/test_np_large_array_part2.py::test_erf PASSED
tests/nightly/test_np_large_array_part2.py::test_erfinv PASSED
tests/nightly/test_np_large_array_part2.py::test_index_add PASSED
tests/nightly/test_np_large_array_part2.py::test_index_update PASSED
tests/nightly/test_np_large_array_part2.py::test_layer_norm PASSED
tests/nightly/test_np_large_array_part2.py::test_dlpack PASSED
tests/nightly/test_np_large_array_part2.py::test_pooling PASSED
tests/nightly/test_np_large_array_part2.py::test_roi_pooling PASSED
tests/nightly/test_np_large_array_part2.py::test_save_load SKIPPED
tests/nightly/test_np_large_array_part2.py::test_gather_nd PASSED
tests/nightly/test_np_large_array_part2.py::test_random_bernoulli PASSED
tests/nightly/test_np_large_array_part2.py::test_cumsum PASSED
tests/nightly/test_np_large_array_part2.py::test_round PASSED
tests/nightly/test_np_large_array_part2.py::test_cross PASSED
tests/nightly/test_np_large_array_part2.py::test_array_split PASSED
tests/nightly/test_np_large_array_part2.py::test_take PASSED
tests/nightly/test_np_large_array_part2.py::test_std PASSED
tests/nightly/test_np_large_array_part2.py::test_var PASSED
tests/nightly/test_np_large_array_part2.py::test_rollaxis PASSED
tests/nightly/test_np_large_array_part2.py::test_vstack PASSED
tests/nightly/test_np_large_array_part2.py::test_ediff1d PASSED
tests/nightly/test_np_large_array_part2.py::test_split PASSED
tests/nightly/test_np_large_array_part2.py::test_hsplit PASSED
tests/nightly/test_np_large_array_part2.py::test_vsplit PASSED
tests/nightly/test_np_large_array_part2.py::test_dsplit PASSED
tests/nightly/test_np_large_array_part2.py::test_tril_indices PASSED
tests/nightly/test_np_large_array_part2.py::test_tril_indices_extreme PASSED
tests/nightly/test_np_large_array_part2.py::test_diff PASSED
tests/nightly/test_np_large_array_part2.py::test_kron PASSED
tests/nightly/test_np_large_array_part2.py::test_logspace PASSED
tests/nightly/test_np_large_array_part2.py::test_linspace PASSED
tests/nightly/test_np_large_array_part2.py::test_histogram PASSED
tests/nightly/test_np_large_array_part2.py::test_nan_to_num PASSED
tests/nightly/test_np_large_array_part2.py::test_interp PASSED
tests/nightly/test_np_large_array_part2.py::test_edge_padding PASSED
tests/nightly/test_np_large_array_part2.py::test_constant_padding PASSED
tests/nightly/test_np_large_array_part2.py::test_minimum_padding PASSED
tests/nightly/test_np_large_array_part2.py::test_reflection_padding PASSED
tests/nightly/test_np_large_array_part2.py::test_symmetric_padding PASSED
tests/nightly/test_np_large_array_part2.py::test_fill_diagonal PASSED
tests/nightly/test_np_large_array_part2.py::test_insert PASSED
================================================================= warnings summary =================================================================
tests/nightly/test_np_large_array_part2.py:49
/home/ubuntu/workspace/incubator-mxnet/tests/nightly/test_np_large_array_part2.py:49: DeprecationWarning: invalid escape sequence \
'''
-- Docs: https://docs.pytest.org/en/latest/warnings.html
============================================== 48 passed, 3 skipped, 1 warning in 1640.82s (0:27:20) ===============================================
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #19576: Fix large tensor nightly tests by running each test as subprocess
Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-736795418
@access2rohit is this ready to merge?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] Zha0q1 merged pull request #19576: Fix large tensor nightly tests by running each test as subprocess
Posted by GitBox <gi...@apache.org>.
Zha0q1 merged pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] leezu commented on pull request #19576: Fix large tensor nightly tests by splitting into 2 parts
Posted by GitBox <gi...@apache.org>.
leezu commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-732421203
If it's an issue with the memory pool, would there be any issue with disabling the pool? `MXNET_GPU_MEM_POOL_TYPE=Unpooled`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] access2rohit commented on pull request #19576: Fix large tensor nightly tests by splitting into 2 parts
Posted by GitBox <gi...@apache.org>.
access2rohit commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-732414584
> Why does the single file cause OOM? It looks like there is a memory leak?
No there isn't. Its the MXNet's pooled memory management that is the issue here. Individually none of these tests take more than 120GB but when run consecutively they cause OOM.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19576: Fix large tensor nightly tests by splitting into 2 parts
Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19576:
URL: https://github.com/apache/incubator-mxnet/pull/19576#issuecomment-731843270
Hey @access2rohit , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:
- To trigger all jobs: @mxnet-bot run ci [all]
- To trigger specific jobs: @mxnet-bot run ci [job1, job2]
***
**CI supported jobs**: [windows-gpu, unix-gpu, miscellaneous, sanity, clang, website, edge, centos-gpu, centos-cpu, unix-cpu, windows-cpu]
***
_Note_:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org