You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/09 06:04:36 UTC

[GitHub] [incubator-mxnet] eric-haibin-lin opened a new issue #18400: flaky test: check leak ndarray

eric-haibin-lin opened a new issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400


   http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-18394/1/pipeline
   
   ```
   [2020-05-24T09:53:03.464Z] ==================================== ERRORS ====================================
   [2020-05-24T09:53:03.464Z] _____________________ ERROR at teardown of test_function1 ______________________
   [2020-05-24T09:53:03.464Z] 
   [2020-05-24T09:53:03.464Z] request = <SubRequest 'check_leak_ndarray' for <Function test_function1>>
   [2020-05-24T09:53:03.464Z] 
   [2020-05-24T09:53:03.464Z]     @pytest.fixture(autouse=True)
   [2020-05-24T09:53:03.464Z]     def check_leak_ndarray(request):
   [2020-05-24T09:53:03.464Z]         garbage_expected = request.node.get_closest_marker('garbage_expected')
   [2020-05-24T09:53:03.464Z]         if garbage_expected:  # Some tests leak references. They should be fixed.
   [2020-05-24T09:53:03.464Z]             yield  # run test
   [2020-05-24T09:53:03.464Z]             return
   [2020-05-24T09:53:03.464Z]     
   [2020-05-24T09:53:03.464Z]         if 'centos' in platform.platform():
   [2020-05-24T09:53:03.464Z]             # Multiple tests are failing due to reference leaks on CentOS. It's not
   [2020-05-24T09:53:03.464Z]             # yet known why there are more memory leaks in the Python 3.6.9 version
   [2020-05-24T09:53:03.464Z]             # shipped on CentOS compared to the Python 3.6.9 version shipped in
   [2020-05-24T09:53:03.464Z]             # Ubuntu.
   [2020-05-24T09:53:03.464Z]             yield
   [2020-05-24T09:53:03.464Z]             return
   [2020-05-24T09:53:03.464Z]     
   [2020-05-24T09:53:03.464Z]         del gc.garbage[:]
   [2020-05-24T09:53:03.464Z]         # Collect garbage prior to running the next test
   [2020-05-24T09:53:03.464Z]         gc.collect()
   [2020-05-24T09:53:03.464Z]         # Enable gc debug mode to check if the test leaks any arrays
   [2020-05-24T09:53:03.464Z]         gc_flags = gc.get_debug()
   [2020-05-24T09:53:03.464Z]         gc.set_debug(gc.DEBUG_SAVEALL)
   [2020-05-24T09:53:03.464Z]     
   [2020-05-24T09:53:03.464Z]         # Run the test
   [2020-05-24T09:53:03.464Z]         yield
   [2020-05-24T09:53:03.464Z]     
   [2020-05-24T09:53:03.464Z]         # Check for leaked NDArrays
   [2020-05-24T09:53:03.464Z]         gc.collect()
   [2020-05-24T09:53:03.464Z]         gc.set_debug(gc_flags)  # reset gc flags
   [2020-05-24T09:53:03.464Z]     
   [2020-05-24T09:53:03.464Z]         seen = set()
   [2020-05-24T09:53:03.464Z]         def has_array(element):
   [2020-05-24T09:53:03.464Z]             try:
   [2020-05-24T09:53:03.464Z]                 if element in seen:
   [2020-05-24T09:53:03.464Z]                     return False
   [2020-05-24T09:53:03.464Z]                 seen.add(element)
   [2020-05-24T09:53:03.464Z]             except (TypeError, ValueError):  # unhashable
   [2020-05-24T09:53:03.464Z]                 pass
   [2020-05-24T09:53:03.464Z]     
   [2020-05-24T09:53:03.464Z]             if isinstance(element, mx.nd._internal.NDArrayBase):
   [2020-05-24T09:53:03.464Z]                 return True
   [2020-05-24T09:53:03.464Z]             elif isinstance(element, mx.sym._internal.SymbolBase):
   [2020-05-24T09:53:03.464Z]                 return False
   [2020-05-24T09:53:03.464Z]             elif hasattr(element, '__dict__'):
   [2020-05-24T09:53:03.464Z]                 return any(has_array(x) for x in vars(element))
   [2020-05-24T09:53:03.464Z]             elif isinstance(element, dict):
   [2020-05-24T09:53:03.464Z]                 return any(has_array(x) for x in element.items())
   [2020-05-24T09:53:03.464Z]             else:
   [2020-05-24T09:53:03.464Z]                 try:
   [2020-05-24T09:53:03.464Z]                     return any(has_array(x) for x in element)
   [2020-05-24T09:53:03.464Z]                 except (TypeError, KeyError, RecursionError):
   [2020-05-24T09:53:03.464Z]                     return False
   [2020-05-24T09:53:03.464Z]     
   [2020-05-24T09:53:03.464Z] >       assert not any(has_array(x) for x in gc.garbage), 'Found leaked NDArrays due to reference cycles'
   [2020-05-24T09:53:03.464Z] E       AssertionError: Found leaked NDArrays due to reference cycles
   [2020-05-24T09:53:03.464Z] E       assert not True
   [2020-05-24T09:53:03.464Z] E        +  where True = any(<generator object check_leak_ndarray.<locals>.<genexpr> at 0x7f96c07802b0>)
   [2020-05-24T09:53:03.464Z] 
   [2020-05-24T09:53:03.464Z] tests/python/conftest.py:78: AssertionError
   [2020-05-24T09:53:03.464Z] ---------------------------- Captured stderr setup -----------------------------
   [2020-05-24T09:53:03.464Z] DEBUG:root:np/mx/python random seeds are set to 135663639, use MXNET_TEST_SEED=135663639 to reproduce.
   [2020-05-24T09:53:03.464Z] ------------------------------ Captured log setup ------------------------------
   [2020-05-24T09:53:03.464Z] DEBUG    root:conftest.py:193 np/mx/python random seeds are set to 135663639, use MXNET_TEST_SEED=135663639 to reproduce.
   [2020-05-24T09:53:03.464Z] ----------------------------- Captured stderr call -----------------------------
   [2020-05-24T09:53:03.464Z] [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1404816900 to reproduce.
   [2020-05-24T09:53:03.465Z] DEBUG:common:Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1404816900 to reproduce.
   [2020-05-24T09:53:03.465Z] ------------------------------ Captured log call -------------------------------
   [2020-05-24T09:53:03.465Z] DEBUG    common:common.py:221 Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1404816900 to reproduce.
   
   ```
   @leezu 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-714064592


   `ERROR at teardown of test_foreach`
   
   https://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-19336/runs/4/nodes/284/steps/420/log/?start=0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] szha edited a comment on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
szha edited a comment on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-655920921


   http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18562/runs/11/nodes/354/steps/501/log/?start=0
   
   `ERROR at teardown of test_grad_with_stype`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu closed issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
leezu closed issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-639195881


   ```
    _____________________ ERROR at teardown of test_get_symbol _____________________
   [2020-06-04T22:33:35.745Z] 
   [2020-06-04T22:33:35.745Z] request = <SubRequest 'check_leak_ndarray' for <Function test_get_symbol>>
   [2020-06-04T22:33:35.745Z] 
   [2020-06-04T22:33:35.745Z]     @pytest.fixture(autouse=True)
   [2020-06-04T22:33:35.745Z]     def check_leak_ndarray(request):
   [2020-06-04T22:33:35.745Z]         garbage_expected = request.node.get_closest_marker('garbage_expected')
   [2020-06-04T22:33:35.745Z]         if garbage_expected:  # Some tests leak references. They should be fixed.
   [2020-06-04T22:33:35.745Z]             yield  # run test
   [2020-06-04T22:33:35.745Z]             return
   [2020-06-04T22:33:35.745Z]     
   [2020-06-04T22:33:35.745Z]         if 'centos' in platform.platform():
   [2020-06-04T22:33:35.745Z]             # Multiple tests are failing due to reference leaks on CentOS. It's not
   [2020-06-04T22:33:35.745Z]             # yet known why there are more memory leaks in the Python 3.6.9 version
   [2020-06-04T22:33:35.745Z]             # shipped on CentOS compared to the Python 3.6.9 version shipped in
   [2020-06-04T22:33:35.745Z]             # Ubuntu.
   [2020-06-04T22:33:35.745Z]             yield
   [2020-06-04T22:33:35.745Z]             return
   [2020-06-04T22:33:35.745Z]     
   [2020-06-04T22:33:35.745Z]         del gc.garbage[:]
   [2020-06-04T22:33:35.745Z]         # Collect garbage prior to running the next test
   [2020-06-04T22:33:35.745Z]         gc.collect()
   [2020-06-04T22:33:35.745Z]         # Enable gc debug mode to check if the test leaks any arrays
   [2020-06-04T22:33:35.745Z]         gc_flags = gc.get_debug()
   [2020-06-04T22:33:35.745Z]         gc.set_debug(gc.DEBUG_SAVEALL)
   [2020-06-04T22:33:35.745Z]     
   [2020-06-04T22:33:35.745Z]         # Run the test
   [2020-06-04T22:33:35.745Z]         yield
   [2020-06-04T22:33:35.745Z]     
   [2020-06-04T22:33:35.745Z]         # Check for leaked NDArrays
   [2020-06-04T22:33:35.745Z]         gc.collect()
   [2020-06-04T22:33:35.745Z]         gc.set_debug(gc_flags)  # reset gc flags
   [2020-06-04T22:33:35.745Z]     
   [2020-06-04T22:33:35.745Z]         seen = set()
   [2020-06-04T22:33:35.745Z]         def has_array(element):
   [2020-06-04T22:33:35.745Z]             try:
   [2020-06-04T22:33:35.745Z]                 if element in seen:
   [2020-06-04T22:33:35.745Z]                     return False
   [2020-06-04T22:33:35.745Z]                 seen.add(element)
   [2020-06-04T22:33:35.745Z]             except (TypeError, ValueError):  # unhashable
   [2020-06-04T22:33:35.745Z]                 pass
   [2020-06-04T22:33:35.745Z]     
   [2020-06-04T22:33:35.745Z]             if isinstance(element, mx.nd._internal.NDArrayBase):
   [2020-06-04T22:33:35.745Z]                 return True
   [2020-06-04T22:33:35.745Z]             elif isinstance(element, mx.sym._internal.SymbolBase):
   [2020-06-04T22:33:35.745Z]                 return False
   [2020-06-04T22:33:35.745Z]             elif hasattr(element, '__dict__'):
   [2020-06-04T22:33:35.745Z]                 return any(has_array(x) for x in vars(element))
   [2020-06-04T22:33:35.745Z]             elif isinstance(element, dict):
   [2020-06-04T22:33:35.745Z]                 return any(has_array(x) for x in element.items())
   [2020-06-04T22:33:35.745Z]             else:
   [2020-06-04T22:33:35.745Z]                 try:
   [2020-06-04T22:33:35.745Z]                     return any(has_array(x) for x in element)
   [2020-06-04T22:33:35.745Z]                 except (TypeError, KeyError, RecursionError):
   [2020-06-04T22:33:35.745Z]                     return False
   [2020-06-04T22:33:35.745Z]     
   [2020-06-04T22:33:35.745Z] >       assert not any(has_array(x) for x in gc.garbage), 'Found leaked NDArrays due to reference cycles'
   [2020-06-04T22:33:35.745Z] E       AssertionError: Found leaked NDArrays due to reference cycles
   [2020-06-04T22:33:35.745Z] E       assert not True
   [2020-06-04T22:33:35.745Z] E        +  where True = any(<generator object check_leak_ndarray.<locals>.<genexpr> at 0x7f8a046fb0a0>)
   [2020-06-04T22:33:35.745Z] 
   ```
   
   http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18485/runs/2/nodes/365/steps/570/log/?start=0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] eric-haibin-lin commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
eric-haibin-lin commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-642217396


   Happened again http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-18525/8/pipeline for test_get_symbol


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
szha commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-655920921


   http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18562/runs/11/nodes/354/steps/501/log/?start=0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
szha commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-646900164


   http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18562/runs/5/nodes/364/steps/758/log/?start=0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu closed issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
leezu closed issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-634171210


   As the flakyness occurrs with `mx.autograd.Function`, which is "known to leak" (cf the `test_function` in the same file), I suggest to mark the flaky `test_function1` as "known to leak" as well. I'm not yet sure why `test_function1` leaks only sometimes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-634416116


   Happened also in http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18408/runs/4/nodes/364/steps/758/log/?start=0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18400: flaky test: check leak ndarray

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18400:
URL: https://github.com/apache/incubator-mxnet/issues/18400#issuecomment-646891134


   And a third time. I'm not sure why this happens time to time and why it only affects test_get_symbol, but let's disable the check for test_get_symbol in favor of CI stability: https://github.com/apache/incubator-mxnet/pull/18595
   
   http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18589/runs/1/nodes/364/steps/755/log/?start=0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org