You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2022/01/04 16:44:04 UTC

[GitHub] [incubator-mxnet] tpyl commented on pull request #20491: [BUGFIX] [MXNET-1456] Added support for OptimizeForBackend in C++ API

tpyl commented on pull request #20491:
URL: https://github.com/apache/incubator-mxnet/pull/20491#issuecomment-1004978787


   @nswamy @aaronmarkham 
   Looks like unix-gpu (e.g. ubuntu_gpu_cu114) which is bumped up to Ubuntu 20.04 by this PR (because TensorRT installation is by far the most straightforward for TensorRT8 + Ubuntu 20.04) suffers from the same reference leak issue that has been flagged for CentOS. 
   
   This could be a clue as to why reference leaks happen? I don't think this is is an issue strictly introduced by this PR. 
   
   Similarly, hard to see how the segfault for centos-gpu could be caused by the changes in this PR. None of the TensorRT code is compiled in for the centos-gpu tests and none of the CentOS CI code is changed. (The only place where we build mxnet with -DUSE_TENSORRT=1 is in build_ubuntu_gpu_tensorrt()). 
   
   I would be grateful for any suggestions on how to get the tests passing for this PR. 
   ```
   [2022-01-04T06:01:03.508Z] ==================================== ERRORS ====================================
   [2022-01-04T06:01:03.508Z] _ ERROR at teardown of test_np_standard_binary_funcs[lshape5-rshape5-add-add-True-numeric-<lambda>-None--1.0-1.0] _
   [2022-01-04T06:01:03.508Z] [gw1] linux -- Python 3.8.10 /usr/bin/python3
   [2022-01-04T06:01:03.508Z] 
   [2022-01-04T06:01:03.508Z] request = <SubRequest 'check_leak_ndarray' for <Function test_np_standard_binary_funcs[lshape5-rshape5-add-add-True-numeric-<lambda>-None--1.0-1.0]>>
   [2022-01-04T06:01:03.508Z] 
   [2022-01-04T06:01:03.508Z]     @pytest.fixture(autouse=True)
   [2022-01-04T06:01:03.508Z]     def check_leak_ndarray(request):
   [2022-01-04T06:01:03.508Z]         garbage_expected = request.node.get_closest_marker('garbage_expected')
   [2022-01-04T06:01:03.508Z]         if garbage_expected:  # Some tests leak references. They should be fixed.
   [2022-01-04T06:01:03.508Z]             yield  # run test
   [2022-01-04T06:01:03.508Z]             return
   [2022-01-04T06:01:03.508Z]     
   [2022-01-04T06:01:03.508Z]         if 'centos' in platform.platform():
   [2022-01-04T06:01:03.508Z]             # Multiple tests are failing due to reference leaks on CentOS. It's not
   [2022-01-04T06:01:03.508Z]             # yet known why there are more memory leaks in the Python 3.6.9 version
   [2022-01-04T06:01:03.508Z]             # shipped on CentOS compared to the Python 3.6.9 version shipped in
   [2022-01-04T06:01:03.508Z]             # Ubuntu.
   [2022-01-04T06:01:03.508Z]             yield
   [2022-01-04T06:01:03.508Z]             return
   ````


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org