You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/02/13 07:30:18 UTC

[GitHub] [tvm] masahi opened a new issue #7455: [FLAKY] tvmc/test_frontends.py

masahi opened a new issue #7455:
URL: https://github.com/apache/tvm/issues/7455


   There is another CI incident on `main` https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/activity?branch=main
   
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/561/pipeline/
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/560/pipeline/
   
   The error is the same one discussed in https://github.com/apache/tvm/pull/7366. Could be due to the recurring omp + pytorch issue


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] areusch commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
areusch commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-780778447


   Happened again this morning: https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/570/pipeline
   
   Relevant traceback:
   ```
   tests/python/driver/tvmc/test_frontends.py::test_load_model__pth munmap_chunk(): invalid pointer
   Fatal Python error: Aborted
   
   Current thread 0x00007fca21074740 (most recent call first):
     File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 416 in _conv_forward
     File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 419 in forward
     File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 704 in _slow_forward
     File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 720 in _call_impl
     File "/workspace/.local/lib/python3.6/site-packages/torchvision/models/resnet.py", line 203 in _forward_impl
     File "/workspace/.local/lib/python3.6/site-packages/torchvision/models/resnet.py", line 220 in forward
     File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 704 in _slow_forward
     File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 720 in _call_impl
     File "/workspace/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1109 in trace_module
     File "/workspace/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 955 in trace
     File "/workspace/tests/python/driver/tvmc/conftest.py", line 113 in pytorch_resnet18
     File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 930 in call_fixture_func
     File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 1124 in pytest_fixture_setup
     File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
     File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
     File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 1070 in execute
     File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 689 in _compute_fixture_value
     File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 605 in _get_active_fixturedef
     File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 585 in getfixturevalue
     File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 572 in _fillfixtures
     File "/usr/local/lib/python3.6/dist-packages/_pytest/python.py", line 1633 in setup
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 448 in prepare
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 151 in pytest_runtest_setup
     File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
     File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 256 in <lambda>
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 310 in from_call
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 256 in call_runtest_hook
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 216 in call_and_report
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 121 in runtestprotocol
     File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 110 in pytest_runtest_protocol
     File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
     File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
     File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 338 in pytest_runtestloop
     File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
     File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
     File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 313 in _main
     File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 257 in wrap_session
     File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 306 in pytest_cmdline_main
     File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
     File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
     File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
     File "/usr/local/lib/python3.6/dist-packages/_pytest/config/__init__.py", line 165 in main
     File "/usr/local/lib/python3.6/dist-packages/_pytest/config/__init__.py", line 187 in console_main
     File "/usr/local/lib/python3.6/dist-packages/pytest/__main__.py", line 5 in <module>
     File "/usr/lib/python3.6/runpy.py", line 85 in _run_code
     File "/usr/lib/python3.6/runpy.py", line 193 in _run_module_as_main
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] ekalda commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
ekalda commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781505252


   @areusch I built and ran the tests in this docker container using the commands you gave me and I see this test being skipped (similarly to the successful CI runs), which makes me suspect that there is some CI problem...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] leandron edited a comment on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
leandron edited a comment on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781481566


   On a quick look in the CI jobs all the posted links so far (including the ones by @apivovarov) it looks like the job was always being executed by `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=1`, whereas I couldn't find any failed on `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=0` (list below).
   
   Looking at random on recent builds:
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/567/pipeline/40
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/569/pipeline/40
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/576/pipeline/40
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/574/pipeline/40
   
   Any idea on how this could be an infrastructure fail rather than a flaky test? (cc @areusch @masahi )


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lhutton1 commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
lhutton1 commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-966386532


   I believe this can be closed now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] areusch commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
areusch commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781455336


   @ekalda are you using the docker container to reproduce?
   
   e.g. 
   `docker/bash.sh tlcpack/ci-cpu:v0.72-t0 ./tests/scripts/task_config_build_cpu.sh`
   `docker/bash.sh tlcpack/ci-cpu:v0.72-t0 ./tests/scripts/task_build.sh build -j2`
   `docker/bash.sh tlcpack/ci-cpu:v0.72-t0 ./tests/scripts/task_python_integration.sh`
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781610350


   FYI this test is getting disabled in https://github.com/apache/tvm/pull/7465


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen closed issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
tqchen closed issue #7455:
URL: https://github.com/apache/tvm/issues/7455


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] leandron commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
leandron commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781481566


   On a quick look in the CI jobs, all the posted links so far (including the ones by @apivovarov) point that the job was being executed by `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=1`, whereas I couldn't find any failed on `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=0`.
   
   Looking at random on recent builds:
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/567/pipeline/40
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/569/pipeline/40
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/576/pipeline/40
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/574/pipeline/40
   
   Any idea on how this could be an infrastructure fail rather than a flaky test? (cc @areusch @masahi )


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] leandron commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
leandron commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781243496


   cc @ekalda


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] ekalda commented on issue #7455: [FLAKY] tvmc/test_frontends.py

Posted by GitBox <gi...@apache.org>.
ekalda commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781320885


   Very odd. It looks like CI is always either skipping that test or failing with that error. That test is passing for me locally though, so I'm not sure how to reproduce it...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org