You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/02/13 07:30:18 UTC
[GitHub] [tvm] masahi opened a new issue #7455: [FLAKY] tvmc/test_frontends.py
masahi opened a new issue #7455:
URL: https://github.com/apache/tvm/issues/7455
There is another CI incident on `main` https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/activity?branch=main
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/561/pipeline/
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/560/pipeline/
The error is the same one discussed in https://github.com/apache/tvm/pull/7366. Could be due to the recurring omp + pytorch issue
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] areusch commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
areusch commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-780778447
Happened again this morning: https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/570/pipeline
Relevant traceback:
```
tests/python/driver/tvmc/test_frontends.py::test_load_model__pth munmap_chunk(): invalid pointer
Fatal Python error: Aborted
Current thread 0x00007fca21074740 (most recent call first):
File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 416 in _conv_forward
File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 419 in forward
File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 704 in _slow_forward
File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 720 in _call_impl
File "/workspace/.local/lib/python3.6/site-packages/torchvision/models/resnet.py", line 203 in _forward_impl
File "/workspace/.local/lib/python3.6/site-packages/torchvision/models/resnet.py", line 220 in forward
File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 704 in _slow_forward
File "/workspace/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 720 in _call_impl
File "/workspace/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1109 in trace_module
File "/workspace/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 955 in trace
File "/workspace/tests/python/driver/tvmc/conftest.py", line 113 in pytorch_resnet18
File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 930 in call_fixture_func
File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 1124 in pytest_fixture_setup
File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 1070 in execute
File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 689 in _compute_fixture_value
File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 605 in _get_active_fixturedef
File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 585 in getfixturevalue
File "/usr/local/lib/python3.6/dist-packages/_pytest/fixtures.py", line 572 in _fillfixtures
File "/usr/local/lib/python3.6/dist-packages/_pytest/python.py", line 1633 in setup
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 448 in prepare
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 151 in pytest_runtest_setup
File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 256 in <lambda>
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 310 in from_call
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 256 in call_runtest_hook
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 216 in call_and_report
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 121 in runtestprotocol
File "/usr/local/lib/python3.6/dist-packages/_pytest/runner.py", line 110 in pytest_runtest_protocol
File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 338 in pytest_runtestloop
File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 313 in _main
File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 257 in wrap_session
File "/usr/local/lib/python3.6/dist-packages/_pytest/main.py", line 306 in pytest_cmdline_main
File "/usr/local/lib/python3.6/dist-packages/pluggy/callers.py", line 187 in _multicall
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 87 in <lambda>
File "/usr/local/lib/python3.6/dist-packages/pluggy/manager.py", line 93 in _hookexec
File "/usr/local/lib/python3.6/dist-packages/pluggy/hooks.py", line 286 in __call__
File "/usr/local/lib/python3.6/dist-packages/_pytest/config/__init__.py", line 165 in main
File "/usr/local/lib/python3.6/dist-packages/_pytest/config/__init__.py", line 187 in console_main
File "/usr/local/lib/python3.6/dist-packages/pytest/__main__.py", line 5 in <module>
File "/usr/lib/python3.6/runpy.py", line 85 in _run_code
File "/usr/lib/python3.6/runpy.py", line 193 in _run_module_as_main
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] ekalda commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
ekalda commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781505252
@areusch I built and ran the tests in this docker container using the commands you gave me and I see this test being skipped (similarly to the successful CI runs), which makes me suspect that there is some CI problem...
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron edited a comment on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
leandron edited a comment on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781481566
On a quick look in the CI jobs all the posted links so far (including the ones by @apivovarov) it looks like the job was always being executed by `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=1`, whereas I couldn't find any failed on `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=0` (list below).
Looking at random on recent builds:
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/567/pipeline/40
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/569/pipeline/40
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/576/pipeline/40
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/574/pipeline/40
Any idea on how this could be an infrastructure fail rather than a flaky test? (cc @areusch @masahi )
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] lhutton1 commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
lhutton1 commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-966386532
I believe this can be closed now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] areusch commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
areusch commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781455336
@ekalda are you using the docker container to reproduce?
e.g.
`docker/bash.sh tlcpack/ci-cpu:v0.72-t0 ./tests/scripts/task_config_build_cpu.sh`
`docker/bash.sh tlcpack/ci-cpu:v0.72-t0 ./tests/scripts/task_build.sh build -j2`
`docker/bash.sh tlcpack/ci-cpu:v0.72-t0 ./tests/scripts/task_python_integration.sh`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781610350
FYI this test is getting disabled in https://github.com/apache/tvm/pull/7465
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] tqchen closed issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
tqchen closed issue #7455:
URL: https://github.com/apache/tvm/issues/7455
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
leandron commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781481566
On a quick look in the CI jobs, all the posted links so far (including the ones by @apivovarov) point that the job was being executed by `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=1`, whereas I couldn't find any failed on `INFO: NODE_NAME=node.aladdin.cudabuild EXECUTOR_NUMBER=0`.
Looking at random on recent builds:
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/567/pipeline/40
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/569/pipeline/40
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/576/pipeline/40
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/574/pipeline/40
Any idea on how this could be an infrastructure fail rather than a flaky test? (cc @areusch @masahi )
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] leandron commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
leandron commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781243496
cc @ekalda
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] ekalda commented on issue #7455: [FLAKY] tvmc/test_frontends.py
Posted by GitBox <gi...@apache.org>.
ekalda commented on issue #7455:
URL: https://github.com/apache/tvm/issues/7455#issuecomment-781320885
Very odd. It looks like CI is always either skipping that test or failing with that error. That test is passing for me locally though, so I'm not sure how to reproduce it...
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org