You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/05/26 10:43:45 UTC
[GitHub] [tvm] ekalda opened a new issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
ekalda opened a new issue #8140:
URL: https://github.com/apache/tvm/issues/8140
That test failed on a whitespace change, which looks suspicious... https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8124/1/pipeline/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980
Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` suggests that something is off @trevor-m
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926
The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
```
Mismatched elements: 8 / 256 (3.12%)
Max absolute difference: 1.8291236
Max relative difference: 21.120028
x: array([[[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
y: array([[[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
```
@trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] trevor-m commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
trevor-m commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-867786119
I'm fine with disabling the test for now, sorry I haven't had a chance to look into the flakiness yet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926
The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
```
Mismatched elements: 8 / 256 (3.12%)
Max absolute difference: 1.8291236
Max relative difference: 21.120028
x: array([[[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
y: array([[[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
```
@trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS. I confirmed this based on comparing input and output box coordinates.
Probably we should change the test code to make sure there would be no ties in scores.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-856564093
The same error happened at https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8126/8/pipeline
@trevor-m This is `q != 1` case (each class has its own box), the new code path from https://github.com/apache/tvm/commits/main is not used.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926
The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
```
Mismatched elements: 8 / 256 (3.12%)
Max absolute difference: 1.8291236
Max relative difference: 21.120028
x: array([[[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
y: array([[[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
```
@trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980
Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` suggests that something is off @trevor-m
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] mbrookhart commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
mbrookhart commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-867743767
I'm seeing flakiness in this test in about 1/3 of CI jobs, it's becoming a real problem to getting other PRs merged. Should we think about disabling this test until we can resolve the flakiness?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980
Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` or ` Mismatched elements: 102 / 256 (39.8%)` suggest that something is off @trevor-m
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8357/1/pipeline
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] trevor-m commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
trevor-m commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857057327
Thanks for letting me know, I guess we can look into the NMS code which is shared by both paths?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-883706777
This can be closed in the sense that the flaky test is now disabled. But the underlying problem with combined NMS converter for `q != 1` case (each class has its own box) should be addressed @trevor-m
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi closed issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi closed issue #8140:
URL: https://github.com/apache/tvm/issues/8140
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857061153
I'm not sure if the two flaky ness are due to the same reason, in which case yes, we need to look at the core NMS loop. Are you sure the conversion logic in TF frontend for `q != 1` case is correct?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] tqchen commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-867005295
normally we should construct test cases to ensure there are no ties
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] tqchen commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-883664861
@masahi please followup to see if we can close this issue
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857054789
I was informed that there was another failure at https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8179/6/pipeline/ which does use the new code path. I was able to get it fail after 970 trials.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926
The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
```
Mismatched elements: 8 / 256 (3.12%)
Max absolute difference: 1.8291236
Max relative difference: 21.120028
x: array([[[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
y: array([[[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
```
@trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS. I confirmed this based on comparing input and output box coordinates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926
The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
```
Mismatched elements: 8 / 256 (3.12%)
Max absolute difference: 1.8291236
Max relative difference: 21.120028
x: array([[[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
y: array([[[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
```
@trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS (based on comparing input and output box coordinates).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926
The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
```
Mismatched elements: 8 / 256 (3.12%)
Max absolute difference: 1.8291236
Max relative difference: 21.120028
x: array([[[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
y: array([[[ 0.398791, 1.742517, 1.710477, 0.569614],
[ 0.613913, -0.086606, 0.650708, 0.433798],
[ 0.736459, 0.409182, 0.026915, -0.968162],...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms
Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980
Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` or ` Mismatched elements: 102 / 256 (39.8%)` suggest that something is off @trevor-m
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8357/1/pipeline
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org