You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/05/26 10:43:45 UTC

[GitHub] [tvm] ekalda opened a new issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

ekalda opened a new issue #8140:
URL: https://github.com/apache/tvm/issues/8140


   That test failed on a whitespace change, which looks suspicious... https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8124/1/pipeline/ 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980


   Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` suggests that something is off @trevor-m 
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926


   The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
   
   ```
   Mismatched elements: 8 / 256 (3.12%)
   Max absolute difference: 1.8291236
   Max relative difference: 21.120028
    x: array([[[ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
    y: array([[[ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
   ```
   
   @trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] trevor-m commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
trevor-m commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-867786119


   I'm fine with disabling the test for now, sorry I haven't had a chance to look into the flakiness yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926


   The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
   
   ```
   Mismatched elements: 8 / 256 (3.12%)
   Max absolute difference: 1.8291236
   Max relative difference: 21.120028
    x: array([[[ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
    y: array([[[ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
   ```
   
   @trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS. I confirmed this based on comparing input and output box coordinates.
   
   Probably we should change the test code to make sure there would be no ties in scores.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-856564093


   The same error happened at https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8126/8/pipeline
   
   @trevor-m This is `q != 1` case (each class has its own box), the new code path from https://github.com/apache/tvm/commits/main is not used.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926


   The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
   
   ```
   Mismatched elements: 8 / 256 (3.12%)
   Max absolute difference: 1.8291236
   Max relative difference: 21.120028
    x: array([[[ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
    y: array([[[ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
   ```
   
   @trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980


   Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` suggests that something is off @trevor-m 
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] mbrookhart commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
mbrookhart commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-867743767


   I'm seeing flakiness in this test in about 1/3 of CI jobs, it's becoming a real problem to getting other PRs merged. Should we think about disabling this test until we can resolve the flakiness?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980


   Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` or ` Mismatched elements: 102 / 256 (39.8%)` suggest that something is off @trevor-m 
   
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8357/1/pipeline


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] trevor-m commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
trevor-m commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857057327


   Thanks for letting me know, I guess we can look into the NMS code which is shared by both paths?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-883706777


   This can be closed in the sense that the flaky test is now disabled. But the underlying problem with combined NMS converter for `q != 1` case (each class has its own box) should be addressed @trevor-m  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi closed issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi closed issue #8140:
URL: https://github.com/apache/tvm/issues/8140


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857061153


   I'm not sure if the two flaky ness are due to the same reason, in which case yes, we need to look at the core NMS loop. Are you sure the conversion logic in TF frontend for `q != 1` case is correct?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-867005295


   normally we should construct test cases to ensure there are no ties


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-883664861


   @masahi please followup to see if we can close this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857054789


   I was informed that there was another failure at https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8179/6/pipeline/ which does use the new code path. I was able to get it fail after 970 trials.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926


   The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
   
   ```
   Mismatched elements: 8 / 256 (3.12%)
   Max absolute difference: 1.8291236
   Max relative difference: 21.120028
    x: array([[[ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
    y: array([[[ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
   ```
   
   @trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS. I confirmed this based on comparing input and output box coordinates.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926


   The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
   
   ```
   Mismatched elements: 8 / 256 (3.12%)
   Max absolute difference: 1.8291236
   Max relative difference: 21.120028
    x: array([[[ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
    y: array([[[ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
   ```
   
   @trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS (based on comparing input and output box coordinates).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi commented on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-857175926


   The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
   
   ```
   Mismatched elements: 8 / 256 (3.12%)
   Max absolute difference: 1.8291236
   Max relative difference: 21.120028
    x: array([[[ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
    y: array([[[ 0.398791,  1.742517,  1.710477,  0.569614],
           [ 0.613913, -0.086606,  0.650708,  0.433798],
           [ 0.736459,  0.409182,  0.026915, -0.968162],...
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on issue #8140: [TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on issue #8140:
URL: https://github.com/apache/tvm/issues/8140#issuecomment-870347980


   Another failure from the `(1, 64, 20, 4)` workload despite the fix in https://github.com/apache/tvm/pull/8335. `Mismatched elements: 65 / 256 (25.4%)` or ` Mismatched elements: 102 / 256 (39.8%)` suggest that something is off @trevor-m 
   
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline
   https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8357/1/pipeline


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org