You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/03/25 23:43:22 UTC

[GitHub] [hadoop-ozone] hanishakoneru opened a new pull request #723: HDDS-3281. Add timeouts to all robot tests

hanishakoneru opened a new pull request #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723
 
 
   ## What changes were proposed in this pull request?
   
   We have seen in some CI runs that the acceptance test suit is getting cancelled as it runs for more than 6 hours. Because of this, the test results and logs are also not saved. 
   
   This Jira aims to add a 5 minute timeout to all robot tests. In case some tests require more time, we can update the timeout. This would help to isolate the test which could be causing the whole acceptance test suit to time out.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3281
   
   ## How was this patch tested?
   
   CI acceptance test suit can test this change as it only adds a timeout to robot tests.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605096549
 
 
   > Note: originally I suggested to put the Timeout to the commonlib.robot to avoid code duplication, but I tested it and doesn't work.
   Yes. Learned that robot framework does not allow "global timeout" by design.
   
   @adoroszlai, @elek are we good to merge this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-607673623
 
 
   > Good catch. But can you please explain why topology/cli would be the one that timed out? 
   
   You are right it seems to be the freon. But in this case:
   
    1. Why can't I see the timeout?
    2. Why freon is hanging?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-607328861
 
 
   @adoroszlai , @elek how about we print a start and end time (or duration) for each test suit. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru merged pull request #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru merged pull request #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-604676324
 
 
   @adoroszlai, I think even with that limitation the timeout will help us isolate the problem. Let's say the acceptance suit is cancelled, we could still get to know which test contributed to the time out.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-606769748
 
 
   > > @elek @hanishakoneru a 6-hours run despite timeout settings: https://github.com/apache/hadoop-ozone/runs/548358523
   > 
   > This branch was not rebased to include the timeout change. Can we rebase and retry.
   
   This run is a push build for 94413cd8c903d153ae183687b5cd4c5990aac341 on [`master`](https://github.com/apache/hadoop-ozone/commits/master), which does have the timeout change eece60420285330e21153d73d682c5eb3bc5458e.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-607133321
 
 
   > Bot not in `topology/cli` which is timed out. Will create a follow-up PR to copy it to there...
   
   Good catch.  But can you please explain why `topology/cli` would be the one that timed out? 
    Freon test from `basic/basic` was not yet done:
   
   ```
   2020-03-31T10:47:00.9355111Z ozone-topology-basic :: Smoketest ozone cluster startup                       
   2020-03-31T10:47:00.9355811Z ==============================================================================
   2020-03-31T10:47:01.0327747Z Check webui static resources                                          | PASS |
   2020-03-31T10:47:01.0328887Z ------------------------------------------------------------------------------
   2020-03-31T16:26:10.4480521Z ##[error]The operation was canceled.
   ```
   
   Compare this to a normal run:
   
   ```
   2020-04-01T03:34:33.4721761Z ozone-topology-basic :: Smoketest ozone cluster startup                       
   2020-04-01T03:34:33.4722354Z ==============================================================================
   2020-04-01T03:34:33.5768474Z Check webui static resources                                          | PASS |
   2020-04-01T03:34:33.5772576Z ------------------------------------------------------------------------------
   2020-04-01T03:35:39.2555783Z Start freon testing                                                   | PASS |
   2020-04-01T03:35:39.2557114Z ------------------------------------------------------------------------------
   2020-04-01T03:35:39.2568636Z ozone-topology-basic :: Smoketest ozone cluster startup               | PASS |
   2020-04-01T03:35:39.2573293Z 2 critical tests, 2 passed, 0 failed
   2020-04-01T03:35:39.2573731Z 2 tests total, 2 passed, 0 failed
   2020-04-01T03:35:39.2573985Z ==============================================================================
   2020-04-01T03:35:39.2586020Z Output:  /tmp/smoketest/ozone-topology/result/robot-ozone-topology-ozone-topology-basic-scm.xml
   2020-04-01T03:35:41.8733119Z ==============================================================================
   2020-04-01T03:35:41.8738206Z ozone-topology-cli :: Smoketest ozone cluster startup                         
   2020-04-01T03:35:41.8738825Z ==============================================================================
   2020-04-01T03:35:43.3927035Z Run printTopology                                                     | PASS |
   2020-04-01T03:35:43.3927937Z ------------------------------------------------------------------------------
   2020-04-01T03:35:44.8556870Z Run printTopology -o                                                  | PASS |
   2020-04-01T03:35:44.8557296Z ------------------------------------------------------------------------------
   2020-04-01T03:35:44.8576945Z ozone-topology-cli :: Smoketest ozone cluster startup                 | PASS |
   2020-04-01T03:35:44.8580699Z 2 critical tests, 2 passed, 0 failed
   2020-04-01T03:35:44.8580837Z 2 tests total, 2 passed, 0 failed
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-607572973
 
 
   Just saw that there is an option to list the timestamp of each message.
   The last test to run is _ozone-topology-basic/Check webui static resources_. The next test in the suit is _basic/Start freon testing_ which is probably the one which is running forever and hence leading to cancellation of the whole acceptance test suit.
   ```
   Tue, 31 Mar 2020 10:46:59 GMT   Safe mode is off
   Tue, 31 Mar 2020 10:47:00 GMT   ==============================================
   Tue, 31 Mar 2020 10:47:00 GMT   ozone-topology-basic :: Smoketest ozone cluster startup                       
   Tue, 31 Mar 2020 10:47:00 GMT   ==============================================
   Tue, 31 Mar 2020 10:47:01 GMT   Check webui static resources                                          | PASS |
   Tue, 31 Mar 2020 10:47:01 GMT   ----------------------------------------------------------------
   Tue, 31 Mar 2020 16:26:10 GMT   ##[error]The operation was canceled.
   ```
   @elek @adoroszlai 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-604911971
 
 
   > @adoroszlai, I think even with that limitation the timeout will help us isolate the problem. Let's say the acceptance suit is cancelled, we could still get to know which test contributed to the time out.
   
   There are two timeouts: 
     1. timeout of the test (measured between two steps)
     2. timeout of one step test steps
   
   As far as I understood @adoroszlai warned us that even if we have a test level timeout it doesn't help at all, if 2nd is not in place. If one `curl` based command is hanging (and robot test doesn't do a `kill`) it won't be stopped (and we won't have any logs / results).
   
   But I agree even without 2nd, it's good to have this patch. 
   
   On the other hand, I tested it with sleep, and it seems to be working for me...
   
   ```
   *** Settings ***
   Documentation       Timeout test
   Library             OperatingSystem
   Test Timeout        20 seconds
   #Resource            commonlib.robot
   
   *** Test cases ***
   Execute PI calculation
                       ${output} =      Run                     sleep 60
                       Should Contain   ${output}               completed successfully
   ```
   
   ```
   time robot test.robot
   ==============================================================================
   Test :: Timeout test
   ==============================================================================
   Execute PI calculation                                                | FAIL |
   Test timeout 20 seconds exceeded.
   ------------------------------------------------------------------------------
   Test :: Timeout test                                                  | FAIL |
   1 critical test, 0 passed, 1 failed
   1 test total, 0 passed, 1 failed
   ==============================================================================
   Output:  /home/elek/projects/ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/smoketest/output.xml
   Log:     /home/elek/projects/ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/smoketest/log.html
   Report:  /home/elek/projects/ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/smoketest/report.html
   robot test.robot  0.25s user 0.03s system 1% cpu 20.285 total
   ```
   
   As you see my sleep command was killed after 20 seconds. 
   
   Note: originally I suggested to put the `Timeout` to the `commonlib.robot` to avoid code duplication, but I tested it and doesn't work.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-607127054
 
 
   Bot not in `topology/cli` which is timed out. Will create a follow-up PR to copy it to there...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605255750
 
 
   Thank you all for the reviews. Will merge this PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-606761659
 
 
   > @elek @hanishakoneru a 6-hours run despite timeout settings: https://github.com/apache/hadoop-ozone/runs/548358523
   
   This branch was not rebased to include the timeout change. Can we rebase and retry.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605153928
 
 
   > @adoroszlai, @elek are we good to merge this patch?
   
   Yes, thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-606754436
 
 
   @elek @hanishakoneru a 6-hours run despite timeout settings: https://github.com/apache/hadoop-ozone/runs/548358523

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] hanishakoneru edited a comment on issue #723: HDDS-3281. Add timeouts to all robot tests

Posted by GitBox <gi...@apache.org>.
hanishakoneru edited a comment on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605096549
 
 
   > Note: originally I suggested to put the Timeout to the commonlib.robot to avoid code duplication, but I tested it and doesn't work.
   
   Yes. Learned that robot framework does not allow "global timeout" by design.
   
   @adoroszlai, @elek are we good to merge this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org