You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Siyao Meng (Jira)" <ji...@apache.org> on 2021/11/10 02:23:00 UTC

[jira] [Updated] (HDDS-5956) Speed up TestOzoneRpcClientAbstract#testZReadKeyWithUnhealthyContainerReplica

     [ https://issues.apache.org/jira/browse/HDDS-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Siyao Meng updated HDDS-5956:
-----------------------------
    Summary: Speed up TestOzoneRpcClientAbstract#testZReadKeyWithUnhealthyContainerReplica  (was: Speed up TestOzoneRpcClientAbstract#testZReadKeyWithUnhealthyContainerReplia)

> Speed up TestOzoneRpcClientAbstract#testZReadKeyWithUnhealthyContainerReplica
> -----------------------------------------------------------------------------
>
>                 Key: HDDS-5956
>                 URL: https://issues.apache.org/jira/browse/HDDS-5956
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: test
>            Reporter: Siyao Meng
>            Assignee: Siyao Meng
>            Priority: Major
>         Attachments: HDDS-5956.001.patch
>
>
> When working on HDDS-5891, I found that {{TestOzoneRpcClientAbstract#testZReadKeyWithUnhealthyContainerReplia}} is markably slow.
> And this is abstract test class is extended by three test classes, with one of test explicitly disabling this test case.
> For instance, for me locally, the entire {{TestOzoneRpcClient}} took 2 min 15 sec to run, {{testZReadKeyWithUnhealthyContainerReplia}} alone took 2 min 9 sec. I assume it would take even longer to finish in GitHub Actions machines. Other 70+ test cases in this class mostly took tens of milliseconds to finish each.
> In a 2 min 9 sec run, ~90 seconds are spent on waiting for DN to be stopped:
> {code:java}
> 2021-11-09 18:03:57,217 [Time-limited test] INFO  ozone.MiniOzoneClusterImpl (MiniOzoneClusterImpl.java:lambda$waitForHddsDatanodesStop$3(389)) - Waiting on 3 datanodes out of 2 to be marked unhealthy.
> ...
> 2021-11-09 18:05:28,622 [Time-limited test] INFO  ozone.MiniOzoneClusterImpl (MiniOzoneClusterImpl.java:lambda$waitForHddsDatanodesStop$3(389)) - Waiting on 3 datanodes out of 2 to be marked unhealthy.
> {code}
> Solution:
> 1. Setting "ozone.scm.stale.node.interval" to 10s (TestReconTasks also did this) for the test alone reduced run time from 69s to 39s, saving 60s (x2=120s for both test classes).
> 2. Moving the extra 5000ms sleep length into {{GenericTestUtils.waitFor()}} saved another 5s
> 3. Fix the typo in this test case method name. :)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org