You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Till Rohrmann (Jira)" <ji...@apache.org> on 2020/02/28 11:13:00 UTC

[jira] [Closed] (FLINK-9678) Remove hard-coded sleeps in HA E2E test

     [ https://issues.apache.org/jira/browse/FLINK-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Till Rohrmann closed FLINK-9678.
--------------------------------
    Resolution: Duplicate

> Remove hard-coded sleeps in HA E2E test
> ---------------------------------------
>
>                 Key: FLINK-9678
>                 URL: https://issues.apache.org/jira/browse/FLINK-9678
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination, Tests
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Chesnay Schepler
>            Priority: Major
>
> {{test_ha.sh}} uses 2 hard-coded sleeps.
> {code:java}
> # let the job run for a while to take some checkpoints
> sleep 20
> for (( c=0; c<${JM_KILLS}; c++ )); do
>     # kill the JM and wait for watchdog to
>     # create a new one which will take over
>     kill_jm
>     sleep 60
> done{code}
> These sleeps are always troublesome as they either make the test brittle by being to small, or causing the test to idle when they are to large.
> The first sleep should be replaced with {{wait_num_checkpoints.}}
> I'm not entirely sure about the semantics of the second sleep, but I guess we're waiting for the new JM to continue the job execution. In this case I suggest to instead query the job status via REST and wait until the job is running.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)