You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Till Rohrmann (Jira)" <ji...@apache.org> on 2020/02/28 11:13:00 UTC
[jira] [Closed] (FLINK-9678) Remove hard-coded sleeps in HA E2E
test
[ https://issues.apache.org/jira/browse/FLINK-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann closed FLINK-9678.
--------------------------------
Resolution: Duplicate
> Remove hard-coded sleeps in HA E2E test
> ---------------------------------------
>
> Key: FLINK-9678
> URL: https://issues.apache.org/jira/browse/FLINK-9678
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination, Tests
> Affects Versions: 1.5.0, 1.6.0
> Reporter: Chesnay Schepler
> Priority: Major
>
> {{test_ha.sh}} uses 2 hard-coded sleeps.
> {code:java}
> # let the job run for a while to take some checkpoints
> sleep 20
> for (( c=0; c<${JM_KILLS}; c++ )); do
> # kill the JM and wait for watchdog to
> # create a new one which will take over
> kill_jm
> sleep 60
> done{code}
> These sleeps are always troublesome as they either make the test brittle by being to small, or causing the test to idle when they are to large.
> The first sleep should be replaced with {{wait_num_checkpoints.}}
> I'm not entirely sure about the semantics of the second sleep, but I guess we're waiting for the new JM to continue the job execution. In this case I suggest to instead query the job status via REST and wait until the job is running.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)