You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/06/27 11:14:00 UTC

[jira] [Commented] (FLINK-9674) Remove 65s sleep in QueryableState E2E test

    [ https://issues.apache.org/jira/browse/FLINK-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524924#comment-16524924 ] 

ASF GitHub Bot commented on FLINK-9674:
---------------------------------------

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/6216

    [FLINK-9674][tests] Replace hard-coded sleeps in QS E2E test 

    ## What is the purpose of the change
    
    This PR replaces a hard-coded sleep in a QueryableState end-to-end test. The test was sleeping for 65 seconds after the TM was restarted, to wait until the restarted.
    
    Instead we now first wait for the job restart procedure to kick in (using a new method to check for state transitions) and then waiting for the job to run.
    
    Additionally
    * reduce `heartbeat.timeout` to speed up TM loss detection
    * fix `common#wait_job_running` returning true for scheduled jobs, we now pass the `-r` flag
    * simplify jar paths in QS test (now relative to END_TO_END_DIR)
    * change `test-runner-common#cleanup` to use existing `common#clean_[log|stdout]_files` functions
    
    ## Verifying this change
    
    * manually verified

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 9674

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/6216.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6216
    
----
commit b6ab955173e1813b9903bf1fb2eb1345ff17abd3
Author: zentol <ch...@...>
Date:   2018-06-27T10:58:49Z

    [hotfix][tests] Reuse existing functions for cleaning logs

commit b4866c55bf64710aa5f21d214c86caa4702dd73e
Author: zentol <ch...@...>
Date:   2018-06-27T10:59:16Z

    [hotfix][tests] Simplify jar paths to QS tests

commit 2e4788f5c6840780c4ff0152a91435087802a14e
Author: zentol <ch...@...>
Date:   2018-06-27T11:01:06Z

    [FLINK-9674][tests] Replace hard-coded sleeps in QS E2E test

----


> Remove 65s sleep in QueryableState E2E test
> -------------------------------------------
>
>                 Key: FLINK-9674
>                 URL: https://issues.apache.org/jira/browse/FLINK-9674
>             Project: Flink
>          Issue Type: Improvement
>          Components: Queryable State, Tests
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>            Priority: Major
>              Labels: pull-request-available
>
> The {{test_queryable_state_restart_tm.sh}} kills a taskmanager, waits for the loss to be noticed, starts a new tm and waits for the job to continue.
> {code}
> kill_random_taskmanager
> [...]
> sleep 65 # this is a little longer than the heartbeat timeout so that the TM is gone
> start_and_wait_for_tm
> {code}
> Instead of waiting for a fixed amount of time that is tied to some config value we should wait for a specific event, like the job being canceled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)