You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Andrey Zagrebin (JIRA)" <ji...@apache.org> on 2019/07/19 14:55:00 UTC

[jira] [Comment Edited] (FLINK-12038) YARNITCase stalls on travis

    [ https://issues.apache.org/jira/browse/FLINK-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888946#comment-16888946 ] 

Andrey Zagrebin edited comment on FLINK-12038 at 7/19/19 2:54 PM:
------------------------------------------------------------------

True, I see it also in JM logs.

I looped again the test on Travis with the previously suggested fix where the test waits for yarn app FINISHED state with a timeout.
https://travis-ci.org/azagrebin/flink/builds/560969859
(the CI fails due to overall timeout because loop has too many iterations but the test does not fail as previously after couple of iterations)

The waiting does not take long. Then there is no need to kill the app in this case, only if normal shutdown does not reach FINISHED within a timeout which would again signal that there is some problem. I think it is a cleaner approach. I would see the yarn mini cluster shutdown in the @AfterClass test method as an emergency cleanup.


was (Author: azagrebin):
True, I see it also in JM logs.

I looped again the test on Travis with the previously suggested fix where the test waits for yarn app FINISHED state with a timeout.
[https://travis-ci.org/azagrebin/flink/builds/560969859
](the CI fails due to overall timeout because loop has too many iterations but the test does not fail as previously after couple of iterations)

The waiting does not take long. Then there is no need to kill the app in this case, only if normal shutdown does not reach FINISHED within a timeout which would again signal that there is some problem. I think it is a cleaner approach. I would see the yarn mini cluster shutdown in the @AfterClass test method as an emergency cleanup.

> YARNITCase stalls on travis
> ---------------------------
>
>                 Key: FLINK-12038
>                 URL: https://issues.apache.org/jira/browse/FLINK-12038
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN, Tests
>    Affects Versions: 1.9.0
>            Reporter: Chesnay Schepler
>            Assignee: shuai.xu
>            Priority: Critical
>              Labels: pull-request-available, test-stability
>             Fix For: 1.9.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://travis-ci.org/apache/flink/jobs/511932978



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)