You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kostas Kloudas (JIRA)" <ji...@apache.org> on 2018/05/16 16:23:00 UTC

[jira] [Closed] (FLINK-9379) HA end-to-end test failing locally

     [ https://issues.apache.org/jira/browse/FLINK-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kostas Kloudas closed FLINK-9379.
---------------------------------
    Resolution: Fixed

Fixed on master with cddb65abe3a7b82ffb3235800903c016fdc38877

and on 1.5 with 7fb94fb345fd6a8d414162e61a000ac8101d81b5

> HA end-to-end test failing locally
> ----------------------------------
>
>                 Key: FLINK-9379
>                 URL: https://issues.apache.org/jira/browse/FLINK-9379
>             Project: Flink
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 1.5.0
>            Reporter: Till Rohrmann
>            Assignee: Kostas Kloudas
>            Priority: Critical
>              Labels: test-stability
>
> The HA end-to-end test fails sometimes with
> {code}
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Could not submit job 797547d5fd619ea240d4c6690adc9101.
>     at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:247)
>     at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
>     at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
>     at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:410)
>     at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:781)
>     at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:275)
>     at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
>     at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
>     at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>     at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>     at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
>     at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$5(RestClusterClient.java:357)
>     at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
>     at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
>     at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>     at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
>     at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
>     at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.rest.util.RestClientException: [Service temporarily unavailable due to an ongoing leader election. Please refresh.]
>     at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
>     at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
>     at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
>     at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
>     at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
>     ... 4 more
> Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Service temporarily unavailable due to an ongoing leader election. Please refresh.]
>     at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:225)
>     at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:209)
>     at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
>     ... 5 more
> {code}
> when executing it locally. 
> I assume that the test does not properly wait until the cluster is ready for a job submission.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)