You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Li Lei (JIRA)" <ji...@apache.org> on 2018/07/12 08:18:00 UTC

[jira] [Updated] (FLINK-9830) submit job to yarn-flink cluster base on java API

     [ https://issues.apache.org/jira/browse/FLINK-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Li Lei updated FLINK-9830:
--------------------------
    Description: 
I use JAVA API to submit jobs to the Yarn cluster, the StreamExecutionEnvironment execute submit the job through RestClusterClient. and the webmonitor URL is the jobmanager.rpc.address configed in configuration flink-conf.yaml file.

But the flink-session that running on the yarn cluster does not use the configuration which is in  flink-conf.yaml ...

I want to know the correct method that submit Flink Job to Yarn cluster through Java API.

 

The exception stack is as follows:

 
||Exception Stack||
|

2018-07-12 15:14:27.763  INFO 77793 --- [pool-1-thread-1] o.a.f.c.program.rest.RestClusterClient   : Submitting job b0698c328bb79c05f2a653df8d84817c (detached: false).
2018-07-12 15:14:27.765  INFO 77793 --- [pool-1-thread-1] o.a.f.c.program.rest.RestClusterClient   : Requesting blob server port.
2018-07-12 15:15:28.307  INFO 77793 --- [pool-1-thread-1] o.apache.flink.runtime.rest.RestClient   : Shutting down rest endpoint.
2018-07-12 15:15:28.310  INFO 77793 --- [pool-1-thread-1] o.apache.flink.runtime.rest.RestClient   : Rest endpoint shutdown complete.
org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result.
at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:258)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:457)
at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:219)
at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:178)
at com.tydic.tysc.job.service.SubmitJobService.submitJobToStandaloneCluster(SubmitJobService.java:215)
at com.tydic.tysc.rest.JobSubmitController$1.run(JobSubmitController.java:92)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$5(RestClusterClient.java:357)
at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:214)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:195)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
... 1 more
Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted.
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.biApply(CompletableFuture.java:1088)
at java.util.concurrent.CompletableFuture$BiApply.tryFire(CompletableFuture.java:1070)
... 21 more
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted.
... 19 more
Caused by: java.util.concurrent.CompletionException: java.net.ConnectException: Connection refused: slave1/192.168.129.102:8081
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
... 16 more
Caused by: java.net.ConnectException: Connection refused: slave1/192.168.129.102:8081
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)
... 7 more
 
 
 
|

 

 

 

  was:
I use JAVA API to submit jobs to the Yarn cluster, the StreamExecutionEnvironment execute submit the job through RestClusterClient. and the webmonitor URL is the jobmanager.rpc.address configed in configuration flink-conf.yaml file.

But the flink-session that running on the yarn cluster does not use the configuration which is in  flink-conf.yaml ...

I want to know the correct method that submit Flink Job to Yarn cluster through Java API.

 

 


> submit job to yarn-flink cluster base on java API
> -------------------------------------------------
>
>                 Key: FLINK-9830
>                 URL: https://issues.apache.org/jira/browse/FLINK-9830
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.5.0
>         Environment: CentOS 6.7
> JAVA 1.8.0_151
> Flink 1.5.0 
> Hadoop 2.6.5
>  
>            Reporter: Li Lei
>            Priority: Major
>              Labels: Flink, yarn, yarn-client
>
> I use JAVA API to submit jobs to the Yarn cluster, the StreamExecutionEnvironment execute submit the job through RestClusterClient. and the webmonitor URL is the jobmanager.rpc.address configed in configuration flink-conf.yaml file.
> But the flink-session that running on the yarn cluster does not use the configuration which is in  flink-conf.yaml ...
> I want to know the correct method that submit Flink Job to Yarn cluster through Java API.
>  
> The exception stack is as follows:
>  
> ||Exception Stack||
> |
> 2018-07-12 15:14:27.763  INFO 77793 --- [pool-1-thread-1] o.a.f.c.program.rest.RestClusterClient   : Submitting job b0698c328bb79c05f2a653df8d84817c (detached: false).
> 2018-07-12 15:14:27.765  INFO 77793 --- [pool-1-thread-1] o.a.f.c.program.rest.RestClusterClient   : Requesting blob server port.
> 2018-07-12 15:15:28.307  INFO 77793 --- [pool-1-thread-1] o.apache.flink.runtime.rest.RestClient   : Shutting down rest endpoint.
> 2018-07-12 15:15:28.310  INFO 77793 --- [pool-1-thread-1] o.apache.flink.runtime.rest.RestClient   : Rest endpoint shutdown complete.
> org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result.
> at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:258)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:457)
> at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:219)
> at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:178)
> at com.tydic.tysc.job.service.SubmitJobService.submitJobToStandaloneCluster(SubmitJobService.java:215)
> at com.tydic.tysc.rest.JobSubmitController$1.run(JobSubmitController.java:92)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
> at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$5(RestClusterClient.java:357)
> at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
> at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
> at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:214)
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
> at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:195)
> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
> at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
> at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> ... 1 more
> Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted.
> at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
> at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
> at java.util.concurrent.CompletableFuture.biApply(CompletableFuture.java:1088)
> at java.util.concurrent.CompletableFuture$BiApply.tryFire(CompletableFuture.java:1070)
> ... 21 more
> Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted.
> ... 19 more
> Caused by: java.util.concurrent.CompletionException: java.net.ConnectException: Connection refused: slave1/192.168.129.102:8081
> at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
> at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
> at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943)
> at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
> ... 16 more
> Caused by: java.net.ConnectException: Connection refused: slave1/192.168.129.102:8081
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
> at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)
> ... 7 more
>  
>  
>  
> |
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)