You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Ethan Li <et...@gmail.com> on 2020/01/10 02:37:32 UTC

[Question] Failed to submit flink job to secure yarn cluster

Hello

I was following  https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html#run-a-flink-job-on-yarn <https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html#run-a-flink-job-on-yarn> and trying to submit a flink job on yarn. 

I downloaded flink-1.9.1 and pre-bundled Hadoop 2.8.3 from https://flink.apache.org/downloads.html#apache-flink-191 <https://flink.apache.org/downloads.html#apache-flink-191>. I used default configs except:

security.kerberos.login.keytab: userA.keytab
security.kerberos.login.principal: userA@REALM


I have a secure Yarn cluster set up already. Then I ran “ ./bin/flink run -m yarn-cluster -p 1 -yjm 1024m -ytm 1024m ./examples/streaming/WordCount.jar” and got the following errors:


org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385)
	at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:251)
	at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
	at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
	at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
	at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1578605412668_0005 to YARN : Failed to renew token: Kind: kms-dt, Service: host3.com:3456, Ident: (owner=userA, renewer=adminB, realUser=, issueDate=1578606224956, maxDate=1579211024956, sequenceNumber=32, masterKeyId=52)
	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:275)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1004)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:507)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:378)
	... 9 more


Full client log:https://gist.github.com/Ethanlm/221284bcaa272270a799957dc05b94fd <https://gist.github.com/Ethanlm/221284bcaa272270a799957dc05b94fd>
Resource manager log: https://gist.github.com/Ethanlm/ecd0a3eb25582ad6b1552927fc0e5c47 <https://gist.github.com/Ethanlm/ecd0a3eb25582ad6b1552927fc0e5c47>
Hostname, IP address, username and etc. are anonymized.


Not sure how to proceed further. Wondering if anyone in the community has encountered this before. Thank you very much for your time!

Best,
Ethan


Re: [Question] Failed to submit flink job to secure yarn cluster

Posted by Ethan Li <et...@gmail.com>.
Sorry forgot to update on this. 

I figured it out. KMS is not set up correctly in my environment. ResourceManager is also missing key provider config.  PE is fixing it.  

Thanks for looking into this

Ethan Li

> On Jan 13, 2020, at 21:38, Yang Wang <da...@gmail.com> wrote:
> 
> 
> I am not familiar with kerberos. However i find "keyProvider null cannot renew token" in the Yarn
> ResourceManager logs. Could you please check the key provider has been configured correctly?
> 
> 
> Best,
> Yang
> 
> Ethan Li <et...@gmail.com> 于2020年1月10日周五 下午10:54写道:
>> Hi Yangze,
>> 
>> Thanks for your reply. Those are the docs I have read and followed. (I was also able to set up a standalone flink cluster with secure HDFS, Zookeeper and Kafa. )
>> 
>> Could you please let me know what I am missing? Thanks
>> 
>> 
>> Best,
>> Ethan
>> 
>> > On Jan 10, 2020, at 6:28 AM, Yangze Guo <ka...@gmail.com> wrote:
>> > 
>> > Hi, Ethan
>> > 
>> > You could first check your cluster following this guide and check if
>> > all the related config[2] set correctly.
>> > 
>> > [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/security-kerberos.html
>> > [2] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#security-kerberos-login-contexts
>> > 
>> > Best,
>> > Yangze Guo
>> > 
>> > On Fri, Jan 10, 2020 at 10:37 AM Ethan Li <et...@gmail.com> wrote:
>> >> 
>> >> Hello
>> >> 
>> >> I was following  https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html#run-a-flink-job-on-yarn and trying to submit a flink job on yarn.
>> >> 
>> >> I downloaded flink-1.9.1 and pre-bundled Hadoop 2.8.3 from https://flink.apache.org/downloads.html#apache-flink-191. I used default configs except:
>> >> 
>> >> security.kerberos.login.keytab: userA.keytab
>> >> security.kerberos.login.principal: userA@REALM
>> >> 
>> >> 
>> >> I have a secure Yarn cluster set up already. Then I ran “ ./bin/flink run -m yarn-cluster -p 1 -yjm 1024m -ytm 1024m ./examples/streaming/WordCount.jar” and got the following errors:
>> >> 
>> >> 
>> >> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
>> >> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385)
>> >> at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:251)
>> >> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
>> >> at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
>> >> at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
>> >> at java.security.AccessController.doPrivileged(Native Method)
>> >> at javax.security.auth.Subject.doAs(Subject.java:422)
>> >> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>> >> at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
>> >> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1578605412668_0005 to YARN : Failed to renew token: Kind: kms-dt, Service: host3.com:3456, Ident: (owner=userA, renewer=adminB, realUser=, issueDate=1578606224956, maxDate=1579211024956, sequenceNumber=32, masterKeyId=52)
>> >> at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:275)
>> >> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1004)
>> >> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:507)
>> >> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:378)
>> >> ... 9 more
>> >> 
>> >> 
>> >> Full client log:https://gist.github.com/Ethanlm/221284bcaa272270a799957dc05b94fd
>> >> Resource manager log: https://gist.github.com/Ethanlm/ecd0a3eb25582ad6b1552927fc0e5c47
>> >> Hostname, IP address, username and etc. are anonymized.
>> >> 
>> >> 
>> >> Not sure how to proceed further. Wondering if anyone in the community has encountered this before. Thank you very much for your time!
>> >> 
>> >> Best,
>> >> Ethan
>> >> 
>> 

Re: [Question] Failed to submit flink job to secure yarn cluster

Posted by Yang Wang <da...@gmail.com>.
I am not familiar with kerberos. However i find "keyProvider null cannot
renew token" in the Yarn
ResourceManager logs. Could you please check the key provider has been
configured correctly?


Best,
Yang

Ethan Li <et...@gmail.com> 于2020年1月10日周五 下午10:54写道:

> Hi Yangze,
>
> Thanks for your reply. Those are the docs I have read and followed. (I was
> also able to set up a standalone flink cluster with secure HDFS, Zookeeper
> and Kafa. )
>
> Could you please let me know what I am missing? Thanks
>
>
> Best,
> Ethan
>
> > On Jan 10, 2020, at 6:28 AM, Yangze Guo <ka...@gmail.com> wrote:
> >
> > Hi, Ethan
> >
> > You could first check your cluster following this guide and check if
> > all the related config[2] set correctly.
> >
> > [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/security-kerberos.html
> > [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#security-kerberos-login-contexts
> >
> > Best,
> > Yangze Guo
> >
> > On Fri, Jan 10, 2020 at 10:37 AM Ethan Li <et...@gmail.com>
> wrote:
> >>
> >> Hello
> >>
> >> I was following
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html#run-a-flink-job-on-yarn
> and trying to submit a flink job on yarn.
> >>
> >> I downloaded flink-1.9.1 and pre-bundled Hadoop 2.8.3 from
> https://flink.apache.org/downloads.html#apache-flink-191. I used default
> configs except:
> >>
> >> security.kerberos.login.keytab: userA.keytab
> >> security.kerberos.login.principal: userA@REALM
> >>
> >>
> >> I have a secure Yarn cluster set up already. Then I ran “ ./bin/flink
> run -m yarn-cluster -p 1 -yjm 1024m -ytm 1024m
> ./examples/streaming/WordCount.jar” and got the following errors:
> >>
> >>
> >> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't
> deploy Yarn session cluster
> >> at
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385)
> >> at
> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:251)
> >> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
> >> at
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
> >> at
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
> >> at java.security.AccessController.doPrivileged(Native Method)
> >> at javax.security.auth.Subject.doAs(Subject.java:422)
> >> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> >> at
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
> >> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to
> submit application_1578605412668_0005 to YARN : Failed to renew token:
> Kind: kms-dt, Service: host3.com:3456, Ident: (owner=userA,
> renewer=adminB, realUser=, issueDate=1578606224956, maxDate=1579211024956,
> sequenceNumber=32, masterKeyId=52)
> >> at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:275)
> >> at
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1004)
> >> at
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:507)
> >> at
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:378)
> >> ... 9 more
> >>
> >>
> >> Full client log:
> https://gist.github.com/Ethanlm/221284bcaa272270a799957dc05b94fd
> >> Resource manager log:
> https://gist.github.com/Ethanlm/ecd0a3eb25582ad6b1552927fc0e5c47
> >> Hostname, IP address, username and etc. are anonymized.
> >>
> >>
> >> Not sure how to proceed further. Wondering if anyone in the community
> has encountered this before. Thank you very much for your time!
> >>
> >> Best,
> >> Ethan
> >>
>
>

Re: [Question] Failed to submit flink job to secure yarn cluster

Posted by Ethan Li <et...@gmail.com>.
Hi Yangze,

Thanks for your reply. Those are the docs I have read and followed. (I was also able to set up a standalone flink cluster with secure HDFS, Zookeeper and Kafa. )

Could you please let me know what I am missing? Thanks


Best,
Ethan

> On Jan 10, 2020, at 6:28 AM, Yangze Guo <ka...@gmail.com> wrote:
> 
> Hi, Ethan
> 
> You could first check your cluster following this guide and check if
> all the related config[2] set correctly.
> 
> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/security-kerberos.html
> [2] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#security-kerberos-login-contexts
> 
> Best,
> Yangze Guo
> 
> On Fri, Jan 10, 2020 at 10:37 AM Ethan Li <et...@gmail.com> wrote:
>> 
>> Hello
>> 
>> I was following  https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html#run-a-flink-job-on-yarn and trying to submit a flink job on yarn.
>> 
>> I downloaded flink-1.9.1 and pre-bundled Hadoop 2.8.3 from https://flink.apache.org/downloads.html#apache-flink-191. I used default configs except:
>> 
>> security.kerberos.login.keytab: userA.keytab
>> security.kerberos.login.principal: userA@REALM
>> 
>> 
>> I have a secure Yarn cluster set up already. Then I ran “ ./bin/flink run -m yarn-cluster -p 1 -yjm 1024m -ytm 1024m ./examples/streaming/WordCount.jar” and got the following errors:
>> 
>> 
>> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
>> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385)
>> at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:251)
>> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
>> at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
>> at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>> at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
>> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1578605412668_0005 to YARN : Failed to renew token: Kind: kms-dt, Service: host3.com:3456, Ident: (owner=userA, renewer=adminB, realUser=, issueDate=1578606224956, maxDate=1579211024956, sequenceNumber=32, masterKeyId=52)
>> at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:275)
>> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1004)
>> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:507)
>> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:378)
>> ... 9 more
>> 
>> 
>> Full client log:https://gist.github.com/Ethanlm/221284bcaa272270a799957dc05b94fd
>> Resource manager log: https://gist.github.com/Ethanlm/ecd0a3eb25582ad6b1552927fc0e5c47
>> Hostname, IP address, username and etc. are anonymized.
>> 
>> 
>> Not sure how to proceed further. Wondering if anyone in the community has encountered this before. Thank you very much for your time!
>> 
>> Best,
>> Ethan
>> 


Re: [Question] Failed to submit flink job to secure yarn cluster

Posted by Yangze Guo <ka...@gmail.com>.
Hi, Ethan

You could first check your cluster following this guide and check if
all the related config[2] set correctly.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/security-kerberos.html
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#security-kerberos-login-contexts

Best,
Yangze Guo

On Fri, Jan 10, 2020 at 10:37 AM Ethan Li <et...@gmail.com> wrote:
>
> Hello
>
> I was following  https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html#run-a-flink-job-on-yarn and trying to submit a flink job on yarn.
>
> I downloaded flink-1.9.1 and pre-bundled Hadoop 2.8.3 from https://flink.apache.org/downloads.html#apache-flink-191. I used default configs except:
>
> security.kerberos.login.keytab: userA.keytab
> security.kerberos.login.principal: userA@REALM
>
>
> I have a secure Yarn cluster set up already. Then I ran “ ./bin/flink run -m yarn-cluster -p 1 -yjm 1024m -ytm 1024m ./examples/streaming/WordCount.jar” and got the following errors:
>
>
> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385)
> at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:251)
> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
> at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
> at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1578605412668_0005 to YARN : Failed to renew token: Kind: kms-dt, Service: host3.com:3456, Ident: (owner=userA, renewer=adminB, realUser=, issueDate=1578606224956, maxDate=1579211024956, sequenceNumber=32, masterKeyId=52)
> at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:275)
> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1004)
> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:507)
> at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:378)
> ... 9 more
>
>
> Full client log:https://gist.github.com/Ethanlm/221284bcaa272270a799957dc05b94fd
> Resource manager log: https://gist.github.com/Ethanlm/ecd0a3eb25582ad6b1552927fc0e5c47
> Hostname, IP address, username and etc. are anonymized.
>
>
> Not sure how to proceed further. Wondering if anyone in the community has encountered this before. Thank you very much for your time!
>
> Best,
> Ethan
>