You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Curt Buechter <tr...@gmail.com> on 2021/09/22 16:46:55 UTC

flink rest endpoint creation failure

Hi,
I'm getting an error that happens randomly when starting a flink
application.

For context, this is running in YARN on AWS. This application is one that
converts from the Table API to the Stream API, so two flink
applications/jobmanagers are trying to start up. I think what happens is
that the rest api port is chosen, and is the same for both of the flink
apps. If YARN chooses two different instances for the two task managers,
they each work fine and start their rest api on the same port on their own
respective machine. But, if YARN chooses the same instance for both job
managers, they both try to start up the rest api on the same port on the
same machine, and I get the error.

Here is the error:

2021-09-22 15:47:27,724 ERROR
org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -
Could not start cluster entrypoint YarnJobClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed
to initialize the cluster entrypoint YarnJobClusterEntrypoint.
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:212)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:600)
[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:99)
[flink-dist_2.12-1.13.2.jar:1.13.2]
Caused by: org.apache.flink.util.FlinkException: Could not create the
DispatcherResourceManagerComponent.
	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:275)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
~[hadoop-common-3.2.1-amzn-3.jar:?]
	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	... 2 more
Caused by: java.net.BindException: Could not start rest endpoint on
any port in port range 35485
	at org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:234)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:172)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
~[hadoop-common-3.2.1-amzn-3.jar:?]
	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186)
~[flink-dist_2.12-1.13.2.jar:1.13.2]
	... 2 more


And, here is part of the log from the other job manager, which
successfully started its rest api on the same port, just a few seconds
earlier:


2021-09-22 15:47:20,690 INFO
org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] -
Rest endpoint listening at ip-10-1-2-137.ec2.internal:35485
2021-09-22 15:47:20,691 INFO
org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] -
http://ip-10-1-2-137.ec2.internal:35485 was granted leadership with
leaderSessionID=00000000-0000-0000-0000-000000000000
2021-09-22 15:47:20,692 INFO
org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Web
frontend listening at http://ip-10-1-2-137.ec2.internal:35485.



Do you know of any configuration that would assist with this? I
thought about rest.bind-port, but the rest port already seems to be
chosen dynamically. My config file has that setting commented out.


Thanks

Re: flink rest endpoint creation failure

Posted by Robert Metzger <rm...@apache.org>.
Hi,

Yes, "rest.bind-port" seems to be set to "35485" on the JobManager
instance. Can you double check the configuration that is used by Flink?
The jobManager is also printing the effective configuration on start up.
You'll probably see the value there as well.


On Wed, Sep 22, 2021 at 6:48 PM Curt Buechter <tr...@gmail.com> wrote:

> Hi,
> I'm getting an error that happens randomly when starting a flink
> application.
>
> For context, this is running in YARN on AWS. This application is one that
> converts from the Table API to the Stream API, so two flink
> applications/jobmanagers are trying to start up. I think what happens is
> that the rest api port is chosen, and is the same for both of the flink
> apps. If YARN chooses two different instances for the two task managers,
> they each work fine and start their rest api on the same port on their own
> respective machine. But, if YARN chooses the same instance for both job
> managers, they both try to start up the rest api on the same port on the
> same machine, and I get the error.
>
> Here is the error:
>
> 2021-09-22 15:47:27,724 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Could not start cluster entrypoint YarnJobClusterEntrypoint.
> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint.
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:212) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:600) [flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:99) [flink-dist_2.12-1.13.2.jar:1.13.2]
> Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
> 	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:275) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
> 	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.2.1-amzn-3.jar:?]
> 	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	... 2 more
> Caused by: java.net.BindException: Could not start rest endpoint on any port in port range 35485
> 	at org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:234) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:172) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
> 	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.2.1-amzn-3.jar:?]
> 	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	... 2 more
>
>
> And, here is part of the log from the other job manager, which successfully started its rest api on the same port, just a few seconds earlier:
>
>
> 2021-09-22 15:47:20,690 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Rest endpoint listening at ip-10-1-2-137.ec2.internal:35485
> 2021-09-22 15:47:20,691 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - http://ip-10-1-2-137.ec2.internal:35485 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000
> 2021-09-22 15:47:20,692 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Web frontend listening at http://ip-10-1-2-137.ec2.internal:35485.
>
>
>
> Do you know of any configuration that would assist with this? I thought about rest.bind-port, but the rest port already seems to be chosen dynamically. My config file has that setting commented out.
>
>
> Thanks
>
>
>

Re: flink rest endpoint creation failure

Posted by Matthias Pohl <ma...@ververica.com>.
Hi Curt,
could you elaborate a bit more on your setup? Maybe, provide commands you
used to deploy the jobs and the Flink/YARN logs. What's puzzling me is your
statement about "two JobManagers spinning up" and "everything's working
fine if two TaskManagers are running on different instances".
- When talking about Flink applications, you're talking about application
mode?
- I have the feeling you're mixing up JobManager and TaskManager in your
initial description. Could you clarify this?
- Actually, each of the Flink components (JobManager and TaskManager)
should run in its own YARN container. The way you describe it it sounds
like Flink runs within one container?

Best,
Matthias



On Thu, Sep 23, 2021 at 5:14 PM Curt Buechter <tr...@gmail.com> wrote:

> Thanks Robert,
> But, no, the rest.bind-port is not set to 35485 in the configuration.
> Other jobs use different ports, so it is getting set dynamically.
>
>
> #==============================================================================
> # Rest & web frontend
>
> #==============================================================================
>
> # The port to which the REST client connects to. If rest.bind-port has
> # not been specified, then the server will bind to this port as well.
> #
> #rest.port: 8081
>
> # The address to which the REST client will connect to
> #
> #rest.address: 0.0.0.0
>
> # Port range for the REST and web server to bind to.
> #
> #rest.bind-port: 8080-8090
>
> # The address that the REST & web server binds to
> #
> #rest.bind-address: 0.0.0.0
>
> # Flag to specify whether job submission is enabled from the web-based
> # runtime monitor. Uncomment to disable.
>
> #web.submit.enable: false
>
>
>
> On Wed, Sep 22, 2021 at 11:46 AM Curt Buechter <tr...@gmail.com>
> wrote:
>
>> Hi,
>> I'm getting an error that happens randomly when starting a flink
>> application.
>>
>> For context, this is running in YARN on AWS. This application is one that
>> converts from the Table API to the Stream API, so two flink
>> applications/jobmanagers are trying to start up. I think what happens is
>> that the rest api port is chosen, and is the same for both of the flink
>> apps. If YARN chooses two different instances for the two task managers,
>> they each work fine and start their rest api on the same port on their own
>> respective machine. But, if YARN chooses the same instance for both job
>> managers, they both try to start up the rest api on the same port on the
>> same machine, and I get the error.
>>
>> Here is the error:
>>
>> 2021-09-22 15:47:27,724 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Could not start cluster entrypoint YarnJobClusterEntrypoint.
>> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint.
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:212) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:600) [flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:99) [flink-dist_2.12-1.13.2.jar:1.13.2]
>> Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
>> 	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:275) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
>> 	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.2.1-amzn-3.jar:?]
>> 	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	... 2 more
>> Caused by: java.net.BindException: Could not start rest endpoint on any port in port range 35485
>> 	at org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:234) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:172) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
>> 	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.2.1-amzn-3.jar:?]
>> 	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
>> 	... 2 more
>>
>>
>> And, here is part of the log from the other job manager, which successfully started its rest api on the same port, just a few seconds earlier:
>>
>>
>> 2021-09-22 15:47:20,690 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Rest endpoint listening at ip-10-1-2-137.ec2.internal:35485
>> 2021-09-22 15:47:20,691 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - http://ip-10-1-2-137.ec2.internal:35485 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000
>> 2021-09-22 15:47:20,692 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Web frontend listening at http://ip-10-1-2-137.ec2.internal:35485.
>>
>>
>>
>> Do you know of any configuration that would assist with this? I thought about rest.bind-port, but the rest port already seems to be chosen dynamically. My config file has that setting commented out.
>>
>>
>> Thanks
>>
>>
>>

Re: flink rest endpoint creation failure

Posted by Curt Buechter <tr...@gmail.com>.
Thanks Robert,
But, no, the rest.bind-port is not set to 35485 in the configuration. Other
jobs use different ports, so it is getting set dynamically.

#==============================================================================
# Rest & web frontend
#==============================================================================

# The port to which the REST client connects to. If rest.bind-port has
# not been specified, then the server will bind to this port as well.
#
#rest.port: 8081

# The address to which the REST client will connect to
#
#rest.address: 0.0.0.0

# Port range for the REST and web server to bind to.
#
#rest.bind-port: 8080-8090

# The address that the REST & web server binds to
#
#rest.bind-address: 0.0.0.0

# Flag to specify whether job submission is enabled from the web-based
# runtime monitor. Uncomment to disable.

#web.submit.enable: false



On Wed, Sep 22, 2021 at 11:46 AM Curt Buechter <tr...@gmail.com>
wrote:

> Hi,
> I'm getting an error that happens randomly when starting a flink
> application.
>
> For context, this is running in YARN on AWS. This application is one that
> converts from the Table API to the Stream API, so two flink
> applications/jobmanagers are trying to start up. I think what happens is
> that the rest api port is chosen, and is the same for both of the flink
> apps. If YARN chooses two different instances for the two task managers,
> they each work fine and start their rest api on the same port on their own
> respective machine. But, if YARN chooses the same instance for both job
> managers, they both try to start up the rest api on the same port on the
> same machine, and I get the error.
>
> Here is the error:
>
> 2021-09-22 15:47:27,724 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Could not start cluster entrypoint YarnJobClusterEntrypoint.
> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint.
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:212) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:600) [flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:99) [flink-dist_2.12-1.13.2.jar:1.13.2]
> Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
> 	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:275) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
> 	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.2.1-amzn-3.jar:?]
> 	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	... 2 more
> Caused by: java.net.BindException: Could not start rest endpoint on any port in port range 35485
> 	at org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:234) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:172) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:250) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:189) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
> 	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.2.1-amzn-3.jar:?]
> 	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:186) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
> 	... 2 more
>
>
> And, here is part of the log from the other job manager, which successfully started its rest api on the same port, just a few seconds earlier:
>
>
> 2021-09-22 15:47:20,690 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Rest endpoint listening at ip-10-1-2-137.ec2.internal:35485
> 2021-09-22 15:47:20,691 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - http://ip-10-1-2-137.ec2.internal:35485 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000
> 2021-09-22 15:47:20,692 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Web frontend listening at http://ip-10-1-2-137.ec2.internal:35485.
>
>
>
> Do you know of any configuration that would assist with this? I thought about rest.bind-port, but the rest port already seems to be chosen dynamically. My config file has that setting commented out.
>
>
> Thanks
>
>
>