You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Yuan,Youjun" <yu...@baidu.com> on 2018/07/25 03:37:07 UTC

Best way to find the current alive jobmanager with HA mode zookeeper

Hi all,

I have a standalone cluster with 3 jobmanagers, and set high-availability to zookeeper. Our client submits job by REST API(POST /jars/:jarid/run), which means we need to know the host of the any of the current alive jobmanagers. The problem is that, how can we know which job manager is alive, or the host of current leader?  We don't want to access a dead JM.

Thanks.
Youjun Yuan

Re: 答复: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by Till Rohrmann <tr...@apache.org>.
Hi Martin,

when configuring Flink to use the ZooKeeper HA mode, then it won't be
necessary to specify the leader's address manually. The CLI will ask
ZooKeeper for the leader information and send the request to the current
leader. This should work with at least Flink >= 1.5 and also with Flink 1.4.

Cheers,
Till

On Tue, Aug 21, 2018 at 10:20 AM Martin Eden <ma...@gmail.com>
wrote:

> Hi guys,
>
> Just to close the loop, with the Flink 1.3.2 cli you have to provide the
> Flink Job Manager host address in order to submit a job like so:
> ${FLINK_HOME}/bin/flink run -d -m ${FLINK_JOBMANAGER_ADDRESS} ${JOB_JAR}
>
> Since we are running the DCOS Flink package we use the Marathon rest api
> to fetch the FLINK_JOBMANAGER_ADDRESS which solved our problem.
>
> We are now thinking of upgrading to the latest 1.6 release. From looking
> at the cli docs and from the previous messages it seems you still need to
> provide the Job Manager address explicitly. Are there any plans to support
> job submission that just takes a zookeeper ensemble and zookeeperNamespace
> (which is currently accepted) without having to provide explicit Job
> Manager address? This would be more user friendly and would eliminate the
> extra step of figuring out the Job Manager address.
>
> Thanks,
> M
>
>
>
> On Tue, Jul 31, 2018 at 3:54 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
>> I think that the web ui automatically redirects to the current leader. So
>> if you should access the JobManager which is not leader, then you should
>> get an HTTP redirect to the current leader. Due to that it should not be
>> strictly necessary to know which of the JobManagers is the leader.
>>
>> The RestClusterClient uses the ZooKeeperLeaderRetrievalService to
>> retrieve the leader address. You could try the same. Using the
>> RestClusterClient with Flink 1.4 won't work, though. Alternatively, you
>> should be able to directly read the address from the leader ZNode in
>> ZooKeeper.
>>
>> Cheers,
>> Till
>>
>>
>>
>> On Thu, Jul 26, 2018 at 4:14 AM vino yang <ya...@gmail.com> wrote:
>>
>>> Hi Youjun,
>>>
>>> Thanks, you can try this but I am not sure if it works correctly.
>>> Because for the REST Client, there are quite a few changes from 1.4 to 1.5.
>>>
>>> Maybe you can customize the source code in 1.4 refer to specific
>>> implementation of 1.5? Another option, upgrade your Flink version.
>>>
>>> To Chesnay and Till:  any suggestion or opinion?
>>>
>>> Thanks, vino.
>>>
>>> 2018-07-26 10:01 GMT+08:00 Yuan,Youjun <yu...@baidu.com>:
>>>
>>>> Thanks for the information. Forgot to mention, I am using Flink 1.4,
>>>> the RestClusterClient seems don’t have the ability to retrieve the leader
>>>> address. I did notice there is webMonitorRetrievalService member in Flink
>>>> 1.5.
>>>>
>>>>
>>>>
>>>> I wonder if I can use RestClusterClient@v1.5 on my client side, to
>>>> retrieve the leader JM of Flink v1.4 Cluster.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Youjun
>>>>
>>>>
>>>>
>>>> *发件人**:* vino yang <ya...@gmail.com>
>>>> *发送时间:* Wednesday, July 25, 2018 7:11 PM
>>>> *收件人:* Martin Eden <ma...@gmail.com>
>>>> *抄送:* Yuan,Youjun <yu...@baidu.com>; user@flink.apache.org
>>>> *主题:* Re: Best way to find the current alive jobmanager with HA mode
>>>> zookeeper
>>>>
>>>>
>>>>
>>>> Hi Martin,
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> For a standalone cluster which exists multiple JM instances, If you do
>>>> not use Rest API, but use Flink provided Cluster client. The client can
>>>> perceive which one this the JM leader from multiple JM instances.
>>>>
>>>>
>>>>
>>>> For example, you can use CLI to submit flink job in a non-Leader node.
>>>>
>>>>
>>>>
>>>> But I did not verify this case for Flink on Mesos.
>>>>
>>>>
>>>>
>>>> Thanks, vino.
>>>>
>>>>
>>>>
>>>> 2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>:
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> This is actually very relevant to us as well.
>>>>
>>>>
>>>>
>>>> We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
>>>> Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
>>>> another node by Marathon in case of failure and re-load it's state from
>>>> Zookeeper.
>>>>
>>>>
>>>>
>>>> Yuan I am guessing you are using Flink in standalone mode and there it
>>>> is actually running 3 instances of the Job Manager, 1 active and 2
>>>> stand-bys.
>>>>
>>>>
>>>>
>>>> Either way, in both cases there is the need to "discover" the hostname
>>>> and port of the Job Manager at runtime. This is needed when you want to use
>>>> the cli to submit jobs for instance. Is there an elegant mode to submit
>>>> jobs other than say just trying out all the possible nodes in your cluster?
>>>>
>>>>
>>>>
>>>> Grateful if anyone could clarify any of the above, thanks,
>>>>
>>>> M
>>>>
>>>>
>>>>
>>>> On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>
>>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>>
>>>>
>>>> I have a standalone cluster with 3 jobmanagers, and set *high-availability
>>>> to zookeeper*. Our client submits job by REST API(POST
>>>> /jars/:jarid/run), which means we need to know the host of the any of the
>>>> current alive jobmanagers. The problem is that, how can we know which job
>>>> manager is alive, or the host of current leader?  We don’t want to access a
>>>> dead JM.
>>>>
>>>>
>>>>
>>>> Thanks.
>>>>
>>>> Youjun Yuan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>

Re: 答复: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by Martin Eden <ma...@gmail.com>.
Hi guys,

Just to close the loop, with the Flink 1.3.2 cli you have to provide the
Flink Job Manager host address in order to submit a job like so:
${FLINK_HOME}/bin/flink run -d -m ${FLINK_JOBMANAGER_ADDRESS} ${JOB_JAR}

Since we are running the DCOS Flink package we use the Marathon rest api to
fetch the FLINK_JOBMANAGER_ADDRESS which solved our problem.

We are now thinking of upgrading to the latest 1.6 release. From looking at
the cli docs and from the previous messages it seems you still need to
provide the Job Manager address explicitly. Are there any plans to support
job submission that just takes a zookeeper ensemble and zookeeperNamespace
(which is currently accepted) without having to provide explicit Job
Manager address? This would be more user friendly and would eliminate the
extra step of figuring out the Job Manager address.

Thanks,
M



On Tue, Jul 31, 2018 at 3:54 PM, Till Rohrmann <tr...@apache.org> wrote:

> I think that the web ui automatically redirects to the current leader. So
> if you should access the JobManager which is not leader, then you should
> get an HTTP redirect to the current leader. Due to that it should not be
> strictly necessary to know which of the JobManagers is the leader.
>
> The RestClusterClient uses the ZooKeeperLeaderRetrievalService to
> retrieve the leader address. You could try the same. Using the
> RestClusterClient with Flink 1.4 won't work, though. Alternatively, you
> should be able to directly read the address from the leader ZNode in
> ZooKeeper.
>
> Cheers,
> Till
>
>
>
> On Thu, Jul 26, 2018 at 4:14 AM vino yang <ya...@gmail.com> wrote:
>
>> Hi Youjun,
>>
>> Thanks, you can try this but I am not sure if it works correctly. Because
>> for the REST Client, there are quite a few changes from 1.4 to 1.5.
>>
>> Maybe you can customize the source code in 1.4 refer to specific
>> implementation of 1.5? Another option, upgrade your Flink version.
>>
>> To Chesnay and Till:  any suggestion or opinion?
>>
>> Thanks, vino.
>>
>> 2018-07-26 10:01 GMT+08:00 Yuan,Youjun <yu...@baidu.com>:
>>
>>> Thanks for the information. Forgot to mention, I am using Flink 1.4, the
>>> RestClusterClient seems don’t have the ability to retrieve the leader
>>> address. I did notice there is webMonitorRetrievalService member in Flink
>>> 1.5.
>>>
>>>
>>>
>>> I wonder if I can use RestClusterClient@v1.5 on my client side, to
>>> retrieve the leader JM of Flink v1.4 Cluster.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Youjun
>>>
>>>
>>>
>>> *发件人**:* vino yang <ya...@gmail.com>
>>> *发送时间:* Wednesday, July 25, 2018 7:11 PM
>>> *收件人:* Martin Eden <ma...@gmail.com>
>>> *抄送:* Yuan,Youjun <yu...@baidu.com>; user@flink.apache.org
>>> *主题:* Re: Best way to find the current alive jobmanager with HA mode
>>> zookeeper
>>>
>>>
>>>
>>> Hi Martin,
>>>
>>>
>>>
>>>
>>>
>>> For a standalone cluster which exists multiple JM instances, If you do
>>> not use Rest API, but use Flink provided Cluster client. The client can
>>> perceive which one this the JM leader from multiple JM instances.
>>>
>>>
>>>
>>> For example, you can use CLI to submit flink job in a non-Leader node.
>>>
>>>
>>>
>>> But I did not verify this case for Flink on Mesos.
>>>
>>>
>>>
>>> Thanks, vino.
>>>
>>>
>>>
>>> 2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>:
>>>
>>> Hi,
>>>
>>>
>>>
>>> This is actually very relevant to us as well.
>>>
>>>
>>>
>>> We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
>>> Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
>>> another node by Marathon in case of failure and re-load it's state from
>>> Zookeeper.
>>>
>>>
>>>
>>> Yuan I am guessing you are using Flink in standalone mode and there it
>>> is actually running 3 instances of the Job Manager, 1 active and 2
>>> stand-bys.
>>>
>>>
>>>
>>> Either way, in both cases there is the need to "discover" the hostname
>>> and port of the Job Manager at runtime. This is needed when you want to use
>>> the cli to submit jobs for instance. Is there an elegant mode to submit
>>> jobs other than say just trying out all the possible nodes in your cluster?
>>>
>>>
>>>
>>> Grateful if anyone could clarify any of the above, thanks,
>>>
>>> M
>>>
>>>
>>>
>>> On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>
>>> wrote:
>>>
>>> Hi all,
>>>
>>>
>>>
>>> I have a standalone cluster with 3 jobmanagers, and set *high-availability
>>> to zookeeper*. Our client submits job by REST API(POST
>>> /jars/:jarid/run), which means we need to know the host of the any of the
>>> current alive jobmanagers. The problem is that, how can we know which job
>>> manager is alive, or the host of current leader?  We don’t want to access a
>>> dead JM.
>>>
>>>
>>>
>>> Thanks.
>>>
>>> Youjun Yuan
>>>
>>>
>>>
>>>
>>>
>>
>>

Re: 答复: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by Till Rohrmann <tr...@apache.org>.
I think that the web ui automatically redirects to the current leader. So
if you should access the JobManager which is not leader, then you should
get an HTTP redirect to the current leader. Due to that it should not be
strictly necessary to know which of the JobManagers is the leader.

The RestClusterClient uses the ZooKeeperLeaderRetrievalService to retrieve
the leader address. You could try the same. Using the RestClusterClient
with Flink 1.4 won't work, though. Alternatively, you should be able to
directly read the address from the leader ZNode in ZooKeeper.

Cheers,
Till



On Thu, Jul 26, 2018 at 4:14 AM vino yang <ya...@gmail.com> wrote:

> Hi Youjun,
>
> Thanks, you can try this but I am not sure if it works correctly. Because
> for the REST Client, there are quite a few changes from 1.4 to 1.5.
>
> Maybe you can customize the source code in 1.4 refer to specific
> implementation of 1.5? Another option, upgrade your Flink version.
>
> To Chesnay and Till:  any suggestion or opinion?
>
> Thanks, vino.
>
> 2018-07-26 10:01 GMT+08:00 Yuan,Youjun <yu...@baidu.com>:
>
>> Thanks for the information. Forgot to mention, I am using Flink 1.4, the
>> RestClusterClient seems don’t have the ability to retrieve the leader
>> address. I did notice there is webMonitorRetrievalService member in Flink
>> 1.5.
>>
>>
>>
>> I wonder if I can use RestClusterClient@v1.5 on my client side, to
>> retrieve the leader JM of Flink v1.4 Cluster.
>>
>>
>>
>> Thanks
>>
>> Youjun
>>
>>
>>
>> *发件人**:* vino yang <ya...@gmail.com>
>> *发送时间:* Wednesday, July 25, 2018 7:11 PM
>> *收件人:* Martin Eden <ma...@gmail.com>
>> *抄送:* Yuan,Youjun <yu...@baidu.com>; user@flink.apache.org
>> *主题:* Re: Best way to find the current alive jobmanager with HA mode
>> zookeeper
>>
>>
>>
>> Hi Martin,
>>
>>
>>
>>
>>
>> For a standalone cluster which exists multiple JM instances, If you do
>> not use Rest API, but use Flink provided Cluster client. The client can
>> perceive which one this the JM leader from multiple JM instances.
>>
>>
>>
>> For example, you can use CLI to submit flink job in a non-Leader node.
>>
>>
>>
>> But I did not verify this case for Flink on Mesos.
>>
>>
>>
>> Thanks, vino.
>>
>>
>>
>> 2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>:
>>
>> Hi,
>>
>>
>>
>> This is actually very relevant to us as well.
>>
>>
>>
>> We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
>> Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
>> another node by Marathon in case of failure and re-load it's state from
>> Zookeeper.
>>
>>
>>
>> Yuan I am guessing you are using Flink in standalone mode and there it is
>> actually running 3 instances of the Job Manager, 1 active and 2 stand-bys.
>>
>>
>>
>> Either way, in both cases there is the need to "discover" the hostname
>> and port of the Job Manager at runtime. This is needed when you want to use
>> the cli to submit jobs for instance. Is there an elegant mode to submit
>> jobs other than say just trying out all the possible nodes in your cluster?
>>
>>
>>
>> Grateful if anyone could clarify any of the above, thanks,
>>
>> M
>>
>>
>>
>> On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>
>> wrote:
>>
>> Hi all,
>>
>>
>>
>> I have a standalone cluster with 3 jobmanagers, and set *high-availability
>> to zookeeper*. Our client submits job by REST API(POST
>> /jars/:jarid/run), which means we need to know the host of the any of the
>> current alive jobmanagers. The problem is that, how can we know which job
>> manager is alive, or the host of current leader?  We don’t want to access a
>> dead JM.
>>
>>
>>
>> Thanks.
>>
>> Youjun Yuan
>>
>>
>>
>>
>>
>
>

Re: 答复: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by vino yang <ya...@gmail.com>.
Hi Youjun,

Thanks, you can try this but I am not sure if it works correctly. Because
for the REST Client, there are quite a few changes from 1.4 to 1.5.

Maybe you can customize the source code in 1.4 refer to specific
implementation of 1.5? Another option, upgrade your Flink version.

To Chesnay and Till:  any suggestion or opinion?

Thanks, vino.

2018-07-26 10:01 GMT+08:00 Yuan,Youjun <yu...@baidu.com>:

> Thanks for the information. Forgot to mention, I am using Flink 1.4, the
> RestClusterClient seems don’t have the ability to retrieve the leader
> address. I did notice there is webMonitorRetrievalService member in Flink
> 1.5.
>
>
>
> I wonder if I can use RestClusterClient@v1.5 on my client side, to
> retrieve the leader JM of Flink v1.4 Cluster.
>
>
>
> Thanks
>
> Youjun
>
>
>
> *发件人**:* vino yang <ya...@gmail.com>
> *发送时间:* Wednesday, July 25, 2018 7:11 PM
> *收件人:* Martin Eden <ma...@gmail.com>
> *抄送:* Yuan,Youjun <yu...@baidu.com>; user@flink.apache.org
> *主题:* Re: Best way to find the current alive jobmanager with HA mode
> zookeeper
>
>
>
> Hi Martin,
>
>
>
>
>
> For a standalone cluster which exists multiple JM instances, If you do not
> use Rest API, but use Flink provided Cluster client. The client can
> perceive which one this the JM leader from multiple JM instances.
>
>
>
> For example, you can use CLI to submit flink job in a non-Leader node.
>
>
>
> But I did not verify this case for Flink on Mesos.
>
>
>
> Thanks, vino.
>
>
>
> 2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>:
>
> Hi,
>
>
>
> This is actually very relevant to us as well.
>
>
>
> We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
> Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
> another node by Marathon in case of failure and re-load it's state from
> Zookeeper.
>
>
>
> Yuan I am guessing you are using Flink in standalone mode and there it is
> actually running 3 instances of the Job Manager, 1 active and 2 stand-bys.
>
>
>
> Either way, in both cases there is the need to "discover" the hostname and
> port of the Job Manager at runtime. This is needed when you want to use the
> cli to submit jobs for instance. Is there an elegant mode to submit jobs
> other than say just trying out all the possible nodes in your cluster?
>
>
>
> Grateful if anyone could clarify any of the above, thanks,
>
> M
>
>
>
> On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>
> wrote:
>
> Hi all,
>
>
>
> I have a standalone cluster with 3 jobmanagers, and set *high-availability
> to zookeeper*. Our client submits job by REST API(POST /jars/:jarid/run),
> which means we need to know the host of the any of the current alive
> jobmanagers. The problem is that, how can we know which job manager is
> alive, or the host of current leader?  We don’t want to access a dead JM.
>
>
>
> Thanks.
>
> Youjun Yuan
>
>
>
>
>

答复: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by "Yuan,Youjun" <yu...@baidu.com>.
Thanks for the information. Forgot to mention, I am using Flink 1.4, the RestClusterClient seems don’t have the ability to retrieve the leader address. I did notice there is webMonitorRetrievalService member in Flink 1.5.

I wonder if I can use RestClusterClient@v1.5<ma...@v1.5> on my client side, to retrieve the leader JM of Flink v1.4 Cluster.

Thanks
Youjun

发件人: vino yang <ya...@gmail.com>
发送时间: Wednesday, July 25, 2018 7:11 PM
收件人: Martin Eden <ma...@gmail.com>
抄送: Yuan,Youjun <yu...@baidu.com>; user@flink.apache.org
主题: Re: Best way to find the current alive jobmanager with HA mode zookeeper

Hi Martin,


For a standalone cluster which exists multiple JM instances, If you do not use Rest API, but use Flink provided Cluster client. The client can perceive which one this the JM leader from multiple JM instances.

For example, you can use CLI to submit flink job in a non-Leader node.

But I did not verify this case for Flink on Mesos.

Thanks, vino.

2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>>:
Hi,

This is actually very relevant to us as well.

We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on another node by Marathon in case of failure and re-load it's state from Zookeeper.

Yuan I am guessing you are using Flink in standalone mode and there it is actually running 3 instances of the Job Manager, 1 active and 2 stand-bys.

Either way, in both cases there is the need to "discover" the hostname and port of the Job Manager at runtime. This is needed when you want to use the cli to submit jobs for instance. Is there an elegant mode to submit jobs other than say just trying out all the possible nodes in your cluster?

Grateful if anyone could clarify any of the above, thanks,
M

On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>> wrote:
Hi all,

I have a standalone cluster with 3 jobmanagers, and set high-availability to zookeeper. Our client submits job by REST API(POST /jars/:jarid/run), which means we need to know the host of the any of the current alive jobmanagers. The problem is that, how can we know which job manager is alive, or the host of current leader?  We don’t want to access a dead JM.

Thanks.
Youjun Yuan



Re: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by vino yang <ya...@gmail.com>.
Hi Martin,


For a standalone cluster which exists multiple JM instances, If you do not
use Rest API, but use Flink provided Cluster client. The client can
perceive which one this the JM leader from multiple JM instances.

For example, you can use CLI to submit flink job in a non-Leader node.

But I did not verify this case for Flink on Mesos.

Thanks, vino.

2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>:

> Hi,
>
> This is actually very relevant to us as well.
>
> We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
> Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
> another node by Marathon in case of failure and re-load it's state from
> Zookeeper.
>
> Yuan I am guessing you are using Flink in standalone mode and there it is
> actually running 3 instances of the Job Manager, 1 active and 2 stand-bys.
>
> Either way, in both cases there is the need to "discover" the hostname and
> port of the Job Manager at runtime. This is needed when you want to use the
> cli to submit jobs for instance. Is there an elegant mode to submit jobs
> other than say just trying out all the possible nodes in your cluster?
>
> Grateful if anyone could clarify any of the above, thanks,
> M
>
> On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>
> wrote:
>
>> Hi all,
>>
>>
>>
>> I have a standalone cluster with 3 jobmanagers, and set *high-availability
>> to zookeeper*. Our client submits job by REST API(POST
>> /jars/:jarid/run), which means we need to know the host of the any of the
>> current alive jobmanagers. The problem is that, how can we know which job
>> manager is alive, or the host of current leader?  We don’t want to access a
>> dead JM.
>>
>>
>>
>> Thanks.
>>
>> Youjun Yuan
>>
>
>

Re: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by Martin Eden <ma...@gmail.com>.
Hi,

This is actually very relevant to us as well.

We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
another node by Marathon in case of failure and re-load it's state from
Zookeeper.

Yuan I am guessing you are using Flink in standalone mode and there it is
actually running 3 instances of the Job Manager, 1 active and 2 stand-bys.

Either way, in both cases there is the need to "discover" the hostname and
port of the Job Manager at runtime. This is needed when you want to use the
cli to submit jobs for instance. Is there an elegant mode to submit jobs
other than say just trying out all the possible nodes in your cluster?

Grateful if anyone could clarify any of the above, thanks,
M

On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com> wrote:

> Hi all,
>
>
>
> I have a standalone cluster with 3 jobmanagers, and set *high-availability
> to zookeeper*. Our client submits job by REST API(POST /jars/:jarid/run),
> which means we need to know the host of the any of the current alive
> jobmanagers. The problem is that, how can we know which job manager is
> alive, or the host of current leader?  We don’t want to access a dead JM.
>
>
>
> Thanks.
>
> Youjun Yuan
>

***UNCHECKED*** Re: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by vino yang <ya...@gmail.com>.
Hi Yuan Youjun,

Actually, RestClusterClient has a method named getWebMonitorBaseUrl which
will retrieve the webmonitor's leader address when you submit job
automatically.[1]

Ideally, you do not need to retrieve JM by yourself. Currently, the
webmonitor is binding with JobManager, maybe if JM failover, you can not
find new web monitor?

Flink provided a component named "LeaderRetrievalService" to retrieval many
compoment's leader, based on Zookeeper, there is a implementation named
"ZooKeeperLeaderRetrievalService".

In ZooKeeperHaServices, it provided a method named
"getWebMonitorLeaderRetriever" to retrieve the web monitor's leader and
provided a method named "getJobManagerLeaderRetriever" to retrieve
JobManager's leader.  And ClusterClient#getJobManagerGateway used it.

[1]:
https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/rest/RestClusterClient.java#L755

Thanks, vino.


2018-07-25 11:37 GMT+08:00 Yuan,Youjun <yu...@baidu.com>:

> Hi all,
>
>
>
> I have a standalone cluster with 3 jobmanagers, and set *high-availability
> to zookeeper*. Our client submits job by REST API(POST /jars/:jarid/run),
> which means we need to know the host of the any of the current alive
> jobmanagers. The problem is that, how can we know which job manager is
> alive, or the host of current leader?  We don’t want to access a dead JM.
>
>
>
> Thanks.
>
> Youjun Yuan
>