You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Martin Eden <ma...@gmail.com> on 2018/08/21 08:20:38 UTC

Re: 答复: Best way to find the current alive jobmanager with HA mode zookeeper

Hi guys,

Just to close the loop, with the Flink 1.3.2 cli you have to provide the
Flink Job Manager host address in order to submit a job like so:
${FLINK_HOME}/bin/flink run -d -m ${FLINK_JOBMANAGER_ADDRESS} ${JOB_JAR}

Since we are running the DCOS Flink package we use the Marathon rest api to
fetch the FLINK_JOBMANAGER_ADDRESS which solved our problem.

We are now thinking of upgrading to the latest 1.6 release. From looking at
the cli docs and from the previous messages it seems you still need to
provide the Job Manager address explicitly. Are there any plans to support
job submission that just takes a zookeeper ensemble and zookeeperNamespace
(which is currently accepted) without having to provide explicit Job
Manager address? This would be more user friendly and would eliminate the
extra step of figuring out the Job Manager address.

Thanks,
M



On Tue, Jul 31, 2018 at 3:54 PM, Till Rohrmann <tr...@apache.org> wrote:

> I think that the web ui automatically redirects to the current leader. So
> if you should access the JobManager which is not leader, then you should
> get an HTTP redirect to the current leader. Due to that it should not be
> strictly necessary to know which of the JobManagers is the leader.
>
> The RestClusterClient uses the ZooKeeperLeaderRetrievalService to
> retrieve the leader address. You could try the same. Using the
> RestClusterClient with Flink 1.4 won't work, though. Alternatively, you
> should be able to directly read the address from the leader ZNode in
> ZooKeeper.
>
> Cheers,
> Till
>
>
>
> On Thu, Jul 26, 2018 at 4:14 AM vino yang <ya...@gmail.com> wrote:
>
>> Hi Youjun,
>>
>> Thanks, you can try this but I am not sure if it works correctly. Because
>> for the REST Client, there are quite a few changes from 1.4 to 1.5.
>>
>> Maybe you can customize the source code in 1.4 refer to specific
>> implementation of 1.5? Another option, upgrade your Flink version.
>>
>> To Chesnay and Till:  any suggestion or opinion?
>>
>> Thanks, vino.
>>
>> 2018-07-26 10:01 GMT+08:00 Yuan,Youjun <yu...@baidu.com>:
>>
>>> Thanks for the information. Forgot to mention, I am using Flink 1.4, the
>>> RestClusterClient seems don’t have the ability to retrieve the leader
>>> address. I did notice there is webMonitorRetrievalService member in Flink
>>> 1.5.
>>>
>>>
>>>
>>> I wonder if I can use RestClusterClient@v1.5 on my client side, to
>>> retrieve the leader JM of Flink v1.4 Cluster.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Youjun
>>>
>>>
>>>
>>> *发件人**:* vino yang <ya...@gmail.com>
>>> *发送时间:* Wednesday, July 25, 2018 7:11 PM
>>> *收件人:* Martin Eden <ma...@gmail.com>
>>> *抄送:* Yuan,Youjun <yu...@baidu.com>; user@flink.apache.org
>>> *主题:* Re: Best way to find the current alive jobmanager with HA mode
>>> zookeeper
>>>
>>>
>>>
>>> Hi Martin,
>>>
>>>
>>>
>>>
>>>
>>> For a standalone cluster which exists multiple JM instances, If you do
>>> not use Rest API, but use Flink provided Cluster client. The client can
>>> perceive which one this the JM leader from multiple JM instances.
>>>
>>>
>>>
>>> For example, you can use CLI to submit flink job in a non-Leader node.
>>>
>>>
>>>
>>> But I did not verify this case for Flink on Mesos.
>>>
>>>
>>>
>>> Thanks, vino.
>>>
>>>
>>>
>>> 2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>:
>>>
>>> Hi,
>>>
>>>
>>>
>>> This is actually very relevant to us as well.
>>>
>>>
>>>
>>> We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
>>> Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
>>> another node by Marathon in case of failure and re-load it's state from
>>> Zookeeper.
>>>
>>>
>>>
>>> Yuan I am guessing you are using Flink in standalone mode and there it
>>> is actually running 3 instances of the Job Manager, 1 active and 2
>>> stand-bys.
>>>
>>>
>>>
>>> Either way, in both cases there is the need to "discover" the hostname
>>> and port of the Job Manager at runtime. This is needed when you want to use
>>> the cli to submit jobs for instance. Is there an elegant mode to submit
>>> jobs other than say just trying out all the possible nodes in your cluster?
>>>
>>>
>>>
>>> Grateful if anyone could clarify any of the above, thanks,
>>>
>>> M
>>>
>>>
>>>
>>> On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>
>>> wrote:
>>>
>>> Hi all,
>>>
>>>
>>>
>>> I have a standalone cluster with 3 jobmanagers, and set *high-availability
>>> to zookeeper*. Our client submits job by REST API(POST
>>> /jars/:jarid/run), which means we need to know the host of the any of the
>>> current alive jobmanagers. The problem is that, how can we know which job
>>> manager is alive, or the host of current leader?  We don’t want to access a
>>> dead JM.
>>>
>>>
>>>
>>> Thanks.
>>>
>>> Youjun Yuan
>>>
>>>
>>>
>>>
>>>
>>
>>

Re: 答复: Best way to find the current alive jobmanager with HA mode zookeeper

Posted by Till Rohrmann <tr...@apache.org>.
Hi Martin,

when configuring Flink to use the ZooKeeper HA mode, then it won't be
necessary to specify the leader's address manually. The CLI will ask
ZooKeeper for the leader information and send the request to the current
leader. This should work with at least Flink >= 1.5 and also with Flink 1.4.

Cheers,
Till

On Tue, Aug 21, 2018 at 10:20 AM Martin Eden <ma...@gmail.com>
wrote:

> Hi guys,
>
> Just to close the loop, with the Flink 1.3.2 cli you have to provide the
> Flink Job Manager host address in order to submit a job like so:
> ${FLINK_HOME}/bin/flink run -d -m ${FLINK_JOBMANAGER_ADDRESS} ${JOB_JAR}
>
> Since we are running the DCOS Flink package we use the Marathon rest api
> to fetch the FLINK_JOBMANAGER_ADDRESS which solved our problem.
>
> We are now thinking of upgrading to the latest 1.6 release. From looking
> at the cli docs and from the previous messages it seems you still need to
> provide the Job Manager address explicitly. Are there any plans to support
> job submission that just takes a zookeeper ensemble and zookeeperNamespace
> (which is currently accepted) without having to provide explicit Job
> Manager address? This would be more user friendly and would eliminate the
> extra step of figuring out the Job Manager address.
>
> Thanks,
> M
>
>
>
> On Tue, Jul 31, 2018 at 3:54 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
>> I think that the web ui automatically redirects to the current leader. So
>> if you should access the JobManager which is not leader, then you should
>> get an HTTP redirect to the current leader. Due to that it should not be
>> strictly necessary to know which of the JobManagers is the leader.
>>
>> The RestClusterClient uses the ZooKeeperLeaderRetrievalService to
>> retrieve the leader address. You could try the same. Using the
>> RestClusterClient with Flink 1.4 won't work, though. Alternatively, you
>> should be able to directly read the address from the leader ZNode in
>> ZooKeeper.
>>
>> Cheers,
>> Till
>>
>>
>>
>> On Thu, Jul 26, 2018 at 4:14 AM vino yang <ya...@gmail.com> wrote:
>>
>>> Hi Youjun,
>>>
>>> Thanks, you can try this but I am not sure if it works correctly.
>>> Because for the REST Client, there are quite a few changes from 1.4 to 1.5.
>>>
>>> Maybe you can customize the source code in 1.4 refer to specific
>>> implementation of 1.5? Another option, upgrade your Flink version.
>>>
>>> To Chesnay and Till:  any suggestion or opinion?
>>>
>>> Thanks, vino.
>>>
>>> 2018-07-26 10:01 GMT+08:00 Yuan,Youjun <yu...@baidu.com>:
>>>
>>>> Thanks for the information. Forgot to mention, I am using Flink 1.4,
>>>> the RestClusterClient seems don’t have the ability to retrieve the leader
>>>> address. I did notice there is webMonitorRetrievalService member in Flink
>>>> 1.5.
>>>>
>>>>
>>>>
>>>> I wonder if I can use RestClusterClient@v1.5 on my client side, to
>>>> retrieve the leader JM of Flink v1.4 Cluster.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Youjun
>>>>
>>>>
>>>>
>>>> *发件人**:* vino yang <ya...@gmail.com>
>>>> *发送时间:* Wednesday, July 25, 2018 7:11 PM
>>>> *收件人:* Martin Eden <ma...@gmail.com>
>>>> *抄送:* Yuan,Youjun <yu...@baidu.com>; user@flink.apache.org
>>>> *主题:* Re: Best way to find the current alive jobmanager with HA mode
>>>> zookeeper
>>>>
>>>>
>>>>
>>>> Hi Martin,
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> For a standalone cluster which exists multiple JM instances, If you do
>>>> not use Rest API, but use Flink provided Cluster client. The client can
>>>> perceive which one this the JM leader from multiple JM instances.
>>>>
>>>>
>>>>
>>>> For example, you can use CLI to submit flink job in a non-Leader node.
>>>>
>>>>
>>>>
>>>> But I did not verify this case for Flink on Mesos.
>>>>
>>>>
>>>>
>>>> Thanks, vino.
>>>>
>>>>
>>>>
>>>> 2018-07-25 17:22 GMT+08:00 Martin Eden <ma...@gmail.com>:
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> This is actually very relevant to us as well.
>>>>
>>>>
>>>>
>>>> We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
>>>> Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
>>>> another node by Marathon in case of failure and re-load it's state from
>>>> Zookeeper.
>>>>
>>>>
>>>>
>>>> Yuan I am guessing you are using Flink in standalone mode and there it
>>>> is actually running 3 instances of the Job Manager, 1 active and 2
>>>> stand-bys.
>>>>
>>>>
>>>>
>>>> Either way, in both cases there is the need to "discover" the hostname
>>>> and port of the Job Manager at runtime. This is needed when you want to use
>>>> the cli to submit jobs for instance. Is there an elegant mode to submit
>>>> jobs other than say just trying out all the possible nodes in your cluster?
>>>>
>>>>
>>>>
>>>> Grateful if anyone could clarify any of the above, thanks,
>>>>
>>>> M
>>>>
>>>>
>>>>
>>>> On Wed, Jul 25, 2018 at 11:37 AM, Yuan,Youjun <yu...@baidu.com>
>>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>>
>>>>
>>>> I have a standalone cluster with 3 jobmanagers, and set *high-availability
>>>> to zookeeper*. Our client submits job by REST API(POST
>>>> /jars/:jarid/run), which means we need to know the host of the any of the
>>>> current alive jobmanagers. The problem is that, how can we know which job
>>>> manager is alive, or the host of current leader?  We don’t want to access a
>>>> dead JM.
>>>>
>>>>
>>>>
>>>> Thanks.
>>>>
>>>> Youjun Yuan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>