You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by "Jain, Ankit" <an...@here.com> on 2017/05/01 16:59:59 UTC

High Availability on Yarn

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-

1) https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?

2) Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?

3) https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Thanks for the reply Robert – I will try out #1 & keep you posted.

From: Robert Metzger <rm...@apache.org>
Date: Wednesday, May 24, 2017 at 7:44 AM
To: "Jain, Ankit" <an...@here.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Hi Ankit,

I realized I can answer your questions myself :)

#1 I think that's possible, by using the same high-availability.zookeeper.path.root configuration parameter between the runs.
By default, on YARN we are using the YARN application ID as the root path, but if you are putting a custom one there, Flink will recover running jobs even if you are starting a new EMR cluster (assuming the files are in s3).

#2 in current Flink we can not expand a running Flink job! Yarn will see new machines being added to the cluster, and it can use them for future Flink deployments on YARN (as you said).
We are working on adding support for dynamically changing the Flink hardware allocations as part of FLIP-6.

Please keep asking and bugging us if we are not responding. Its just that most Flink developers are quite busy with the 1.3 release right now.

Regards,
Robert

On Wed, May 24, 2017 at 3:36 PM, Robert Metzger <rm...@apache.org>> wrote:
Hi Ankit,

I'm sorry that nobody is responding to the message. I'll try to find somebody.

On Tue, May 23, 2017 at 10:27 PM, Jain, Ankit <an...@here.com>> wrote:
Following up on this.

From: "Jain, Ankit" <an...@here.com>>
Date: Tuesday, May 16, 2017 at 12:14 AM

To: Stephan Ewen <se...@apache.org>>, "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: High Availability on Yarn

Bringing it back to list’s focus.

From: "Jain, Ankit" <an...@here.com>>
Date: Thursday, May 11, 2017 at 1:19 PM
To: Stephan Ewen <se...@apache.org>>, "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: High Availability on Yarn

Got the answer on #2, looks like that will work, still looking for suggestions on #1.

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>>
Date: Thursday, May 11, 2017 at 8:26 AM
To: Stephan Ewen <se...@apache.org>>, "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: High Availability on Yarn

Following up further on this.

1)      We are using a long running EMR cluster to submit jobs right now and as you know EMR hasn’t made Yarn ResourceManager HA.

Is there any way we can use the information put in Zookeeper by Flink Job Manager to bring the jobs back up on a new EMR cluster if RM goes down?

We are not looking for completely automated option but maybe write a script which reads Zookeeper and re-starts all jobs on a fresh EMR cluster?

I am assuming if  Yarn ResouceManager goes down, there is no way to just bring it back up – you have to start a new EMR cluster?

2)      Regarding elasticity, I know for now a running flink cluster can’t make use of new hosts added to EMR but can I am guessing Yarn will still see the new hosts and new flink jobs can make use it, is that right?

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>>
Date: Monday, May 8, 2017 at 9:09 AM
To: Stephan Ewen <se...@apache.org>>, "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: High Availability on Yarn

Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully have it started through a cloudformation script as part of EMR startup.

Is Zk also used to keep track of checkpoint metadata and the execution graph of the running job to recover from ApplicationMaster failure as Aljoscha was guessing below or only for leader election in case of accidently running multiple Application Masters ?

Thanks
Ankit

From: Stephan Ewen <se...@apache.org>>
Date: Monday, May 8, 2017 at 9:00 AM
To: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, "Jain, Ankit" <an...@here.com>>
Subject: Re: High Availability on Yarn

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one JobManager in the normal case, Yarn can accidentally create a second one when there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any existing ZooKeeper, or user one ZooKeeper cluster for very many Flink clusters/jobs.

Stephan

On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>> wrote:
Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com>> wrote:

Thanks for the update Aljoscha.

@Till Rohrmann<ma...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by Robert Metzger <rm...@apache.org>.

Hi Ankit,

I realized I can answer your questions myself :)

#1 I think that's possible, by using the same
high-availability.zookeeper.path.root configuration parameter between the
runs.
By default, on YARN we are using the YARN application ID as the root path,
but if you are putting a custom one there, Flink will recover running jobs
even if you are starting a new EMR cluster (assuming the files are in s3).

#2 in current Flink we can not expand a running Flink job! Yarn will see
new machines being added to the cluster, and it can use them for future
Flink deployments on YARN (as you said).
We are working on adding support for dynamically changing the Flink
hardware allocations as part of FLIP-6.


Please keep asking and bugging us if we are not responding. Its just that
most Flink developers are quite busy with the 1.3 release right now.

Regards,
Robert


On Wed, May 24, 2017 at 3:36 PM, Robert Metzger <rm...@apache.org> wrote:

> Hi Ankit,
>
> I'm sorry that nobody is responding to the message. I'll try to find
> somebody.
>
> On Tue, May 23, 2017 at 10:27 PM, Jain, Ankit <an...@here.com> wrote:
>
>> Following up on this.
>>
>>
>>
>> *From: *"Jain, Ankit" <an...@here.com>
>> *Date: *Tuesday, May 16, 2017 at 12:14 AM
>>
>> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
>> user@flink.apache.org>
>> *Subject: *Re: High Availability on Yarn
>>
>>
>>
>> Bringing it back to list’s focus.
>>
>>
>>
>> *From: *"Jain, Ankit" <an...@here.com>
>> *Date: *Thursday, May 11, 2017 at 1:19 PM
>> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
>> user@flink.apache.org>
>> *Subject: *Re: High Availability on Yarn
>>
>>
>>
>> Got the answer on #2, looks like that will work, still looking for
>> suggestions on #1.
>>
>>
>>
>> Thanks
>>
>> Ankit
>>
>>
>>
>> *From: *"Jain, Ankit" <an...@here.com>
>> *Date: *Thursday, May 11, 2017 at 8:26 AM
>> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
>> user@flink.apache.org>
>> *Subject: *Re: High Availability on Yarn
>>
>>
>>
>> Following up further on this.
>>
>>
>>
>> 1)      We are using a long running EMR cluster to submit jobs right now
>> and as you know EMR hasn’t made Yarn ResourceManager HA.
>>
>> Is there any way we can use the information put in Zookeeper by Flink Job
>> Manager to bring the jobs back up on a new EMR cluster if RM goes down?
>>
>>
>>
>> We are not looking for completely automated option but maybe write a
>> script which reads Zookeeper and re-starts all jobs on a fresh EMR cluster?
>>
>> I am assuming if  Yarn ResouceManager goes down, there is no way to just
>> bring it back up – you have to start a new EMR cluster?
>>
>>
>>
>> 2)      Regarding elasticity, I know for now a running flink cluster
>> can’t make use of new hosts added to EMR but can I am guessing Yarn will
>> still see the new hosts and new flink jobs can make use it, is that right?
>>
>>
>>
>>
>>
>> Thanks
>>
>> Ankit
>>
>>
>>
>> *From: *"Jain, Ankit" <an...@here.com>
>> *Date: *Monday, May 8, 2017 at 9:09 AM
>> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
>> user@flink.apache.org>
>> *Subject: *Re: High Availability on Yarn
>>
>>
>>
>> Thanks Stephan – we will go with a central ZooKeeper Instance and
>> hopefully have it started through a cloudformation script as part of EMR
>> startup.
>>
>>
>>
>> Is Zk also used to keep track of checkpoint metadata and the execution
>> graph of the running job to recover from ApplicationMaster failure as
>> Aljoscha was guessing below or only for leader election in case of
>> accidently running multiple Application Masters ?
>>
>>
>>
>> Thanks
>>
>> Ankit
>>
>>
>>
>> *From: *Stephan Ewen <se...@apache.org>
>> *Date: *Monday, May 8, 2017 at 9:00 AM
>> *To: *"user@flink.apache.org" <us...@flink.apache.org>, "Jain, Ankit" <
>> ankit.jain@here.com>
>> *Subject: *Re: High Availability on Yarn
>>
>>
>>
>> @Ankit:
>>
>>
>>
>> ZooKeeper is required in YARN setups still. Even if there is only one
>> JobManager in the normal case, Yarn can accidentally create a second one
>> when there is a network partition.
>>
>> To prevent that this leads to inconsistencies, we use ZooKeeper.
>>
>>
>>
>> Flink uses ZooKeeper very little, so you can just let Flink attach to any
>> existing ZooKeeper, or user one ZooKeeper cluster for very many Flink
>> clusters/jobs.
>>
>>
>>
>> Stephan
>>
>>
>>
>>
>>
>> On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>
>> wrote:
>>
>> Hi,
>>
>> Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.
>>
>>
>>
>> Best,
>>
>> Aljoscha
>>
>>
>>
>> On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com> wrote:
>>
>>
>>
>> Thanks for the update Aljoscha.
>>
>>
>>
>> @Till Rohrmann <tr...@apache.org>,
>>
>> Can you please chim in?
>>
>>
>>
>> Also, we currently have a long running EMR cluster where we create one
>> flink cluster per job – can we just choose to install Zookeeper when
>> creating the EMR cluster and use one Zookeeper instance for ALL of flink
>> jobs?
>>
>> Or
>>
>> Recommendation is to have a dedicated Zookeeper instance per flink job?
>>
>>
>>
>> Thanks
>>
>> Ankit
>>
>>
>>
>> *From: *Aljoscha Krettek <al...@apache.org>
>> *Date: *Thursday, May 4, 2017 at 1:19 AM
>> *To: *"Jain, Ankit" <an...@here.com>
>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <
>> trohrmann@apache.org>
>> *Subject: *Re: High Availability on Yarn
>>
>>
>>
>> Hi,
>>
>> Yes, for YARN there is only one running JobManager. As far as I Know, In
>> this case ZooKeeper is only used to keep track of checkpoint metadata and
>> the execution graph of the running job. Such that a restoring JobManager
>> can pick up the data again.
>>
>>
>>
>> I’m not 100 % sure on this, though, so maybe Till can shed some light on
>> this.
>>
>>
>>
>> Best,
>>
>> Aljoscha
>>
>> On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com> wrote:
>>
>>
>>
>> Thanks for your reply Aljoscha.
>>
>>
>>
>> After building better understanding of Yarn and spending copious amount
>> of time on Flink codebase, I think I now get how Flink & Yarn interact – I
>> plan to document this soon in case it could help somebody starting afresh
>> with Flink-Yarn.
>>
>>
>>
>> Regarding Zookeper, in YARN mode there is only one JobManager running, do
>> we still need leader election?
>>
>>
>>
>> If the ApplicationMaster goes down (where JM runs) it is restarted by
>> Yarn RM and while restarting, Flink AM will bring back previous running
>> containers.  So, where does Zookeeper sit in this setup?
>>
>>
>>
>> Thanks
>>
>> Ankit
>>
>>
>>
>> *From: *Aljoscha Krettek <al...@apache.org>
>> *Date: *Wednesday, May 3, 2017 at 2:05 AM
>> *To: *"Jain, Ankit" <an...@here.com>
>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <
>> trohrmann@apache.org>
>> *Subject: *Re: High Availability on Yarn
>>
>>
>>
>> Hi,
>>
>> As a first comment, the work mentioned in the FLIP-6 doc you linked is
>> still work-in-progress. You cannot use these abstractions yet without going
>> into the code and setting up a cluster “by hand”.
>>
>>
>>
>> The documentation for one-step deployment of a Job to YARN is available
>> here: https://ci.apache.org/projects/flink/flink-docs-releas
>> e-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn
>> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>
>>
>>
>>
>> Regarding your third question, ZooKeeper is mostly used for discovery and
>> leader election. That is, JobManagers use it to decide who is the main JM
>> and who are standby JMs. TaskManagers use it to discover the leading
>> JobManager that they should connect to.
>>
>>
>>
>> I’m also cc’ing Till, who should know this stuff better and can maybe
>> explain it in a bit more detail.
>>
>>
>>
>> Best,
>>
>> Aljoscha
>>
>> On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com> wrote:
>>
>>
>>
>> Hi fellow users,
>>
>> We are trying to straighten out high availability story for flink.
>>
>>
>>
>> Our setup includes a long running EMR cluster, job submission is a
>> two-step process – 1) Flink cluster is first created using flink yarn
>> client on the EMR cluster already running 2) Flink job is submitted.
>>
>>
>>
>> I also saw references that with 1.2, these two steps have been combined
>> into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to
>> documentation please?
>>
>>
>>
>> W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly
>> introduced) failure for now, I want to understand first how task manager &
>> job manager failures are handled.
>>
>>
>>
>> My questions-
>>
>> 1)       https://cwiki.apache.org/confluence/pages/viewpage.
>> action?pageId=65147077
>> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0>
>>  suggests a new RM has been added and now there is one JobManager for
>> each job. Since Yarn RM will now talk to Flink RM( instead of JobManager
>> previously), will Yarn automatically restart failing Flink RM?
>>
>> 2)       Is there any documentation on behavior of new Flink RM that
>> will come up? How will previously running JobManagers & TaskManagers find
>> out about new RM?
>>
>> 3)       https://ci.apache.org/projects/flink/flink-docs-rel
>> ease-1.3/setup/jobmanager_high_availability.html#configuration
>> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0>
>>  requires configuring Zookeeper even for Yarn – Is this needed for
>> handling Task Manager failures or JM or both? Will Yarn not take care of JM
>> failures?
>>
>>
>>
>> It may sound like I am little confused between role of Yarn and Flink
>> components– who has the most burden of HA? Documentation in current state
>> is lacking clarity – I know it is still evolving.
>>
>>
>>
>> Please let me know if somebody can help clear the confusion.
>>
>>
>>
>> Thanks
>>
>> Ankit
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>

Re: High Availability on Yarn

Posted by Robert Metzger <rm...@apache.org>.

Hi Ankit,

I'm sorry that nobody is responding to the message. I'll try to find
somebody.

On Tue, May 23, 2017 at 10:27 PM, Jain, Ankit <an...@here.com> wrote:

> Following up on this.
>
>
>
> *From: *"Jain, Ankit" <an...@here.com>
> *Date: *Tuesday, May 16, 2017 at 12:14 AM
>
> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: High Availability on Yarn
>
>
>
> Bringing it back to list’s focus.
>
>
>
> *From: *"Jain, Ankit" <an...@here.com>
> *Date: *Thursday, May 11, 2017 at 1:19 PM
> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: High Availability on Yarn
>
>
>
> Got the answer on #2, looks like that will work, still looking for
> suggestions on #1.
>
>
>
> Thanks
>
> Ankit
>
>
>
> *From: *"Jain, Ankit" <an...@here.com>
> *Date: *Thursday, May 11, 2017 at 8:26 AM
> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: High Availability on Yarn
>
>
>
> Following up further on this.
>
>
>
> 1)      We are using a long running EMR cluster to submit jobs right now
> and as you know EMR hasn’t made Yarn ResourceManager HA.
>
> Is there any way we can use the information put in Zookeeper by Flink Job
> Manager to bring the jobs back up on a new EMR cluster if RM goes down?
>
>
>
> We are not looking for completely automated option but maybe write a
> script which reads Zookeeper and re-starts all jobs on a fresh EMR cluster?
>
> I am assuming if  Yarn ResouceManager goes down, there is no way to just
> bring it back up – you have to start a new EMR cluster?
>
>
>
> 2)      Regarding elasticity, I know for now a running flink cluster
> can’t make use of new hosts added to EMR but can I am guessing Yarn will
> still see the new hosts and new flink jobs can make use it, is that right?
>
>
>
>
>
> Thanks
>
> Ankit
>
>
>
> *From: *"Jain, Ankit" <an...@here.com>
> *Date: *Monday, May 8, 2017 at 9:09 AM
> *To: *Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: High Availability on Yarn
>
>
>
> Thanks Stephan – we will go with a central ZooKeeper Instance and
> hopefully have it started through a cloudformation script as part of EMR
> startup.
>
>
>
> Is Zk also used to keep track of checkpoint metadata and the execution
> graph of the running job to recover from ApplicationMaster failure as
> Aljoscha was guessing below or only for leader election in case of
> accidently running multiple Application Masters ?
>
>
>
> Thanks
>
> Ankit
>
>
>
> *From: *Stephan Ewen <se...@apache.org>
> *Date: *Monday, May 8, 2017 at 9:00 AM
> *To: *"user@flink.apache.org" <us...@flink.apache.org>, "Jain, Ankit" <
> ankit.jain@here.com>
> *Subject: *Re: High Availability on Yarn
>
>
>
> @Ankit:
>
>
>
> ZooKeeper is required in YARN setups still. Even if there is only one
> JobManager in the normal case, Yarn can accidentally create a second one
> when there is a network partition.
>
> To prevent that this leads to inconsistencies, we use ZooKeeper.
>
>
>
> Flink uses ZooKeeper very little, so you can just let Flink attach to any
> existing ZooKeeper, or user one ZooKeeper cluster for very many Flink
> clusters/jobs.
>
>
>
> Stephan
>
>
>
>
>
> On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>
> wrote:
>
> Hi,
>
> Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.
>
>
>
> Best,
>
> Aljoscha
>
>
>
> On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com> wrote:
>
>
>
> Thanks for the update Aljoscha.
>
>
>
> @Till Rohrmann <tr...@apache.org>,
>
> Can you please chim in?
>
>
>
> Also, we currently have a long running EMR cluster where we create one
> flink cluster per job – can we just choose to install Zookeeper when
> creating the EMR cluster and use one Zookeeper instance for ALL of flink
> jobs?
>
> Or
>
> Recommendation is to have a dedicated Zookeeper instance per flink job?
>
>
>
> Thanks
>
> Ankit
>
>
>
> *From: *Aljoscha Krettek <al...@apache.org>
> *Date: *Thursday, May 4, 2017 at 1:19 AM
> *To: *"Jain, Ankit" <an...@here.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <
> trohrmann@apache.org>
> *Subject: *Re: High Availability on Yarn
>
>
>
> Hi,
>
> Yes, for YARN there is only one running JobManager. As far as I Know, In
> this case ZooKeeper is only used to keep track of checkpoint metadata and
> the execution graph of the running job. Such that a restoring JobManager
> can pick up the data again.
>
>
>
> I’m not 100 % sure on this, though, so maybe Till can shed some light on
> this.
>
>
>
> Best,
>
> Aljoscha
>
> On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com> wrote:
>
>
>
> Thanks for your reply Aljoscha.
>
>
>
> After building better understanding of Yarn and spending copious amount of
> time on Flink codebase, I think I now get how Flink & Yarn interact – I
> plan to document this soon in case it could help somebody starting afresh
> with Flink-Yarn.
>
>
>
> Regarding Zookeper, in YARN mode there is only one JobManager running, do
> we still need leader election?
>
>
>
> If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn
> RM and while restarting, Flink AM will bring back previous running
> containers.  So, where does Zookeeper sit in this setup?
>
>
>
> Thanks
>
> Ankit
>
>
>
> *From: *Aljoscha Krettek <al...@apache.org>
> *Date: *Wednesday, May 3, 2017 at 2:05 AM
> *To: *"Jain, Ankit" <an...@here.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <
> trohrmann@apache.org>
> *Subject: *Re: High Availability on Yarn
>
>
>
> Hi,
>
> As a first comment, the work mentioned in the FLIP-6 doc you linked is
> still work-in-progress. You cannot use these abstractions yet without going
> into the code and setting up a cluster “by hand”.
>
>
>
> The documentation for one-step deployment of a Job to YARN is available
> here: https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn
> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>
>
>
>
> Regarding your third question, ZooKeeper is mostly used for discovery and
> leader election. That is, JobManagers use it to decide who is the main JM
> and who are standby JMs. TaskManagers use it to discover the leading
> JobManager that they should connect to.
>
>
>
> I’m also cc’ing Till, who should know this stuff better and can maybe
> explain it in a bit more detail.
>
>
>
> Best,
>
> Aljoscha
>
> On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com> wrote:
>
>
>
> Hi fellow users,
>
> We are trying to straighten out high availability story for flink.
>
>
>
> Our setup includes a long running EMR cluster, job submission is a
> two-step process – 1) Flink cluster is first created using flink yarn
> client on the EMR cluster already running 2) Flink job is submitted.
>
>
>
> I also saw references that with 1.2, these two steps have been combined
> into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to
> documentation please?
>
>
>
> W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly
> introduced) failure for now, I want to understand first how task manager &
> job manager failures are handled.
>
>
>
> My questions-
>
> 1)       https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65147077
> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0>
>  suggests a new RM has been added and now there is one JobManager for
> each job. Since Yarn RM will now talk to Flink RM( instead of JobManager
> previously), will Yarn automatically restart failing Flink RM?
>
> 2)       Is there any documentation on behavior of new Flink RM that will
> come up? How will previously running JobManagers & TaskManagers find out
> about new RM?
>
> 3)       https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/jobmanager_high_availability.html#configuration
> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0>
>  requires configuring Zookeeper even for Yarn – Is this needed for
> handling Task Manager failures or JM or both? Will Yarn not take care of JM
> failures?
>
>
>
> It may sound like I am little confused between role of Yarn and Flink
> components– who has the most burden of HA? Documentation in current state
> is lacking clarity – I know it is still evolving.
>
>
>
> Please let me know if somebody can help clear the confusion.
>
>
>
> Thanks
>
> Ankit
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Following up on this.

From: "Jain, Ankit" <an...@here.com>
Date: Tuesday, May 16, 2017 at 12:14 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Bringing it back to list’s focus.

From: "Jain, Ankit" <an...@here.com>
Date: Thursday, May 11, 2017 at 1:19 PM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Got the answer on #2, looks like that will work, still looking for suggestions on #1.

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>
Date: Thursday, May 11, 2017 at 8:26 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Following up further on this.

1)      We are using a long running EMR cluster to submit jobs right now and as you know EMR hasn’t made Yarn ResourceManager HA.

Is there any way we can use the information put in Zookeeper by Flink Job Manager to bring the jobs back up on a new EMR cluster if RM goes down?

We are not looking for completely automated option but maybe write a script which reads Zookeeper and re-starts all jobs on a fresh EMR cluster?

I am assuming if  Yarn ResouceManager goes down, there is no way to just bring it back up – you have to start a new EMR cluster?

2)      Regarding elasticity, I know for now a running flink cluster can’t make use of new hosts added to EMR but can I am guessing Yarn will still see the new hosts and new flink jobs can make use it, is that right?

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>
Date: Monday, May 8, 2017 at 9:09 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully have it started through a cloudformation script as part of EMR startup.

Is Zk also used to keep track of checkpoint metadata and the execution graph of the running job to recover from ApplicationMaster failure as Aljoscha was guessing below or only for leader election in case of accidently running multiple Application Masters ?

Thanks
Ankit

From: Stephan Ewen <se...@apache.org>
Date: Monday, May 8, 2017 at 9:00 AM
To: "user@flink.apache.org" <us...@flink.apache.org>, "Jain, Ankit" <an...@here.com>
Subject: Re: High Availability on Yarn

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one JobManager in the normal case, Yarn can accidentally create a second one when there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any existing ZooKeeper, or user one ZooKeeper cluster for very many Flink clusters/jobs.

Stephan

On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>> wrote:
Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com>> wrote:

Thanks for the update Aljoscha.

@Till Rohrmann<ma...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Bringing it back to list’s focus.

From: "Jain, Ankit" <an...@here.com>
Date: Thursday, May 11, 2017 at 1:19 PM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Got the answer on #2, looks like that will work, still looking for suggestions on #1.

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>
Date: Thursday, May 11, 2017 at 8:26 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Following up further on this.

1)      We are using a long running EMR cluster to submit jobs right now and as you know EMR hasn’t made Yarn ResourceManager HA.

Is there any way we can use the information put in Zookeeper by Flink Job Manager to bring the jobs back up on a new EMR cluster if RM goes down?

We are not looking for completely automated option but maybe write a script which reads Zookeeper and re-starts all jobs on a fresh EMR cluster?

I am assuming if  Yarn ResouceManager goes down, there is no way to just bring it back up – you have to start a new EMR cluster?

2)      Regarding elasticity, I know for now a running flink cluster can’t make use of new hosts added to EMR but can I am guessing Yarn will still see the new hosts and new flink jobs can make use it, is that right?

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>
Date: Monday, May 8, 2017 at 9:09 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully have it started through a cloudformation script as part of EMR startup.

Is Zk also used to keep track of checkpoint metadata and the execution graph of the running job to recover from ApplicationMaster failure as Aljoscha was guessing below or only for leader election in case of accidently running multiple Application Masters ?

Thanks
Ankit

From: Stephan Ewen <se...@apache.org>
Date: Monday, May 8, 2017 at 9:00 AM
To: "user@flink.apache.org" <us...@flink.apache.org>, "Jain, Ankit" <an...@here.com>
Subject: Re: High Availability on Yarn

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one JobManager in the normal case, Yarn can accidentally create a second one when there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any existing ZooKeeper, or user one ZooKeeper cluster for very many Flink clusters/jobs.

Stephan

On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>> wrote:
Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com>> wrote:

Thanks for the update Aljoscha.

@Till Rohrmann<ma...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Got the answer on #2, looks like that will work, still looking for suggestions on #1.

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>
Date: Thursday, May 11, 2017 at 8:26 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Following up further on this.

1)      We are using a long running EMR cluster to submit jobs right now and as you know EMR hasn’t made Yarn ResourceManager HA.

Is there any way we can use the information put in Zookeeper by Flink Job Manager to bring the jobs back up on a new EMR cluster if RM goes down?

We are not looking for completely automated option but maybe write a script which reads Zookeeper and re-starts all jobs on a fresh EMR cluster?

I am assuming if  Yarn ResouceManager goes down, there is no way to just bring it back up – you have to start a new EMR cluster?

2)      Regarding elasticity, I know for now a running flink cluster can’t make use of new hosts added to EMR but can I am guessing Yarn will still see the new hosts and new flink jobs can make use it, is that right?

Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>
Date: Monday, May 8, 2017 at 9:09 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully have it started through a cloudformation script as part of EMR startup.

Is Zk also used to keep track of checkpoint metadata and the execution graph of the running job to recover from ApplicationMaster failure as Aljoscha was guessing below or only for leader election in case of accidently running multiple Application Masters ?

Thanks
Ankit

From: Stephan Ewen <se...@apache.org>
Date: Monday, May 8, 2017 at 9:00 AM
To: "user@flink.apache.org" <us...@flink.apache.org>, "Jain, Ankit" <an...@here.com>
Subject: Re: High Availability on Yarn

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one JobManager in the normal case, Yarn can accidentally create a second one when there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any existing ZooKeeper, or user one ZooKeeper cluster for very many Flink clusters/jobs.

Stephan

On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>> wrote:
Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com>> wrote:

Thanks for the update Aljoscha.

@Till Rohrmann<ma...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Following up further on this.


1)       We are using a long running EMR cluster to submit jobs right now and as you know EMR hasn’t made Yarn ResourceManager HA.

Is there any way we can use the information put in Zookeeper by Flink Job Manager to bring the jobs back up on a new EMR cluster if RM goes down?



We are not looking for completely automated option but maybe write a script which reads Zookeeper and re-starts all jobs on a fresh EMR cluster?

I am assuming if  Yarn ResouceManager goes down, there is no way to just bring it back up – you have to start a new EMR cluster?



2)       Regarding elasticity, I know for now a running flink cluster can’t make use of new hosts added to EMR but can I am guessing Yarn will still see the new hosts and new flink jobs can make use it, is that right?


Thanks
Ankit

From: "Jain, Ankit" <an...@here.com>
Date: Monday, May 8, 2017 at 9:09 AM
To: Stephan Ewen <se...@apache.org>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: High Availability on Yarn

Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully have it started through a cloudformation script as part of EMR startup.

Is Zk also used to keep track of checkpoint metadata and the execution graph of the running job to recover from ApplicationMaster failure as Aljoscha was guessing below or only for leader election in case of accidently running multiple Application Masters ?

Thanks
Ankit

From: Stephan Ewen <se...@apache.org>
Date: Monday, May 8, 2017 at 9:00 AM
To: "user@flink.apache.org" <us...@flink.apache.org>, "Jain, Ankit" <an...@here.com>
Subject: Re: High Availability on Yarn

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one JobManager in the normal case, Yarn can accidentally create a second one when there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any existing ZooKeeper, or user one ZooKeeper cluster for very many Flink clusters/jobs.

Stephan


On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>> wrote:
Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com>> wrote:

Thanks for the update Aljoscha.

@Till Rohrmann<ma...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully have it started through a cloudformation script as part of EMR startup.

Is Zk also used to keep track of checkpoint metadata and the execution graph of the running job to recover from ApplicationMaster failure as Aljoscha was guessing below or only for leader election in case of accidently running multiple Application Masters ?

Thanks
Ankit

From: Stephan Ewen <se...@apache.org>
Date: Monday, May 8, 2017 at 9:00 AM
To: "user@flink.apache.org" <us...@flink.apache.org>, "Jain, Ankit" <an...@here.com>
Subject: Re: High Availability on Yarn

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one JobManager in the normal case, Yarn can accidentally create a second one when there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any existing ZooKeeper, or user one ZooKeeper cluster for very many Flink clusters/jobs.

Stephan

On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>> wrote:
Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com>> wrote:

Thanks for the update Aljoscha.

@Till Rohrmann<ma...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by Stephan Ewen <se...@apache.org>.

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one
JobManager in the normal case, Yarn can accidentally create a second one
when there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any
existing ZooKeeper, or user one ZooKeeper cluster for very many Flink
clusters/jobs.

Stephan


On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> Hi,
> Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.
>
> Best,
> Aljoscha
>
> On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com> wrote:
>
> Thanks for the update Aljoscha.
>
> @Till Rohrmann <tr...@apache.org>,
> Can you please chim in?
>
> Also, we currently have a long running EMR cluster where we create one
> flink cluster per job – can we just choose to install Zookeeper when
> creating the EMR cluster and use one Zookeeper instance for ALL of flink
> jobs?
> Or
> Recommendation is to have a dedicated Zookeeper instance per flink job?
>
> Thanks
> Ankit
>
> *From: *Aljoscha Krettek <al...@apache.org>
> *Date: *Thursday, May 4, 2017 at 1:19 AM
> *To: *"Jain, Ankit" <an...@here.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <
> trohrmann@apache.org>
> *Subject: *Re: High Availability on Yarn
>
> Hi,
> Yes, for YARN there is only one running JobManager. As far as I Know, In
> this case ZooKeeper is only used to keep track of checkpoint metadata and
> the execution graph of the running job. Such that a restoring JobManager
> can pick up the data again.
>
> I’m not 100 % sure on this, though, so maybe Till can shed some light on
> this.
>
> Best,
> Aljoscha
>
> On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com> wrote:
>
> Thanks for your reply Aljoscha.
>
> After building better understanding of Yarn and spending copious amount of
> time on Flink codebase, I think I now get how Flink & Yarn interact – I
> plan to document this soon in case it could help somebody starting afresh
> with Flink-Yarn.
>
> Regarding Zookeper, in YARN mode there is only one JobManager running, do
> we still need leader election?
>
> If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn
> RM and while restarting, Flink AM will bring back previous running
> containers.  So, where does Zookeeper sit in this setup?
>
> Thanks
> Ankit
>
> *From: *Aljoscha Krettek <al...@apache.org>
> *Date: *Wednesday, May 3, 2017 at 2:05 AM
> *To: *"Jain, Ankit" <an...@here.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <
> trohrmann@apache.org>
> *Subject: *Re: High Availability on Yarn
>
> Hi,
> As a first comment, the work mentioned in the FLIP-6 doc you linked is
> still work-in-progress. You cannot use these abstractions yet without going
> into the code and setting up a cluster “by hand”.
>
> The documentation for one-step deployment of a Job to YARN is available
> here: https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn
> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>
>
> Regarding your third question, ZooKeeper is mostly used for discovery and
> leader election. That is, JobManagers use it to decide who is the main JM
> and who are standby JMs. TaskManagers use it to discover the leading
> JobManager that they should connect to.
>
> I’m also cc’ing Till, who should know this stuff better and can maybe
> explain it in a bit more detail.
>
> Best,
> Aljoscha
>
> On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com> wrote:
>
> Hi fellow users,
> We are trying to straighten out high availability story for flink.
>
> Our setup includes a long running EMR cluster, job submission is a
> two-step process – 1) Flink cluster is first created using flink yarn
> client on the EMR cluster already running 2) Flink job is submitted.
>
> I also saw references that with 1.2, these two steps have been combined
> into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to
> documentation please?
>
> W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly
> introduced) failure for now, I want to understand first how task manager &
> job manager failures are handled.
>
> My questions-
> 1)       https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65147077
> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0>
>  suggests a new RM has been added and now there is one JobManager for
> each job. Since Yarn RM will now talk to Flink RM( instead of JobManager
> previously), will Yarn automatically restart failing Flink RM?
> 2)       Is there any documentation on behavior of new Flink RM that will
> come up? How will previously running JobManagers & TaskManagers find out
> about new RM?
> 3)       https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/jobmanager_high_availability.html#configuration
> <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0>
>  requires configuring Zookeeper even for Yarn – Is this needed for
> handling Task Manager failures or JM or both? Will Yarn not take care of JM
> failures?
>
> It may sound like I am little confused between role of Yarn and Flink
> components– who has the most burden of HA? Documentation in current state
> is lacking clarity – I know it is still evolving.
>
> Please let me know if somebody can help clear the confusion.
>
> Thanks
> Ankit
>
>
>
>
>
>
>
>
>

Re: High Availability on Yarn

Posted by Aljoscha Krettek <al...@apache.org>.

Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

> On 5. May 2017, at 16:56, Jain, Ankit <an...@here.com> wrote:
> 
> Thanks for the update Aljoscha.
>  
> @Till Rohrmann <ma...@apache.org>,
> Can you please chim in?
>  
> Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
> Or
> Recommendation is to have a dedicated Zookeeper instance per flink job?
>  
> Thanks
> Ankit
>  
> From: Aljoscha Krettek <al...@apache.org>
> Date: Thursday, May 4, 2017 at 1:19 AM
> To: "Jain, Ankit" <an...@here.com>
> Cc: "user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <tr...@apache.org>
> Subject: Re: High Availability on Yarn
>  
> Hi, 
> Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.
>  
> I’m not 100 % sure on this, though, so maybe Till can shed some light on this.
>  
> Best,
> Aljoscha
> On 3. May 2017, at 16:58, Jain, Ankit <ankit.jain@here.com <ma...@here.com>> wrote:
>  
> Thanks for your reply Aljoscha.
>  
> After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.
>  
> Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?
>  
> If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?
>  
> Thanks
> Ankit
>  
> From: Aljoscha Krettek <aljoscha@apache.org <ma...@apache.org>>
> Date: Wednesday, May 3, 2017 at 2:05 AM
> To: "Jain, Ankit" <ankit.jain@here.com <ma...@here.com>>
> Cc: "user@flink.apache.org <ma...@flink.apache.org>" <user@flink.apache.org <ma...@flink.apache.org>>, Till Rohrmann <trohrmann@apache.org <ma...@apache.org>>
> Subject: Re: High Availability on Yarn
>  
> Hi, 
> As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.
>  
> The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>
>  
> Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.
>  
> I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.
>  
> Best,
> Aljoscha
> On 1. May 2017, at 18:59, Jain, Ankit <ankit.jain@here.com <ma...@here.com>> wrote:
>  
> Hi fellow users,
> We are trying to straighten out high availability story for flink.
>  
> Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.
>  
> I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?
>  
> W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.
>  
> My questions-
> 1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
> 2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
> 3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?
>  
> It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.
>  
> Please let me know if somebody can help clear the confusion.
>  
> Thanks
> Ankit
>  
>  
>  
>

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Thanks for the update Aljoscha.

@Till Rohrmann<ma...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink cluster per job – can we just choose to install Zookeeper when creating the EMR cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <an...@here.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <tr...@apache.org>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>, Till Rohrmann <tr...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by Aljoscha Krettek <al...@apache.org>.

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this case ZooKeeper is only used to keep track of checkpoint metadata and the execution graph of the running job. Such that a restoring JobManager can pick up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
> On 3. May 2017, at 16:58, Jain, Ankit <an...@here.com> wrote:
> 
> Thanks for your reply Aljoscha.
>  
> After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.
>  
> Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?
>  
> If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?
>  
> Thanks
> Ankit
>  
> From: Aljoscha Krettek <al...@apache.org>
> Date: Wednesday, May 3, 2017 at 2:05 AM
> To: "Jain, Ankit" <an...@here.com>
> Cc: "user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <tr...@apache.org>
> Subject: Re: High Availability on Yarn
>  
> Hi, 
> As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.
>  
> The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>
>  
> Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.
>  
> I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.
>  
> Best,
> Aljoscha
> On 1. May 2017, at 18:59, Jain, Ankit <ankit.jain@here.com <ma...@here.com>> wrote:
>  
> Hi fellow users,
> We are trying to straighten out high availability story for flink.
>  
> Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.
>  
> I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?
>  
> W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.
>  
> My questions-
> 1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
> 2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
> 3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?
>  
> It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.
>  
> Please let me know if somebody can help clear the confusion.
>  
> Thanks
> Ankit
>  
>  
>

Re: High Availability on Yarn

Posted by "Jain, Ankit" <an...@here.com>.

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time on Flink codebase, I think I now get how Flink & Yarn interact – I plan to document this soon in case it could help somebody starting afresh with Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM and while restarting, Flink AM will bring back previous running containers.  So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <al...@apache.org>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <an...@here.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>, Till Rohrmann <tr...@apache.org>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.

My questions-
1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit

Re: High Availability on Yarn

Posted by Aljoscha Krettek <al...@apache.org>.

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still work-in-progress. You cannot use these abstractions yet without going into the code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn <https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn>

Regarding your third question, ZooKeeper is mostly used for discovery and leader election. That is, JobManagers use it to decide who is the main JM and who are standby JMs. TaskManagers use it to discover the leading JobManager that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain it in a bit more detail.

Best,
Aljoscha
> On 1. May 2017, at 18:59, Jain, Ankit <an...@here.com> wrote:
> 
> Hi fellow users,
> We are trying to straighten out high availability story for flink.
>  
> Our setup includes a long running EMR cluster, job submission is a two-step process – 1) Flink cluster is first created using flink yarn client on the EMR cluster already running 2) Flink job is submitted.
>  
> I also saw references that with 1.2, these two steps have been combined into 1 – is that change in FlinkYarnSessionCli.java? Can somebody point to documentation please?
>  
> W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly introduced) failure for now, I want to understand first how task manager & job manager failures are handled.
>  
> My questions-
> 1)       https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> suggests a new RM has been added and now there is one JobManager for each job. Since Yarn RM will now talk to Flink RM( instead of JobManager previously), will Yarn automatically restart failing Flink RM?
> 2)       Is there any documentation on behavior of new Flink RM that will come up? How will previously running JobManagers & TaskManagers find out about new RM?
> 3)       https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration> requires configuring Zookeeper even for Yarn – Is this needed for handling Task Manager failures or JM or both? Will Yarn not take care of JM failures?
>  
> It may sound like I am little confused between role of Yarn and Flink components– who has the most burden of HA? Documentation in current state is lacking clarity – I know it is still evolving.
>  
> Please let me know if somebody can help clear the confusion.
>  
> Thanks
> Ankit
>  
>