You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Vishal Santoshi <vi...@gmail.com> on 2018/07/06 12:08:05 UTC

Re: is there a config to ask taskmanager to keep retrying connect to jobmanager after Disassociated?

Hello Chesnay, I have used an HA setup without the masters file and have
seen failover happen based on alerts from a leader election routine.... Is
it actually required that there be a masters file when there is a central
arbiterer ZK  that has the alive JMs and a call back to force TMs to switch
to a new leader in case of failure...

On Tue, Jun 5, 2018, 6:45 AM Chesnay Schepler <ch...@apache.org> wrote:

> Please look into high-availability
> <https://ci.apache.org/projects/flink/flink-docs-master/ops/jobmanager_high_availability.html>
> to make your cluster resistant against shutdowns.
>
> On 05.06.2018 12:31, makeyang wrote:
>
> can anybody share anythoughts, insights about this issue?
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>
>
>

Re: is there a config to ask taskmanager to keep retrying connect to jobmanager after Disassociated?

Posted by Vishal Santoshi <vi...@gmail.com>.

Even though I must admit that the jobs restart but they do restart
successfully  with the new JM.....

On Fri, Jul 6, 2018, 8:08 AM Vishal Santoshi <vi...@gmail.com>
wrote:

> Hello Chesnay, I have used an HA setup without the masters file and have
> seen failover happen based on alerts from a leader election routine.... Is
> it actually required that there be a masters file when there is a central
> arbiterer ZK  that has the alive JMs and a call back to force TMs to switch
> to a new leader in case of failure...
>
> On Tue, Jun 5, 2018, 6:45 AM Chesnay Schepler <ch...@apache.org> wrote:
>
>> Please look into high-availability
>> <https://ci.apache.org/projects/flink/flink-docs-master/ops/jobmanager_high_availability.html>
>> to make your cluster resistant against shutdowns.
>>
>> On 05.06.2018 12:31, makeyang wrote:
>>
>> can anybody share anythoughts, insights about this issue?
>>
>>
>>
>> --
>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>>
>>

Re: is there a config to ask taskmanager to keep retrying connect to jobmanager after Disassociated?

Posted by Vishal Santoshi <vi...@gmail.com>.

Yep, pwrfect, that we do.  Can you confirm though that jobs will restart in
the case of a failover ? That is what we see and that is fine..

On Fri, Jul 6, 2018, 8:24 AM Chesnay Schepler <ch...@apache.org> wrote:

> If i remember correctly the masters file is only used by the
> [start|stop]-cluster.sh scripts to determine how many JobManagers should be
> started / stopped and which port they should use.
>
> it's not necessarily *required*, but without it you have to manually
> start/stop all jobmanagers.
>
> On 06.07.2018 14:08, Vishal Santoshi wrote:
>
> Hello Chesnay, I have used an HA setup without the masters file and have
> seen failover happen based on alerts from a leader election routine.... Is
> it actually required that there be a masters file when there is a central
> arbiterer ZK  that has the alive JMs and a call back to force TMs to switch
> to a new leader in case of failure...
>
> On Tue, Jun 5, 2018, 6:45 AM Chesnay Schepler <ch...@apache.org> wrote:
>
>> Please look into high-availability
>> <https://ci.apache.org/projects/flink/flink-docs-master/ops/jobmanager_high_availability.html>
>> to make your cluster resistant against shutdowns.
>>
>> On 05.06.2018 12:31, makeyang wrote:
>>
>> can anybody share anythoughts, insights about this issue?
>>
>>
>>
>> --
>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>>
>>
>

Re: is there a config to ask taskmanager to keep retrying connect to jobmanager after Disassociated?

Posted by Chesnay Schepler <ch...@apache.org>.

If i remember correctly the masters file is only used by the 
[start|stop]-cluster.sh scripts to determine how many JobManagers should 
be started / stopped and which port they should use.

it's not necessarily /required/, but without it you have to manually 
start/stop all jobmanagers.

On 06.07.2018 14:08, Vishal Santoshi wrote:
> Hello Chesnay, I have used an HA setup without the masters file and 
> have seen failover happen based on alerts from a leader election 
> routine.... Is it actually required that there be a masters file when 
> there is a central arbiterer ZK  that has the alive JMs and a call 
> back to force TMs to switch to a new leader in case of failure...
>
> On Tue, Jun 5, 2018, 6:45 AM Chesnay Schepler <chesnay@apache.org 
> <ma...@apache.org>> wrote:
>
>     Please look into high-availability
>     <https://ci.apache.org/projects/flink/flink-docs-master/ops/jobmanager_high_availability.html>
>     to make your cluster resistant against shutdowns.
>
>     On 05.06.2018 12:31, makeyang wrote:
>>     can anybody share anythoughts, insights about this issue?
>>
>>
>>
>>     --
>>     Sent from:http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>