You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Siew Wai Yow <wa...@hotmail.com> on 2018/08/20 14:23:05 UTC

Cluster die when one of the TM killed

Hi,


When one of the task manager is killed, the whole cluster die, is this something expected? We are using Flink 1.4. Thank you.


Regards,

Yow

Re: Cluster die when one of the TM killed

Posted by Lasse Nedergaard <la...@gmail.com>.
Hi. 
We have seen the same behaviour on Yarn. It turned out that the default settings for was not optimal. 
yarn.maximum-failed-containers: The maximum number of failed containers the ApplicationMaster accepts until it fails the YARN session. Default: The number of initially requested TaskManagers (-n).
So try to lookup the configuration for your system. 
Next step is to investigate why the task manager is killed. 


Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 20. aug. 2018 kl. 16.34 skrev Dominik Wosiński <wo...@gmail.com>:
> 
> Hey, 
> Can You please provide a little more information about your setup and maybe logs showing when the crash occurs? 
> Best Regards,
> Dominik
> 
> 2018-08-20 16:23 GMT+02:00 Siew Wai Yow <wa...@hotmail.com>:
>> Hi,
>> 
>> When one of the task manager is killed, the whole cluster die, is this something expected? We are using Flink 1.4. Thank you.
>> 
>> Regards,
>> Yow
> 

Re: Cluster die when one of the TM killed

Posted by Dominik Wosiński <wo...@gmail.com>.
Hey,
Can You please provide a little more information about your setup and maybe
logs showing when the crash occurs?
Best Regards,
Dominik

2018-08-20 16:23 GMT+02:00 Siew Wai Yow <wa...@hotmail.com>:

> Hi,
>
>
> When one of the task manager is killed, the whole cluster die, is this
> something expected? We are using Flink 1.4. Thank you.
>
>
> Regards,
>
> Yow
>