You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "fanxin (Jira)" <ji...@apache.org> on 2020/05/22 01:51:00 UTC

[jira] [Updated] (FLINK-17871) Make the default value of attemptFailuresValidityInterval more reasonable

     [ https://issues.apache.org/jira/browse/FLINK-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

fanxin updated FLINK-17871:
---------------------------
    Description: Default value of `yarn.application-attempt-failures-validity-interval` is `10000` milliseconds at present. Usually preparing the context alone can take seconds, which means that default value of 10000 is too small to even prepare the runtime context. With a default config, a flink on yarn job in will hardly meet the condition of ”fail 2 times in 10s“. If the job has some internal problems, unfortunately, it can easily get bogged down in endless retries.  (was: Default value of `yarn.application-attempt-failures-validity-interval` is `10000` milliseconds at present. Usually preparing the context alone can take seconds, which means that default value 10000 is too small even to ready a runtime context. With a default config, a flink on yarn job in will hardly meet the condition of ”fail 2 times in 10s“. If the job has some internal problems, unfortunately, it can easily get bogged down in endless retries.)

> Make the default value of attemptFailuresValidityInterval more reasonable
> -------------------------------------------------------------------------
>
>                 Key: FLINK-17871
>                 URL: https://issues.apache.org/jira/browse/FLINK-17871
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / YARN
>            Reporter: fanxin
>            Priority: Minor
>
> Default value of `yarn.application-attempt-failures-validity-interval` is `10000` milliseconds at present. Usually preparing the context alone can take seconds, which means that default value of 10000 is too small to even prepare the runtime context. With a default config, a flink on yarn job in will hardly meet the condition of ”fail 2 times in 10s“. If the job has some internal problems, unfortunately, it can easily get bogged down in endless retries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)