You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (JIRA)" <ji...@apache.org> on 2019/07/05 08:23:00 UTC

[jira] [Closed] (FLINK-13060) FailoverStrategies should respect restart constraints

     [ https://issues.apache.org/jira/browse/FLINK-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chesnay Schepler closed FLINK-13060.
------------------------------------
      Resolution: Fixed
    Release Note: Users that have enabled the "region" failover strategy, along with a restart strategy that enforces a certain number of restarts or introduces a restart delay, will see changes in behavior. This failover strategy now respects constraints that are defined by the restart strategy.

master: fd85207c946683f33b5f5f5d0d644c2ba2ccb381

> FailoverStrategies should respect restart constraints
> -----------------------------------------------------
>
>                 Key: FLINK-13060
>                 URL: https://issues.apache.org/jira/browse/FLINK-13060
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.9.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> RestartStrategies can define their own restrictions for whether job can be restarted or not. For example, they could count the number of total failures or observe failure rates.
> FailoverStrategies are used for partial restarts of jobs, and currently largely bypass the restrictions defined by the restart strategies.
> My proposal is the following:
> Introduce a new method into the {{RestartStrategy}} interface to notify the strategy of failed task executions. Currently, strategies implicitly handle this in {{RestartStrategy#restart}}, as such the migration of our existing strategies should be trivial.
> Next, before calling {{RestartStrategy#restart}}, inform the strategy about the task failure. This retains existing behavior.
> Additionally, the {{FailoverStrategy}} implementation may additionally inform the restart strategy about task failures, if and when they perform a local failover. Additionally, all implementation have to check {{RestartStrategy#canRestart}} before attempting a failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)