You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (JIRA)" <ji...@apache.org> on 2019/07/05 08:23:00 UTC
[jira] [Closed] (FLINK-13060) FailoverStrategies should respect
restart constraints
[ https://issues.apache.org/jira/browse/FLINK-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chesnay Schepler closed FLINK-13060.
------------------------------------
Resolution: Fixed
Release Note: Users that have enabled the "region" failover strategy, along with a restart strategy that enforces a certain number of restarts or introduces a restart delay, will see changes in behavior. This failover strategy now respects constraints that are defined by the restart strategy.
master: fd85207c946683f33b5f5f5d0d644c2ba2ccb381
> FailoverStrategies should respect restart constraints
> -----------------------------------------------------
>
> Key: FLINK-13060
> URL: https://issues.apache.org/jira/browse/FLINK-13060
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Chesnay Schepler
> Assignee: Chesnay Schepler
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.9.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> RestartStrategies can define their own restrictions for whether job can be restarted or not. For example, they could count the number of total failures or observe failure rates.
> FailoverStrategies are used for partial restarts of jobs, and currently largely bypass the restrictions defined by the restart strategies.
> My proposal is the following:
> Introduce a new method into the {{RestartStrategy}} interface to notify the strategy of failed task executions. Currently, strategies implicitly handle this in {{RestartStrategy#restart}}, as such the migration of our existing strategies should be trivial.
> Next, before calling {{RestartStrategy#restart}}, inform the strategy about the task failure. This retains existing behavior.
> Additionally, the {{FailoverStrategy}} implementation may additionally inform the restart strategy about task failures, if and when they perform a local failover. Additionally, all implementation have to check {{RestartStrategy#canRestart}} before attempting a failover.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)