You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Vijay Balakrishnan <bv...@gmail.com> on 2018/05/15 22:16:34 UTC

Recovering from 1 of the nodes/slots of a Task Manager failing without resetting entire state during Recovery

Hi,
I have been going through the book "Real time streaming with Apache Flink".
How do I recover state for just a single node/slot in a TaskManager without
having the recovery reset the application state for all the Task Managers ?

They mention the following:
*Reset the state of the whole application to the latest checkpoint, i.e.,
resetting the state of each task. The JobManager provides the location to
the most recent checkpoints of task state. Note that, depending on the
topology of the application, certain optimizations are possible and not all
tasks need to be reset.*

TIA,
Vijay
Flink newbie.

Re: Recovering from 1 of the nodes/slots of a Task Manager failing without resetting entire state during Recovery

Posted by Vijay Balakrishnan <bv...@gmail.com>.
HI,
I found the following documentation in the code:

flink-runtime:
org.apache.flink.runtime.executiongraph.failover.RestartIndividualStrategy
Simple failover strategy that restarts each task individually.
 * This strategy is only applicable if the entire job consists unconnected
 * tasks, meaning each task is its own component.

 RestartPipelinedRegionStrategy:
 A failover strategy that restarts regions of the ExecutionGraph. A region
is defined
  * by this strategy as the weakly connected component of tasks that
communicate via pipelined
  * data exchange.


Can someone please explain to me the  IndividualStrategy (entire job
consists unconnected tasks) & PipelinedRegionStrategy( weakly connected
component of tasks that communicate via pipelined data exchange) with an
example ?

TIA,
Vijay

On Tue, May 15, 2018 at 3:16 PM Vijay Balakrishnan <bv...@gmail.com>
wrote:

> Hi,
> I have been going through the book "Real time streaming with Apache
> Flink".
> How do I recover state for just a single node/slot in a TaskManager
> without having the recovery reset the application state for all the Task
> Managers ?
>
> They mention the following:
> *Reset the state of the whole application to the latest checkpoint, i.e.,
> resetting the state of each task. The JobManager provides the location to
> the most recent checkpoints of task state. Note that, depending on the
> topology of the application, certain optimizations are possible and not all
> tasks need to be reset.*
>
> TIA,
> Vijay
> Flink newbie.
>