You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "vprabhu@gmail.com" <vp...@gmail.com> on 2016/06/30 01:35:38 UTC

RollingSink - question on a failure scenario

Hi,

Is there a chance of data loss if there is a failure between the checkpoint
completion and when "notifyCheckpointComplete" is invoked.

The pending files are moved to final state in the "notifyCheckpointComplete"
method. So if there is a failure in this method or just before the method is
invoked the data in the pending files is lost, am I missing something ?

Thanks,
Prabhu 



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/RollingSink-question-on-a-failure-scenario-tp7735.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: RollingSink - question on a failure scenario

Posted by Till Rohrmann <tr...@apache.org>.
Hi Prabhu,

the rolling file sinks should not suffer from data loss. The reason is the
following: The checkpointed state, bucket state, contains the current file,
the offset and all pending file which are ready to be moved. Once a
checkpoint is completed, the notifyCheckpointComplete method is called. In
this method, the pending files are moved to their final position. If a
failure occurs before the notifyCheckpointComplete method or while moving
the files, then it is correct that not all pending files have been moved.
However, upon recovery, the rolling file sink receives its bucket state
which contains the list of all pending files. Now it checks for every file
whether it has been already moved or not. If a file wasn’t moved yet, then
it would be moved it now. That’s why the system can guarantee that you
won’t lose any data.

Cheers,
Till
​

On Thu, Jun 30, 2016 at 3:35 AM, vprabhu@gmail.com <vp...@gmail.com>
wrote:

> Hi,
>
> Is there a chance of data loss if there is a failure between the checkpoint
> completion and when "notifyCheckpointComplete" is invoked.
>
> The pending files are moved to final state in the
> "notifyCheckpointComplete"
> method. So if there is a failure in this method or just before the method
> is
> invoked the data in the pending files is lost, am I missing something ?
>
> Thanks,
> Prabhu
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/RollingSink-question-on-a-failure-scenario-tp7735.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>