You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Lu Niu <qq...@gmail.com> on 2019/10/03 04:38:46 UTC
Re: Implementing CheckpointableInputFormat
Hi, Fabian
Thanks for replying!
I implemented a Custom RichInputFormat
implementing CheckpointableInputFormat. And I found it is executed
through InputFormatSourceFunction, which doesn't
use CheckpointableInputFormat during execution. If so, how does checkpoint
work here?
I also notice when one task finished, I cannot trigger savepoint anymore.
It throws exception "Not all tasks are running". Does that imply no
savepoint/checkpoint can be taken once any task finish?
Best
Lu
On Fri, Sep 6, 2019 at 6:33 AM Fabian Hueske <fh...@gmail.com> wrote:
> Hi,
>
> CheckpointableInputFormat is only relevant if you plan to use the
> InputFormat in a MonitoringFileSource, i.e., in a streaming application.
> If you plan to use it in a DataSet (batch) program, InputFormat is fine.
>
> Btw. the latest release Flink 1.9.0 has major improvements for the
> recovery of batch jobs.
>
> Best, Fabian
>
> Am Do., 5. Sept. 2019 um 19:01 Uhr schrieb Lu Niu <qq...@gmail.com>:
>
>> Hi, Team
>>
>> I am implementing a custom InputFormat. Shall I
>> implement CheckpointableInputFormat interface? If I don't, does that mean
>> the whole job has to restart given only one task fails? I ask because I
>> found all InputFormat implements CheckpointableInputFormat, which makes me
>> confused. Thank you!
>>
>> Best
>> Lu
>>
>
Re: Implementing CheckpointableInputFormat
Posted by Chesnay Schepler <ch...@apache.org>.
You have to use StreamExecutionEnvironment#createFileInput for
implementing CheckpointableInputFormat to have any effect. This
internally results in it being used by the MonitoringFileSource.
If you use StreamExecutionEnvironment#createInput nothing will be
checkpointed for the source; and yes this usually means having to
restart the entire job if an error occurs.
Checkpoints/savepoints cannot be taken if any task is no longer running,
see FLINK-2491.
On 03/10/2019 06:38, Lu Niu wrote:
> Hi, Fabian
>
> Thanks for replying!
>
> I implemented a Custom RichInputFormat
> implementing CheckpointableInputFormat. And I found it is executed
> through InputFormatSourceFunction, which doesn't
> use CheckpointableInputFormat during execution. If so, how does
> checkpoint work here?
>
> I also notice when one task finished, I cannot trigger savepoint
> anymore. It throws exception "Not all tasks are running". Does that
> imply no savepoint/checkpoint can be taken once any task finish?
>
> Best
> Lu
>
> On Fri, Sep 6, 2019 at 6:33 AM Fabian Hueske <fhueske@gmail.com
> <ma...@gmail.com>> wrote:
>
> Hi,
>
> CheckpointableInputFormat is only relevant if you plan to use the
> InputFormat in a MonitoringFileSource, i.e., in a streaming
> application.
> If you plan to use it in a DataSet (batch) program, InputFormat is
> fine.
>
> Btw. the latest release Flink 1.9.0 has major improvements for the
> recovery of batch jobs.
>
> Best, Fabian
>
> Am Do., 5. Sept. 2019 um 19:01 Uhr schrieb Lu Niu
> <qqibrow@gmail.com <ma...@gmail.com>>:
>
> Hi, Team
>
> I am implementing a custom InputFormat. Shall I
> implement CheckpointableInputFormat interface? If I don't,
> does that mean the whole job has to restart given only one
> task fails? I ask because I found all InputFormat
> implements CheckpointableInputFormat, which makes me confused.
> Thank you!
>
> Best
> Lu
>