You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by M Singh <ma...@yahoo.com> on 2019/11/16 19:39:10 UTC

Apache Airflow - Question about checkpointing and re-run a job

Hi:
I have a Flink job and sometimes I need to cancel and re run it.  From what I understand the checkpoints for a job are saved under the job id directory at the checkpoint location. If I run the same job again, it will get a new job id and the checkpoint saved from the previous run job (which is saved under the previous job's id dir) will not be used for this new run. Is that a correct understanding ?  If I need to re-run the job from the previous checkpoint - is there any way to do that automatically without using a savepoint ?
Also, I believe the internal job restarts do not change the job id so in those cases where the job restarts will pick the state from the saved checkpoint.  Is my understanding correct ?
Thanks
Mans

Re: Apache Airflow - Question about checkpointing and re-run a job

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi,

I believe that the title of this email thread was a typo, and should be
"Apache Flink - Question about checkpointing and re-run a job."
I assume this because the contents of the previous conversations seem to be
purely about Flink.

Otherwise, as far as I know, there doesn't seem to be any publicly available
Airflow operators for Flink right now.

Cheers,
Gordon



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Apache Airflow - Question about checkpointing and re-run a job

Posted by kant kodali <ka...@gmail.com>.
Does Airflow has a Flink Operator? I am not seeing it? Can you please point
me?

On Mon, Nov 18, 2019 at 3:10 AM M Singh <ma...@yahoo.com> wrote:

> Thanks Congxian for your answer and reference.  Mans
>
> On Sunday, November 17, 2019, 08:59:16 PM EST, Congxian Qiu <
> qcx978132955@gmail.com> wrote:
>
>
> Hi
> Yes, checkpoint data locates under jobid dir. you can try to restore from
> the retained checkpoint[1]
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint
> Best,
> Congxian
>
>
> M Singh <ma...@yahoo.com> 于2019年11月18日周一 上午2:54写道:
>
> Folks - Please let me know if you have any advice on this question.  Thanks
>
> On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh <
> mans2singh@yahoo.com> wrote:
>
>
> Hi:
>
> I have a Flink job and sometimes I need to cancel and re run it.  From
> what I understand the checkpoints for a job are saved under the job id
> directory at the checkpoint location. If I run the same job again, it will
> get a new job id and the checkpoint saved from the previous run job (which
> is saved under the previous job's id dir) will not be used for this new
> run. Is that a correct understanding ?  If I need to re-run the job from
> the previous checkpoint - is there any way to do that automatically without
> using a savepoint ?
>
> Also, I believe the internal job restarts do not change the job id so in
> those cases where the job restarts will pick the state from the saved
> checkpoint.  Is my understanding correct ?
>
> Thanks
>
> Mans
>
>

Re: Apache Airflow - Question about checkpointing and re-run a job

Posted by M Singh <ma...@yahoo.com>.
 Thanks Congxian for your answer and reference.  Mans
    On Sunday, November 17, 2019, 08:59:16 PM EST, Congxian Qiu <qc...@gmail.com> wrote:  
 
 HiYes, checkpoint data locates under jobid dir. you can try to restore from the retained checkpoint[1][1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint
Best,Congxian

M Singh <ma...@yahoo.com> 于2019年11月18日周一 上午2:54写道:

 Folks - Please let me know if you have any advice on this question.  Thanks
    On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh <ma...@yahoo.com> wrote:  
 
 Hi:
I have a Flink job and sometimes I need to cancel and re run it.  From what I understand the checkpoints for a job are saved under the job id directory at the checkpoint location. If I run the same job again, it will get a new job id and the checkpoint saved from the previous run job (which is saved under the previous job's id dir) will not be used for this new run. Is that a correct understanding ?  If I need to re-run the job from the previous checkpoint - is there any way to do that automatically without using a savepoint ?
Also, I believe the internal job restarts do not change the job id so in those cases where the job restarts will pick the state from the saved checkpoint.  Is my understanding correct ?
Thanks
Mans  
  

Re: Apache Airflow - Question about checkpointing and re-run a job

Posted by Congxian Qiu <qc...@gmail.com>.
Hi
Yes, checkpoint data locates under jobid dir. you can try to restore from
the retained checkpoint[1]
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint
Best,
Congxian


M Singh <ma...@yahoo.com> 于2019年11月18日周一 上午2:54写道:

> Folks - Please let me know if you have any advice on this question.  Thanks
>
> On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh <
> mans2singh@yahoo.com> wrote:
>
>
> Hi:
>
> I have a Flink job and sometimes I need to cancel and re run it.  From
> what I understand the checkpoints for a job are saved under the job id
> directory at the checkpoint location. If I run the same job again, it will
> get a new job id and the checkpoint saved from the previous run job (which
> is saved under the previous job's id dir) will not be used for this new
> run. Is that a correct understanding ?  If I need to re-run the job from
> the previous checkpoint - is there any way to do that automatically without
> using a savepoint ?
>
> Also, I believe the internal job restarts do not change the job id so in
> those cases where the job restarts will pick the state from the saved
> checkpoint.  Is my understanding correct ?
>
> Thanks
>
> Mans
>

Re: Apache Airflow - Question about checkpointing and re-run a job

Posted by M Singh <ma...@yahoo.com>.
 Folks - Please let me know if you have any advice on this question.  Thanks
    On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh <ma...@yahoo.com> wrote:  
 
 Hi:
I have a Flink job and sometimes I need to cancel and re run it.  From what I understand the checkpoints for a job are saved under the job id directory at the checkpoint location. If I run the same job again, it will get a new job id and the checkpoint saved from the previous run job (which is saved under the previous job's id dir) will not be used for this new run. Is that a correct understanding ?  If I need to re-run the job from the previous checkpoint - is there any way to do that automatically without using a savepoint ?
Also, I believe the internal job restarts do not change the job id so in those cases where the job restarts will pick the state from the saved checkpoint.  Is my understanding correct ?
Thanks
Mans