You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Zhu Zhu (Jira)" <ji...@apache.org> on 2022/04/18 07:49:00 UTC

[jira] [Comment Edited] (FLINK-27274) Job cannot be recovered, after restarting cluster

    [ https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523586#comment-17523586 ] 

Zhu Zhu edited comment on FLINK-27274 at 4/18/22 7:48 AM:
----------------------------------------------------------

Yes, in flink-gum-standalonesession-0-hb3-dev-flink-000.log, no job is recovered. However, this is a by design behavior of Flink.
Job recovery only happens on [JobManager failures|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/ha/overview/#high-availability], but stopping-and-restarting a cluster is not a JobManager failure.


was (Author: zhuzh):
Yes, in flink-gum-standalonesession-0-hb3-dev-flink-000.log, no job is recovered. However, this is a by design behavior of Flink.
Job recovery only happens on [JobManager failures|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/ha/overview/#high-availability], but stopping-and-restarting a cluster is not a master failure.

> Job cannot be recovered, after restarting cluster
> -------------------------------------------------
>
>                 Key: FLINK-27274
>                 URL: https://issues.apache.org/jira/browse/FLINK-27274
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API
>    Affects Versions: 1.15.0
>         Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>            Reporter: macdoor615
>            Priority: Blocker
>             Fix For: 1.15.1
>
>         Attachments: flink-conf.yaml, flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.1#820001)