You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by "KevinyhZou (Jira)" <ji...@apache.org> on 2022/07/19 06:33:00 UTC

[jira] [Created] (FLINK-28604) job failover and not restore from checkpoint in zookeeper HA mode

KevinyhZou created FLINK-28604:
----------------------------------

             Summary: job failover and not restore from checkpoint in zookeeper HA mode
                 Key: FLINK-28604
                 URL: https://issues.apache.org/jira/browse/FLINK-28604
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing
    Affects Versions: 1.14.2
            Reporter: KevinyhZou
         Attachments: image-2022-07-19-14-30-27-198.png

Run a job with flink 1.14.2 by configure the zookeeper ha 
{code:java}
high-availability.storageDir: hdfs://testcluster/app/flink/ha
high-availability: zookeeper
high-availability.zookeeper.quorum: *****
high-availability.zookeeper.path.root: /flink{code}
when the zookeeper node restart, I see the JM failover with log "Close and clean up all data for  ZookeeperHaServices",  So the ha data was cleaned when the first JM shutdown. 

when the second JM was started,  the log was "No checkpoint found during restore", and no checkpoint to restored  .

From debug, I find when job failover, it would goto the `ClusterEntryPoint.java` line 285

!image-2022-07-19-14-30-27-198.png!

and will set the `cleanupHaData` as true.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)