You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Liting Liu (litiliu)" <li...@cisco.com> on 2022/10/19 07:33:40 UTC

回复: Does kubernetes operator support manually triggering savepoint with canceling the job?

hi, Geng:
   I successfully triggered savePoint manually, but the job was still running after finish taking savepoint. I expect this job to be deleted, because the savepoint has been taken.
  jobStatus:
    jobId: 9de925e9d4a67e04ef6279925450907c
    jobName: sql-te-lab-s334c9
    savepointInfo:
      lastPeriodicSavepointTimestamp: 0
      lastSavepoint:
        location: >-
          hdfs://flink/sql-te/savepoint-9de925-b9ead1c58e7b
        timeStamp: 1666163606426
        triggerType: MANUAL
      savepointHistory:
        - location: >-
            hdfs://flink/sql-te/savepoint-9de925-b9ead1c58e7b
          timeStamp: 1666163606426
          triggerType: MANUAL
      triggerId: ''
      triggerTimestamp: 0
      triggerType: MANUAL
    startTime: '1666161791058'
    state: RUNNING

________________________________
发件人: Geng Biao <bi...@gmail.com>
发送时间: 2022年10月4日 13:57
收件人: Liting Liu (litiliu) <li...@cisco.com>; user <us...@flink.apache.org>
主题: Re: Does kubernetes operator support manually triggering savepoint with canceling the job?

Hi liting,

Maybe you can check codes of deleteClusterDeployment. When savepoint is finished, the operator will delete the job. Is the job not deleted as expected?

Best,
Bias Geng

获取 Outlook for iOS<https://aka.ms/o0ukef>
________________________________
发件人: Liting Liu (litiliu) <li...@cisco.com>
发送时间: Tuesday, October 4, 2022 12:53:45 PM
收件人: user <us...@flink.apache.org>
主题: Does kubernetes operator support manually triggering savepoint with canceling the job?

Hello Flink community:
   I want to manually trigger the savepoint with the help of kubernetes operator. But seems kubernetes operator hasn't provided an option for whether cancling the job when triggering savepoint. Because the  `cancelJob` parameter was hard coded to false in latest code AbstractFlinkService.java#L299<https://github.com/apache/flink-kubernetes-operator/blob/1f6a75056acae90e9fab182fd076ee6755b35bbb/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L299>.
      Do i have to watch the savepoint finish myself, then cancel this job ASAP?  And do we have a plan to support this option?

Re: Does kubernetes operator support manually triggering savepoint with canceling the job?

Posted by Gyula Fóra <gy...@gmail.com>.
I think you are confusing manual savepoints with savepoint upgrades.

Manual savepoints will trigger a savepoint but not shut down the job. If
you want to stop the job with savepoint you set the upgradeMode to
savepoint and set the state to SUSPENDED.
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/job-management/

The operator is not really a substitute for the CLI for these low level
operations, it's possible but the main goal is to provide comprehensive
lifecycle management easily.
You don't usually need to stop with savepoint... Instead you usually do a
stateful upgrade by changing the CR.

Hope this helps
Gyula

On Wed, Oct 19, 2022 at 9:34 AM Liting Liu (litiliu) <li...@cisco.com>
wrote:

> hi, Geng:
>    I successfully triggered savePoint manually, but the job was still
> running after finish taking savepoint. I expect this job to be deleted,
> because the savepoint has been taken.
>   jobStatus:
>     jobId: 9de925e9d4a67e04ef6279925450907c
>     jobName: sql-te-lab-s334c9
>     savepointInfo:
>       lastPeriodicSavepointTimestamp: 0
>       lastSavepoint:
>         location: >-
>           hdfs://flink/sql-te/savepoint-9de925-b9ead1c58e7b
>         timeStamp: 1666163606426
>         triggerType: MANUAL
>       savepointHistory:
>         - location: >-
>             hdfs://flink/sql-te/savepoint-9de925-b9ead1c58e7b
>           timeStamp: 1666163606426
>           triggerType: MANUAL
>       triggerId: ''
>       triggerTimestamp: 0
>       triggerType: MANUAL
>     startTime: '1666161791058'
>     state: RUNNING
>
> ------------------------------
> *发件人:* Geng Biao <bi...@gmail.com>
> *发送时间:* 2022年10月4日 13:57
> *收件人:* Liting Liu (litiliu) <li...@cisco.com>; user <
> user@flink.apache.org>
> *主题:* Re: Does kubernetes operator support manually triggering savepoint
> with canceling the job?
>
> Hi liting,
>
> Maybe you can check codes of deleteClusterDeployment. When savepoint is
> finished, the operator will delete the job. Is the job not deleted as
> expected?
>
> Best,
> Bias Geng
>
> 获取 Outlook for iOS <https://aka.ms/o0ukef>
> ------------------------------
> *发件人:* Liting Liu (litiliu) <li...@cisco.com>
> *发送时间:* Tuesday, October 4, 2022 12:53:45 PM
> *收件人:* user <us...@flink.apache.org>
> *主题:* Does kubernetes operator support manually triggering savepoint with
> canceling the job?
>
> Hello Flink community:
> I want to manually trigger the savepoint with the help of kubernetes
> operator. But seems kubernetes operator hasn't provided an option for
> whether cancling the job when triggering savepoint. Because the
> `cancelJob` parameter was hard coded to false in latest code
> AbstractFlinkService.java#L299
> <https://github.com/apache/flink-kubernetes-operator/blob/1f6a75056acae90e9fab182fd076ee6755b35bbb/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L299>
> .
>       Do i have to watch the savepoint finish myself, then cancel this job
> ASAP?  And do we have a plan to support this option?
>