You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Matt Magsombol <ra...@gmail.com> on 2020/06/10 12:48:50 UTC

Running Kubernetes on Flink with Savepoint

We're currently using this template: https://github.com/docker-flink/examples/tree/master/helm/flink for running kubernetes flink for running a job specific cluster ( with a nit of specifying the class as the main runner for the cluster ).


How would I go about setting up adding savepoints, so that we can edit our currently existing running jobs to add pipes to the flink job without having to restart our state? Reasoning is that our state has a 1 day TTL and updating our code without state will have to restart this from scratch.

Through documentation, I see that I'd need to run some sort of command. This is not possible to be consistent if we're using the helm charts specified in the link.

I see this email thread talking about a certain problem with savepoints + kubernetes but doesn't quite specify how to set this up with helm: https://lists.apache.org/thread.html/4299518f4da2810aa88fe6b21f841880b619f3f8ac264084a318c034%40%3Cuser.flink.apache.org%3E


According to hasun@zendesk from that thread, they mention that "We always make a savepoint before we shutdown the job-cluster. So the savepoint is always the latest. When we fix a bug or change the job graph, it can resume well."

This is the exact use case that I'm looking to appease. Other than specifying configs, are there any other additional parameters that I'd need to add within helm to specify that it needs to take in the latest savepoint upon starting?

Re: Running Kubernetes on Flink with Savepoint

Posted by Matt Magsombol <ra...@gmail.com>.

Yeah, our set up is a bit out dated ( since flink 1.7-ish ) but we're effectively just using helm templates...when upgrading to 1.10, I just ended up looking at diffs and change logs for changes...
Anyways, thanks, I was hoping that flink has a community supported way of doing this, but I think I know what to do internally

On 2020/06/15 15:11:32, Robert Metzger <rm...@apache.org> wrote: 
> Hi Matt,
> 
> sorry for the late reply. Why are you using the "flink-docker" helm example
> instead of
> https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/kubernetes.html
>  or
> https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
>  ?
> I don't think that the helm charts you are mentioning are actively
> maintained or recommended for production use.
> 
> If you want to create a savepoint in Flink, you'll need to trigger it via
> the JobManager's REST API (independent of how you deploy it). I guess
> you'll have to come up with some tooling that orchestrates triggering a
> savepoint before shutting down / upgrading the job.
> See also:
> https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#jobs-jobid-savepoints
> 
> Best,
> Robert
> 
> 
> 
> On Wed, Jun 10, 2020 at 2:48 PM Matt Magsombol <ra...@gmail.com> wrote:
> 
> > We're currently using this template:
> > https://github.com/docker-flink/examples/tree/master/helm/flink for
> > running kubernetes flink for running a job specific cluster ( with a nit of
> > specifying the class as the main runner for the cluster ).
> >
> >
> > How would I go about setting up adding savepoints, so that we can edit our
> > currently existing running jobs to add pipes to the flink job without
> > having to restart our state? Reasoning is that our state has a 1 day TTL
> > and updating our code without state will have to restart this from scratch.
> >
> > Through documentation, I see that I'd need to run some sort of command.
> > This is not possible to be consistent if we're using the helm charts
> > specified in the link.
> >
> > I see this email thread talking about a certain problem with savepoints +
> > kubernetes but doesn't quite specify how to set this up with helm:
> > https://lists.apache.org/thread.html/4299518f4da2810aa88fe6b21f841880b619f3f8ac264084a318c034%40%3Cuser.flink.apache.org%3E
> >
> >
> > According to hasun@zendesk from that thread, they mention that "We always
> > make a savepoint before we shutdown the job-cluster. So the savepoint is
> > always the latest. When we fix a bug or change the job graph, it can resume
> > well."
> >
> > This is the exact use case that I'm looking to appease. Other than
> > specifying configs, are there any other additional parameters that I'd need
> > to add within helm to specify that it needs to take in the latest savepoint
> > upon starting?
> >
>

Re: Running Kubernetes on Flink with Savepoint

Posted by Robert Metzger <rm...@apache.org>.

Hi Matt,

sorry for the late reply. Why are you using the "flink-docker" helm example
instead of
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/kubernetes.html
 or
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
 ?
I don't think that the helm charts you are mentioning are actively
maintained or recommended for production use.

If you want to create a savepoint in Flink, you'll need to trigger it via
the JobManager's REST API (independent of how you deploy it). I guess
you'll have to come up with some tooling that orchestrates triggering a
savepoint before shutting down / upgrading the job.
See also:
https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#jobs-jobid-savepoints

Best,
Robert

On Wed, Jun 10, 2020 at 2:48 PM Matt Magsombol <ra...@gmail.com> wrote:

> We're currently using this template:
> https://github.com/docker-flink/examples/tree/master/helm/flink for
> running kubernetes flink for running a job specific cluster ( with a nit of
> specifying the class as the main runner for the cluster ).
>
>
> How would I go about setting up adding savepoints, so that we can edit our
> currently existing running jobs to add pipes to the flink job without
> having to restart our state? Reasoning is that our state has a 1 day TTL
> and updating our code without state will have to restart this from scratch.
>
> Through documentation, I see that I'd need to run some sort of command.
> This is not possible to be consistent if we're using the helm charts
> specified in the link.
>
> I see this email thread talking about a certain problem with savepoints +
> kubernetes but doesn't quite specify how to set this up with helm:
> https://lists.apache.org/thread.html/4299518f4da2810aa88fe6b21f841880b619f3f8ac264084a318c034%40%3Cuser.flink.apache.org%3E
>
>
> According to hasun@zendesk from that thread, they mention that "We always
> make a savepoint before we shutdown the job-cluster. So the savepoint is
> always the latest. When we fix a bug or change the job graph, it can resume
> well."
>
> This is the exact use case that I'm looking to appease. Other than
> specifying configs, are there any other additional parameters that I'd need
> to add within helm to specify that it needs to take in the latest savepoint
> upon starting?
>