You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Őrhidi Mátyás <ma...@gmail.com> on 2022/03/01 12:51:24 UTC

[DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Hi All!

I'd like to start a quick discussion about the way we allow users to
trigger savepoints manually in the operator [FLINK-26181]
<https://issues.apache.org/jira/browse/FLINK-26181>. There are existing
solutions already for this functionality in other operators, for example:
- counter based
<https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource>
- annotation based
<https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource>

We could implement any of these or both or come up with our own approach.
It seems, the java-operator-sdk handles the changes of the .metadata and
.spec fields of custom resources differently. For further info see the
chapter Generation Awareness and Event Filtering in the docs
<https://javaoperatorsdk.io/docs/features>.

Let me know what you think.

Cheers,
Matyas

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Yang Wang <wa...@apache.org>.
Using a separate CR for managing the savepoints is really a good idea.
After then managing savepoints will be easier and we will not leak any
unusable savepoints on the object storage.


Best,
Yang

On Wed, Mar 13, 2024 at 4:40 AM Gyula Fóra <gy...@gmail.com> wrote:

> That would be great Mate! If you could draw up a FLIP for this that would
> be nice as this is a rather large change that will have a significant
> impact for existing users.
>
> If possible it would be good to provide some backward compatibility /
> transition period while we preserve the current content of the status so
> it's easy to migrate to the new savepoint CRs.
>
> Cheers,
> Gyula
>
> On Tue, Mar 12, 2024 at 9:22 PM Mate Czagany <cz...@gmail.com> wrote:
>
> > Hi,
> >
> > I really like this idea as well, I think it would be a great improvement
> > compared to how manual savepoints currently work, and suits Kubernetes
> > workflows a lot better.
> >
> > If there are no objections, I can investigate it during the next few
> weeks
> > and see how this could be implemented in the current code.
> >
> > Cheers,
> > Mate
> >
> > Gyula Fóra <gy...@gmail.com> ezt írta (időpont: 2024. márc. 12., K,
> > 16:01):
> >
> > > That's definitely a good improvement Robert and we should add it at
> some
> > > point. At the point in time when this was implemented we went with the
> > > current simpler / more lightweight approach.
> > > However if anyone is interested in working on this / contributing this
> > > improvement I would personally support it.
> > >
> > > Gyula
> > >
> > > On Tue, Mar 12, 2024 at 3:53 PM Robert Metzger <rm...@apache.org>
> > > wrote:
> > >
> > > > Have you guys considered making savepoints a first class citizen in
> the
> > > > Kubernetes operator?
> > > > E.g. to trigger a savepoint, you create a "FlinkSavepoint" CR, the
> K8s
> > > > operator picks up that resource and tries to create a savepoint
> > > > indefinitely until the savepoint has been successfully created. We
> > report
> > > > the savepoint status and location in the "status" field.
> > > >
> > > > We could even add an (optional) finalizer to delete the physical
> > > savepoint
> > > > from the savepoint storage once the "FlinkSavepoint" CR has been
> > deleted.
> > > > optional: the savepoint spec could contain a field "retain
> > > > physical savepoint" or something, that controls the delete behavior.
> > > >
> > > >
> > > > On Thu, Mar 3, 2022 at 4:02 AM Yang Wang <da...@gmail.com>
> > wrote:
> > > >
> > > > > I agree that we could start with the annotation approach and
> collect
> > > the
> > > > > feedback at the same time.
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Őrhidi Mátyás <ma...@gmail.com> 于2022年3月2日周三 20:06写道:
> > > > >
> > > > > > Thank you for your feedback!
> > > > > >
> > > > > > The annotation on the
> > > > > >
> > > > > > @ControllerConfiguration(generationAwareEventProcessing = false)
> > > > > > FlinkDeploymentController
> > > > > >
> > > > > > already enables the event triggering based on metadata changes.
> It
> > > was
> > > > > set
> > > > > > earlier to support some failure scenarios. (It can be used for
> > > example
> > > > to
> > > > > > manually reenable the reconcile loop when it got stuck in an
> error
> > > > phase)
> > > > > >
> > > > > > I will go ahead and propose a PR using annotations then.
> > > > > >
> > > > > > Cheers,
> > > > > > Matyas
> > > > > >
> > > > > > On Wed, Mar 2, 2022 at 12:47 PM Yang Wang <danrtsey.wy@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > I also like the annotation approach since it is more natural.
> > > > > > > But I am not sure about whether the meta data change will
> trigger
> > > an
> > > > > > event
> > > > > > > in java-operator-sdk.
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yang
> > > > > > >
> > > > > > > Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:
> > > > > > >
> > > > > > > > Thanks Matyas,
> > > > > > > >
> > > > > > > > From a user perspective I think the annotation is pretty nice
> > and
> > > > > user
> > > > > > > > friendly so I personally prefer that approach.
> > > > > > > >
> > > > > > > > You said:
> > > > > > > >  "It seems, the java-operator-sdk handles the changes of the
> > > > > .metadata
> > > > > > > and
> > > > > > > > .spec fields of custom resources differently."
> > > > > > > >
> > > > > > > > What implications does this have on the above mentioned 2
> > > > approaches?
> > > > > > > Does
> > > > > > > > it make one more difficult than the other?
> > > > > > > >
> > > > > > > > Cheers
> > > > > > > > Gyula
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <
> > > > > matyas.orhidi@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi All!
> > > > > > > > >
> > > > > > > > > I'd like to start a quick discussion about the way we allow
> > > users
> > > > > to
> > > > > > > > > trigger savepoints manually in the operator [FLINK-26181]
> > > > > > > > > <https://issues.apache.org/jira/browse/FLINK-26181>. There
> > are
> > > > > > > existing
> > > > > > > > > solutions already for this functionality in other
> operators,
> > > for
> > > > > > > example:
> > > > > > > > > - counter based
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > > > > > > > > >
> > > > > > > > > - annotation based
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > We could implement any of these or both or come up with our
> > own
> > > > > > > approach.
> > > > > > > > > It seems, the java-operator-sdk handles the changes of the
> > > > > .metadata
> > > > > > > and
> > > > > > > > > .spec fields of custom resources differently. For further
> > info
> > > > see
> > > > > > the
> > > > > > > > > chapter Generation Awareness and Event Filtering in the
> docs
> > > > > > > > > <https://javaoperatorsdk.io/docs/features>.
> > > > > > > > >
> > > > > > > > > Let me know what you think.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Matyas
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Gyula Fóra <gy...@gmail.com>.
That would be great Mate! If you could draw up a FLIP for this that would
be nice as this is a rather large change that will have a significant
impact for existing users.

If possible it would be good to provide some backward compatibility /
transition period while we preserve the current content of the status so
it's easy to migrate to the new savepoint CRs.

Cheers,
Gyula

On Tue, Mar 12, 2024 at 9:22 PM Mate Czagany <cz...@gmail.com> wrote:

> Hi,
>
> I really like this idea as well, I think it would be a great improvement
> compared to how manual savepoints currently work, and suits Kubernetes
> workflows a lot better.
>
> If there are no objections, I can investigate it during the next few weeks
> and see how this could be implemented in the current code.
>
> Cheers,
> Mate
>
> Gyula Fóra <gy...@gmail.com> ezt írta (időpont: 2024. márc. 12., K,
> 16:01):
>
> > That's definitely a good improvement Robert and we should add it at some
> > point. At the point in time when this was implemented we went with the
> > current simpler / more lightweight approach.
> > However if anyone is interested in working on this / contributing this
> > improvement I would personally support it.
> >
> > Gyula
> >
> > On Tue, Mar 12, 2024 at 3:53 PM Robert Metzger <rm...@apache.org>
> > wrote:
> >
> > > Have you guys considered making savepoints a first class citizen in the
> > > Kubernetes operator?
> > > E.g. to trigger a savepoint, you create a "FlinkSavepoint" CR, the K8s
> > > operator picks up that resource and tries to create a savepoint
> > > indefinitely until the savepoint has been successfully created. We
> report
> > > the savepoint status and location in the "status" field.
> > >
> > > We could even add an (optional) finalizer to delete the physical
> > savepoint
> > > from the savepoint storage once the "FlinkSavepoint" CR has been
> deleted.
> > > optional: the savepoint spec could contain a field "retain
> > > physical savepoint" or something, that controls the delete behavior.
> > >
> > >
> > > On Thu, Mar 3, 2022 at 4:02 AM Yang Wang <da...@gmail.com>
> wrote:
> > >
> > > > I agree that we could start with the annotation approach and collect
> > the
> > > > feedback at the same time.
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Őrhidi Mátyás <ma...@gmail.com> 于2022年3月2日周三 20:06写道:
> > > >
> > > > > Thank you for your feedback!
> > > > >
> > > > > The annotation on the
> > > > >
> > > > > @ControllerConfiguration(generationAwareEventProcessing = false)
> > > > > FlinkDeploymentController
> > > > >
> > > > > already enables the event triggering based on metadata changes. It
> > was
> > > > set
> > > > > earlier to support some failure scenarios. (It can be used for
> > example
> > > to
> > > > > manually reenable the reconcile loop when it got stuck in an error
> > > phase)
> > > > >
> > > > > I will go ahead and propose a PR using annotations then.
> > > > >
> > > > > Cheers,
> > > > > Matyas
> > > > >
> > > > > On Wed, Mar 2, 2022 at 12:47 PM Yang Wang <da...@gmail.com>
> > > wrote:
> > > > >
> > > > > > I also like the annotation approach since it is more natural.
> > > > > > But I am not sure about whether the meta data change will trigger
> > an
> > > > > event
> > > > > > in java-operator-sdk.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:
> > > > > >
> > > > > > > Thanks Matyas,
> > > > > > >
> > > > > > > From a user perspective I think the annotation is pretty nice
> and
> > > > user
> > > > > > > friendly so I personally prefer that approach.
> > > > > > >
> > > > > > > You said:
> > > > > > >  "It seems, the java-operator-sdk handles the changes of the
> > > > .metadata
> > > > > > and
> > > > > > > .spec fields of custom resources differently."
> > > > > > >
> > > > > > > What implications does this have on the above mentioned 2
> > > approaches?
> > > > > > Does
> > > > > > > it make one more difficult than the other?
> > > > > > >
> > > > > > > Cheers
> > > > > > > Gyula
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <
> > > > matyas.orhidi@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All!
> > > > > > > >
> > > > > > > > I'd like to start a quick discussion about the way we allow
> > users
> > > > to
> > > > > > > > trigger savepoints manually in the operator [FLINK-26181]
> > > > > > > > <https://issues.apache.org/jira/browse/FLINK-26181>. There
> are
> > > > > > existing
> > > > > > > > solutions already for this functionality in other operators,
> > for
> > > > > > example:
> > > > > > > > - counter based
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > > > > > > > >
> > > > > > > > - annotation based
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > > > > > > > >
> > > > > > > >
> > > > > > > > We could implement any of these or both or come up with our
> own
> > > > > > approach.
> > > > > > > > It seems, the java-operator-sdk handles the changes of the
> > > > .metadata
> > > > > > and
> > > > > > > > .spec fields of custom resources differently. For further
> info
> > > see
> > > > > the
> > > > > > > > chapter Generation Awareness and Event Filtering in the docs
> > > > > > > > <https://javaoperatorsdk.io/docs/features>.
> > > > > > > >
> > > > > > > > Let me know what you think.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Matyas
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Mate Czagany <cz...@gmail.com>.
Hi,

I really like this idea as well, I think it would be a great improvement
compared to how manual savepoints currently work, and suits Kubernetes
workflows a lot better.

If there are no objections, I can investigate it during the next few weeks
and see how this could be implemented in the current code.

Cheers,
Mate

Gyula Fóra <gy...@gmail.com> ezt írta (időpont: 2024. márc. 12., K,
16:01):

> That's definitely a good improvement Robert and we should add it at some
> point. At the point in time when this was implemented we went with the
> current simpler / more lightweight approach.
> However if anyone is interested in working on this / contributing this
> improvement I would personally support it.
>
> Gyula
>
> On Tue, Mar 12, 2024 at 3:53 PM Robert Metzger <rm...@apache.org>
> wrote:
>
> > Have you guys considered making savepoints a first class citizen in the
> > Kubernetes operator?
> > E.g. to trigger a savepoint, you create a "FlinkSavepoint" CR, the K8s
> > operator picks up that resource and tries to create a savepoint
> > indefinitely until the savepoint has been successfully created. We report
> > the savepoint status and location in the "status" field.
> >
> > We could even add an (optional) finalizer to delete the physical
> savepoint
> > from the savepoint storage once the "FlinkSavepoint" CR has been deleted.
> > optional: the savepoint spec could contain a field "retain
> > physical savepoint" or something, that controls the delete behavior.
> >
> >
> > On Thu, Mar 3, 2022 at 4:02 AM Yang Wang <da...@gmail.com> wrote:
> >
> > > I agree that we could start with the annotation approach and collect
> the
> > > feedback at the same time.
> > >
> > > Best,
> > > Yang
> > >
> > > Őrhidi Mátyás <ma...@gmail.com> 于2022年3月2日周三 20:06写道:
> > >
> > > > Thank you for your feedback!
> > > >
> > > > The annotation on the
> > > >
> > > > @ControllerConfiguration(generationAwareEventProcessing = false)
> > > > FlinkDeploymentController
> > > >
> > > > already enables the event triggering based on metadata changes. It
> was
> > > set
> > > > earlier to support some failure scenarios. (It can be used for
> example
> > to
> > > > manually reenable the reconcile loop when it got stuck in an error
> > phase)
> > > >
> > > > I will go ahead and propose a PR using annotations then.
> > > >
> > > > Cheers,
> > > > Matyas
> > > >
> > > > On Wed, Mar 2, 2022 at 12:47 PM Yang Wang <da...@gmail.com>
> > wrote:
> > > >
> > > > > I also like the annotation approach since it is more natural.
> > > > > But I am not sure about whether the meta data change will trigger
> an
> > > > event
> > > > > in java-operator-sdk.
> > > > >
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:
> > > > >
> > > > > > Thanks Matyas,
> > > > > >
> > > > > > From a user perspective I think the annotation is pretty nice and
> > > user
> > > > > > friendly so I personally prefer that approach.
> > > > > >
> > > > > > You said:
> > > > > >  "It seems, the java-operator-sdk handles the changes of the
> > > .metadata
> > > > > and
> > > > > > .spec fields of custom resources differently."
> > > > > >
> > > > > > What implications does this have on the above mentioned 2
> > approaches?
> > > > > Does
> > > > > > it make one more difficult than the other?
> > > > > >
> > > > > > Cheers
> > > > > > Gyula
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <
> > > matyas.orhidi@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi All!
> > > > > > >
> > > > > > > I'd like to start a quick discussion about the way we allow
> users
> > > to
> > > > > > > trigger savepoints manually in the operator [FLINK-26181]
> > > > > > > <https://issues.apache.org/jira/browse/FLINK-26181>. There are
> > > > > existing
> > > > > > > solutions already for this functionality in other operators,
> for
> > > > > example:
> > > > > > > - counter based
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > > > > > > >
> > > > > > > - annotation based
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > > > > > > >
> > > > > > >
> > > > > > > We could implement any of these or both or come up with our own
> > > > > approach.
> > > > > > > It seems, the java-operator-sdk handles the changes of the
> > > .metadata
> > > > > and
> > > > > > > .spec fields of custom resources differently. For further info
> > see
> > > > the
> > > > > > > chapter Generation Awareness and Event Filtering in the docs
> > > > > > > <https://javaoperatorsdk.io/docs/features>.
> > > > > > >
> > > > > > > Let me know what you think.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Matyas
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Gyula Fóra <gy...@gmail.com>.
That's definitely a good improvement Robert and we should add it at some
point. At the point in time when this was implemented we went with the
current simpler / more lightweight approach.
However if anyone is interested in working on this / contributing this
improvement I would personally support it.

Gyula

On Tue, Mar 12, 2024 at 3:53 PM Robert Metzger <rm...@apache.org> wrote:

> Have you guys considered making savepoints a first class citizen in the
> Kubernetes operator?
> E.g. to trigger a savepoint, you create a "FlinkSavepoint" CR, the K8s
> operator picks up that resource and tries to create a savepoint
> indefinitely until the savepoint has been successfully created. We report
> the savepoint status and location in the "status" field.
>
> We could even add an (optional) finalizer to delete the physical savepoint
> from the savepoint storage once the "FlinkSavepoint" CR has been deleted.
> optional: the savepoint spec could contain a field "retain
> physical savepoint" or something, that controls the delete behavior.
>
>
> On Thu, Mar 3, 2022 at 4:02 AM Yang Wang <da...@gmail.com> wrote:
>
> > I agree that we could start with the annotation approach and collect the
> > feedback at the same time.
> >
> > Best,
> > Yang
> >
> > Őrhidi Mátyás <ma...@gmail.com> 于2022年3月2日周三 20:06写道:
> >
> > > Thank you for your feedback!
> > >
> > > The annotation on the
> > >
> > > @ControllerConfiguration(generationAwareEventProcessing = false)
> > > FlinkDeploymentController
> > >
> > > already enables the event triggering based on metadata changes. It was
> > set
> > > earlier to support some failure scenarios. (It can be used for example
> to
> > > manually reenable the reconcile loop when it got stuck in an error
> phase)
> > >
> > > I will go ahead and propose a PR using annotations then.
> > >
> > > Cheers,
> > > Matyas
> > >
> > > On Wed, Mar 2, 2022 at 12:47 PM Yang Wang <da...@gmail.com>
> wrote:
> > >
> > > > I also like the annotation approach since it is more natural.
> > > > But I am not sure about whether the meta data change will trigger an
> > > event
> > > > in java-operator-sdk.
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:
> > > >
> > > > > Thanks Matyas,
> > > > >
> > > > > From a user perspective I think the annotation is pretty nice and
> > user
> > > > > friendly so I personally prefer that approach.
> > > > >
> > > > > You said:
> > > > >  "It seems, the java-operator-sdk handles the changes of the
> > .metadata
> > > > and
> > > > > .spec fields of custom resources differently."
> > > > >
> > > > > What implications does this have on the above mentioned 2
> approaches?
> > > > Does
> > > > > it make one more difficult than the other?
> > > > >
> > > > > Cheers
> > > > > Gyula
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <
> > matyas.orhidi@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi All!
> > > > > >
> > > > > > I'd like to start a quick discussion about the way we allow users
> > to
> > > > > > trigger savepoints manually in the operator [FLINK-26181]
> > > > > > <https://issues.apache.org/jira/browse/FLINK-26181>. There are
> > > > existing
> > > > > > solutions already for this functionality in other operators, for
> > > > example:
> > > > > > - counter based
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > > > > > >
> > > > > > - annotation based
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > > > > > >
> > > > > >
> > > > > > We could implement any of these or both or come up with our own
> > > > approach.
> > > > > > It seems, the java-operator-sdk handles the changes of the
> > .metadata
> > > > and
> > > > > > .spec fields of custom resources differently. For further info
> see
> > > the
> > > > > > chapter Generation Awareness and Event Filtering in the docs
> > > > > > <https://javaoperatorsdk.io/docs/features>.
> > > > > >
> > > > > > Let me know what you think.
> > > > > >
> > > > > > Cheers,
> > > > > > Matyas
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Robert Metzger <rm...@apache.org>.
Have you guys considered making savepoints a first class citizen in the
Kubernetes operator?
E.g. to trigger a savepoint, you create a "FlinkSavepoint" CR, the K8s
operator picks up that resource and tries to create a savepoint
indefinitely until the savepoint has been successfully created. We report
the savepoint status and location in the "status" field.

We could even add an (optional) finalizer to delete the physical savepoint
from the savepoint storage once the "FlinkSavepoint" CR has been deleted.
optional: the savepoint spec could contain a field "retain
physical savepoint" or something, that controls the delete behavior.


On Thu, Mar 3, 2022 at 4:02 AM Yang Wang <da...@gmail.com> wrote:

> I agree that we could start with the annotation approach and collect the
> feedback at the same time.
>
> Best,
> Yang
>
> Őrhidi Mátyás <ma...@gmail.com> 于2022年3月2日周三 20:06写道:
>
> > Thank you for your feedback!
> >
> > The annotation on the
> >
> > @ControllerConfiguration(generationAwareEventProcessing = false)
> > FlinkDeploymentController
> >
> > already enables the event triggering based on metadata changes. It was
> set
> > earlier to support some failure scenarios. (It can be used for example to
> > manually reenable the reconcile loop when it got stuck in an error phase)
> >
> > I will go ahead and propose a PR using annotations then.
> >
> > Cheers,
> > Matyas
> >
> > On Wed, Mar 2, 2022 at 12:47 PM Yang Wang <da...@gmail.com> wrote:
> >
> > > I also like the annotation approach since it is more natural.
> > > But I am not sure about whether the meta data change will trigger an
> > event
> > > in java-operator-sdk.
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:
> > >
> > > > Thanks Matyas,
> > > >
> > > > From a user perspective I think the annotation is pretty nice and
> user
> > > > friendly so I personally prefer that approach.
> > > >
> > > > You said:
> > > >  "It seems, the java-operator-sdk handles the changes of the
> .metadata
> > > and
> > > > .spec fields of custom resources differently."
> > > >
> > > > What implications does this have on the above mentioned 2 approaches?
> > > Does
> > > > it make one more difficult than the other?
> > > >
> > > > Cheers
> > > > Gyula
> > > >
> > > >
> > > >
> > > > On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <
> matyas.orhidi@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi All!
> > > > >
> > > > > I'd like to start a quick discussion about the way we allow users
> to
> > > > > trigger savepoints manually in the operator [FLINK-26181]
> > > > > <https://issues.apache.org/jira/browse/FLINK-26181>. There are
> > > existing
> > > > > solutions already for this functionality in other operators, for
> > > example:
> > > > > - counter based
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > > > > >
> > > > > - annotation based
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > > > > >
> > > > >
> > > > > We could implement any of these or both or come up with our own
> > > approach.
> > > > > It seems, the java-operator-sdk handles the changes of the
> .metadata
> > > and
> > > > > .spec fields of custom resources differently. For further info see
> > the
> > > > > chapter Generation Awareness and Event Filtering in the docs
> > > > > <https://javaoperatorsdk.io/docs/features>.
> > > > >
> > > > > Let me know what you think.
> > > > >
> > > > > Cheers,
> > > > > Matyas
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Yang Wang <da...@gmail.com>.
I agree that we could start with the annotation approach and collect the
feedback at the same time.

Best,
Yang

Őrhidi Mátyás <ma...@gmail.com> 于2022年3月2日周三 20:06写道:

> Thank you for your feedback!
>
> The annotation on the
>
> @ControllerConfiguration(generationAwareEventProcessing = false)
> FlinkDeploymentController
>
> already enables the event triggering based on metadata changes. It was set
> earlier to support some failure scenarios. (It can be used for example to
> manually reenable the reconcile loop when it got stuck in an error phase)
>
> I will go ahead and propose a PR using annotations then.
>
> Cheers,
> Matyas
>
> On Wed, Mar 2, 2022 at 12:47 PM Yang Wang <da...@gmail.com> wrote:
>
> > I also like the annotation approach since it is more natural.
> > But I am not sure about whether the meta data change will trigger an
> event
> > in java-operator-sdk.
> >
> >
> > Best,
> > Yang
> >
> > Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:
> >
> > > Thanks Matyas,
> > >
> > > From a user perspective I think the annotation is pretty nice and user
> > > friendly so I personally prefer that approach.
> > >
> > > You said:
> > >  "It seems, the java-operator-sdk handles the changes of the .metadata
> > and
> > > .spec fields of custom resources differently."
> > >
> > > What implications does this have on the above mentioned 2 approaches?
> > Does
> > > it make one more difficult than the other?
> > >
> > > Cheers
> > > Gyula
> > >
> > >
> > >
> > > On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <ma...@gmail.com>
> > > wrote:
> > >
> > > > Hi All!
> > > >
> > > > I'd like to start a quick discussion about the way we allow users to
> > > > trigger savepoints manually in the operator [FLINK-26181]
> > > > <https://issues.apache.org/jira/browse/FLINK-26181>. There are
> > existing
> > > > solutions already for this functionality in other operators, for
> > example:
> > > > - counter based
> > > > <
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > > > >
> > > > - annotation based
> > > > <
> > > >
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > > > >
> > > >
> > > > We could implement any of these or both or come up with our own
> > approach.
> > > > It seems, the java-operator-sdk handles the changes of the .metadata
> > and
> > > > .spec fields of custom resources differently. For further info see
> the
> > > > chapter Generation Awareness and Event Filtering in the docs
> > > > <https://javaoperatorsdk.io/docs/features>.
> > > >
> > > > Let me know what you think.
> > > >
> > > > Cheers,
> > > > Matyas
> > > >
> > >
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Őrhidi Mátyás <ma...@gmail.com>.
Thank you for your feedback!

The annotation on the

@ControllerConfiguration(generationAwareEventProcessing = false)
FlinkDeploymentController

already enables the event triggering based on metadata changes. It was set
earlier to support some failure scenarios. (It can be used for example to
manually reenable the reconcile loop when it got stuck in an error phase)

I will go ahead and propose a PR using annotations then.

Cheers,
Matyas

On Wed, Mar 2, 2022 at 12:47 PM Yang Wang <da...@gmail.com> wrote:

> I also like the annotation approach since it is more natural.
> But I am not sure about whether the meta data change will trigger an event
> in java-operator-sdk.
>
>
> Best,
> Yang
>
> Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:
>
> > Thanks Matyas,
> >
> > From a user perspective I think the annotation is pretty nice and user
> > friendly so I personally prefer that approach.
> >
> > You said:
> >  "It seems, the java-operator-sdk handles the changes of the .metadata
> and
> > .spec fields of custom resources differently."
> >
> > What implications does this have on the above mentioned 2 approaches?
> Does
> > it make one more difficult than the other?
> >
> > Cheers
> > Gyula
> >
> >
> >
> > On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <ma...@gmail.com>
> > wrote:
> >
> > > Hi All!
> > >
> > > I'd like to start a quick discussion about the way we allow users to
> > > trigger savepoints manually in the operator [FLINK-26181]
> > > <https://issues.apache.org/jira/browse/FLINK-26181>. There are
> existing
> > > solutions already for this functionality in other operators, for
> example:
> > > - counter based
> > > <
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > > >
> > > - annotation based
> > > <
> > >
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > > >
> > >
> > > We could implement any of these or both or come up with our own
> approach.
> > > It seems, the java-operator-sdk handles the changes of the .metadata
> and
> > > .spec fields of custom resources differently. For further info see the
> > > chapter Generation Awareness and Event Filtering in the docs
> > > <https://javaoperatorsdk.io/docs/features>.
> > >
> > > Let me know what you think.
> > >
> > > Cheers,
> > > Matyas
> > >
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Yang Wang <da...@gmail.com>.
I also like the annotation approach since it is more natural.
But I am not sure about whether the meta data change will trigger an event
in java-operator-sdk.


Best,
Yang

Gyula Fóra <gy...@gmail.com> 于2022年3月2日周三 16:29写道:

> Thanks Matyas,
>
> From a user perspective I think the annotation is pretty nice and user
> friendly so I personally prefer that approach.
>
> You said:
>  "It seems, the java-operator-sdk handles the changes of the .metadata and
> .spec fields of custom resources differently."
>
> What implications does this have on the above mentioned 2 approaches? Does
> it make one more difficult than the other?
>
> Cheers
> Gyula
>
>
>
> On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <ma...@gmail.com>
> wrote:
>
> > Hi All!
> >
> > I'd like to start a quick discussion about the way we allow users to
> > trigger savepoints manually in the operator [FLINK-26181]
> > <https://issues.apache.org/jira/browse/FLINK-26181>. There are existing
> > solutions already for this functionality in other operators, for example:
> > - counter based
> > <
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> > >
> > - annotation based
> > <
> >
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> > >
> >
> > We could implement any of these or both or come up with our own approach.
> > It seems, the java-operator-sdk handles the changes of the .metadata and
> > .spec fields of custom resources differently. For further info see the
> > chapter Generation Awareness and Event Filtering in the docs
> > <https://javaoperatorsdk.io/docs/features>.
> >
> > Let me know what you think.
> >
> > Cheers,
> > Matyas
> >
>

Re: [DISCUSS] Manual savepoint triggering in flink-kubernetes-operator

Posted by Gyula Fóra <gy...@gmail.com>.
Thanks Matyas,

From a user perspective I think the annotation is pretty nice and user
friendly so I personally prefer that approach.

You said:
 "It seems, the java-operator-sdk handles the changes of the .metadata and
.spec fields of custom resources differently."

What implications does this have on the above mentioned 2 approaches? Does
it make one more difficult than the other?

Cheers
Gyula



On Tue, Mar 1, 2022 at 1:52 PM Őrhidi Mátyás <ma...@gmail.com>
wrote:

> Hi All!
>
> I'd like to start a quick discussion about the way we allow users to
> trigger savepoints manually in the operator [FLINK-26181]
> <https://issues.apache.org/jira/browse/FLINK-26181>. There are existing
> solutions already for this functionality in other operators, for example:
> - counter based
> <
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#2-taking-savepoints-by-updating-the-flinkcluster-custom-resource
> >
> - annotation based
> <
> https://github.com/spotify/flink-on-k8s-operator/blob/master/docs/savepoints_guide.md#3-taking-savepoints-by-attaching-annotation-to-the-flinkcluster-custom-resource
> >
>
> We could implement any of these or both or come up with our own approach.
> It seems, the java-operator-sdk handles the changes of the .metadata and
> .spec fields of custom resources differently. For further info see the
> chapter Generation Awareness and Event Filtering in the docs
> <https://javaoperatorsdk.io/docs/features>.
>
> Let me know what you think.
>
> Cheers,
> Matyas
>