You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Jassat, Usamah" <us...@amazon.co.uk.INVALID> on 2022/04/25 10:25:11 UTC

[DISCUSS] FLIP-223: Implement standalone mode support in the kubernetes operator

Hi everyone,

We would like to start the discussion of the adding standalone mode support to the Flink Kubernetes operator. Standalone mode was initially considered as part of FLIP-212 but decided to be out of scope to focus on Flink native k8s integration for that FLIP [1]. Standalone support will also open the door to supporting previous Flink versions in the operator which I would also like to open discussion about.

I have created a FLIP with the details on the general changes that we are proposing: https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Implement+standalone+mode+support+in+the+kubernetes+operator


Looking forward to your feedback.

Regards,
Usamah

[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator

Re: [DISCUSS] FLIP-223: Implement standalone mode support in the kubernetes operator

Posted by Gyula Fóra <gy...@gmail.com>.
+1 for the proposal :)

I think this will fit nicely with the current API.

If we use deployments we could simply extend the current status interfaces
with taskManagerDeploymentStatus (we already have
jobManagerDeploymentStatus) for the standalone mode.

We would have to check how we could use the Zookeeper HA to implement
last-state, there is quite a bit of trickery involved in the current
implementation that is specific to k8s ha.

Gyula

On Tue, Apr 26, 2022 at 1:23 PM Jassat, Usamah <us...@amazon.co.uk.invalid>
wrote:

> Thanks for the feedback.
>
> # The TaskManager replicas
> Yeah I think this makes sense and explicitly stating TM replicas makes
> more sense in the standalone mode. I will update the FLIP and clarify this.
>
> # How the JobManager and TaskManager pods are managed?
> I think deployments for both TaskManager and JobManager pods should be
> sufficient as we just need a pool of TaskManagers and this also matches the
> Flink documentation for setting up a Flink Standalone cluster on k8s [1].
>
>
> # Version support
> That's great to hear that we can support 1.13.
>
> I would want to support ZooKeeper HA for last-state upgrade however if
> this is some sort of limitation that doesn't make this possible then we can
> fall-back to limitations for older versions. Do you know if there are any
> limitations of ZooKeeper HA that may mean it won't work with last-state
> upgrade mode?
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/
>
> Thanks,
> Usamah
>
> On 26/04/2022, 10:10, "Yang Wang" <da...@gmail.com> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
>     Thanks for creating the FLIP-223 and starting the discussion.
>
>     I have some quick questions.
>
>     # The TaskManager replicas
>
>
>     The TaskManager replicas need to be configured both for standalone
> session
>     and application. Because it could not be calculated if the parallelism
> is
>     set via java codes.
>
>
>     # How the JobManager and TaskManager pods are managed?
>
>     We could use k8s Deployment to manage the JobManager pods. Of cause,
> k8s
>     Job, StatefulSet also make sense.
>
>
>     What would you like to do for the TaskManager pods?
>
>
>
>     # Version support
>
>     Native support could work from 1.13 and I have created a ticket for
> this[1].
>
>
>     Considering the last-state upgrade mode, the K8s HA should be enabled.
> I am
>     afraid even standalone mode before 1.12 could not work.
>
>     Do you want to introduce the ZooKeeper HA or add some limitations for
>     version choice?
>
>
>
>
>     [1]. https://issues.apache.org/jira/browse/FLINK-27412
>
>
>
>     Best,
>
>     Yang
>
>     Jassat, Usamah <us...@amazon.co.uk.invalid> 于2022年4月25日周一 18:25写道:
>
>     > Hi everyone,
>     >
>     > We would like to start the discussion of the adding standalone mode
>     > support to the Flink Kubernetes operator. Standalone mode was
> initially
>     > considered as part of FLIP-212 but decided to be out of scope to
> focus on
>     > Flink native k8s integration for that FLIP [1]. Standalone support
> will
>     > also open the door to supporting previous Flink versions in the
> operator
>     > which I would also like to open discussion about.
>     >
>     > I have created a FLIP with the details on the general changes that
> we are
>     > proposing:
>     >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Implement+standalone+mode+support+in+the+kubernetes+operator
>     >
>     >
>     > Looking forward to your feedback.
>     >
>     > Regards,
>     > Usamah
>     >
>     > [1]
>     >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator
>     >
>
>

Re: [DISCUSS] FLIP-223: Implement standalone mode support in the kubernetes operator

Posted by "Jassat, Usamah" <us...@amazon.co.uk.INVALID>.
Thanks for the feedback.
 
# The TaskManager replicas
Yeah I think this makes sense and explicitly stating TM replicas makes more sense in the standalone mode. I will update the FLIP and clarify this.

# How the JobManager and TaskManager pods are managed?
I think deployments for both TaskManager and JobManager pods should be sufficient as we just need a pool of TaskManagers and this also matches the Flink documentation for setting up a Flink Standalone cluster on k8s [1].


# Version support
That's great to hear that we can support 1.13.

I would want to support ZooKeeper HA for last-state upgrade however if this is some sort of limitation that doesn't make this possible then we can fall-back to limitations for older versions. Do you know if there are any limitations of ZooKeeper HA that may mean it won't work with last-state upgrade mode?

[1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/

Thanks,
Usamah

On 26/04/2022, 10:10, "Yang Wang" <da...@gmail.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



    Thanks for creating the FLIP-223 and starting the discussion.

    I have some quick questions.

    # The TaskManager replicas


    The TaskManager replicas need to be configured both for standalone session
    and application. Because it could not be calculated if the parallelism is
    set via java codes.


    # How the JobManager and TaskManager pods are managed?

    We could use k8s Deployment to manage the JobManager pods. Of cause, k8s
    Job, StatefulSet also make sense.


    What would you like to do for the TaskManager pods?



    # Version support

    Native support could work from 1.13 and I have created a ticket for this[1].


    Considering the last-state upgrade mode, the K8s HA should be enabled. I am
    afraid even standalone mode before 1.12 could not work.

    Do you want to introduce the ZooKeeper HA or add some limitations for
    version choice?




    [1]. https://issues.apache.org/jira/browse/FLINK-27412



    Best,

    Yang

    Jassat, Usamah <us...@amazon.co.uk.invalid> 于2022年4月25日周一 18:25写道:

    > Hi everyone,
    >
    > We would like to start the discussion of the adding standalone mode
    > support to the Flink Kubernetes operator. Standalone mode was initially
    > considered as part of FLIP-212 but decided to be out of scope to focus on
    > Flink native k8s integration for that FLIP [1]. Standalone support will
    > also open the door to supporting previous Flink versions in the operator
    > which I would also like to open discussion about.
    >
    > I have created a FLIP with the details on the general changes that we are
    > proposing:
    > https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Implement+standalone+mode+support+in+the+kubernetes+operator
    >
    >
    > Looking forward to your feedback.
    >
    > Regards,
    > Usamah
    >
    > [1]
    > https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator
    >


Re: [DISCUSS] FLIP-223: Implement standalone mode support in the kubernetes operator

Posted by Yang Wang <da...@gmail.com>.
Thanks for creating the FLIP-223 and starting the discussion.

I have some quick questions.

# The TaskManager replicas


The TaskManager replicas need to be configured both for standalone session
and application. Because it could not be calculated if the parallelism is
set via java codes.


# How the JobManager and TaskManager pods are managed?

We could use k8s Deployment to manage the JobManager pods. Of cause, k8s
Job, StatefulSet also make sense.


What would you like to do for the TaskManager pods?



# Version support

Native support could work from 1.13 and I have created a ticket for this[1].


Considering the last-state upgrade mode, the K8s HA should be enabled. I am
afraid even standalone mode before 1.12 could not work.

Do you want to introduce the ZooKeeper HA or add some limitations for
version choice?




[1]. https://issues.apache.org/jira/browse/FLINK-27412



Best,

Yang

Jassat, Usamah <us...@amazon.co.uk.invalid> 于2022年4月25日周一 18:25写道:

> Hi everyone,
>
> We would like to start the discussion of the adding standalone mode
> support to the Flink Kubernetes operator. Standalone mode was initially
> considered as part of FLIP-212 but decided to be out of scope to focus on
> Flink native k8s integration for that FLIP [1]. Standalone support will
> also open the door to supporting previous Flink versions in the operator
> which I would also like to open discussion about.
>
> I have created a FLIP with the details on the general changes that we are
> proposing:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Implement+standalone+mode+support+in+the+kubernetes+operator
>
>
> Looking forward to your feedback.
>
> Regards,
> Usamah
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator
>