You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Gyula Fóra <gy...@apache.org> on 2022/05/19 06:13:49 UTC

Re: Flink Kubernetes operator not having a scale subresource

Hi Team!

This is probably something for after the release but I created a simple
prototype for the scaling subresource based on taskmanager replica count.

You can take a look here:
https://github.com/apache/flink-kubernetes-operator/pull/227

After some consideration I decided against using parallelism and used tm
replicas instead (still with native integration), I describe this in the PR.

I will leave the PR open so people can experiment/comment and we should
definitely get back to this after the 1.0.0 release because it seems to be
a very lightweight yet useful feature.

Cheers,
Gyula


On Sat, May 7, 2022 at 11:25 AM Gyula Fóra <gy...@apache.org> wrote:

> Hi Jay!
>
> I will take a closer look into this and see if we can use the parallelism
> in the scale subresource.
>
> If you could experiment with this and see if it works with the current CRD
> that would be helpful . Not sure if we need to change the status or
> anything as parallelism is only part of the spec at the moment.
>
> If you have a working modified CRD I would appreciate if you could share
> it with us!
>
> Don’t worry about the release schedule, if we think that this is important
> and we need some changes for it , we can push the release out a few days if
> necessary.
>
> What is important at this point to understand what exactly we need to make
> the parallelism scaling work natively to avoid breaking changes to the
> spec/status after the release :)
>
> Cheers
> Gyula
>
> On Sat, 7 May 2022 at 11:14, Jay Ghiya <gh...@gmail.com> wrote:
>
>> Hi Team,
>>
>> Yes we can change the parallelism of flink job. So going through the
>> roadmap , what I understand that we have put the standalone mode as second
>> priority due to right reasons. So , if possible can I be of any help to
>> accelerate this as we have a tight release schedule so would want to close
>> this in next 10 days with your guys’ help.
>>
>> Looking forward to hear from you !
>>
>> -Jay
>>
>> Sent with a Spark <https://sparkmailapp.com/source?from=signature>
>> On 7 May 2022, 8:15 AM +0530, Yang Wang <da...@gmail.com>, wrote:
>>
>> Currently, the flink-kubernetes-operator is using Flink native K8s
>> integration[1], which means Flink ResourceManager will dynamically allocate
>> TaskManager on demand.
>> So the users do not need to specify the replicas of TaskManager.
>>
>> Just like Gyula said, one possible solution to make "kubectl scale" work
>> is to change the parallelism of Flink job.
>>
>> If the standalone mode[2] is introduced in the operator, then it is also
>> possible to directly change the replicas of TaskManager pods.
>>
>>
>> [1].
>> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/
>> [2].
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-225%3A+Implement+standalone+mode+support+in+the+kubernetes+operator
>>
>> Best,
>> Yang
>>
>> Gyula Fóra <gy...@apache.org> 于2022年5月7日周六 04:26写道:
>>
>>> Hi Jay!
>>>
>>> Interesting question/proposal to add the scale-subresource.
>>>
>>> I am not an expert on this area but we will look into this a little and
>>> give you some feedback and see if we can incorporate something into the
>>> upcoming release if it makes sense.
>>>
>>> On a high level there is not a single replicas value for a
>>> FlinkDeployment that would be easy to map, but maybe we could use the
>>> parallelism value for this purpose for Applications/Session jobs.
>>>
>>> Cheers,
>>> Gyula
>>>
>>> On Fri, May 6, 2022 at 8:04 PM Jay Ghiya <gh...@gmail.com> wrote:
>>>
>>>>  Hi Team,
>>>>
>>>>
>>>> I have been experimenting the Flink Kubernetes operator. One of the
>>>> biggest miss that we have is it does not support scale sub resource as of
>>>> now to support reactive scaling. Without that commercially it becomes very
>>>> difficult for products like us who have very varied loads for every hour.
>>>>
>>>>
>>>>
>>>> Can I get some direction on the same to contribute on
>>>> https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource for
>>>> our Kubernetes operator crd?
>>>>
>>>> I have been a hard time reading -> *https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/crds/flinkdeployments.flink.apache.org-v1.yml
>>>> <https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/crds/flinkdeployments.flink.apache.org-v1.yml> to
>>>> figure out the replicas, status,label selector json path of task
>>>> manager? It may be due to lack of my knowledge so sense of direction will
>>>> help me.*
>>>>
>>>> *-Jay*
>>>> *GEHC*
>>>>
>>>