You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by felixzheng zheng <fe...@gmail.com> on 2020/02/21 01:30:42 UTC

[DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Hi everyone,

I would like to kick off a discussion on refactoring the existing
Kubernetes resources construction architecture design.

I created a design document [1] that clarifies our motivation to do this
and some improvement proposals for the new design.

Briefly, we would like to
1. Introduce a unified monadic-step based orchestrator architecture that
has a better, cleaner and consistent abstraction for the Kubernetes
resources construction process, both applicable to the client side and the
master side.
2. Add some dedicated tools for centrally parsing, verifying, and managing
the Kubernetes parameters.

It would be great to start the efforts before adding big features for the
native Kubernetes submodule, and Tison and I plan to work together for this
issue.

Please let me know your thoughts.

Regards,
Canbin Zheng

[1]
https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
[2] https://issues.apache.org/jira/browse/FLINK-16194

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Canbin Zheng <fe...@gmail.com>.
Thanks very much Yang Wang.

Cheers,
Canbin Zheng

Yang Wang <da...@gmail.com> 于2020年2月27日周四 下午9:19写道:

> Great work! I could help to review and test.
>
> Best,
> Yang
>
> Canbin Zheng <fe...@gmail.com> 于2020年2月27日周四 下午4:24写道:
>
>> Hi, everyone,
>>
>> I have pushed a PR <https://github.com/apache/flink/pull/11233> for this
>> issue, looking forward to your feedback.
>>
>>
>> Cheers,
>> Canbin Zheng
>>
>> Canbin Zheng <fe...@gmail.com> 于2020年2月26日周三 上午10:39写道:
>>
>>> Thanks for the detailed PR advice, I would separate the commits as clear
>>> as possible to help the code reviewing.
>>>
>>>
>>> Cheers,
>>> Canbin Zheng
>>>
>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午10:11写道:
>>>
>>>> Thanks for your clarification Yang! We're on the same page.
>>>>
>>>> Best,
>>>> tison.
>>>>
>>>>
>>>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午10:07写道:
>>>>
>>>>> Hi tison,
>>>>>
>>>>> I do not mean to keep two decorator at the same. Since the two
>>>>> decorators are
>>>>> not api compatible, it is meaningless. I am just thinking how
>>>>> to organize the
>>>>> commits/PRs to make the review easier. The reviewers may need some
>>>>> context
>>>>> to get the point.
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>> Yang
>>>>>
>>>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:23写道:
>>>>>
>>>>>> The process in my mind is somehow like this commit[1] which belongs
>>>>>> to this pr[2]
>>>>>> that we firstly introduce the new implementation and then replace it
>>>>>> with the original
>>>>>> one. The difference is that these two versions of decorators are not
>>>>>> api compatible
>>>>>> while adding a switch for such an internal abstraction or extracting
>>>>>> a clumsy
>>>>>> "common" interface doesn't benefit.
>>>>>>
>>>>>> Best,
>>>>>> tison.
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/apache/flink/commit/1f2969357c441e24b71daef83d21563da9a93bb4
>>>>>> [2] https://github.com/apache/flink/pull/9832
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:08写道:
>>>>>>
>>>>>>> I agree for separating commits we can have multiple commits that
>>>>>>> firstly add the new parameters
>>>>>>> and decorators,  and later replace current decorators with new
>>>>>>> decorators which are well
>>>>>>> unit tested.
>>>>>>>
>>>>>>> However, it makes no sense we have two codepaths from
>>>>>>> FlinkKubeClient to decorators
>>>>>>> since these two version of decorators are not api compatible and
>>>>>>> there is no reason we keep both
>>>>>>> of them.
>>>>>>>
>>>>>>> Best,
>>>>>>> tison.
>>>>>>>
>>>>>>>
>>>>>>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:
>>>>>>>
>>>>>>>> I think if we could, splitting into as many PRs as possible is
>>>>>>>> good. Maybe we could
>>>>>>>> introduce the new designed decorators and parameter parser first,
>>>>>>>> and leave the existing
>>>>>>>> decorators as legacy. Once all the new decorators is ready and well
>>>>>>>> tested, we could
>>>>>>>> remove the legacy codes and use the new decorators in the kube
>>>>>>>> client implementation.
>>>>>>>>
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Yang
>>>>>>>>
>>>>>>>> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>>>>>>>>
>>>>>>>>> Hi, Till,
>>>>>>>>>
>>>>>>>>> Great thanks for your advice, I totally agree with you to split
>>>>>>>>> the changes up in as many PRs as possible. The part of "Parameter Parser"
>>>>>>>>> is trivial so that we prefer to make one PR to avoid adapting a lot of
>>>>>>>>> pieces of code that would be deleted immediately with the following
>>>>>>>>> decorator refactoring PR. Actually I won't insist on one PR, could it be
>>>>>>>>> possible that I first try out with one PR and let the committers help
>>>>>>>>> assess whether it is necessary to split the changes into several PRs?
>>>>>>>>> Kindly expect to see your reply.
>>>>>>>>>
>>>>>>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Yang Wang <da...@gmail.com>.
Great work! I could help to review and test.

Best,
Yang

Canbin Zheng <fe...@gmail.com> 于2020年2月27日周四 下午4:24写道:

> Hi, everyone,
>
> I have pushed a PR <https://github.com/apache/flink/pull/11233> for this
> issue, looking forward to your feedback.
>
>
> Cheers,
> Canbin Zheng
>
> Canbin Zheng <fe...@gmail.com> 于2020年2月26日周三 上午10:39写道:
>
>> Thanks for the detailed PR advice, I would separate the commits as clear
>> as possible to help the code reviewing.
>>
>>
>> Cheers,
>> Canbin Zheng
>>
>> tison <wa...@gmail.com> 于2020年2月25日周二 下午10:11写道:
>>
>>> Thanks for your clarification Yang! We're on the same page.
>>>
>>> Best,
>>> tison.
>>>
>>>
>>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午10:07写道:
>>>
>>>> Hi tison,
>>>>
>>>> I do not mean to keep two decorator at the same. Since the two
>>>> decorators are
>>>> not api compatible, it is meaningless. I am just thinking how
>>>> to organize the
>>>> commits/PRs to make the review easier. The reviewers may need some
>>>> context
>>>> to get the point.
>>>>
>>>>
>>>>
>>>> Best,
>>>> Yang
>>>>
>>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:23写道:
>>>>
>>>>> The process in my mind is somehow like this commit[1] which belongs to
>>>>> this pr[2]
>>>>> that we firstly introduce the new implementation and then replace it
>>>>> with the original
>>>>> one. The difference is that these two versions of decorators are not
>>>>> api compatible
>>>>> while adding a switch for such an internal abstraction or extracting a
>>>>> clumsy
>>>>> "common" interface doesn't benefit.
>>>>>
>>>>> Best,
>>>>> tison.
>>>>>
>>>>> [1]
>>>>> https://github.com/apache/flink/commit/1f2969357c441e24b71daef83d21563da9a93bb4
>>>>> [2] https://github.com/apache/flink/pull/9832
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:08写道:
>>>>>
>>>>>> I agree for separating commits we can have multiple commits that
>>>>>> firstly add the new parameters
>>>>>> and decorators,  and later replace current decorators with new
>>>>>> decorators which are well
>>>>>> unit tested.
>>>>>>
>>>>>> However, it makes no sense we have two codepaths from FlinkKubeClient
>>>>>> to decorators
>>>>>> since these two version of decorators are not api compatible and
>>>>>> there is no reason we keep both
>>>>>> of them.
>>>>>>
>>>>>> Best,
>>>>>> tison.
>>>>>>
>>>>>>
>>>>>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:
>>>>>>
>>>>>>> I think if we could, splitting into as many PRs as possible is good.
>>>>>>> Maybe we could
>>>>>>> introduce the new designed decorators and parameter parser first,
>>>>>>> and leave the existing
>>>>>>> decorators as legacy. Once all the new decorators is ready and well
>>>>>>> tested, we could
>>>>>>> remove the legacy codes and use the new decorators in the kube
>>>>>>> client implementation.
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>> Yang
>>>>>>>
>>>>>>> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>>>>>>>
>>>>>>>> Hi, Till,
>>>>>>>>
>>>>>>>> Great thanks for your advice, I totally agree with you to split the
>>>>>>>> changes up in as many PRs as possible. The part of "Parameter Parser" is
>>>>>>>> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
>>>>>>>> of code that would be deleted immediately with the following decorator
>>>>>>>> refactoring PR. Actually I won't insist on one PR, could it be possible
>>>>>>>> that I first try out with one PR and let the committers help assess whether
>>>>>>>> it is necessary to split the changes into several PRs?  Kindly expect to
>>>>>>>> see your reply.
>>>>>>>>
>>>>>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Canbin Zheng <fe...@gmail.com>.
Hi, everyone,

I have pushed a PR <https://github.com/apache/flink/pull/11233> for this
issue, looking forward to your feedback.


Cheers,
Canbin Zheng

Canbin Zheng <fe...@gmail.com> 于2020年2月26日周三 上午10:39写道:

> Thanks for the detailed PR advice, I would separate the commits as clear
> as possible to help the code reviewing.
>
>
> Cheers,
> Canbin Zheng
>
> tison <wa...@gmail.com> 于2020年2月25日周二 下午10:11写道:
>
>> Thanks for your clarification Yang! We're on the same page.
>>
>> Best,
>> tison.
>>
>>
>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午10:07写道:
>>
>>> Hi tison,
>>>
>>> I do not mean to keep two decorator at the same. Since the two
>>> decorators are
>>> not api compatible, it is meaningless. I am just thinking how
>>> to organize the
>>> commits/PRs to make the review easier. The reviewers may need some
>>> context
>>> to get the point.
>>>
>>>
>>>
>>> Best,
>>> Yang
>>>
>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:23写道:
>>>
>>>> The process in my mind is somehow like this commit[1] which belongs to
>>>> this pr[2]
>>>> that we firstly introduce the new implementation and then replace it
>>>> with the original
>>>> one. The difference is that these two versions of decorators are not
>>>> api compatible
>>>> while adding a switch for such an internal abstraction or extracting a
>>>> clumsy
>>>> "common" interface doesn't benefit.
>>>>
>>>> Best,
>>>> tison.
>>>>
>>>> [1]
>>>> https://github.com/apache/flink/commit/1f2969357c441e24b71daef83d21563da9a93bb4
>>>> [2] https://github.com/apache/flink/pull/9832
>>>>
>>>>
>>>>
>>>>
>>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:08写道:
>>>>
>>>>> I agree for separating commits we can have multiple commits that
>>>>> firstly add the new parameters
>>>>> and decorators,  and later replace current decorators with new
>>>>> decorators which are well
>>>>> unit tested.
>>>>>
>>>>> However, it makes no sense we have two codepaths from FlinkKubeClient
>>>>> to decorators
>>>>> since these two version of decorators are not api compatible and there
>>>>> is no reason we keep both
>>>>> of them.
>>>>>
>>>>> Best,
>>>>> tison.
>>>>>
>>>>>
>>>>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:
>>>>>
>>>>>> I think if we could, splitting into as many PRs as possible is good.
>>>>>> Maybe we could
>>>>>> introduce the new designed decorators and parameter parser first, and
>>>>>> leave the existing
>>>>>> decorators as legacy. Once all the new decorators is ready and well
>>>>>> tested, we could
>>>>>> remove the legacy codes and use the new decorators in the kube client
>>>>>> implementation.
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>> Yang
>>>>>>
>>>>>> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>>>>>>
>>>>>>> Hi, Till,
>>>>>>>
>>>>>>> Great thanks for your advice, I totally agree with you to split the
>>>>>>> changes up in as many PRs as possible. The part of "Parameter Parser" is
>>>>>>> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
>>>>>>> of code that would be deleted immediately with the following decorator
>>>>>>> refactoring PR. Actually I won't insist on one PR, could it be possible
>>>>>>> that I first try out with one PR and let the committers help assess whether
>>>>>>> it is necessary to split the changes into several PRs?  Kindly expect to
>>>>>>> see your reply.
>>>>>>>
>>>>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Canbin Zheng <fe...@gmail.com>.
Thanks for the detailed PR advice, I would separate the commits as clear as
possible to help the code reviewing.


Cheers,
Canbin Zheng

tison <wa...@gmail.com> 于2020年2月25日周二 下午10:11写道:

> Thanks for your clarification Yang! We're on the same page.
>
> Best,
> tison.
>
>
> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午10:07写道:
>
>> Hi tison,
>>
>> I do not mean to keep two decorator at the same. Since the two decorators
>> are
>> not api compatible, it is meaningless. I am just thinking how to organize
>> the
>> commits/PRs to make the review easier. The reviewers may need some context
>> to get the point.
>>
>>
>>
>> Best,
>> Yang
>>
>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:23写道:
>>
>>> The process in my mind is somehow like this commit[1] which belongs to
>>> this pr[2]
>>> that we firstly introduce the new implementation and then replace it
>>> with the original
>>> one. The difference is that these two versions of decorators are not api
>>> compatible
>>> while adding a switch for such an internal abstraction or extracting a
>>> clumsy
>>> "common" interface doesn't benefit.
>>>
>>> Best,
>>> tison.
>>>
>>> [1]
>>> https://github.com/apache/flink/commit/1f2969357c441e24b71daef83d21563da9a93bb4
>>> [2] https://github.com/apache/flink/pull/9832
>>>
>>>
>>>
>>>
>>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:08写道:
>>>
>>>> I agree for separating commits we can have multiple commits that
>>>> firstly add the new parameters
>>>> and decorators,  and later replace current decorators with new
>>>> decorators which are well
>>>> unit tested.
>>>>
>>>> However, it makes no sense we have two codepaths from FlinkKubeClient
>>>> to decorators
>>>> since these two version of decorators are not api compatible and there
>>>> is no reason we keep both
>>>> of them.
>>>>
>>>> Best,
>>>> tison.
>>>>
>>>>
>>>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:
>>>>
>>>>> I think if we could, splitting into as many PRs as possible is good.
>>>>> Maybe we could
>>>>> introduce the new designed decorators and parameter parser first, and
>>>>> leave the existing
>>>>> decorators as legacy. Once all the new decorators is ready and well
>>>>> tested, we could
>>>>> remove the legacy codes and use the new decorators in the kube client
>>>>> implementation.
>>>>>
>>>>>
>>>>> Best,
>>>>> Yang
>>>>>
>>>>> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>>>>>
>>>>>> Hi, Till,
>>>>>>
>>>>>> Great thanks for your advice, I totally agree with you to split the
>>>>>> changes up in as many PRs as possible. The part of "Parameter Parser" is
>>>>>> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
>>>>>> of code that would be deleted immediately with the following decorator
>>>>>> refactoring PR. Actually I won't insist on one PR, could it be possible
>>>>>> that I first try out with one PR and let the committers help assess whether
>>>>>> it is necessary to split the changes into several PRs?  Kindly expect to
>>>>>> see your reply.
>>>>>>
>>>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by tison <wa...@gmail.com>.
Thanks for your clarification Yang! We're on the same page.

Best,
tison.


Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午10:07写道:

> Hi tison,
>
> I do not mean to keep two decorator at the same. Since the two decorators
> are
> not api compatible, it is meaningless. I am just thinking how to organize
> the
> commits/PRs to make the review easier. The reviewers may need some context
> to get the point.
>
>
>
> Best,
> Yang
>
> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:23写道:
>
>> The process in my mind is somehow like this commit[1] which belongs to
>> this pr[2]
>> that we firstly introduce the new implementation and then replace it with
>> the original
>> one. The difference is that these two versions of decorators are not api
>> compatible
>> while adding a switch for such an internal abstraction or extracting a
>> clumsy
>> "common" interface doesn't benefit.
>>
>> Best,
>> tison.
>>
>> [1]
>> https://github.com/apache/flink/commit/1f2969357c441e24b71daef83d21563da9a93bb4
>> [2] https://github.com/apache/flink/pull/9832
>>
>>
>>
>>
>> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:08写道:
>>
>>> I agree for separating commits we can have multiple commits that firstly
>>> add the new parameters
>>> and decorators,  and later replace current decorators with new
>>> decorators which are well
>>> unit tested.
>>>
>>> However, it makes no sense we have two codepaths from FlinkKubeClient to
>>> decorators
>>> since these two version of decorators are not api compatible and there
>>> is no reason we keep both
>>> of them.
>>>
>>> Best,
>>> tison.
>>>
>>>
>>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:
>>>
>>>> I think if we could, splitting into as many PRs as possible is good.
>>>> Maybe we could
>>>> introduce the new designed decorators and parameter parser first, and
>>>> leave the existing
>>>> decorators as legacy. Once all the new decorators is ready and well
>>>> tested, we could
>>>> remove the legacy codes and use the new decorators in the kube client
>>>> implementation.
>>>>
>>>>
>>>> Best,
>>>> Yang
>>>>
>>>> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>>>>
>>>>> Hi, Till,
>>>>>
>>>>> Great thanks for your advice, I totally agree with you to split the
>>>>> changes up in as many PRs as possible. The part of "Parameter Parser" is
>>>>> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
>>>>> of code that would be deleted immediately with the following decorator
>>>>> refactoring PR. Actually I won't insist on one PR, could it be possible
>>>>> that I first try out with one PR and let the committers help assess whether
>>>>> it is necessary to split the changes into several PRs?  Kindly expect to
>>>>> see your reply.
>>>>>
>>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Yang Wang <da...@gmail.com>.
Hi tison,

I do not mean to keep two decorator at the same. Since the two decorators
are
not api compatible, it is meaningless. I am just thinking how to organize
the
commits/PRs to make the review easier. The reviewers may need some context
to get the point.



Best,
Yang

tison <wa...@gmail.com> 于2020年2月25日周二 下午8:23写道:

> The process in my mind is somehow like this commit[1] which belongs to
> this pr[2]
> that we firstly introduce the new implementation and then replace it with
> the original
> one. The difference is that these two versions of decorators are not api
> compatible
> while adding a switch for such an internal abstraction or extracting a
> clumsy
> "common" interface doesn't benefit.
>
> Best,
> tison.
>
> [1]
> https://github.com/apache/flink/commit/1f2969357c441e24b71daef83d21563da9a93bb4
> [2] https://github.com/apache/flink/pull/9832
>
>
>
>
> tison <wa...@gmail.com> 于2020年2月25日周二 下午8:08写道:
>
>> I agree for separating commits we can have multiple commits that firstly
>> add the new parameters
>> and decorators,  and later replace current decorators with new decorators
>> which are well
>> unit tested.
>>
>> However, it makes no sense we have two codepaths from FlinkKubeClient to
>> decorators
>> since these two version of decorators are not api compatible and there is
>> no reason we keep both
>> of them.
>>
>> Best,
>> tison.
>>
>>
>> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:
>>
>>> I think if we could, splitting into as many PRs as possible is good.
>>> Maybe we could
>>> introduce the new designed decorators and parameter parser first, and
>>> leave the existing
>>> decorators as legacy. Once all the new decorators is ready and well
>>> tested, we could
>>> remove the legacy codes and use the new decorators in the kube client
>>> implementation.
>>>
>>>
>>> Best,
>>> Yang
>>>
>>> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>>>
>>>> Hi, Till,
>>>>
>>>> Great thanks for your advice, I totally agree with you to split the
>>>> changes up in as many PRs as possible. The part of "Parameter Parser" is
>>>> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
>>>> of code that would be deleted immediately with the following decorator
>>>> refactoring PR. Actually I won't insist on one PR, could it be possible
>>>> that I first try out with one PR and let the committers help assess whether
>>>> it is necessary to split the changes into several PRs?  Kindly expect to
>>>> see your reply.
>>>>
>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by tison <wa...@gmail.com>.
The process in my mind is somehow like this commit[1] which belongs to this
pr[2]
that we firstly introduce the new implementation and then replace it with
the original
one. The difference is that these two versions of decorators are not api
compatible
while adding a switch for such an internal abstraction or extracting a
clumsy
"common" interface doesn't benefit.

Best,
tison.

[1]
https://github.com/apache/flink/commit/1f2969357c441e24b71daef83d21563da9a93bb4
[2] https://github.com/apache/flink/pull/9832




tison <wa...@gmail.com> 于2020年2月25日周二 下午8:08写道:

> I agree for separating commits we can have multiple commits that firstly
> add the new parameters
> and decorators,  and later replace current decorators with new decorators
> which are well
> unit tested.
>
> However, it makes no sense we have two codepaths from FlinkKubeClient to
> decorators
> since these two version of decorators are not api compatible and there is
> no reason we keep both
> of them.
>
> Best,
> tison.
>
>
> Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:
>
>> I think if we could, splitting into as many PRs as possible is good.
>> Maybe we could
>> introduce the new designed decorators and parameter parser first, and
>> leave the existing
>> decorators as legacy. Once all the new decorators is ready and well
>> tested, we could
>> remove the legacy codes and use the new decorators in the kube client
>> implementation.
>>
>>
>> Best,
>> Yang
>>
>> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>>
>>> Hi, Till,
>>>
>>> Great thanks for your advice, I totally agree with you to split the
>>> changes up in as many PRs as possible. The part of "Parameter Parser" is
>>> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
>>> of code that would be deleted immediately with the following decorator
>>> refactoring PR. Actually I won't insist on one PR, could it be possible
>>> that I first try out with one PR and let the committers help assess whether
>>> it is necessary to split the changes into several PRs?  Kindly expect to
>>> see your reply.
>>>
>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by tison <wa...@gmail.com>.
I agree for separating commits we can have multiple commits that firstly
add the new parameters
and decorators,  and later replace current decorators with new decorators
which are well
unit tested.

However, it makes no sense we have two codepaths from FlinkKubeClient to
decorators
since these two version of decorators are not api compatible and there is
no reason we keep both
of them.

Best,
tison.


Yang Wang <da...@gmail.com> 于2020年2月25日周二 下午7:50写道:

> I think if we could, splitting into as many PRs as possible is good. Maybe
> we could
> introduce the new designed decorators and parameter parser first, and
> leave the existing
> decorators as legacy. Once all the new decorators is ready and well
> tested, we could
> remove the legacy codes and use the new decorators in the kube client
> implementation.
>
>
> Best,
> Yang
>
> Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:
>
>> Hi, Till,
>>
>> Great thanks for your advice, I totally agree with you to split the
>> changes up in as many PRs as possible. The part of "Parameter Parser" is
>> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
>> of code that would be deleted immediately with the following decorator
>> refactoring PR. Actually I won't insist on one PR, could it be possible
>> that I first try out with one PR and let the committers help assess whether
>> it is necessary to split the changes into several PRs?  Kindly expect to
>> see your reply.
>>
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Yang Wang <da...@gmail.com>.
I think if we could, splitting into as many PRs as possible is good. Maybe
we could
introduce the new designed decorators and parameter parser first, and leave
the existing
decorators as legacy. Once all the new decorators is ready and well tested,
we could
remove the legacy codes and use the new decorators in the kube client
implementation.


Best,
Yang

Canbin Zheng <fe...@gmail.com> 于2020年2月25日周二 下午6:16写道:

> Hi, Till,
>
> Great thanks for your advice, I totally agree with you to split the
> changes up in as many PRs as possible. The part of "Parameter Parser" is
> trivial so that we prefer to make one PR to avoid adapting a lot of pieces
> of code that would be deleted immediately with the following decorator
> refactoring PR. Actually I won't insist on one PR, could it be possible
> that I first try out with one PR and let the committers help assess whether
> it is necessary to split the changes into several PRs?  Kindly expect to
> see your reply.
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Canbin Zheng <fe...@gmail.com>.
Hi, Till,

Great thanks for your advice, I totally agree with you to split the changes
up in as many PRs as possible. The part of "Parameter Parser" is trivial so
that we prefer to make one PR to avoid adapting a lot of pieces of code
that would be deleted immediately with the following decorator refactoring
PR. Actually I won't insist on one PR, could it be possible that I first
try out with one PR and let the committers help assess whether it is
necessary to split the changes into several PRs?  Kindly expect to see your
reply.

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Till Rohrmann <tr...@apache.org>.
Just a quick meta comment concerning one PR vs. two PRs vs. multiple PRs: I
would recommend to split the changes up in as many PRs as possible. In my
experience, a large uber PR usually won't be reviewed as thoroughly because
of the sheer size. Moreover, the reviewing usually takes much longer. Of
course, this only makes sense if the change can be split in multiple parts.

Cheers,
Till

On Tue, Feb 25, 2020 at 5:39 AM Canbin Zheng <fe...@gmail.com> wrote:

> Thanks, Yang Wang and tison,
>
> Since we have reached a consensus on the decorator design evolution and the
> parameters parser, and after an offline discussion with tison and Yang
> Wang, we have agreed on attaching one PR instead of two to avoid repeated
> code adapting and reviewing.
>
> Thanks for all the feedback.
>
> tison <wa...@gmail.com> 于2020年2月24日周一 下午10:14写道:
>
> > Thank Canbin for starting the discussion and thanks for participating so
> > far.
> >
> > I agree that we can start with the "Parameters Parser" part and actually
> > split it into a
> > self-contained issue.
> >
> > Best,
> > tison.
> >
> >
> > Yang Wang <da...@gmail.com> 于2020年2月24日周一 下午8:38写道:
> >
> >> It seems that we could benefit a lot from the "Parameters Parser". Let's
> >> start to
> >> add the dedicated jobmanager/taskmanager config parser classes.
> >>
> >>
> >> Best,
> >> Yang
> >>
> >> Canbin Zheng <fe...@gmail.com> 于2020年2月24日周一 上午10:39写道:
> >>
> >>> Hi, Yang Wang,
> >>>
> >>> Thanks for the feedback.
> >>>
> >>> > Parameters Parser
> >>>
> >>> I can think of many benefits of the dedicated parameter
> >>> parsing/verifying tools, here is some of them:
> >>> 1. Reuse the parameters parsing and verifying code.
> >>> 2. Ensure consistent handling logic for the same setting.
> >>> 3. Simplify some of the method's signature because we can avoid
> defining
> >>> too many unnecessary input parameters.
> >>>
> >>> > BTW, the fabric8 kubernetes-client and Kubernetes api server already
> >>> has
> >>> the parameters check before starting to create the resources. I think
> >>> the exceptions
> >>> are usually meaning and enough for the users to get the root cause.
> >>>
> >>> 1. On the one hand, there are gaps between the declarative model used
> by
> >>> Kubernetes and the configuration model used by Flink. Indeed, the
> >>> Kubernetes client/server helps check the Kubernetes side parameters and
> >>> throw exceptions in case of failures; however, the exception messages
> are
> >>> not always easily understood from the perspective of Flink users.
> >>> Independent of the solutions, what we should make sure is that the
> >>> exception information in the Flink configuration model is meaningful
> enough
> >>> for the Flink users.
> >>> 2. On the other hand, some prechecking is beneficial in some scenarios
> >>> since we can avoid useless effort on Kubernetes resource creation in
> case
> >>> of failures leading to clean up all the created resources.
> >>>
> >>>
> >>> Regards,
> >>> Canbin Zheng
> >>>
> >>> Yang Wang <da...@gmail.com> 于2020年2月23日周日 下午10:37写道:
> >>>
> >>>> > The new introduced decorator
> >>>> After some offline discussion with Canbin and tison, i totally
> >>>> understand
> >>>> the evolved decorator design. Each decorator will be self-contained
> and
> >>>> is responsible for just one thing. Currently, if we want to mount a
> new
> >>>> config
> >>>> file to jobmanager/taskmanager, then both `ConfigMapDecorator`,
> >>>> `JobManagerDecorator`, `TaskManagerDecorator` needs to be updated.
> >>>> It is not convenient for the new contributors to do this. In the new
> >>>> design,
> >>>> by leveraging the accompanying kubernetes resources in decorator, we
> >>>> could finish the creating and mounting config map in one decorator.
> >>>>
> >>>> Since now we just have a basic implementation for native Kubernetes
> >>>> integration and lots of features need to be developed. And many users
> >>>> want to participate in and contribute to the integration. So i agree
> to
> >>>> refactor
> >>>> the current decorator implementation and make it easier for the new
> >>>> contributor.
> >>>>
> >>>> For the detailed divergences(naming, etc.), i think we could discuss
> in
> >>>> the PR.
> >>>>
> >>>> > Parameters Parser
> >>>> Currently, the decorator directly use Flink configuration to get the
> >>>> parameters
> >>>> to build the Kubernetes resource. It is straightforward, however we
> >>>> could not
> >>>> have a unified parameters check. So i am not sure whether you will
> >>>> introduce
> >>>> a tool to check the parameters or just simply have our own basic
> check.
> >>>>
> >>>> BTW, the fabric8 kubernetes-client and Kubernetes api server already
> has
> >>>> the parameters check before starting to create the resources. I think
> >>>> the exceptions
> >>>> are usually meaning and enough for the users to get the root cause.
> >>>>
> >>>>
> >>>>
> >>>> Best,
> >>>> Yang
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> felixzheng zheng <fe...@gmail.com> 于2020年2月22日周六 上午10:54写道:
> >>>>
> >>>>> Great thanks for the quick feedback Till. You are right; it is not a
> >>>>> fundamentally different approach compared to
> >>>>> what we have right now, all the Kubernetes resources created are the
> >>>>> same,
> >>>>> we aim to evolve the existing decorator approach so that,
> >>>>> 1. the decorators are monadic and smaller in size and functionality.
> >>>>> 2. the new decorator design allows reusing the decorators between the
> >>>>> client and the cluster as much as possible.
> >>>>> 3. all the decorators are independent with each other, and they could
> >>>>> have
> >>>>> arbitrary order in the chain, they share the same APIs and follow a
> >>>>> unified
> >>>>> orchestrator architecture so that new developers could quickly
> >>>>> understand
> >>>>> what should be done to introduce a new feature.
> >>>>>
> >>>>> Besides that, the new approach allows us adding tests for every
> >>>>> decorator
> >>>>> alone instead of doing a final test of all the decorators in the
> >>>>> Fabric8ClientTest.java.
> >>>>>
> >>>>> Cheers,
> >>>>> Canbin Zheng
> >>>>>
> >>>>> Till Rohrmann <tr...@apache.org> 于2020年2月22日周六 上午12:28写道:
> >>>>>
> >>>>> > Thanks for starting this discussion Canbin. If I understand your
> >>>>> proposal
> >>>>> > correctly, then you would like to evolve the existing decorator
> >>>>> approach so
> >>>>> > that decorators are monadic and smaller in size and functionality.
> >>>>> The
> >>>>> > latter aspect will allow to reuse them between the client and the
> >>>>> cluster.
> >>>>> > Just to make sure, it is not a fundamentally different approach
> >>>>> compared to
> >>>>> > what we have right now, is it?
> >>>>> >
> >>>>> > If this is the case, then I think it makes sense to reuse code as
> >>>>> much as
> >>>>> > possible and to create small code units which are easier to test.
> >>>>> >
> >>>>> > Cheers,
> >>>>> > Till
> >>>>> >
> >>>>> > On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <
> >>>>> felixzhengcb@gmail.com>
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks for the feedback @Yang Wang. I would like to discuss some
> >>>>> of the
> >>>>> > > details in depth about why I am confused about the existing
> design.
> >>>>> > >
> >>>>> > > Question 1: How do we mount a configuration file?
> >>>>> > >
> >>>>> > > For the existing design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    We need several classes to finish it:
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >       InitializerDecorator
> >>>>> > >       2.
> >>>>> > >
> >>>>> > >       OwnerReferenceDecorator
> >>>>> > >       3.
> >>>>> > >
> >>>>> > >       ConfigMapDecorator
> >>>>> > >       4.
> >>>>> > >
> >>>>> > >       KubernetesUtils: providing the getConfigMapVolume method to
> >>>>> share
> >>>>> > for
> >>>>> > >       the FlinkMasterDeploymentDecorator and the
> >>>>> TaskManagerPodDecorator.
> >>>>> > >       5.
> >>>>> > >
> >>>>> > >       FlinkMasterDeploymentDecorator: mounts the ConfigMap
> volume.
> >>>>> > >       6.
> >>>>> > >
> >>>>> > >       TaskManagerPodDecorator: mounts the ConfigMap volume.
> >>>>> > >       7.
> >>>>> > >
> >>>>> > >       If in the future, someone would like to introduce an init
> >>>>> > Container,
> >>>>> > >       the InitContainerDecorator has to mount the ConfigMap
> volume
> >>>>> too.
> >>>>> > >
> >>>>> > >
> >>>>> > > I am confused about the current solution to mounting a
> >>>>> configuration
> >>>>> > file:
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    Actually, we do not need so many Decorators for mounting a
> file.
> >>>>> > >    2.
> >>>>> > >
> >>>>> > >    If we would like to mount a new file, we have no choice but to
> >>>>> repeat
> >>>>> > >    the same tedious and scattered routine.
> >>>>> > >    3.
> >>>>> > >
> >>>>> > >    There’s no easy way to test the file mounting functionality
> >>>>> alone; we
> >>>>> > >    have to construct the ConfigMap, the Deployment or the
> >>>>> TaskManagerPod
> >>>>> > > first
> >>>>> > >    and then do a final test.
> >>>>> > >
> >>>>> > >
> >>>>> > > The reason why it is so complex to mount a configuration file is
> >>>>> that we
> >>>>> > > don’t fully consider the internal connections among those
> >>>>> resources in
> >>>>> > the
> >>>>> > > existing design.
> >>>>> > >
> >>>>> > > The new abstraction we proposed could solve such a kind of
> >>>>> problem, the
> >>>>> > new
> >>>>> > > Decorator object is as follows:
> >>>>> > >
> >>>>> > > public interface KubernetesStepDecorator {
> >>>>> > >
> >>>>> > >   /**
> >>>>> > >
> >>>>> > >    * Apply transformations to the given FlinkPod in accordance
> >>>>> with this
> >>>>> > > feature. This can include adding
> >>>>> > >
> >>>>> > >    * labels/annotations, mounting volumes, and setting startup
> >>>>> command or
> >>>>> > > parameters, etc.
> >>>>> > >
> >>>>> > >    */
> >>>>> > >
> >>>>> > >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
> >>>>> > >
> >>>>> > >   /**
> >>>>> > >
> >>>>> > >    * Build the accompanying Kubernetes resources that should be
> >>>>> > introduced
> >>>>> > > to support this feature. This could
> >>>>> > >
> >>>>> > >    * only applicable to the client-side submission process.
> >>>>> > >
> >>>>> > >    */
> >>>>> > >
> >>>>> > >   List<HasMetadata> buildAccompanyingKubernetesResources() throws
> >>>>> > > IOException;
> >>>>> > >
> >>>>> > > }
> >>>>> > >
> >>>>> > > The FlinkPod is a composition of the Pod, the main Container, the
> >>>>> init
> >>>>> > > Container, and the sidecar Container.
> >>>>> > >
> >>>>> > > Next, we introduce a KubernetesStepDecorator implementation, the
> >>>>> method
> >>>>> > of
> >>>>> > > buildAccompanyingKubernetesResources creates the corresponding
> >>>>> ConfigMap,
> >>>>> > > and the method of decorateFlinkPod configures the Volume for the
> >>>>> Pod and
> >>>>> > > all the Containers.
> >>>>> > >
> >>>>> > > So, for the scenario of mounting a configuration file, the
> >>>>> advantages of
> >>>>> > > this new architecture are as follows:
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    One dedicated KubernetesStepDecorator implementation is enough
> >>>>> to
> >>>>> > finish
> >>>>> > >    all the mounting work, meanwhile, this class would be shared
> >>>>> between
> >>>>> > the
> >>>>> > >    client-side and the master-side. Besides that, the number of
> >>>>> lines of
> >>>>> > > code
> >>>>> > >    in that class will not exceed 300 lines which facilitate code
> >>>>> > > readability
> >>>>> > >    and maintenance.
> >>>>> > >    2.
> >>>>> > >
> >>>>> > >    Testing becomes an easy thing now, as we can add a dedicated
> >>>>> test
> >>>>> > class
> >>>>> > >    for only this class, the test would never rely on the
> >>>>> construction of
> >>>>> > > other
> >>>>> > >    components such as the Deployment.
> >>>>> > >    3.
> >>>>> > >
> >>>>> > >    It’s quite convenient to mount a new configuration file via
> the
> >>>>> newly
> >>>>> > >    dedicated KubernetesStepDecorator implementation.
> >>>>> > >
> >>>>> > >
> >>>>> > > Question 2: How do we construct the Pod?
> >>>>> > >
> >>>>> > > For the existing design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    The FlinkMasterDeploymentDecorator is responsible for building
> >>>>> the
> >>>>> > >    Deployment and configuring the Pod and the Containers, while
> the
> >>>>> > >    TaskManagerPodDecorator is responsible for building the
> >>>>> TaskManager
> >>>>> > Pod.
> >>>>> > >    Take FlinkMasterDeploymentDecorator as an example, let’s see
> >>>>> what it
> >>>>> > has
> >>>>> > >    done.
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >       Configure the main Container, including the name, command,
> >>>>> args,
> >>>>> > >       image, image pull policy, resource requirements, ports, all
> >>>>> kinds
> >>>>> > of
> >>>>> > >       environment variables, all kinds of volume mounts, etc.
> >>>>> > >       2.
> >>>>> > >
> >>>>> > >       Configure the Pod, including service account, all kinds of
> >>>>> volumes,
> >>>>> > >       and attach all kinds of Container, including the main
> >>>>> Container,
> >>>>> > the
> >>>>> > > init
> >>>>> > >       Container, and the sidecar Container.
> >>>>> > >       3.
> >>>>> > >
> >>>>> > >       Configure the Deployment.
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    The InitializerDecorator and the OwnerReferenceDecorator have
> >>>>> basic
> >>>>> > >    logic so that the most complex work is completed in the
> >>>>> > >    FlinkMasterDeploymentDecorator and the
> TaskManagerPodDecorator.
> >>>>> With
> >>>>> > the
> >>>>> > >    introduction of new features for the Pod, such as customized
> >>>>> volume
> >>>>> > > mounts,
> >>>>> > >    Hadoop configuration support, Kerberized HDFS support, secret
> >>>>> mounts,
> >>>>> > >    Python support, etc. the construction process could become far
> >>>>> more
> >>>>> > >    complicated, and the functionality of a single class could
> >>>>> explode,
> >>>>> > > which
> >>>>> > >    hurts code readability, writability, and testability. Besides
> >>>>> that,
> >>>>> > both
> >>>>> > >    the client-side and the master-side shares much of the Pod
> >>>>> > construction
> >>>>> > >    logic.
> >>>>> > >
> >>>>> > >
> >>>>> > > So the problems are as follows:
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    We don’t have a consistent abstraction that is applicable to
> >>>>> both the
> >>>>> > >    client-side and the master-side for Pod construction
> (including
> >>>>> the
> >>>>> > >    Containers) so that we have to share some of the code via tool
> >>>>> classes
> >>>>> > >    which is a trick solution, however, we can’t share most of the
> >>>>> code as
> >>>>> > > much
> >>>>> > >    as possible.
> >>>>> > >    2.
> >>>>> > >
> >>>>> > >    We don’t have a step-by-step orchestrator architecture to help
> >>>>> the Pod
> >>>>> > >    construction.
> >>>>> > >
> >>>>> > >
> >>>>> > > For the proposed design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    The problems above are all solved: we have a consistent
> >>>>> abstraction
> >>>>> > that
> >>>>> > >    leads to a monadic-step orchestrator architecture to construct
> >>>>> the Pod
> >>>>> > > step
> >>>>> > >    by step. One step is responsible for exactly one thing,
> >>>>> following that
> >>>>> > > is
> >>>>> > >    the fact that every step is well testable, because each unit
> >>>>> can be
> >>>>> > >    considerably small, and the number of branches to test in any
> >>>>> given
> >>>>> > > step is
> >>>>> > >    limited. Moreover, much of the step could be shared between
> the
> >>>>> > > client-side
> >>>>> > >    and the master-side, such as configuration files mount, etc.
> >>>>> > >
> >>>>> > >
> >>>>> > > Question 3: Could all the resources share the same orchestrator
> >>>>> > > architecture?
> >>>>> > >
> >>>>> > > For the existing design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    We don’t have a unified orchestrator architecture for the
> >>>>> construction
> >>>>> > >    of all the Kubernetes resources, therefore, we need a
> Decorator
> >>>>> chain
> >>>>> > > for
> >>>>> > >    every Kubernetes resource.
> >>>>> > >
> >>>>> > >
> >>>>> > > For the proposed design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    Following Question 1 ~ 2 and the design doc[1], we have
> >>>>> introduced a
> >>>>> > >    monadic-step orchestrator architecture to construct the Pod
> and
> >>>>> all
> >>>>> > the
> >>>>> > >    Containers. Besides that, this architecture also works for the
> >>>>> other
> >>>>> > >    resources, such as the Services and the ConfigMaps. For
> >>>>> example, if we
> >>>>> > > need
> >>>>> > >    a new Service or a ConfigMap, just introduce a new step.
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > > Question 4: How do we introduce the InitContainer or the side-car
> >>>>> > > Container?
> >>>>> > >
> >>>>> > > For the existing design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    Both the client-side and the master-side introduce some new
> >>>>> > Decorators,
> >>>>> > >
> >>>>> > > the client-side chain could be:
> >>>>> > >
> >>>>> > > InitializerDecorator -> OwnerReferenceDecorator ->
> >>>>> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> >>>>> > > SidecarDecorator -> etc
> >>>>> > >
> >>>>> > > -
> >>>>> > >
> >>>>> > > and the master-side could be:
> >>>>> > >
> >>>>> > > InitializerDecorator -> OwnerReferenceDecorator  ->
> >>>>> > TaskManagerPodDecorator
> >>>>> > > -> InitContainerDecorator -> SidecarDecorator -> etc
> >>>>> > >
> >>>>> > > As we can see, the FlinkMasterDeploymentDecorator or the
> >>>>> > > TaskManagerPodDecorator is designed for the Pod construction,
> >>>>> including
> >>>>> > the
> >>>>> > > Containers, so we don’t need to treat the init Container and the
> >>>>> sidecar
> >>>>> > > Container as special cases different from the main Container by
> >>>>> > introducing
> >>>>> > > a dedicated Decorator for every one of them. Such kind of trick
> >>>>> solution
> >>>>> > > confuses me, maybe to the other developers.
> >>>>> > >
> >>>>> > > For the proposed design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    Following Question 1 ~ 4 and the design doc [1], the central
> >>>>> premise
> >>>>> > of
> >>>>> > >    the proposed orchestrator architecture is that the Pod, the
> main
> >>>>> > > Container,
> >>>>> > >    the init Container, the sidecar Container, the Services, and
> the
> >>>>> > > ConfigMaps
> >>>>> > >    all sit on an equal footing. For every one of the Pod, the
> main
> >>>>> > > Container,
> >>>>> > >    the init Container and the sidecar Container, people could
> >>>>> introduce
> >>>>> > any
> >>>>> > >    number of steps to finish its construction; we attach all the
> >>>>> > > Containers to
> >>>>> > >    the Pod in the Builder tool classes after the orchestrator
> >>>>> > architecture
> >>>>> > >    constructs them.
> >>>>> > >
> >>>>> > >
> >>>>> > > Question 5: What is the relation between the Decorators?
> >>>>> > >
> >>>>> > > For the existing design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    The Decorators are not independent; most of them have a strict
> >>>>> order
> >>>>> > in
> >>>>> > >    the chain.
> >>>>> > >    2.
> >>>>> > >
> >>>>> > >    The Decorators do not share common APIs in essence, as their
> >>>>> input
> >>>>> > type
> >>>>> > >    could be different so that we can’t finish the construction
> of a
> >>>>> > > Kubernetes
> >>>>> > >    resource such as the Deployment and the TaskManager Pod
> through
> >>>>> a
> >>>>> > > one-time
> >>>>> > >    traversal, there are boxing and unboxing overheads among some
> >>>>> of the
> >>>>> > >    neighboring Decorators.
> >>>>> > >
> >>>>> > >
> >>>>> > > For the proposed design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    The steps are completely independent so that they could have
> >>>>> arbitrary
> >>>>> > >    order in the chain; The only rule is that the Hadoop
> >>>>> configuration
> >>>>> > >    Decorator should be the last node in the chain.
> >>>>> > >    2.
> >>>>> > >
> >>>>> > >    All the steps share common APIs, with the same input type of
> >>>>> all the
> >>>>> > >    methods, so we can construct a Kubernetes resource via a
> >>>>> one-time
> >>>>> > > traversal.
> >>>>> > >
> >>>>> > >
> >>>>> > > Question 6: What is the number of Decorators chains?
> >>>>> > >
> >>>>> > > For the existing design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    People have to introduce a new chain once they introduce a new
> >>>>> > >    Kubernetes resource. For example, a new ConfigMap Decorator
> >>>>> chain for
> >>>>> > >    mounting a new configuration file. At this moment, we already
> >>>>> have
> >>>>> > five
> >>>>> > >    Decorator chains.
> >>>>> > >
> >>>>> > >
> >>>>> > > For the proposed design,
> >>>>> > >
> >>>>> > >    1.
> >>>>> > >
> >>>>> > >    There are always two chains, one is for constructing all the
> >>>>> > Kubernetes
> >>>>> > >    resources on the client-side, including the JobManager Pod,
> the
> >>>>> > > Containers,
> >>>>> > >    the ConfigMap(s), and the Services(s); the other one is for
> >>>>> > constructing
> >>>>> > >    the TaskManager Pod and the Containers.
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > > Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
> >>>>> > >
> >>>>> > > >  Hi Canbing,
> >>>>> > > >
> >>>>> > > >
> >>>>> > > > Thanks a lot for sharing your thoughts to improve the Flink on
> >>>>> K8s
> >>>>> > native
> >>>>> > > > integration.
> >>>>> > > > Frankly speaking, your discussion title confuses me and i am
> >>>>> wondering
> >>>>> > > > whether you
> >>>>> > > > want to refactor the whole design. However, after i dive into
> the
> >>>>> > details
> >>>>> > > > and the provided
> >>>>> > > > document, i realize that mostly you want to improve the
> >>>>> implementation.
> >>>>> > > >
> >>>>> > > >
> >>>>> > > > Regarding your two main points.
> >>>>> > > >
> >>>>> > > > >> Introduce a unified monadic-step based orchestrator
> >>>>> architecture
> >>>>> > that
> >>>>> > > > has a better,
> >>>>> > > > cleaner and consistent abstraction for the Kubernetes resources
> >>>>> > > > construction process,
> >>>>> > > > both applicable to the client side and the master side.
> >>>>> > > >
> >>>>> > > > When i introduce the decorator for the K8s in Flink, there is
> >>>>> always a
> >>>>> > > > guideline in my mind
> >>>>> > > > that it should be easy for extension and adding new features.
> >>>>> Just as
> >>>>> > you
> >>>>> > > > say, we have lots
> >>>>> > > > of functions to support and the K8s is also evolving very fast.
> >>>>> The
> >>>>> > > current
> >>>>> > > > `ConfigMapDecorator`,
> >>>>> > > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is
> a
> >>>>> basic
> >>>>> > > > implementation with
> >>>>> > > > some prerequisite parameters. Of course we could chain more
> >>>>> decorators
> >>>>> > to
> >>>>> > > > construct the K8s
> >>>>> > > > resources. For example, InitializerDecorator ->
> >>>>> OwnerReferenceDecorator
> >>>>> > > ->
> >>>>> > > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> >>>>> > > > SidecarDecorator -> etc.
> >>>>> > > >
> >>>>> > > > I am little sceptical about splitting every parameter into a
> >>>>> single
> >>>>> > > > decorator.  Since it does
> >>>>> > > > not take too much benefits. But i agree with moving some common
> >>>>> > > parameters
> >>>>> > > > into separate
> >>>>> > > > decorators(e.g. volume mount). Also introducing the `~Builder`
> >>>>> class
> >>>>> > and
> >>>>> > > > spinning off the chaining
> >>>>> > > > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >
> >>>>> > > > >> Add some dedicated tools for centrally parsing, verifying,
> and
> >>>>> > > managing
> >>>>> > > > the Kubernetes parameters.
> >>>>> > > >
> >>>>> > > > Currently, we always get the parameters directly from Flink
> >>>>> > > configuration(
> >>>>> > > > e.g.
> >>>>> > `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`).
> I
> >>>>> > > > think it could be improved
> >>>>> > > > by introducing some dedicated conf parser classes. It is a good
> >>>>> idea.
> >>>>> > > >
> >>>>> > > >
> >>>>> > > > Best,
> >>>>> > > > Yang
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >
> >>>>> > > > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五
> >>>>> 上午9:31写道:
> >>>>> > > >
> >>>>> > > > > Hi everyone,
> >>>>> > > > >
> >>>>> > > > > I would like to kick off a discussion on refactoring the
> >>>>> existing
> >>>>> > > > > Kubernetes resources construction architecture design.
> >>>>> > > > >
> >>>>> > > > > I created a design document [1] that clarifies our motivation
> >>>>> to do
> >>>>> > > this
> >>>>> > > > > and some improvement proposals for the new design.
> >>>>> > > > >
> >>>>> > > > > Briefly, we would like to
> >>>>> > > > > 1. Introduce a unified monadic-step based orchestrator
> >>>>> architecture
> >>>>> > > that
> >>>>> > > > > has a better, cleaner and consistent abstraction for the
> >>>>> Kubernetes
> >>>>> > > > > resources construction process, both applicable to the client
> >>>>> side
> >>>>> > and
> >>>>> > > > the
> >>>>> > > > > master side.
> >>>>> > > > > 2. Add some dedicated tools for centrally parsing, verifying,
> >>>>> and
> >>>>> > > > managing
> >>>>> > > > > the Kubernetes parameters.
> >>>>> > > > >
> >>>>> > > > > It would be great to start the efforts before adding big
> >>>>> features for
> >>>>> > > the
> >>>>> > > > > native Kubernetes submodule, and Tison and I plan to work
> >>>>> together
> >>>>> > for
> >>>>> > > > this
> >>>>> > > > > issue.
> >>>>> > > > >
> >>>>> > > > > Please let me know your thoughts.
> >>>>> > > > >
> >>>>> > > > > Regards,
> >>>>> > > > > Canbin Zheng
> >>>>> > > > >
> >>>>> > > > > [1]
> >>>>> > > > >
> >>>>> > > > >
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>>
> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
> >>>>> > > > > [2] https://issues.apache.org/jira/browse/FLINK-16194
> >>>>> > > > >
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Canbin Zheng <fe...@gmail.com>.
Thanks, Yang Wang and tison,

Since we have reached a consensus on the decorator design evolution and the
parameters parser, and after an offline discussion with tison and Yang
Wang, we have agreed on attaching one PR instead of two to avoid repeated
code adapting and reviewing.

Thanks for all the feedback.

tison <wa...@gmail.com> 于2020年2月24日周一 下午10:14写道:

> Thank Canbin for starting the discussion and thanks for participating so
> far.
>
> I agree that we can start with the "Parameters Parser" part and actually
> split it into a
> self-contained issue.
>
> Best,
> tison.
>
>
> Yang Wang <da...@gmail.com> 于2020年2月24日周一 下午8:38写道:
>
>> It seems that we could benefit a lot from the "Parameters Parser". Let's
>> start to
>> add the dedicated jobmanager/taskmanager config parser classes.
>>
>>
>> Best,
>> Yang
>>
>> Canbin Zheng <fe...@gmail.com> 于2020年2月24日周一 上午10:39写道:
>>
>>> Hi, Yang Wang,
>>>
>>> Thanks for the feedback.
>>>
>>> > Parameters Parser
>>>
>>> I can think of many benefits of the dedicated parameter
>>> parsing/verifying tools, here is some of them:
>>> 1. Reuse the parameters parsing and verifying code.
>>> 2. Ensure consistent handling logic for the same setting.
>>> 3. Simplify some of the method's signature because we can avoid defining
>>> too many unnecessary input parameters.
>>>
>>> > BTW, the fabric8 kubernetes-client and Kubernetes api server already
>>> has
>>> the parameters check before starting to create the resources. I think
>>> the exceptions
>>> are usually meaning and enough for the users to get the root cause.
>>>
>>> 1. On the one hand, there are gaps between the declarative model used by
>>> Kubernetes and the configuration model used by Flink. Indeed, the
>>> Kubernetes client/server helps check the Kubernetes side parameters and
>>> throw exceptions in case of failures; however, the exception messages are
>>> not always easily understood from the perspective of Flink users.
>>> Independent of the solutions, what we should make sure is that the
>>> exception information in the Flink configuration model is meaningful enough
>>> for the Flink users.
>>> 2. On the other hand, some prechecking is beneficial in some scenarios
>>> since we can avoid useless effort on Kubernetes resource creation in case
>>> of failures leading to clean up all the created resources.
>>>
>>>
>>> Regards,
>>> Canbin Zheng
>>>
>>> Yang Wang <da...@gmail.com> 于2020年2月23日周日 下午10:37写道:
>>>
>>>> > The new introduced decorator
>>>> After some offline discussion with Canbin and tison, i totally
>>>> understand
>>>> the evolved decorator design. Each decorator will be self-contained and
>>>> is responsible for just one thing. Currently, if we want to mount a new
>>>> config
>>>> file to jobmanager/taskmanager, then both `ConfigMapDecorator`,
>>>> `JobManagerDecorator`, `TaskManagerDecorator` needs to be updated.
>>>> It is not convenient for the new contributors to do this. In the new
>>>> design,
>>>> by leveraging the accompanying kubernetes resources in decorator, we
>>>> could finish the creating and mounting config map in one decorator.
>>>>
>>>> Since now we just have a basic implementation for native Kubernetes
>>>> integration and lots of features need to be developed. And many users
>>>> want to participate in and contribute to the integration. So i agree to
>>>> refactor
>>>> the current decorator implementation and make it easier for the new
>>>> contributor.
>>>>
>>>> For the detailed divergences(naming, etc.), i think we could discuss in
>>>> the PR.
>>>>
>>>> > Parameters Parser
>>>> Currently, the decorator directly use Flink configuration to get the
>>>> parameters
>>>> to build the Kubernetes resource. It is straightforward, however we
>>>> could not
>>>> have a unified parameters check. So i am not sure whether you will
>>>> introduce
>>>> a tool to check the parameters or just simply have our own basic check.
>>>>
>>>> BTW, the fabric8 kubernetes-client and Kubernetes api server already has
>>>> the parameters check before starting to create the resources. I think
>>>> the exceptions
>>>> are usually meaning and enough for the users to get the root cause.
>>>>
>>>>
>>>>
>>>> Best,
>>>> Yang
>>>>
>>>>
>>>>
>>>>
>>>> felixzheng zheng <fe...@gmail.com> 于2020年2月22日周六 上午10:54写道:
>>>>
>>>>> Great thanks for the quick feedback Till. You are right; it is not a
>>>>> fundamentally different approach compared to
>>>>> what we have right now, all the Kubernetes resources created are the
>>>>> same,
>>>>> we aim to evolve the existing decorator approach so that,
>>>>> 1. the decorators are monadic and smaller in size and functionality.
>>>>> 2. the new decorator design allows reusing the decorators between the
>>>>> client and the cluster as much as possible.
>>>>> 3. all the decorators are independent with each other, and they could
>>>>> have
>>>>> arbitrary order in the chain, they share the same APIs and follow a
>>>>> unified
>>>>> orchestrator architecture so that new developers could quickly
>>>>> understand
>>>>> what should be done to introduce a new feature.
>>>>>
>>>>> Besides that, the new approach allows us adding tests for every
>>>>> decorator
>>>>> alone instead of doing a final test of all the decorators in the
>>>>> Fabric8ClientTest.java.
>>>>>
>>>>> Cheers,
>>>>> Canbin Zheng
>>>>>
>>>>> Till Rohrmann <tr...@apache.org> 于2020年2月22日周六 上午12:28写道:
>>>>>
>>>>> > Thanks for starting this discussion Canbin. If I understand your
>>>>> proposal
>>>>> > correctly, then you would like to evolve the existing decorator
>>>>> approach so
>>>>> > that decorators are monadic and smaller in size and functionality.
>>>>> The
>>>>> > latter aspect will allow to reuse them between the client and the
>>>>> cluster.
>>>>> > Just to make sure, it is not a fundamentally different approach
>>>>> compared to
>>>>> > what we have right now, is it?
>>>>> >
>>>>> > If this is the case, then I think it makes sense to reuse code as
>>>>> much as
>>>>> > possible and to create small code units which are easier to test.
>>>>> >
>>>>> > Cheers,
>>>>> > Till
>>>>> >
>>>>> > On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <
>>>>> felixzhengcb@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks for the feedback @Yang Wang. I would like to discuss some
>>>>> of the
>>>>> > > details in depth about why I am confused about the existing design.
>>>>> > >
>>>>> > > Question 1: How do we mount a configuration file?
>>>>> > >
>>>>> > > For the existing design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    We need several classes to finish it:
>>>>> > >    1.
>>>>> > >
>>>>> > >       InitializerDecorator
>>>>> > >       2.
>>>>> > >
>>>>> > >       OwnerReferenceDecorator
>>>>> > >       3.
>>>>> > >
>>>>> > >       ConfigMapDecorator
>>>>> > >       4.
>>>>> > >
>>>>> > >       KubernetesUtils: providing the getConfigMapVolume method to
>>>>> share
>>>>> > for
>>>>> > >       the FlinkMasterDeploymentDecorator and the
>>>>> TaskManagerPodDecorator.
>>>>> > >       5.
>>>>> > >
>>>>> > >       FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
>>>>> > >       6.
>>>>> > >
>>>>> > >       TaskManagerPodDecorator: mounts the ConfigMap volume.
>>>>> > >       7.
>>>>> > >
>>>>> > >       If in the future, someone would like to introduce an init
>>>>> > Container,
>>>>> > >       the InitContainerDecorator has to mount the ConfigMap volume
>>>>> too.
>>>>> > >
>>>>> > >
>>>>> > > I am confused about the current solution to mounting a
>>>>> configuration
>>>>> > file:
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    Actually, we do not need so many Decorators for mounting a file.
>>>>> > >    2.
>>>>> > >
>>>>> > >    If we would like to mount a new file, we have no choice but to
>>>>> repeat
>>>>> > >    the same tedious and scattered routine.
>>>>> > >    3.
>>>>> > >
>>>>> > >    There’s no easy way to test the file mounting functionality
>>>>> alone; we
>>>>> > >    have to construct the ConfigMap, the Deployment or the
>>>>> TaskManagerPod
>>>>> > > first
>>>>> > >    and then do a final test.
>>>>> > >
>>>>> > >
>>>>> > > The reason why it is so complex to mount a configuration file is
>>>>> that we
>>>>> > > don’t fully consider the internal connections among those
>>>>> resources in
>>>>> > the
>>>>> > > existing design.
>>>>> > >
>>>>> > > The new abstraction we proposed could solve such a kind of
>>>>> problem, the
>>>>> > new
>>>>> > > Decorator object is as follows:
>>>>> > >
>>>>> > > public interface KubernetesStepDecorator {
>>>>> > >
>>>>> > >   /**
>>>>> > >
>>>>> > >    * Apply transformations to the given FlinkPod in accordance
>>>>> with this
>>>>> > > feature. This can include adding
>>>>> > >
>>>>> > >    * labels/annotations, mounting volumes, and setting startup
>>>>> command or
>>>>> > > parameters, etc.
>>>>> > >
>>>>> > >    */
>>>>> > >
>>>>> > >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
>>>>> > >
>>>>> > >   /**
>>>>> > >
>>>>> > >    * Build the accompanying Kubernetes resources that should be
>>>>> > introduced
>>>>> > > to support this feature. This could
>>>>> > >
>>>>> > >    * only applicable to the client-side submission process.
>>>>> > >
>>>>> > >    */
>>>>> > >
>>>>> > >   List<HasMetadata> buildAccompanyingKubernetesResources() throws
>>>>> > > IOException;
>>>>> > >
>>>>> > > }
>>>>> > >
>>>>> > > The FlinkPod is a composition of the Pod, the main Container, the
>>>>> init
>>>>> > > Container, and the sidecar Container.
>>>>> > >
>>>>> > > Next, we introduce a KubernetesStepDecorator implementation, the
>>>>> method
>>>>> > of
>>>>> > > buildAccompanyingKubernetesResources creates the corresponding
>>>>> ConfigMap,
>>>>> > > and the method of decorateFlinkPod configures the Volume for the
>>>>> Pod and
>>>>> > > all the Containers.
>>>>> > >
>>>>> > > So, for the scenario of mounting a configuration file, the
>>>>> advantages of
>>>>> > > this new architecture are as follows:
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    One dedicated KubernetesStepDecorator implementation is enough
>>>>> to
>>>>> > finish
>>>>> > >    all the mounting work, meanwhile, this class would be shared
>>>>> between
>>>>> > the
>>>>> > >    client-side and the master-side. Besides that, the number of
>>>>> lines of
>>>>> > > code
>>>>> > >    in that class will not exceed 300 lines which facilitate code
>>>>> > > readability
>>>>> > >    and maintenance.
>>>>> > >    2.
>>>>> > >
>>>>> > >    Testing becomes an easy thing now, as we can add a dedicated
>>>>> test
>>>>> > class
>>>>> > >    for only this class, the test would never rely on the
>>>>> construction of
>>>>> > > other
>>>>> > >    components such as the Deployment.
>>>>> > >    3.
>>>>> > >
>>>>> > >    It’s quite convenient to mount a new configuration file via the
>>>>> newly
>>>>> > >    dedicated KubernetesStepDecorator implementation.
>>>>> > >
>>>>> > >
>>>>> > > Question 2: How do we construct the Pod?
>>>>> > >
>>>>> > > For the existing design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    The FlinkMasterDeploymentDecorator is responsible for building
>>>>> the
>>>>> > >    Deployment and configuring the Pod and the Containers, while the
>>>>> > >    TaskManagerPodDecorator is responsible for building the
>>>>> TaskManager
>>>>> > Pod.
>>>>> > >    Take FlinkMasterDeploymentDecorator as an example, let’s see
>>>>> what it
>>>>> > has
>>>>> > >    done.
>>>>> > >    1.
>>>>> > >
>>>>> > >       Configure the main Container, including the name, command,
>>>>> args,
>>>>> > >       image, image pull policy, resource requirements, ports, all
>>>>> kinds
>>>>> > of
>>>>> > >       environment variables, all kinds of volume mounts, etc.
>>>>> > >       2.
>>>>> > >
>>>>> > >       Configure the Pod, including service account, all kinds of
>>>>> volumes,
>>>>> > >       and attach all kinds of Container, including the main
>>>>> Container,
>>>>> > the
>>>>> > > init
>>>>> > >       Container, and the sidecar Container.
>>>>> > >       3.
>>>>> > >
>>>>> > >       Configure the Deployment.
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    The InitializerDecorator and the OwnerReferenceDecorator have
>>>>> basic
>>>>> > >    logic so that the most complex work is completed in the
>>>>> > >    FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
>>>>> With
>>>>> > the
>>>>> > >    introduction of new features for the Pod, such as customized
>>>>> volume
>>>>> > > mounts,
>>>>> > >    Hadoop configuration support, Kerberized HDFS support, secret
>>>>> mounts,
>>>>> > >    Python support, etc. the construction process could become far
>>>>> more
>>>>> > >    complicated, and the functionality of a single class could
>>>>> explode,
>>>>> > > which
>>>>> > >    hurts code readability, writability, and testability. Besides
>>>>> that,
>>>>> > both
>>>>> > >    the client-side and the master-side shares much of the Pod
>>>>> > construction
>>>>> > >    logic.
>>>>> > >
>>>>> > >
>>>>> > > So the problems are as follows:
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    We don’t have a consistent abstraction that is applicable to
>>>>> both the
>>>>> > >    client-side and the master-side for Pod construction (including
>>>>> the
>>>>> > >    Containers) so that we have to share some of the code via tool
>>>>> classes
>>>>> > >    which is a trick solution, however, we can’t share most of the
>>>>> code as
>>>>> > > much
>>>>> > >    as possible.
>>>>> > >    2.
>>>>> > >
>>>>> > >    We don’t have a step-by-step orchestrator architecture to help
>>>>> the Pod
>>>>> > >    construction.
>>>>> > >
>>>>> > >
>>>>> > > For the proposed design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    The problems above are all solved: we have a consistent
>>>>> abstraction
>>>>> > that
>>>>> > >    leads to a monadic-step orchestrator architecture to construct
>>>>> the Pod
>>>>> > > step
>>>>> > >    by step. One step is responsible for exactly one thing,
>>>>> following that
>>>>> > > is
>>>>> > >    the fact that every step is well testable, because each unit
>>>>> can be
>>>>> > >    considerably small, and the number of branches to test in any
>>>>> given
>>>>> > > step is
>>>>> > >    limited. Moreover, much of the step could be shared between the
>>>>> > > client-side
>>>>> > >    and the master-side, such as configuration files mount, etc.
>>>>> > >
>>>>> > >
>>>>> > > Question 3: Could all the resources share the same orchestrator
>>>>> > > architecture?
>>>>> > >
>>>>> > > For the existing design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    We don’t have a unified orchestrator architecture for the
>>>>> construction
>>>>> > >    of all the Kubernetes resources, therefore, we need a Decorator
>>>>> chain
>>>>> > > for
>>>>> > >    every Kubernetes resource.
>>>>> > >
>>>>> > >
>>>>> > > For the proposed design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    Following Question 1 ~ 2 and the design doc[1], we have
>>>>> introduced a
>>>>> > >    monadic-step orchestrator architecture to construct the Pod and
>>>>> all
>>>>> > the
>>>>> > >    Containers. Besides that, this architecture also works for the
>>>>> other
>>>>> > >    resources, such as the Services and the ConfigMaps. For
>>>>> example, if we
>>>>> > > need
>>>>> > >    a new Service or a ConfigMap, just introduce a new step.
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > Question 4: How do we introduce the InitContainer or the side-car
>>>>> > > Container?
>>>>> > >
>>>>> > > For the existing design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    Both the client-side and the master-side introduce some new
>>>>> > Decorators,
>>>>> > >
>>>>> > > the client-side chain could be:
>>>>> > >
>>>>> > > InitializerDecorator -> OwnerReferenceDecorator ->
>>>>> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>>>>> > > SidecarDecorator -> etc
>>>>> > >
>>>>> > > -
>>>>> > >
>>>>> > > and the master-side could be:
>>>>> > >
>>>>> > > InitializerDecorator -> OwnerReferenceDecorator  ->
>>>>> > TaskManagerPodDecorator
>>>>> > > -> InitContainerDecorator -> SidecarDecorator -> etc
>>>>> > >
>>>>> > > As we can see, the FlinkMasterDeploymentDecorator or the
>>>>> > > TaskManagerPodDecorator is designed for the Pod construction,
>>>>> including
>>>>> > the
>>>>> > > Containers, so we don’t need to treat the init Container and the
>>>>> sidecar
>>>>> > > Container as special cases different from the main Container by
>>>>> > introducing
>>>>> > > a dedicated Decorator for every one of them. Such kind of trick
>>>>> solution
>>>>> > > confuses me, maybe to the other developers.
>>>>> > >
>>>>> > > For the proposed design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    Following Question 1 ~ 4 and the design doc [1], the central
>>>>> premise
>>>>> > of
>>>>> > >    the proposed orchestrator architecture is that the Pod, the main
>>>>> > > Container,
>>>>> > >    the init Container, the sidecar Container, the Services, and the
>>>>> > > ConfigMaps
>>>>> > >    all sit on an equal footing. For every one of the Pod, the main
>>>>> > > Container,
>>>>> > >    the init Container and the sidecar Container, people could
>>>>> introduce
>>>>> > any
>>>>> > >    number of steps to finish its construction; we attach all the
>>>>> > > Containers to
>>>>> > >    the Pod in the Builder tool classes after the orchestrator
>>>>> > architecture
>>>>> > >    constructs them.
>>>>> > >
>>>>> > >
>>>>> > > Question 5: What is the relation between the Decorators?
>>>>> > >
>>>>> > > For the existing design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    The Decorators are not independent; most of them have a strict
>>>>> order
>>>>> > in
>>>>> > >    the chain.
>>>>> > >    2.
>>>>> > >
>>>>> > >    The Decorators do not share common APIs in essence, as their
>>>>> input
>>>>> > type
>>>>> > >    could be different so that we can’t finish the construction of a
>>>>> > > Kubernetes
>>>>> > >    resource such as the Deployment and the TaskManager Pod through
>>>>> a
>>>>> > > one-time
>>>>> > >    traversal, there are boxing and unboxing overheads among some
>>>>> of the
>>>>> > >    neighboring Decorators.
>>>>> > >
>>>>> > >
>>>>> > > For the proposed design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    The steps are completely independent so that they could have
>>>>> arbitrary
>>>>> > >    order in the chain; The only rule is that the Hadoop
>>>>> configuration
>>>>> > >    Decorator should be the last node in the chain.
>>>>> > >    2.
>>>>> > >
>>>>> > >    All the steps share common APIs, with the same input type of
>>>>> all the
>>>>> > >    methods, so we can construct a Kubernetes resource via a
>>>>> one-time
>>>>> > > traversal.
>>>>> > >
>>>>> > >
>>>>> > > Question 6: What is the number of Decorators chains?
>>>>> > >
>>>>> > > For the existing design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    People have to introduce a new chain once they introduce a new
>>>>> > >    Kubernetes resource. For example, a new ConfigMap Decorator
>>>>> chain for
>>>>> > >    mounting a new configuration file. At this moment, we already
>>>>> have
>>>>> > five
>>>>> > >    Decorator chains.
>>>>> > >
>>>>> > >
>>>>> > > For the proposed design,
>>>>> > >
>>>>> > >    1.
>>>>> > >
>>>>> > >    There are always two chains, one is for constructing all the
>>>>> > Kubernetes
>>>>> > >    resources on the client-side, including the JobManager Pod, the
>>>>> > > Containers,
>>>>> > >    the ConfigMap(s), and the Services(s); the other one is for
>>>>> > constructing
>>>>> > >    the TaskManager Pod and the Containers.
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
>>>>> > >
>>>>> > > >  Hi Canbing,
>>>>> > > >
>>>>> > > >
>>>>> > > > Thanks a lot for sharing your thoughts to improve the Flink on
>>>>> K8s
>>>>> > native
>>>>> > > > integration.
>>>>> > > > Frankly speaking, your discussion title confuses me and i am
>>>>> wondering
>>>>> > > > whether you
>>>>> > > > want to refactor the whole design. However, after i dive into the
>>>>> > details
>>>>> > > > and the provided
>>>>> > > > document, i realize that mostly you want to improve the
>>>>> implementation.
>>>>> > > >
>>>>> > > >
>>>>> > > > Regarding your two main points.
>>>>> > > >
>>>>> > > > >> Introduce a unified monadic-step based orchestrator
>>>>> architecture
>>>>> > that
>>>>> > > > has a better,
>>>>> > > > cleaner and consistent abstraction for the Kubernetes resources
>>>>> > > > construction process,
>>>>> > > > both applicable to the client side and the master side.
>>>>> > > >
>>>>> > > > When i introduce the decorator for the K8s in Flink, there is
>>>>> always a
>>>>> > > > guideline in my mind
>>>>> > > > that it should be easy for extension and adding new features.
>>>>> Just as
>>>>> > you
>>>>> > > > say, we have lots
>>>>> > > > of functions to support and the K8s is also evolving very fast.
>>>>> The
>>>>> > > current
>>>>> > > > `ConfigMapDecorator`,
>>>>> > > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a
>>>>> basic
>>>>> > > > implementation with
>>>>> > > > some prerequisite parameters. Of course we could chain more
>>>>> decorators
>>>>> > to
>>>>> > > > construct the K8s
>>>>> > > > resources. For example, InitializerDecorator ->
>>>>> OwnerReferenceDecorator
>>>>> > > ->
>>>>> > > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>>>>> > > > SidecarDecorator -> etc.
>>>>> > > >
>>>>> > > > I am little sceptical about splitting every parameter into a
>>>>> single
>>>>> > > > decorator.  Since it does
>>>>> > > > not take too much benefits. But i agree with moving some common
>>>>> > > parameters
>>>>> > > > into separate
>>>>> > > > decorators(e.g. volume mount). Also introducing the `~Builder`
>>>>> class
>>>>> > and
>>>>> > > > spinning off the chaining
>>>>> > > > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > > >> Add some dedicated tools for centrally parsing, verifying, and
>>>>> > > managing
>>>>> > > > the Kubernetes parameters.
>>>>> > > >
>>>>> > > > Currently, we always get the parameters directly from Flink
>>>>> > > configuration(
>>>>> > > > e.g.
>>>>> > `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
>>>>> > > > think it could be improved
>>>>> > > > by introducing some dedicated conf parser classes. It is a good
>>>>> idea.
>>>>> > > >
>>>>> > > >
>>>>> > > > Best,
>>>>> > > > Yang
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五
>>>>> 上午9:31写道:
>>>>> > > >
>>>>> > > > > Hi everyone,
>>>>> > > > >
>>>>> > > > > I would like to kick off a discussion on refactoring the
>>>>> existing
>>>>> > > > > Kubernetes resources construction architecture design.
>>>>> > > > >
>>>>> > > > > I created a design document [1] that clarifies our motivation
>>>>> to do
>>>>> > > this
>>>>> > > > > and some improvement proposals for the new design.
>>>>> > > > >
>>>>> > > > > Briefly, we would like to
>>>>> > > > > 1. Introduce a unified monadic-step based orchestrator
>>>>> architecture
>>>>> > > that
>>>>> > > > > has a better, cleaner and consistent abstraction for the
>>>>> Kubernetes
>>>>> > > > > resources construction process, both applicable to the client
>>>>> side
>>>>> > and
>>>>> > > > the
>>>>> > > > > master side.
>>>>> > > > > 2. Add some dedicated tools for centrally parsing, verifying,
>>>>> and
>>>>> > > > managing
>>>>> > > > > the Kubernetes parameters.
>>>>> > > > >
>>>>> > > > > It would be great to start the efforts before adding big
>>>>> features for
>>>>> > > the
>>>>> > > > > native Kubernetes submodule, and Tison and I plan to work
>>>>> together
>>>>> > for
>>>>> > > > this
>>>>> > > > > issue.
>>>>> > > > >
>>>>> > > > > Please let me know your thoughts.
>>>>> > > > >
>>>>> > > > > Regards,
>>>>> > > > > Canbin Zheng
>>>>> > > > >
>>>>> > > > > [1]
>>>>> > > > >
>>>>> > > > >
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
>>>>> > > > > [2] https://issues.apache.org/jira/browse/FLINK-16194
>>>>> > > > >
>>>>> > > >
>>>>> > >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by tison <wa...@gmail.com>.
Thank Canbin for starting the discussion and thanks for participating so
far.

I agree that we can start with the "Parameters Parser" part and actually
split it into a
self-contained issue.

Best,
tison.


Yang Wang <da...@gmail.com> 于2020年2月24日周一 下午8:38写道:

> It seems that we could benefit a lot from the "Parameters Parser". Let's
> start to
> add the dedicated jobmanager/taskmanager config parser classes.
>
>
> Best,
> Yang
>
> Canbin Zheng <fe...@gmail.com> 于2020年2月24日周一 上午10:39写道:
>
>> Hi, Yang Wang,
>>
>> Thanks for the feedback.
>>
>> > Parameters Parser
>>
>> I can think of many benefits of the dedicated parameter parsing/verifying
>> tools, here is some of them:
>> 1. Reuse the parameters parsing and verifying code.
>> 2. Ensure consistent handling logic for the same setting.
>> 3. Simplify some of the method's signature because we can avoid defining
>> too many unnecessary input parameters.
>>
>> > BTW, the fabric8 kubernetes-client and Kubernetes api server already has
>> the parameters check before starting to create the resources. I think the
>> exceptions
>> are usually meaning and enough for the users to get the root cause.
>>
>> 1. On the one hand, there are gaps between the declarative model used by
>> Kubernetes and the configuration model used by Flink. Indeed, the
>> Kubernetes client/server helps check the Kubernetes side parameters and
>> throw exceptions in case of failures; however, the exception messages are
>> not always easily understood from the perspective of Flink users.
>> Independent of the solutions, what we should make sure is that the
>> exception information in the Flink configuration model is meaningful enough
>> for the Flink users.
>> 2. On the other hand, some prechecking is beneficial in some scenarios
>> since we can avoid useless effort on Kubernetes resource creation in case
>> of failures leading to clean up all the created resources.
>>
>>
>> Regards,
>> Canbin Zheng
>>
>> Yang Wang <da...@gmail.com> 于2020年2月23日周日 下午10:37写道:
>>
>>> > The new introduced decorator
>>> After some offline discussion with Canbin and tison, i totally understand
>>> the evolved decorator design. Each decorator will be self-contained and
>>> is responsible for just one thing. Currently, if we want to mount a new
>>> config
>>> file to jobmanager/taskmanager, then both `ConfigMapDecorator`,
>>> `JobManagerDecorator`, `TaskManagerDecorator` needs to be updated.
>>> It is not convenient for the new contributors to do this. In the new
>>> design,
>>> by leveraging the accompanying kubernetes resources in decorator, we
>>> could finish the creating and mounting config map in one decorator.
>>>
>>> Since now we just have a basic implementation for native Kubernetes
>>> integration and lots of features need to be developed. And many users
>>> want to participate in and contribute to the integration. So i agree to
>>> refactor
>>> the current decorator implementation and make it easier for the new
>>> contributor.
>>>
>>> For the detailed divergences(naming, etc.), i think we could discuss in
>>> the PR.
>>>
>>> > Parameters Parser
>>> Currently, the decorator directly use Flink configuration to get the
>>> parameters
>>> to build the Kubernetes resource. It is straightforward, however we
>>> could not
>>> have a unified parameters check. So i am not sure whether you will
>>> introduce
>>> a tool to check the parameters or just simply have our own basic check.
>>>
>>> BTW, the fabric8 kubernetes-client and Kubernetes api server already has
>>> the parameters check before starting to create the resources. I think
>>> the exceptions
>>> are usually meaning and enough for the users to get the root cause.
>>>
>>>
>>>
>>> Best,
>>> Yang
>>>
>>>
>>>
>>>
>>> felixzheng zheng <fe...@gmail.com> 于2020年2月22日周六 上午10:54写道:
>>>
>>>> Great thanks for the quick feedback Till. You are right; it is not a
>>>> fundamentally different approach compared to
>>>> what we have right now, all the Kubernetes resources created are the
>>>> same,
>>>> we aim to evolve the existing decorator approach so that,
>>>> 1. the decorators are monadic and smaller in size and functionality.
>>>> 2. the new decorator design allows reusing the decorators between the
>>>> client and the cluster as much as possible.
>>>> 3. all the decorators are independent with each other, and they could
>>>> have
>>>> arbitrary order in the chain, they share the same APIs and follow a
>>>> unified
>>>> orchestrator architecture so that new developers could quickly
>>>> understand
>>>> what should be done to introduce a new feature.
>>>>
>>>> Besides that, the new approach allows us adding tests for every
>>>> decorator
>>>> alone instead of doing a final test of all the decorators in the
>>>> Fabric8ClientTest.java.
>>>>
>>>> Cheers,
>>>> Canbin Zheng
>>>>
>>>> Till Rohrmann <tr...@apache.org> 于2020年2月22日周六 上午12:28写道:
>>>>
>>>> > Thanks for starting this discussion Canbin. If I understand your
>>>> proposal
>>>> > correctly, then you would like to evolve the existing decorator
>>>> approach so
>>>> > that decorators are monadic and smaller in size and functionality. The
>>>> > latter aspect will allow to reuse them between the client and the
>>>> cluster.
>>>> > Just to make sure, it is not a fundamentally different approach
>>>> compared to
>>>> > what we have right now, is it?
>>>> >
>>>> > If this is the case, then I think it makes sense to reuse code as
>>>> much as
>>>> > possible and to create small code units which are easier to test.
>>>> >
>>>> > Cheers,
>>>> > Till
>>>> >
>>>> > On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <
>>>> felixzhengcb@gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Thanks for the feedback @Yang Wang. I would like to discuss some of
>>>> the
>>>> > > details in depth about why I am confused about the existing design.
>>>> > >
>>>> > > Question 1: How do we mount a configuration file?
>>>> > >
>>>> > > For the existing design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    We need several classes to finish it:
>>>> > >    1.
>>>> > >
>>>> > >       InitializerDecorator
>>>> > >       2.
>>>> > >
>>>> > >       OwnerReferenceDecorator
>>>> > >       3.
>>>> > >
>>>> > >       ConfigMapDecorator
>>>> > >       4.
>>>> > >
>>>> > >       KubernetesUtils: providing the getConfigMapVolume method to
>>>> share
>>>> > for
>>>> > >       the FlinkMasterDeploymentDecorator and the
>>>> TaskManagerPodDecorator.
>>>> > >       5.
>>>> > >
>>>> > >       FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
>>>> > >       6.
>>>> > >
>>>> > >       TaskManagerPodDecorator: mounts the ConfigMap volume.
>>>> > >       7.
>>>> > >
>>>> > >       If in the future, someone would like to introduce an init
>>>> > Container,
>>>> > >       the InitContainerDecorator has to mount the ConfigMap volume
>>>> too.
>>>> > >
>>>> > >
>>>> > > I am confused about the current solution to mounting a configuration
>>>> > file:
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    Actually, we do not need so many Decorators for mounting a file.
>>>> > >    2.
>>>> > >
>>>> > >    If we would like to mount a new file, we have no choice but to
>>>> repeat
>>>> > >    the same tedious and scattered routine.
>>>> > >    3.
>>>> > >
>>>> > >    There’s no easy way to test the file mounting functionality
>>>> alone; we
>>>> > >    have to construct the ConfigMap, the Deployment or the
>>>> TaskManagerPod
>>>> > > first
>>>> > >    and then do a final test.
>>>> > >
>>>> > >
>>>> > > The reason why it is so complex to mount a configuration file is
>>>> that we
>>>> > > don’t fully consider the internal connections among those resources
>>>> in
>>>> > the
>>>> > > existing design.
>>>> > >
>>>> > > The new abstraction we proposed could solve such a kind of problem,
>>>> the
>>>> > new
>>>> > > Decorator object is as follows:
>>>> > >
>>>> > > public interface KubernetesStepDecorator {
>>>> > >
>>>> > >   /**
>>>> > >
>>>> > >    * Apply transformations to the given FlinkPod in accordance with
>>>> this
>>>> > > feature. This can include adding
>>>> > >
>>>> > >    * labels/annotations, mounting volumes, and setting startup
>>>> command or
>>>> > > parameters, etc.
>>>> > >
>>>> > >    */
>>>> > >
>>>> > >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
>>>> > >
>>>> > >   /**
>>>> > >
>>>> > >    * Build the accompanying Kubernetes resources that should be
>>>> > introduced
>>>> > > to support this feature. This could
>>>> > >
>>>> > >    * only applicable to the client-side submission process.
>>>> > >
>>>> > >    */
>>>> > >
>>>> > >   List<HasMetadata> buildAccompanyingKubernetesResources() throws
>>>> > > IOException;
>>>> > >
>>>> > > }
>>>> > >
>>>> > > The FlinkPod is a composition of the Pod, the main Container, the
>>>> init
>>>> > > Container, and the sidecar Container.
>>>> > >
>>>> > > Next, we introduce a KubernetesStepDecorator implementation, the
>>>> method
>>>> > of
>>>> > > buildAccompanyingKubernetesResources creates the corresponding
>>>> ConfigMap,
>>>> > > and the method of decorateFlinkPod configures the Volume for the
>>>> Pod and
>>>> > > all the Containers.
>>>> > >
>>>> > > So, for the scenario of mounting a configuration file, the
>>>> advantages of
>>>> > > this new architecture are as follows:
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    One dedicated KubernetesStepDecorator implementation is enough to
>>>> > finish
>>>> > >    all the mounting work, meanwhile, this class would be shared
>>>> between
>>>> > the
>>>> > >    client-side and the master-side. Besides that, the number of
>>>> lines of
>>>> > > code
>>>> > >    in that class will not exceed 300 lines which facilitate code
>>>> > > readability
>>>> > >    and maintenance.
>>>> > >    2.
>>>> > >
>>>> > >    Testing becomes an easy thing now, as we can add a dedicated test
>>>> > class
>>>> > >    for only this class, the test would never rely on the
>>>> construction of
>>>> > > other
>>>> > >    components such as the Deployment.
>>>> > >    3.
>>>> > >
>>>> > >    It’s quite convenient to mount a new configuration file via the
>>>> newly
>>>> > >    dedicated KubernetesStepDecorator implementation.
>>>> > >
>>>> > >
>>>> > > Question 2: How do we construct the Pod?
>>>> > >
>>>> > > For the existing design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    The FlinkMasterDeploymentDecorator is responsible for building
>>>> the
>>>> > >    Deployment and configuring the Pod and the Containers, while the
>>>> > >    TaskManagerPodDecorator is responsible for building the
>>>> TaskManager
>>>> > Pod.
>>>> > >    Take FlinkMasterDeploymentDecorator as an example, let’s see
>>>> what it
>>>> > has
>>>> > >    done.
>>>> > >    1.
>>>> > >
>>>> > >       Configure the main Container, including the name, command,
>>>> args,
>>>> > >       image, image pull policy, resource requirements, ports, all
>>>> kinds
>>>> > of
>>>> > >       environment variables, all kinds of volume mounts, etc.
>>>> > >       2.
>>>> > >
>>>> > >       Configure the Pod, including service account, all kinds of
>>>> volumes,
>>>> > >       and attach all kinds of Container, including the main
>>>> Container,
>>>> > the
>>>> > > init
>>>> > >       Container, and the sidecar Container.
>>>> > >       3.
>>>> > >
>>>> > >       Configure the Deployment.
>>>> > >
>>>> > >
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    The InitializerDecorator and the OwnerReferenceDecorator have
>>>> basic
>>>> > >    logic so that the most complex work is completed in the
>>>> > >    FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
>>>> With
>>>> > the
>>>> > >    introduction of new features for the Pod, such as customized
>>>> volume
>>>> > > mounts,
>>>> > >    Hadoop configuration support, Kerberized HDFS support, secret
>>>> mounts,
>>>> > >    Python support, etc. the construction process could become far
>>>> more
>>>> > >    complicated, and the functionality of a single class could
>>>> explode,
>>>> > > which
>>>> > >    hurts code readability, writability, and testability. Besides
>>>> that,
>>>> > both
>>>> > >    the client-side and the master-side shares much of the Pod
>>>> > construction
>>>> > >    logic.
>>>> > >
>>>> > >
>>>> > > So the problems are as follows:
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    We don’t have a consistent abstraction that is applicable to
>>>> both the
>>>> > >    client-side and the master-side for Pod construction (including
>>>> the
>>>> > >    Containers) so that we have to share some of the code via tool
>>>> classes
>>>> > >    which is a trick solution, however, we can’t share most of the
>>>> code as
>>>> > > much
>>>> > >    as possible.
>>>> > >    2.
>>>> > >
>>>> > >    We don’t have a step-by-step orchestrator architecture to help
>>>> the Pod
>>>> > >    construction.
>>>> > >
>>>> > >
>>>> > > For the proposed design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    The problems above are all solved: we have a consistent
>>>> abstraction
>>>> > that
>>>> > >    leads to a monadic-step orchestrator architecture to construct
>>>> the Pod
>>>> > > step
>>>> > >    by step. One step is responsible for exactly one thing,
>>>> following that
>>>> > > is
>>>> > >    the fact that every step is well testable, because each unit can
>>>> be
>>>> > >    considerably small, and the number of branches to test in any
>>>> given
>>>> > > step is
>>>> > >    limited. Moreover, much of the step could be shared between the
>>>> > > client-side
>>>> > >    and the master-side, such as configuration files mount, etc.
>>>> > >
>>>> > >
>>>> > > Question 3: Could all the resources share the same orchestrator
>>>> > > architecture?
>>>> > >
>>>> > > For the existing design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    We don’t have a unified orchestrator architecture for the
>>>> construction
>>>> > >    of all the Kubernetes resources, therefore, we need a Decorator
>>>> chain
>>>> > > for
>>>> > >    every Kubernetes resource.
>>>> > >
>>>> > >
>>>> > > For the proposed design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    Following Question 1 ~ 2 and the design doc[1], we have
>>>> introduced a
>>>> > >    monadic-step orchestrator architecture to construct the Pod and
>>>> all
>>>> > the
>>>> > >    Containers. Besides that, this architecture also works for the
>>>> other
>>>> > >    resources, such as the Services and the ConfigMaps. For example,
>>>> if we
>>>> > > need
>>>> > >    a new Service or a ConfigMap, just introduce a new step.
>>>> > >
>>>> > >
>>>> > >
>>>> > > Question 4: How do we introduce the InitContainer or the side-car
>>>> > > Container?
>>>> > >
>>>> > > For the existing design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    Both the client-side and the master-side introduce some new
>>>> > Decorators,
>>>> > >
>>>> > > the client-side chain could be:
>>>> > >
>>>> > > InitializerDecorator -> OwnerReferenceDecorator ->
>>>> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>>>> > > SidecarDecorator -> etc
>>>> > >
>>>> > > -
>>>> > >
>>>> > > and the master-side could be:
>>>> > >
>>>> > > InitializerDecorator -> OwnerReferenceDecorator  ->
>>>> > TaskManagerPodDecorator
>>>> > > -> InitContainerDecorator -> SidecarDecorator -> etc
>>>> > >
>>>> > > As we can see, the FlinkMasterDeploymentDecorator or the
>>>> > > TaskManagerPodDecorator is designed for the Pod construction,
>>>> including
>>>> > the
>>>> > > Containers, so we don’t need to treat the init Container and the
>>>> sidecar
>>>> > > Container as special cases different from the main Container by
>>>> > introducing
>>>> > > a dedicated Decorator for every one of them. Such kind of trick
>>>> solution
>>>> > > confuses me, maybe to the other developers.
>>>> > >
>>>> > > For the proposed design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    Following Question 1 ~ 4 and the design doc [1], the central
>>>> premise
>>>> > of
>>>> > >    the proposed orchestrator architecture is that the Pod, the main
>>>> > > Container,
>>>> > >    the init Container, the sidecar Container, the Services, and the
>>>> > > ConfigMaps
>>>> > >    all sit on an equal footing. For every one of the Pod, the main
>>>> > > Container,
>>>> > >    the init Container and the sidecar Container, people could
>>>> introduce
>>>> > any
>>>> > >    number of steps to finish its construction; we attach all the
>>>> > > Containers to
>>>> > >    the Pod in the Builder tool classes after the orchestrator
>>>> > architecture
>>>> > >    constructs them.
>>>> > >
>>>> > >
>>>> > > Question 5: What is the relation between the Decorators?
>>>> > >
>>>> > > For the existing design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    The Decorators are not independent; most of them have a strict
>>>> order
>>>> > in
>>>> > >    the chain.
>>>> > >    2.
>>>> > >
>>>> > >    The Decorators do not share common APIs in essence, as their
>>>> input
>>>> > type
>>>> > >    could be different so that we can’t finish the construction of a
>>>> > > Kubernetes
>>>> > >    resource such as the Deployment and the TaskManager Pod through a
>>>> > > one-time
>>>> > >    traversal, there are boxing and unboxing overheads among some of
>>>> the
>>>> > >    neighboring Decorators.
>>>> > >
>>>> > >
>>>> > > For the proposed design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    The steps are completely independent so that they could have
>>>> arbitrary
>>>> > >    order in the chain; The only rule is that the Hadoop
>>>> configuration
>>>> > >    Decorator should be the last node in the chain.
>>>> > >    2.
>>>> > >
>>>> > >    All the steps share common APIs, with the same input type of all
>>>> the
>>>> > >    methods, so we can construct a Kubernetes resource via a one-time
>>>> > > traversal.
>>>> > >
>>>> > >
>>>> > > Question 6: What is the number of Decorators chains?
>>>> > >
>>>> > > For the existing design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    People have to introduce a new chain once they introduce a new
>>>> > >    Kubernetes resource. For example, a new ConfigMap Decorator
>>>> chain for
>>>> > >    mounting a new configuration file. At this moment, we already
>>>> have
>>>> > five
>>>> > >    Decorator chains.
>>>> > >
>>>> > >
>>>> > > For the proposed design,
>>>> > >
>>>> > >    1.
>>>> > >
>>>> > >    There are always two chains, one is for constructing all the
>>>> > Kubernetes
>>>> > >    resources on the client-side, including the JobManager Pod, the
>>>> > > Containers,
>>>> > >    the ConfigMap(s), and the Services(s); the other one is for
>>>> > constructing
>>>> > >    the TaskManager Pod and the Containers.
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
>>>> > >
>>>> > > >  Hi Canbing,
>>>> > > >
>>>> > > >
>>>> > > > Thanks a lot for sharing your thoughts to improve the Flink on K8s
>>>> > native
>>>> > > > integration.
>>>> > > > Frankly speaking, your discussion title confuses me and i am
>>>> wondering
>>>> > > > whether you
>>>> > > > want to refactor the whole design. However, after i dive into the
>>>> > details
>>>> > > > and the provided
>>>> > > > document, i realize that mostly you want to improve the
>>>> implementation.
>>>> > > >
>>>> > > >
>>>> > > > Regarding your two main points.
>>>> > > >
>>>> > > > >> Introduce a unified monadic-step based orchestrator
>>>> architecture
>>>> > that
>>>> > > > has a better,
>>>> > > > cleaner and consistent abstraction for the Kubernetes resources
>>>> > > > construction process,
>>>> > > > both applicable to the client side and the master side.
>>>> > > >
>>>> > > > When i introduce the decorator for the K8s in Flink, there is
>>>> always a
>>>> > > > guideline in my mind
>>>> > > > that it should be easy for extension and adding new features.
>>>> Just as
>>>> > you
>>>> > > > say, we have lots
>>>> > > > of functions to support and the K8s is also evolving very fast.
>>>> The
>>>> > > current
>>>> > > > `ConfigMapDecorator`,
>>>> > > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a
>>>> basic
>>>> > > > implementation with
>>>> > > > some prerequisite parameters. Of course we could chain more
>>>> decorators
>>>> > to
>>>> > > > construct the K8s
>>>> > > > resources. For example, InitializerDecorator ->
>>>> OwnerReferenceDecorator
>>>> > > ->
>>>> > > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>>>> > > > SidecarDecorator -> etc.
>>>> > > >
>>>> > > > I am little sceptical about splitting every parameter into a
>>>> single
>>>> > > > decorator.  Since it does
>>>> > > > not take too much benefits. But i agree with moving some common
>>>> > > parameters
>>>> > > > into separate
>>>> > > > decorators(e.g. volume mount). Also introducing the `~Builder`
>>>> class
>>>> > and
>>>> > > > spinning off the chaining
>>>> > > > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > >> Add some dedicated tools for centrally parsing, verifying, and
>>>> > > managing
>>>> > > > the Kubernetes parameters.
>>>> > > >
>>>> > > > Currently, we always get the parameters directly from Flink
>>>> > > configuration(
>>>> > > > e.g.
>>>> > `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
>>>> > > > think it could be improved
>>>> > > > by introducing some dedicated conf parser classes. It is a good
>>>> idea.
>>>> > > >
>>>> > > >
>>>> > > > Best,
>>>> > > > Yang
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:
>>>> > > >
>>>> > > > > Hi everyone,
>>>> > > > >
>>>> > > > > I would like to kick off a discussion on refactoring the
>>>> existing
>>>> > > > > Kubernetes resources construction architecture design.
>>>> > > > >
>>>> > > > > I created a design document [1] that clarifies our motivation
>>>> to do
>>>> > > this
>>>> > > > > and some improvement proposals for the new design.
>>>> > > > >
>>>> > > > > Briefly, we would like to
>>>> > > > > 1. Introduce a unified monadic-step based orchestrator
>>>> architecture
>>>> > > that
>>>> > > > > has a better, cleaner and consistent abstraction for the
>>>> Kubernetes
>>>> > > > > resources construction process, both applicable to the client
>>>> side
>>>> > and
>>>> > > > the
>>>> > > > > master side.
>>>> > > > > 2. Add some dedicated tools for centrally parsing, verifying,
>>>> and
>>>> > > > managing
>>>> > > > > the Kubernetes parameters.
>>>> > > > >
>>>> > > > > It would be great to start the efforts before adding big
>>>> features for
>>>> > > the
>>>> > > > > native Kubernetes submodule, and Tison and I plan to work
>>>> together
>>>> > for
>>>> > > > this
>>>> > > > > issue.
>>>> > > > >
>>>> > > > > Please let me know your thoughts.
>>>> > > > >
>>>> > > > > Regards,
>>>> > > > > Canbin Zheng
>>>> > > > >
>>>> > > > > [1]
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > >
>>>> >
>>>> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
>>>> > > > > [2] https://issues.apache.org/jira/browse/FLINK-16194
>>>> > > > >
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Yang Wang <da...@gmail.com>.
It seems that we could benefit a lot from the "Parameters Parser". Let's
start to
add the dedicated jobmanager/taskmanager config parser classes.


Best,
Yang

Canbin Zheng <fe...@gmail.com> 于2020年2月24日周一 上午10:39写道:

> Hi, Yang Wang,
>
> Thanks for the feedback.
>
> > Parameters Parser
>
> I can think of many benefits of the dedicated parameter parsing/verifying
> tools, here is some of them:
> 1. Reuse the parameters parsing and verifying code.
> 2. Ensure consistent handling logic for the same setting.
> 3. Simplify some of the method's signature because we can avoid defining
> too many unnecessary input parameters.
>
> > BTW, the fabric8 kubernetes-client and Kubernetes api server already has
> the parameters check before starting to create the resources. I think the
> exceptions
> are usually meaning and enough for the users to get the root cause.
>
> 1. On the one hand, there are gaps between the declarative model used by
> Kubernetes and the configuration model used by Flink. Indeed, the
> Kubernetes client/server helps check the Kubernetes side parameters and
> throw exceptions in case of failures; however, the exception messages are
> not always easily understood from the perspective of Flink users.
> Independent of the solutions, what we should make sure is that the
> exception information in the Flink configuration model is meaningful enough
> for the Flink users.
> 2. On the other hand, some prechecking is beneficial in some scenarios
> since we can avoid useless effort on Kubernetes resource creation in case
> of failures leading to clean up all the created resources.
>
>
> Regards,
> Canbin Zheng
>
> Yang Wang <da...@gmail.com> 于2020年2月23日周日 下午10:37写道:
>
>> > The new introduced decorator
>> After some offline discussion with Canbin and tison, i totally understand
>> the evolved decorator design. Each decorator will be self-contained and
>> is responsible for just one thing. Currently, if we want to mount a new
>> config
>> file to jobmanager/taskmanager, then both `ConfigMapDecorator`,
>> `JobManagerDecorator`, `TaskManagerDecorator` needs to be updated.
>> It is not convenient for the new contributors to do this. In the new
>> design,
>> by leveraging the accompanying kubernetes resources in decorator, we
>> could finish the creating and mounting config map in one decorator.
>>
>> Since now we just have a basic implementation for native Kubernetes
>> integration and lots of features need to be developed. And many users
>> want to participate in and contribute to the integration. So i agree to
>> refactor
>> the current decorator implementation and make it easier for the new
>> contributor.
>>
>> For the detailed divergences(naming, etc.), i think we could discuss in
>> the PR.
>>
>> > Parameters Parser
>> Currently, the decorator directly use Flink configuration to get the
>> parameters
>> to build the Kubernetes resource. It is straightforward, however we could
>> not
>> have a unified parameters check. So i am not sure whether you will
>> introduce
>> a tool to check the parameters or just simply have our own basic check.
>>
>> BTW, the fabric8 kubernetes-client and Kubernetes api server already has
>> the parameters check before starting to create the resources. I think the
>> exceptions
>> are usually meaning and enough for the users to get the root cause.
>>
>>
>>
>> Best,
>> Yang
>>
>>
>>
>>
>> felixzheng zheng <fe...@gmail.com> 于2020年2月22日周六 上午10:54写道:
>>
>>> Great thanks for the quick feedback Till. You are right; it is not a
>>> fundamentally different approach compared to
>>> what we have right now, all the Kubernetes resources created are the
>>> same,
>>> we aim to evolve the existing decorator approach so that,
>>> 1. the decorators are monadic and smaller in size and functionality.
>>> 2. the new decorator design allows reusing the decorators between the
>>> client and the cluster as much as possible.
>>> 3. all the decorators are independent with each other, and they could
>>> have
>>> arbitrary order in the chain, they share the same APIs and follow a
>>> unified
>>> orchestrator architecture so that new developers could quickly understand
>>> what should be done to introduce a new feature.
>>>
>>> Besides that, the new approach allows us adding tests for every decorator
>>> alone instead of doing a final test of all the decorators in the
>>> Fabric8ClientTest.java.
>>>
>>> Cheers,
>>> Canbin Zheng
>>>
>>> Till Rohrmann <tr...@apache.org> 于2020年2月22日周六 上午12:28写道:
>>>
>>> > Thanks for starting this discussion Canbin. If I understand your
>>> proposal
>>> > correctly, then you would like to evolve the existing decorator
>>> approach so
>>> > that decorators are monadic and smaller in size and functionality. The
>>> > latter aspect will allow to reuse them between the client and the
>>> cluster.
>>> > Just to make sure, it is not a fundamentally different approach
>>> compared to
>>> > what we have right now, is it?
>>> >
>>> > If this is the case, then I think it makes sense to reuse code as much
>>> as
>>> > possible and to create small code units which are easier to test.
>>> >
>>> > Cheers,
>>> > Till
>>> >
>>> > On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <
>>> felixzhengcb@gmail.com>
>>> > wrote:
>>> >
>>> > > Thanks for the feedback @Yang Wang. I would like to discuss some of
>>> the
>>> > > details in depth about why I am confused about the existing design.
>>> > >
>>> > > Question 1: How do we mount a configuration file?
>>> > >
>>> > > For the existing design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    We need several classes to finish it:
>>> > >    1.
>>> > >
>>> > >       InitializerDecorator
>>> > >       2.
>>> > >
>>> > >       OwnerReferenceDecorator
>>> > >       3.
>>> > >
>>> > >       ConfigMapDecorator
>>> > >       4.
>>> > >
>>> > >       KubernetesUtils: providing the getConfigMapVolume method to
>>> share
>>> > for
>>> > >       the FlinkMasterDeploymentDecorator and the
>>> TaskManagerPodDecorator.
>>> > >       5.
>>> > >
>>> > >       FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
>>> > >       6.
>>> > >
>>> > >       TaskManagerPodDecorator: mounts the ConfigMap volume.
>>> > >       7.
>>> > >
>>> > >       If in the future, someone would like to introduce an init
>>> > Container,
>>> > >       the InitContainerDecorator has to mount the ConfigMap volume
>>> too.
>>> > >
>>> > >
>>> > > I am confused about the current solution to mounting a configuration
>>> > file:
>>> > >
>>> > >    1.
>>> > >
>>> > >    Actually, we do not need so many Decorators for mounting a file.
>>> > >    2.
>>> > >
>>> > >    If we would like to mount a new file, we have no choice but to
>>> repeat
>>> > >    the same tedious and scattered routine.
>>> > >    3.
>>> > >
>>> > >    There’s no easy way to test the file mounting functionality
>>> alone; we
>>> > >    have to construct the ConfigMap, the Deployment or the
>>> TaskManagerPod
>>> > > first
>>> > >    and then do a final test.
>>> > >
>>> > >
>>> > > The reason why it is so complex to mount a configuration file is
>>> that we
>>> > > don’t fully consider the internal connections among those resources
>>> in
>>> > the
>>> > > existing design.
>>> > >
>>> > > The new abstraction we proposed could solve such a kind of problem,
>>> the
>>> > new
>>> > > Decorator object is as follows:
>>> > >
>>> > > public interface KubernetesStepDecorator {
>>> > >
>>> > >   /**
>>> > >
>>> > >    * Apply transformations to the given FlinkPod in accordance with
>>> this
>>> > > feature. This can include adding
>>> > >
>>> > >    * labels/annotations, mounting volumes, and setting startup
>>> command or
>>> > > parameters, etc.
>>> > >
>>> > >    */
>>> > >
>>> > >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
>>> > >
>>> > >   /**
>>> > >
>>> > >    * Build the accompanying Kubernetes resources that should be
>>> > introduced
>>> > > to support this feature. This could
>>> > >
>>> > >    * only applicable to the client-side submission process.
>>> > >
>>> > >    */
>>> > >
>>> > >   List<HasMetadata> buildAccompanyingKubernetesResources() throws
>>> > > IOException;
>>> > >
>>> > > }
>>> > >
>>> > > The FlinkPod is a composition of the Pod, the main Container, the
>>> init
>>> > > Container, and the sidecar Container.
>>> > >
>>> > > Next, we introduce a KubernetesStepDecorator implementation, the
>>> method
>>> > of
>>> > > buildAccompanyingKubernetesResources creates the corresponding
>>> ConfigMap,
>>> > > and the method of decorateFlinkPod configures the Volume for the Pod
>>> and
>>> > > all the Containers.
>>> > >
>>> > > So, for the scenario of mounting a configuration file, the
>>> advantages of
>>> > > this new architecture are as follows:
>>> > >
>>> > >    1.
>>> > >
>>> > >    One dedicated KubernetesStepDecorator implementation is enough to
>>> > finish
>>> > >    all the mounting work, meanwhile, this class would be shared
>>> between
>>> > the
>>> > >    client-side and the master-side. Besides that, the number of
>>> lines of
>>> > > code
>>> > >    in that class will not exceed 300 lines which facilitate code
>>> > > readability
>>> > >    and maintenance.
>>> > >    2.
>>> > >
>>> > >    Testing becomes an easy thing now, as we can add a dedicated test
>>> > class
>>> > >    for only this class, the test would never rely on the
>>> construction of
>>> > > other
>>> > >    components such as the Deployment.
>>> > >    3.
>>> > >
>>> > >    It’s quite convenient to mount a new configuration file via the
>>> newly
>>> > >    dedicated KubernetesStepDecorator implementation.
>>> > >
>>> > >
>>> > > Question 2: How do we construct the Pod?
>>> > >
>>> > > For the existing design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    The FlinkMasterDeploymentDecorator is responsible for building the
>>> > >    Deployment and configuring the Pod and the Containers, while the
>>> > >    TaskManagerPodDecorator is responsible for building the
>>> TaskManager
>>> > Pod.
>>> > >    Take FlinkMasterDeploymentDecorator as an example, let’s see what
>>> it
>>> > has
>>> > >    done.
>>> > >    1.
>>> > >
>>> > >       Configure the main Container, including the name, command,
>>> args,
>>> > >       image, image pull policy, resource requirements, ports, all
>>> kinds
>>> > of
>>> > >       environment variables, all kinds of volume mounts, etc.
>>> > >       2.
>>> > >
>>> > >       Configure the Pod, including service account, all kinds of
>>> volumes,
>>> > >       and attach all kinds of Container, including the main
>>> Container,
>>> > the
>>> > > init
>>> > >       Container, and the sidecar Container.
>>> > >       3.
>>> > >
>>> > >       Configure the Deployment.
>>> > >
>>> > >
>>> > >
>>> > >    1.
>>> > >
>>> > >    The InitializerDecorator and the OwnerReferenceDecorator have
>>> basic
>>> > >    logic so that the most complex work is completed in the
>>> > >    FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
>>> With
>>> > the
>>> > >    introduction of new features for the Pod, such as customized
>>> volume
>>> > > mounts,
>>> > >    Hadoop configuration support, Kerberized HDFS support, secret
>>> mounts,
>>> > >    Python support, etc. the construction process could become far
>>> more
>>> > >    complicated, and the functionality of a single class could
>>> explode,
>>> > > which
>>> > >    hurts code readability, writability, and testability. Besides
>>> that,
>>> > both
>>> > >    the client-side and the master-side shares much of the Pod
>>> > construction
>>> > >    logic.
>>> > >
>>> > >
>>> > > So the problems are as follows:
>>> > >
>>> > >    1.
>>> > >
>>> > >    We don’t have a consistent abstraction that is applicable to both
>>> the
>>> > >    client-side and the master-side for Pod construction (including
>>> the
>>> > >    Containers) so that we have to share some of the code via tool
>>> classes
>>> > >    which is a trick solution, however, we can’t share most of the
>>> code as
>>> > > much
>>> > >    as possible.
>>> > >    2.
>>> > >
>>> > >    We don’t have a step-by-step orchestrator architecture to help
>>> the Pod
>>> > >    construction.
>>> > >
>>> > >
>>> > > For the proposed design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    The problems above are all solved: we have a consistent
>>> abstraction
>>> > that
>>> > >    leads to a monadic-step orchestrator architecture to construct
>>> the Pod
>>> > > step
>>> > >    by step. One step is responsible for exactly one thing, following
>>> that
>>> > > is
>>> > >    the fact that every step is well testable, because each unit can
>>> be
>>> > >    considerably small, and the number of branches to test in any
>>> given
>>> > > step is
>>> > >    limited. Moreover, much of the step could be shared between the
>>> > > client-side
>>> > >    and the master-side, such as configuration files mount, etc.
>>> > >
>>> > >
>>> > > Question 3: Could all the resources share the same orchestrator
>>> > > architecture?
>>> > >
>>> > > For the existing design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    We don’t have a unified orchestrator architecture for the
>>> construction
>>> > >    of all the Kubernetes resources, therefore, we need a Decorator
>>> chain
>>> > > for
>>> > >    every Kubernetes resource.
>>> > >
>>> > >
>>> > > For the proposed design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    Following Question 1 ~ 2 and the design doc[1], we have
>>> introduced a
>>> > >    monadic-step orchestrator architecture to construct the Pod and
>>> all
>>> > the
>>> > >    Containers. Besides that, this architecture also works for the
>>> other
>>> > >    resources, such as the Services and the ConfigMaps. For example,
>>> if we
>>> > > need
>>> > >    a new Service or a ConfigMap, just introduce a new step.
>>> > >
>>> > >
>>> > >
>>> > > Question 4: How do we introduce the InitContainer or the side-car
>>> > > Container?
>>> > >
>>> > > For the existing design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    Both the client-side and the master-side introduce some new
>>> > Decorators,
>>> > >
>>> > > the client-side chain could be:
>>> > >
>>> > > InitializerDecorator -> OwnerReferenceDecorator ->
>>> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>>> > > SidecarDecorator -> etc
>>> > >
>>> > > -
>>> > >
>>> > > and the master-side could be:
>>> > >
>>> > > InitializerDecorator -> OwnerReferenceDecorator  ->
>>> > TaskManagerPodDecorator
>>> > > -> InitContainerDecorator -> SidecarDecorator -> etc
>>> > >
>>> > > As we can see, the FlinkMasterDeploymentDecorator or the
>>> > > TaskManagerPodDecorator is designed for the Pod construction,
>>> including
>>> > the
>>> > > Containers, so we don’t need to treat the init Container and the
>>> sidecar
>>> > > Container as special cases different from the main Container by
>>> > introducing
>>> > > a dedicated Decorator for every one of them. Such kind of trick
>>> solution
>>> > > confuses me, maybe to the other developers.
>>> > >
>>> > > For the proposed design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    Following Question 1 ~ 4 and the design doc [1], the central
>>> premise
>>> > of
>>> > >    the proposed orchestrator architecture is that the Pod, the main
>>> > > Container,
>>> > >    the init Container, the sidecar Container, the Services, and the
>>> > > ConfigMaps
>>> > >    all sit on an equal footing. For every one of the Pod, the main
>>> > > Container,
>>> > >    the init Container and the sidecar Container, people could
>>> introduce
>>> > any
>>> > >    number of steps to finish its construction; we attach all the
>>> > > Containers to
>>> > >    the Pod in the Builder tool classes after the orchestrator
>>> > architecture
>>> > >    constructs them.
>>> > >
>>> > >
>>> > > Question 5: What is the relation between the Decorators?
>>> > >
>>> > > For the existing design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    The Decorators are not independent; most of them have a strict
>>> order
>>> > in
>>> > >    the chain.
>>> > >    2.
>>> > >
>>> > >    The Decorators do not share common APIs in essence, as their input
>>> > type
>>> > >    could be different so that we can’t finish the construction of a
>>> > > Kubernetes
>>> > >    resource such as the Deployment and the TaskManager Pod through a
>>> > > one-time
>>> > >    traversal, there are boxing and unboxing overheads among some of
>>> the
>>> > >    neighboring Decorators.
>>> > >
>>> > >
>>> > > For the proposed design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    The steps are completely independent so that they could have
>>> arbitrary
>>> > >    order in the chain; The only rule is that the Hadoop configuration
>>> > >    Decorator should be the last node in the chain.
>>> > >    2.
>>> > >
>>> > >    All the steps share common APIs, with the same input type of all
>>> the
>>> > >    methods, so we can construct a Kubernetes resource via a one-time
>>> > > traversal.
>>> > >
>>> > >
>>> > > Question 6: What is the number of Decorators chains?
>>> > >
>>> > > For the existing design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    People have to introduce a new chain once they introduce a new
>>> > >    Kubernetes resource. For example, a new ConfigMap Decorator chain
>>> for
>>> > >    mounting a new configuration file. At this moment, we already have
>>> > five
>>> > >    Decorator chains.
>>> > >
>>> > >
>>> > > For the proposed design,
>>> > >
>>> > >    1.
>>> > >
>>> > >    There are always two chains, one is for constructing all the
>>> > Kubernetes
>>> > >    resources on the client-side, including the JobManager Pod, the
>>> > > Containers,
>>> > >    the ConfigMap(s), and the Services(s); the other one is for
>>> > constructing
>>> > >    the TaskManager Pod and the Containers.
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
>>> > >
>>> > > >  Hi Canbing,
>>> > > >
>>> > > >
>>> > > > Thanks a lot for sharing your thoughts to improve the Flink on K8s
>>> > native
>>> > > > integration.
>>> > > > Frankly speaking, your discussion title confuses me and i am
>>> wondering
>>> > > > whether you
>>> > > > want to refactor the whole design. However, after i dive into the
>>> > details
>>> > > > and the provided
>>> > > > document, i realize that mostly you want to improve the
>>> implementation.
>>> > > >
>>> > > >
>>> > > > Regarding your two main points.
>>> > > >
>>> > > > >> Introduce a unified monadic-step based orchestrator architecture
>>> > that
>>> > > > has a better,
>>> > > > cleaner and consistent abstraction for the Kubernetes resources
>>> > > > construction process,
>>> > > > both applicable to the client side and the master side.
>>> > > >
>>> > > > When i introduce the decorator for the K8s in Flink, there is
>>> always a
>>> > > > guideline in my mind
>>> > > > that it should be easy for extension and adding new features. Just
>>> as
>>> > you
>>> > > > say, we have lots
>>> > > > of functions to support and the K8s is also evolving very fast. The
>>> > > current
>>> > > > `ConfigMapDecorator`,
>>> > > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a
>>> basic
>>> > > > implementation with
>>> > > > some prerequisite parameters. Of course we could chain more
>>> decorators
>>> > to
>>> > > > construct the K8s
>>> > > > resources. For example, InitializerDecorator ->
>>> OwnerReferenceDecorator
>>> > > ->
>>> > > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>>> > > > SidecarDecorator -> etc.
>>> > > >
>>> > > > I am little sceptical about splitting every parameter into a single
>>> > > > decorator.  Since it does
>>> > > > not take too much benefits. But i agree with moving some common
>>> > > parameters
>>> > > > into separate
>>> > > > decorators(e.g. volume mount). Also introducing the `~Builder`
>>> class
>>> > and
>>> > > > spinning off the chaining
>>> > > > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
>>> > > >
>>> > > >
>>> > > >
>>> > > > >> Add some dedicated tools for centrally parsing, verifying, and
>>> > > managing
>>> > > > the Kubernetes parameters.
>>> > > >
>>> > > > Currently, we always get the parameters directly from Flink
>>> > > configuration(
>>> > > > e.g.
>>> > `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
>>> > > > think it could be improved
>>> > > > by introducing some dedicated conf parser classes. It is a good
>>> idea.
>>> > > >
>>> > > >
>>> > > > Best,
>>> > > > Yang
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:
>>> > > >
>>> > > > > Hi everyone,
>>> > > > >
>>> > > > > I would like to kick off a discussion on refactoring the existing
>>> > > > > Kubernetes resources construction architecture design.
>>> > > > >
>>> > > > > I created a design document [1] that clarifies our motivation to
>>> do
>>> > > this
>>> > > > > and some improvement proposals for the new design.
>>> > > > >
>>> > > > > Briefly, we would like to
>>> > > > > 1. Introduce a unified monadic-step based orchestrator
>>> architecture
>>> > > that
>>> > > > > has a better, cleaner and consistent abstraction for the
>>> Kubernetes
>>> > > > > resources construction process, both applicable to the client
>>> side
>>> > and
>>> > > > the
>>> > > > > master side.
>>> > > > > 2. Add some dedicated tools for centrally parsing, verifying, and
>>> > > > managing
>>> > > > > the Kubernetes parameters.
>>> > > > >
>>> > > > > It would be great to start the efforts before adding big
>>> features for
>>> > > the
>>> > > > > native Kubernetes submodule, and Tison and I plan to work
>>> together
>>> > for
>>> > > > this
>>> > > > > issue.
>>> > > > >
>>> > > > > Please let me know your thoughts.
>>> > > > >
>>> > > > > Regards,
>>> > > > > Canbin Zheng
>>> > > > >
>>> > > > > [1]
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
>>> > > > > [2] https://issues.apache.org/jira/browse/FLINK-16194
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Canbin Zheng <fe...@gmail.com>.
Hi, Yang Wang,

Thanks for the feedback.

> Parameters Parser

I can think of many benefits of the dedicated parameter parsing/verifying
tools, here is some of them:
1. Reuse the parameters parsing and verifying code.
2. Ensure consistent handling logic for the same setting.
3. Simplify some of the method's signature because we can avoid defining
too many unnecessary input parameters.

> BTW, the fabric8 kubernetes-client and Kubernetes api server already has
the parameters check before starting to create the resources. I think the
exceptions
are usually meaning and enough for the users to get the root cause.

1. On the one hand, there are gaps between the declarative model used by
Kubernetes and the configuration model used by Flink. Indeed, the
Kubernetes client/server helps check the Kubernetes side parameters and
throw exceptions in case of failures; however, the exception messages are
not always easily understood from the perspective of Flink users.
Independent of the solutions, what we should make sure is that the
exception information in the Flink configuration model is meaningful enough
for the Flink users.
2. On the other hand, some prechecking is beneficial in some scenarios
since we can avoid useless effort on Kubernetes resource creation in case
of failures leading to clean up all the created resources.


Regards,
Canbin Zheng

Yang Wang <da...@gmail.com> 于2020年2月23日周日 下午10:37写道:

> > The new introduced decorator
> After some offline discussion with Canbin and tison, i totally understand
> the evolved decorator design. Each decorator will be self-contained and
> is responsible for just one thing. Currently, if we want to mount a new
> config
> file to jobmanager/taskmanager, then both `ConfigMapDecorator`,
> `JobManagerDecorator`, `TaskManagerDecorator` needs to be updated.
> It is not convenient for the new contributors to do this. In the new
> design,
> by leveraging the accompanying kubernetes resources in decorator, we
> could finish the creating and mounting config map in one decorator.
>
> Since now we just have a basic implementation for native Kubernetes
> integration and lots of features need to be developed. And many users
> want to participate in and contribute to the integration. So i agree to
> refactor
> the current decorator implementation and make it easier for the new
> contributor.
>
> For the detailed divergences(naming, etc.), i think we could discuss in
> the PR.
>
> > Parameters Parser
> Currently, the decorator directly use Flink configuration to get the
> parameters
> to build the Kubernetes resource. It is straightforward, however we could
> not
> have a unified parameters check. So i am not sure whether you will
> introduce
> a tool to check the parameters or just simply have our own basic check.
>
> BTW, the fabric8 kubernetes-client and Kubernetes api server already has
> the parameters check before starting to create the resources. I think the
> exceptions
> are usually meaning and enough for the users to get the root cause.
>
>
>
> Best,
> Yang
>
>
>
>
> felixzheng zheng <fe...@gmail.com> 于2020年2月22日周六 上午10:54写道:
>
>> Great thanks for the quick feedback Till. You are right; it is not a
>> fundamentally different approach compared to
>> what we have right now, all the Kubernetes resources created are the same,
>> we aim to evolve the existing decorator approach so that,
>> 1. the decorators are monadic and smaller in size and functionality.
>> 2. the new decorator design allows reusing the decorators between the
>> client and the cluster as much as possible.
>> 3. all the decorators are independent with each other, and they could have
>> arbitrary order in the chain, they share the same APIs and follow a
>> unified
>> orchestrator architecture so that new developers could quickly understand
>> what should be done to introduce a new feature.
>>
>> Besides that, the new approach allows us adding tests for every decorator
>> alone instead of doing a final test of all the decorators in the
>> Fabric8ClientTest.java.
>>
>> Cheers,
>> Canbin Zheng
>>
>> Till Rohrmann <tr...@apache.org> 于2020年2月22日周六 上午12:28写道:
>>
>> > Thanks for starting this discussion Canbin. If I understand your
>> proposal
>> > correctly, then you would like to evolve the existing decorator
>> approach so
>> > that decorators are monadic and smaller in size and functionality. The
>> > latter aspect will allow to reuse them between the client and the
>> cluster.
>> > Just to make sure, it is not a fundamentally different approach
>> compared to
>> > what we have right now, is it?
>> >
>> > If this is the case, then I think it makes sense to reuse code as much
>> as
>> > possible and to create small code units which are easier to test.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <
>> felixzhengcb@gmail.com>
>> > wrote:
>> >
>> > > Thanks for the feedback @Yang Wang. I would like to discuss some of
>> the
>> > > details in depth about why I am confused about the existing design.
>> > >
>> > > Question 1: How do we mount a configuration file?
>> > >
>> > > For the existing design,
>> > >
>> > >    1.
>> > >
>> > >    We need several classes to finish it:
>> > >    1.
>> > >
>> > >       InitializerDecorator
>> > >       2.
>> > >
>> > >       OwnerReferenceDecorator
>> > >       3.
>> > >
>> > >       ConfigMapDecorator
>> > >       4.
>> > >
>> > >       KubernetesUtils: providing the getConfigMapVolume method to
>> share
>> > for
>> > >       the FlinkMasterDeploymentDecorator and the
>> TaskManagerPodDecorator.
>> > >       5.
>> > >
>> > >       FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
>> > >       6.
>> > >
>> > >       TaskManagerPodDecorator: mounts the ConfigMap volume.
>> > >       7.
>> > >
>> > >       If in the future, someone would like to introduce an init
>> > Container,
>> > >       the InitContainerDecorator has to mount the ConfigMap volume
>> too.
>> > >
>> > >
>> > > I am confused about the current solution to mounting a configuration
>> > file:
>> > >
>> > >    1.
>> > >
>> > >    Actually, we do not need so many Decorators for mounting a file.
>> > >    2.
>> > >
>> > >    If we would like to mount a new file, we have no choice but to
>> repeat
>> > >    the same tedious and scattered routine.
>> > >    3.
>> > >
>> > >    There’s no easy way to test the file mounting functionality alone;
>> we
>> > >    have to construct the ConfigMap, the Deployment or the
>> TaskManagerPod
>> > > first
>> > >    and then do a final test.
>> > >
>> > >
>> > > The reason why it is so complex to mount a configuration file is that
>> we
>> > > don’t fully consider the internal connections among those resources in
>> > the
>> > > existing design.
>> > >
>> > > The new abstraction we proposed could solve such a kind of problem,
>> the
>> > new
>> > > Decorator object is as follows:
>> > >
>> > > public interface KubernetesStepDecorator {
>> > >
>> > >   /**
>> > >
>> > >    * Apply transformations to the given FlinkPod in accordance with
>> this
>> > > feature. This can include adding
>> > >
>> > >    * labels/annotations, mounting volumes, and setting startup
>> command or
>> > > parameters, etc.
>> > >
>> > >    */
>> > >
>> > >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
>> > >
>> > >   /**
>> > >
>> > >    * Build the accompanying Kubernetes resources that should be
>> > introduced
>> > > to support this feature. This could
>> > >
>> > >    * only applicable to the client-side submission process.
>> > >
>> > >    */
>> > >
>> > >   List<HasMetadata> buildAccompanyingKubernetesResources() throws
>> > > IOException;
>> > >
>> > > }
>> > >
>> > > The FlinkPod is a composition of the Pod, the main Container, the init
>> > > Container, and the sidecar Container.
>> > >
>> > > Next, we introduce a KubernetesStepDecorator implementation, the
>> method
>> > of
>> > > buildAccompanyingKubernetesResources creates the corresponding
>> ConfigMap,
>> > > and the method of decorateFlinkPod configures the Volume for the Pod
>> and
>> > > all the Containers.
>> > >
>> > > So, for the scenario of mounting a configuration file, the advantages
>> of
>> > > this new architecture are as follows:
>> > >
>> > >    1.
>> > >
>> > >    One dedicated KubernetesStepDecorator implementation is enough to
>> > finish
>> > >    all the mounting work, meanwhile, this class would be shared
>> between
>> > the
>> > >    client-side and the master-side. Besides that, the number of lines
>> of
>> > > code
>> > >    in that class will not exceed 300 lines which facilitate code
>> > > readability
>> > >    and maintenance.
>> > >    2.
>> > >
>> > >    Testing becomes an easy thing now, as we can add a dedicated test
>> > class
>> > >    for only this class, the test would never rely on the construction
>> of
>> > > other
>> > >    components such as the Deployment.
>> > >    3.
>> > >
>> > >    It’s quite convenient to mount a new configuration file via the
>> newly
>> > >    dedicated KubernetesStepDecorator implementation.
>> > >
>> > >
>> > > Question 2: How do we construct the Pod?
>> > >
>> > > For the existing design,
>> > >
>> > >    1.
>> > >
>> > >    The FlinkMasterDeploymentDecorator is responsible for building the
>> > >    Deployment and configuring the Pod and the Containers, while the
>> > >    TaskManagerPodDecorator is responsible for building the TaskManager
>> > Pod.
>> > >    Take FlinkMasterDeploymentDecorator as an example, let’s see what
>> it
>> > has
>> > >    done.
>> > >    1.
>> > >
>> > >       Configure the main Container, including the name, command, args,
>> > >       image, image pull policy, resource requirements, ports, all
>> kinds
>> > of
>> > >       environment variables, all kinds of volume mounts, etc.
>> > >       2.
>> > >
>> > >       Configure the Pod, including service account, all kinds of
>> volumes,
>> > >       and attach all kinds of Container, including the main Container,
>> > the
>> > > init
>> > >       Container, and the sidecar Container.
>> > >       3.
>> > >
>> > >       Configure the Deployment.
>> > >
>> > >
>> > >
>> > >    1.
>> > >
>> > >    The InitializerDecorator and the OwnerReferenceDecorator have basic
>> > >    logic so that the most complex work is completed in the
>> > >    FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
>> With
>> > the
>> > >    introduction of new features for the Pod, such as customized volume
>> > > mounts,
>> > >    Hadoop configuration support, Kerberized HDFS support, secret
>> mounts,
>> > >    Python support, etc. the construction process could become far more
>> > >    complicated, and the functionality of a single class could explode,
>> > > which
>> > >    hurts code readability, writability, and testability. Besides that,
>> > both
>> > >    the client-side and the master-side shares much of the Pod
>> > construction
>> > >    logic.
>> > >
>> > >
>> > > So the problems are as follows:
>> > >
>> > >    1.
>> > >
>> > >    We don’t have a consistent abstraction that is applicable to both
>> the
>> > >    client-side and the master-side for Pod construction (including the
>> > >    Containers) so that we have to share some of the code via tool
>> classes
>> > >    which is a trick solution, however, we can’t share most of the
>> code as
>> > > much
>> > >    as possible.
>> > >    2.
>> > >
>> > >    We don’t have a step-by-step orchestrator architecture to help the
>> Pod
>> > >    construction.
>> > >
>> > >
>> > > For the proposed design,
>> > >
>> > >    1.
>> > >
>> > >    The problems above are all solved: we have a consistent abstraction
>> > that
>> > >    leads to a monadic-step orchestrator architecture to construct the
>> Pod
>> > > step
>> > >    by step. One step is responsible for exactly one thing, following
>> that
>> > > is
>> > >    the fact that every step is well testable, because each unit can be
>> > >    considerably small, and the number of branches to test in any given
>> > > step is
>> > >    limited. Moreover, much of the step could be shared between the
>> > > client-side
>> > >    and the master-side, such as configuration files mount, etc.
>> > >
>> > >
>> > > Question 3: Could all the resources share the same orchestrator
>> > > architecture?
>> > >
>> > > For the existing design,
>> > >
>> > >    1.
>> > >
>> > >    We don’t have a unified orchestrator architecture for the
>> construction
>> > >    of all the Kubernetes resources, therefore, we need a Decorator
>> chain
>> > > for
>> > >    every Kubernetes resource.
>> > >
>> > >
>> > > For the proposed design,
>> > >
>> > >    1.
>> > >
>> > >    Following Question 1 ~ 2 and the design doc[1], we have introduced
>> a
>> > >    monadic-step orchestrator architecture to construct the Pod and all
>> > the
>> > >    Containers. Besides that, this architecture also works for the
>> other
>> > >    resources, such as the Services and the ConfigMaps. For example,
>> if we
>> > > need
>> > >    a new Service or a ConfigMap, just introduce a new step.
>> > >
>> > >
>> > >
>> > > Question 4: How do we introduce the InitContainer or the side-car
>> > > Container?
>> > >
>> > > For the existing design,
>> > >
>> > >    1.
>> > >
>> > >    Both the client-side and the master-side introduce some new
>> > Decorators,
>> > >
>> > > the client-side chain could be:
>> > >
>> > > InitializerDecorator -> OwnerReferenceDecorator ->
>> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>> > > SidecarDecorator -> etc
>> > >
>> > > -
>> > >
>> > > and the master-side could be:
>> > >
>> > > InitializerDecorator -> OwnerReferenceDecorator  ->
>> > TaskManagerPodDecorator
>> > > -> InitContainerDecorator -> SidecarDecorator -> etc
>> > >
>> > > As we can see, the FlinkMasterDeploymentDecorator or the
>> > > TaskManagerPodDecorator is designed for the Pod construction,
>> including
>> > the
>> > > Containers, so we don’t need to treat the init Container and the
>> sidecar
>> > > Container as special cases different from the main Container by
>> > introducing
>> > > a dedicated Decorator for every one of them. Such kind of trick
>> solution
>> > > confuses me, maybe to the other developers.
>> > >
>> > > For the proposed design,
>> > >
>> > >    1.
>> > >
>> > >    Following Question 1 ~ 4 and the design doc [1], the central
>> premise
>> > of
>> > >    the proposed orchestrator architecture is that the Pod, the main
>> > > Container,
>> > >    the init Container, the sidecar Container, the Services, and the
>> > > ConfigMaps
>> > >    all sit on an equal footing. For every one of the Pod, the main
>> > > Container,
>> > >    the init Container and the sidecar Container, people could
>> introduce
>> > any
>> > >    number of steps to finish its construction; we attach all the
>> > > Containers to
>> > >    the Pod in the Builder tool classes after the orchestrator
>> > architecture
>> > >    constructs them.
>> > >
>> > >
>> > > Question 5: What is the relation between the Decorators?
>> > >
>> > > For the existing design,
>> > >
>> > >    1.
>> > >
>> > >    The Decorators are not independent; most of them have a strict
>> order
>> > in
>> > >    the chain.
>> > >    2.
>> > >
>> > >    The Decorators do not share common APIs in essence, as their input
>> > type
>> > >    could be different so that we can’t finish the construction of a
>> > > Kubernetes
>> > >    resource such as the Deployment and the TaskManager Pod through a
>> > > one-time
>> > >    traversal, there are boxing and unboxing overheads among some of
>> the
>> > >    neighboring Decorators.
>> > >
>> > >
>> > > For the proposed design,
>> > >
>> > >    1.
>> > >
>> > >    The steps are completely independent so that they could have
>> arbitrary
>> > >    order in the chain; The only rule is that the Hadoop configuration
>> > >    Decorator should be the last node in the chain.
>> > >    2.
>> > >
>> > >    All the steps share common APIs, with the same input type of all
>> the
>> > >    methods, so we can construct a Kubernetes resource via a one-time
>> > > traversal.
>> > >
>> > >
>> > > Question 6: What is the number of Decorators chains?
>> > >
>> > > For the existing design,
>> > >
>> > >    1.
>> > >
>> > >    People have to introduce a new chain once they introduce a new
>> > >    Kubernetes resource. For example, a new ConfigMap Decorator chain
>> for
>> > >    mounting a new configuration file. At this moment, we already have
>> > five
>> > >    Decorator chains.
>> > >
>> > >
>> > > For the proposed design,
>> > >
>> > >    1.
>> > >
>> > >    There are always two chains, one is for constructing all the
>> > Kubernetes
>> > >    resources on the client-side, including the JobManager Pod, the
>> > > Containers,
>> > >    the ConfigMap(s), and the Services(s); the other one is for
>> > constructing
>> > >    the TaskManager Pod and the Containers.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
>> > >
>> > > >  Hi Canbing,
>> > > >
>> > > >
>> > > > Thanks a lot for sharing your thoughts to improve the Flink on K8s
>> > native
>> > > > integration.
>> > > > Frankly speaking, your discussion title confuses me and i am
>> wondering
>> > > > whether you
>> > > > want to refactor the whole design. However, after i dive into the
>> > details
>> > > > and the provided
>> > > > document, i realize that mostly you want to improve the
>> implementation.
>> > > >
>> > > >
>> > > > Regarding your two main points.
>> > > >
>> > > > >> Introduce a unified monadic-step based orchestrator architecture
>> > that
>> > > > has a better,
>> > > > cleaner and consistent abstraction for the Kubernetes resources
>> > > > construction process,
>> > > > both applicable to the client side and the master side.
>> > > >
>> > > > When i introduce the decorator for the K8s in Flink, there is
>> always a
>> > > > guideline in my mind
>> > > > that it should be easy for extension and adding new features. Just
>> as
>> > you
>> > > > say, we have lots
>> > > > of functions to support and the K8s is also evolving very fast. The
>> > > current
>> > > > `ConfigMapDecorator`,
>> > > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a
>> basic
>> > > > implementation with
>> > > > some prerequisite parameters. Of course we could chain more
>> decorators
>> > to
>> > > > construct the K8s
>> > > > resources. For example, InitializerDecorator ->
>> OwnerReferenceDecorator
>> > > ->
>> > > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
>> > > > SidecarDecorator -> etc.
>> > > >
>> > > > I am little sceptical about splitting every parameter into a single
>> > > > decorator.  Since it does
>> > > > not take too much benefits. But i agree with moving some common
>> > > parameters
>> > > > into separate
>> > > > decorators(e.g. volume mount). Also introducing the `~Builder` class
>> > and
>> > > > spinning off the chaining
>> > > > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
>> > > >
>> > > >
>> > > >
>> > > > >> Add some dedicated tools for centrally parsing, verifying, and
>> > > managing
>> > > > the Kubernetes parameters.
>> > > >
>> > > > Currently, we always get the parameters directly from Flink
>> > > configuration(
>> > > > e.g.
>> > `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
>> > > > think it could be improved
>> > > > by introducing some dedicated conf parser classes. It is a good
>> idea.
>> > > >
>> > > >
>> > > > Best,
>> > > > Yang
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:
>> > > >
>> > > > > Hi everyone,
>> > > > >
>> > > > > I would like to kick off a discussion on refactoring the existing
>> > > > > Kubernetes resources construction architecture design.
>> > > > >
>> > > > > I created a design document [1] that clarifies our motivation to
>> do
>> > > this
>> > > > > and some improvement proposals for the new design.
>> > > > >
>> > > > > Briefly, we would like to
>> > > > > 1. Introduce a unified monadic-step based orchestrator
>> architecture
>> > > that
>> > > > > has a better, cleaner and consistent abstraction for the
>> Kubernetes
>> > > > > resources construction process, both applicable to the client side
>> > and
>> > > > the
>> > > > > master side.
>> > > > > 2. Add some dedicated tools for centrally parsing, verifying, and
>> > > > managing
>> > > > > the Kubernetes parameters.
>> > > > >
>> > > > > It would be great to start the efforts before adding big features
>> for
>> > > the
>> > > > > native Kubernetes submodule, and Tison and I plan to work together
>> > for
>> > > > this
>> > > > > issue.
>> > > > >
>> > > > > Please let me know your thoughts.
>> > > > >
>> > > > > Regards,
>> > > > > Canbin Zheng
>> > > > >
>> > > > > [1]
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
>> > > > > [2] https://issues.apache.org/jira/browse/FLINK-16194
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Yang Wang <da...@gmail.com>.
 > The new introduced decorator
After some offline discussion with Canbin and tison, i totally understand
the evolved decorator design. Each decorator will be self-contained and
is responsible for just one thing. Currently, if we want to mount a new
config
file to jobmanager/taskmanager, then both `ConfigMapDecorator`,
`JobManagerDecorator`, `TaskManagerDecorator` needs to be updated.
It is not convenient for the new contributors to do this. In the new design,
by leveraging the accompanying kubernetes resources in decorator, we
could finish the creating and mounting config map in one decorator.

Since now we just have a basic implementation for native Kubernetes
integration and lots of features need to be developed. And many users
want to participate in and contribute to the integration. So i agree to
refactor
the current decorator implementation and make it easier for the new
contributor.

For the detailed divergences(naming, etc.), i think we could discuss in the
PR.

> Parameters Parser
Currently, the decorator directly use Flink configuration to get the
parameters
to build the Kubernetes resource. It is straightforward, however we could
not
have a unified parameters check. So i am not sure whether you will introduce
a tool to check the parameters or just simply have our own basic check.

BTW, the fabric8 kubernetes-client and Kubernetes api server already has
the parameters check before starting to create the resources. I think the
exceptions
are usually meaning and enough for the users to get the root cause.



Best,
Yang




felixzheng zheng <fe...@gmail.com> 于2020年2月22日周六 上午10:54写道:

> Great thanks for the quick feedback Till. You are right; it is not a
> fundamentally different approach compared to
> what we have right now, all the Kubernetes resources created are the same,
> we aim to evolve the existing decorator approach so that,
> 1. the decorators are monadic and smaller in size and functionality.
> 2. the new decorator design allows reusing the decorators between the
> client and the cluster as much as possible.
> 3. all the decorators are independent with each other, and they could have
> arbitrary order in the chain, they share the same APIs and follow a unified
> orchestrator architecture so that new developers could quickly understand
> what should be done to introduce a new feature.
>
> Besides that, the new approach allows us adding tests for every decorator
> alone instead of doing a final test of all the decorators in the
> Fabric8ClientTest.java.
>
> Cheers,
> Canbin Zheng
>
> Till Rohrmann <tr...@apache.org> 于2020年2月22日周六 上午12:28写道:
>
> > Thanks for starting this discussion Canbin. If I understand your proposal
> > correctly, then you would like to evolve the existing decorator approach
> so
> > that decorators are monadic and smaller in size and functionality. The
> > latter aspect will allow to reuse them between the client and the
> cluster.
> > Just to make sure, it is not a fundamentally different approach compared
> to
> > what we have right now, is it?
> >
> > If this is the case, then I think it makes sense to reuse code as much as
> > possible and to create small code units which are easier to test.
> >
> > Cheers,
> > Till
> >
> > On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <felixzhengcb@gmail.com
> >
> > wrote:
> >
> > > Thanks for the feedback @Yang Wang. I would like to discuss some of the
> > > details in depth about why I am confused about the existing design.
> > >
> > > Question 1: How do we mount a configuration file?
> > >
> > > For the existing design,
> > >
> > >    1.
> > >
> > >    We need several classes to finish it:
> > >    1.
> > >
> > >       InitializerDecorator
> > >       2.
> > >
> > >       OwnerReferenceDecorator
> > >       3.
> > >
> > >       ConfigMapDecorator
> > >       4.
> > >
> > >       KubernetesUtils: providing the getConfigMapVolume method to share
> > for
> > >       the FlinkMasterDeploymentDecorator and the
> TaskManagerPodDecorator.
> > >       5.
> > >
> > >       FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
> > >       6.
> > >
> > >       TaskManagerPodDecorator: mounts the ConfigMap volume.
> > >       7.
> > >
> > >       If in the future, someone would like to introduce an init
> > Container,
> > >       the InitContainerDecorator has to mount the ConfigMap volume too.
> > >
> > >
> > > I am confused about the current solution to mounting a configuration
> > file:
> > >
> > >    1.
> > >
> > >    Actually, we do not need so many Decorators for mounting a file.
> > >    2.
> > >
> > >    If we would like to mount a new file, we have no choice but to
> repeat
> > >    the same tedious and scattered routine.
> > >    3.
> > >
> > >    There’s no easy way to test the file mounting functionality alone;
> we
> > >    have to construct the ConfigMap, the Deployment or the
> TaskManagerPod
> > > first
> > >    and then do a final test.
> > >
> > >
> > > The reason why it is so complex to mount a configuration file is that
> we
> > > don’t fully consider the internal connections among those resources in
> > the
> > > existing design.
> > >
> > > The new abstraction we proposed could solve such a kind of problem, the
> > new
> > > Decorator object is as follows:
> > >
> > > public interface KubernetesStepDecorator {
> > >
> > >   /**
> > >
> > >    * Apply transformations to the given FlinkPod in accordance with
> this
> > > feature. This can include adding
> > >
> > >    * labels/annotations, mounting volumes, and setting startup command
> or
> > > parameters, etc.
> > >
> > >    */
> > >
> > >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
> > >
> > >   /**
> > >
> > >    * Build the accompanying Kubernetes resources that should be
> > introduced
> > > to support this feature. This could
> > >
> > >    * only applicable to the client-side submission process.
> > >
> > >    */
> > >
> > >   List<HasMetadata> buildAccompanyingKubernetesResources() throws
> > > IOException;
> > >
> > > }
> > >
> > > The FlinkPod is a composition of the Pod, the main Container, the init
> > > Container, and the sidecar Container.
> > >
> > > Next, we introduce a KubernetesStepDecorator implementation, the method
> > of
> > > buildAccompanyingKubernetesResources creates the corresponding
> ConfigMap,
> > > and the method of decorateFlinkPod configures the Volume for the Pod
> and
> > > all the Containers.
> > >
> > > So, for the scenario of mounting a configuration file, the advantages
> of
> > > this new architecture are as follows:
> > >
> > >    1.
> > >
> > >    One dedicated KubernetesStepDecorator implementation is enough to
> > finish
> > >    all the mounting work, meanwhile, this class would be shared between
> > the
> > >    client-side and the master-side. Besides that, the number of lines
> of
> > > code
> > >    in that class will not exceed 300 lines which facilitate code
> > > readability
> > >    and maintenance.
> > >    2.
> > >
> > >    Testing becomes an easy thing now, as we can add a dedicated test
> > class
> > >    for only this class, the test would never rely on the construction
> of
> > > other
> > >    components such as the Deployment.
> > >    3.
> > >
> > >    It’s quite convenient to mount a new configuration file via the
> newly
> > >    dedicated KubernetesStepDecorator implementation.
> > >
> > >
> > > Question 2: How do we construct the Pod?
> > >
> > > For the existing design,
> > >
> > >    1.
> > >
> > >    The FlinkMasterDeploymentDecorator is responsible for building the
> > >    Deployment and configuring the Pod and the Containers, while the
> > >    TaskManagerPodDecorator is responsible for building the TaskManager
> > Pod.
> > >    Take FlinkMasterDeploymentDecorator as an example, let’s see what it
> > has
> > >    done.
> > >    1.
> > >
> > >       Configure the main Container, including the name, command, args,
> > >       image, image pull policy, resource requirements, ports, all kinds
> > of
> > >       environment variables, all kinds of volume mounts, etc.
> > >       2.
> > >
> > >       Configure the Pod, including service account, all kinds of
> volumes,
> > >       and attach all kinds of Container, including the main Container,
> > the
> > > init
> > >       Container, and the sidecar Container.
> > >       3.
> > >
> > >       Configure the Deployment.
> > >
> > >
> > >
> > >    1.
> > >
> > >    The InitializerDecorator and the OwnerReferenceDecorator have basic
> > >    logic so that the most complex work is completed in the
> > >    FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator. With
> > the
> > >    introduction of new features for the Pod, such as customized volume
> > > mounts,
> > >    Hadoop configuration support, Kerberized HDFS support, secret
> mounts,
> > >    Python support, etc. the construction process could become far more
> > >    complicated, and the functionality of a single class could explode,
> > > which
> > >    hurts code readability, writability, and testability. Besides that,
> > both
> > >    the client-side and the master-side shares much of the Pod
> > construction
> > >    logic.
> > >
> > >
> > > So the problems are as follows:
> > >
> > >    1.
> > >
> > >    We don’t have a consistent abstraction that is applicable to both
> the
> > >    client-side and the master-side for Pod construction (including the
> > >    Containers) so that we have to share some of the code via tool
> classes
> > >    which is a trick solution, however, we can’t share most of the code
> as
> > > much
> > >    as possible.
> > >    2.
> > >
> > >    We don’t have a step-by-step orchestrator architecture to help the
> Pod
> > >    construction.
> > >
> > >
> > > For the proposed design,
> > >
> > >    1.
> > >
> > >    The problems above are all solved: we have a consistent abstraction
> > that
> > >    leads to a monadic-step orchestrator architecture to construct the
> Pod
> > > step
> > >    by step. One step is responsible for exactly one thing, following
> that
> > > is
> > >    the fact that every step is well testable, because each unit can be
> > >    considerably small, and the number of branches to test in any given
> > > step is
> > >    limited. Moreover, much of the step could be shared between the
> > > client-side
> > >    and the master-side, such as configuration files mount, etc.
> > >
> > >
> > > Question 3: Could all the resources share the same orchestrator
> > > architecture?
> > >
> > > For the existing design,
> > >
> > >    1.
> > >
> > >    We don’t have a unified orchestrator architecture for the
> construction
> > >    of all the Kubernetes resources, therefore, we need a Decorator
> chain
> > > for
> > >    every Kubernetes resource.
> > >
> > >
> > > For the proposed design,
> > >
> > >    1.
> > >
> > >    Following Question 1 ~ 2 and the design doc[1], we have introduced a
> > >    monadic-step orchestrator architecture to construct the Pod and all
> > the
> > >    Containers. Besides that, this architecture also works for the other
> > >    resources, such as the Services and the ConfigMaps. For example, if
> we
> > > need
> > >    a new Service or a ConfigMap, just introduce a new step.
> > >
> > >
> > >
> > > Question 4: How do we introduce the InitContainer or the side-car
> > > Container?
> > >
> > > For the existing design,
> > >
> > >    1.
> > >
> > >    Both the client-side and the master-side introduce some new
> > Decorators,
> > >
> > > the client-side chain could be:
> > >
> > > InitializerDecorator -> OwnerReferenceDecorator ->
> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> > > SidecarDecorator -> etc
> > >
> > > -
> > >
> > > and the master-side could be:
> > >
> > > InitializerDecorator -> OwnerReferenceDecorator  ->
> > TaskManagerPodDecorator
> > > -> InitContainerDecorator -> SidecarDecorator -> etc
> > >
> > > As we can see, the FlinkMasterDeploymentDecorator or the
> > > TaskManagerPodDecorator is designed for the Pod construction, including
> > the
> > > Containers, so we don’t need to treat the init Container and the
> sidecar
> > > Container as special cases different from the main Container by
> > introducing
> > > a dedicated Decorator for every one of them. Such kind of trick
> solution
> > > confuses me, maybe to the other developers.
> > >
> > > For the proposed design,
> > >
> > >    1.
> > >
> > >    Following Question 1 ~ 4 and the design doc [1], the central premise
> > of
> > >    the proposed orchestrator architecture is that the Pod, the main
> > > Container,
> > >    the init Container, the sidecar Container, the Services, and the
> > > ConfigMaps
> > >    all sit on an equal footing. For every one of the Pod, the main
> > > Container,
> > >    the init Container and the sidecar Container, people could introduce
> > any
> > >    number of steps to finish its construction; we attach all the
> > > Containers to
> > >    the Pod in the Builder tool classes after the orchestrator
> > architecture
> > >    constructs them.
> > >
> > >
> > > Question 5: What is the relation between the Decorators?
> > >
> > > For the existing design,
> > >
> > >    1.
> > >
> > >    The Decorators are not independent; most of them have a strict order
> > in
> > >    the chain.
> > >    2.
> > >
> > >    The Decorators do not share common APIs in essence, as their input
> > type
> > >    could be different so that we can’t finish the construction of a
> > > Kubernetes
> > >    resource such as the Deployment and the TaskManager Pod through a
> > > one-time
> > >    traversal, there are boxing and unboxing overheads among some of the
> > >    neighboring Decorators.
> > >
> > >
> > > For the proposed design,
> > >
> > >    1.
> > >
> > >    The steps are completely independent so that they could have
> arbitrary
> > >    order in the chain; The only rule is that the Hadoop configuration
> > >    Decorator should be the last node in the chain.
> > >    2.
> > >
> > >    All the steps share common APIs, with the same input type of all the
> > >    methods, so we can construct a Kubernetes resource via a one-time
> > > traversal.
> > >
> > >
> > > Question 6: What is the number of Decorators chains?
> > >
> > > For the existing design,
> > >
> > >    1.
> > >
> > >    People have to introduce a new chain once they introduce a new
> > >    Kubernetes resource. For example, a new ConfigMap Decorator chain
> for
> > >    mounting a new configuration file. At this moment, we already have
> > five
> > >    Decorator chains.
> > >
> > >
> > > For the proposed design,
> > >
> > >    1.
> > >
> > >    There are always two chains, one is for constructing all the
> > Kubernetes
> > >    resources on the client-side, including the JobManager Pod, the
> > > Containers,
> > >    the ConfigMap(s), and the Services(s); the other one is for
> > constructing
> > >    the TaskManager Pod and the Containers.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
> > >
> > > >  Hi Canbing,
> > > >
> > > >
> > > > Thanks a lot for sharing your thoughts to improve the Flink on K8s
> > native
> > > > integration.
> > > > Frankly speaking, your discussion title confuses me and i am
> wondering
> > > > whether you
> > > > want to refactor the whole design. However, after i dive into the
> > details
> > > > and the provided
> > > > document, i realize that mostly you want to improve the
> implementation.
> > > >
> > > >
> > > > Regarding your two main points.
> > > >
> > > > >> Introduce a unified monadic-step based orchestrator architecture
> > that
> > > > has a better,
> > > > cleaner and consistent abstraction for the Kubernetes resources
> > > > construction process,
> > > > both applicable to the client side and the master side.
> > > >
> > > > When i introduce the decorator for the K8s in Flink, there is always
> a
> > > > guideline in my mind
> > > > that it should be easy for extension and adding new features. Just as
> > you
> > > > say, we have lots
> > > > of functions to support and the K8s is also evolving very fast. The
> > > current
> > > > `ConfigMapDecorator`,
> > > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a
> basic
> > > > implementation with
> > > > some prerequisite parameters. Of course we could chain more
> decorators
> > to
> > > > construct the K8s
> > > > resources. For example, InitializerDecorator ->
> OwnerReferenceDecorator
> > > ->
> > > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> > > > SidecarDecorator -> etc.
> > > >
> > > > I am little sceptical about splitting every parameter into a single
> > > > decorator.  Since it does
> > > > not take too much benefits. But i agree with moving some common
> > > parameters
> > > > into separate
> > > > decorators(e.g. volume mount). Also introducing the `~Builder` class
> > and
> > > > spinning off the chaining
> > > > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
> > > >
> > > >
> > > >
> > > > >> Add some dedicated tools for centrally parsing, verifying, and
> > > managing
> > > > the Kubernetes parameters.
> > > >
> > > > Currently, we always get the parameters directly from Flink
> > > configuration(
> > > > e.g.
> > `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
> > > > think it could be improved
> > > > by introducing some dedicated conf parser classes. It is a good idea.
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > >
> > > >
> > > >
> > > > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I would like to kick off a discussion on refactoring the existing
> > > > > Kubernetes resources construction architecture design.
> > > > >
> > > > > I created a design document [1] that clarifies our motivation to do
> > > this
> > > > > and some improvement proposals for the new design.
> > > > >
> > > > > Briefly, we would like to
> > > > > 1. Introduce a unified monadic-step based orchestrator architecture
> > > that
> > > > > has a better, cleaner and consistent abstraction for the Kubernetes
> > > > > resources construction process, both applicable to the client side
> > and
> > > > the
> > > > > master side.
> > > > > 2. Add some dedicated tools for centrally parsing, verifying, and
> > > > managing
> > > > > the Kubernetes parameters.
> > > > >
> > > > > It would be great to start the efforts before adding big features
> for
> > > the
> > > > > native Kubernetes submodule, and Tison and I plan to work together
> > for
> > > > this
> > > > > issue.
> > > > >
> > > > > Please let me know your thoughts.
> > > > >
> > > > > Regards,
> > > > > Canbin Zheng
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
> > > > > [2] https://issues.apache.org/jira/browse/FLINK-16194
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by felixzheng zheng <fe...@gmail.com>.
Great thanks for the quick feedback Till. You are right; it is not a
fundamentally different approach compared to
what we have right now, all the Kubernetes resources created are the same,
we aim to evolve the existing decorator approach so that,
1. the decorators are monadic and smaller in size and functionality.
2. the new decorator design allows reusing the decorators between the
client and the cluster as much as possible.
3. all the decorators are independent with each other, and they could have
arbitrary order in the chain, they share the same APIs and follow a unified
orchestrator architecture so that new developers could quickly understand
what should be done to introduce a new feature.

Besides that, the new approach allows us adding tests for every decorator
alone instead of doing a final test of all the decorators in the
Fabric8ClientTest.java.

Cheers,
Canbin Zheng

Till Rohrmann <tr...@apache.org> 于2020年2月22日周六 上午12:28写道:

> Thanks for starting this discussion Canbin. If I understand your proposal
> correctly, then you would like to evolve the existing decorator approach so
> that decorators are monadic and smaller in size and functionality. The
> latter aspect will allow to reuse them between the client and the cluster.
> Just to make sure, it is not a fundamentally different approach compared to
> what we have right now, is it?
>
> If this is the case, then I think it makes sense to reuse code as much as
> possible and to create small code units which are easier to test.
>
> Cheers,
> Till
>
> On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <fe...@gmail.com>
> wrote:
>
> > Thanks for the feedback @Yang Wang. I would like to discuss some of the
> > details in depth about why I am confused about the existing design.
> >
> > Question 1: How do we mount a configuration file?
> >
> > For the existing design,
> >
> >    1.
> >
> >    We need several classes to finish it:
> >    1.
> >
> >       InitializerDecorator
> >       2.
> >
> >       OwnerReferenceDecorator
> >       3.
> >
> >       ConfigMapDecorator
> >       4.
> >
> >       KubernetesUtils: providing the getConfigMapVolume method to share
> for
> >       the FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
> >       5.
> >
> >       FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
> >       6.
> >
> >       TaskManagerPodDecorator: mounts the ConfigMap volume.
> >       7.
> >
> >       If in the future, someone would like to introduce an init
> Container,
> >       the InitContainerDecorator has to mount the ConfigMap volume too.
> >
> >
> > I am confused about the current solution to mounting a configuration
> file:
> >
> >    1.
> >
> >    Actually, we do not need so many Decorators for mounting a file.
> >    2.
> >
> >    If we would like to mount a new file, we have no choice but to repeat
> >    the same tedious and scattered routine.
> >    3.
> >
> >    There’s no easy way to test the file mounting functionality alone; we
> >    have to construct the ConfigMap, the Deployment or the TaskManagerPod
> > first
> >    and then do a final test.
> >
> >
> > The reason why it is so complex to mount a configuration file is that we
> > don’t fully consider the internal connections among those resources in
> the
> > existing design.
> >
> > The new abstraction we proposed could solve such a kind of problem, the
> new
> > Decorator object is as follows:
> >
> > public interface KubernetesStepDecorator {
> >
> >   /**
> >
> >    * Apply transformations to the given FlinkPod in accordance with this
> > feature. This can include adding
> >
> >    * labels/annotations, mounting volumes, and setting startup command or
> > parameters, etc.
> >
> >    */
> >
> >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
> >
> >   /**
> >
> >    * Build the accompanying Kubernetes resources that should be
> introduced
> > to support this feature. This could
> >
> >    * only applicable to the client-side submission process.
> >
> >    */
> >
> >   List<HasMetadata> buildAccompanyingKubernetesResources() throws
> > IOException;
> >
> > }
> >
> > The FlinkPod is a composition of the Pod, the main Container, the init
> > Container, and the sidecar Container.
> >
> > Next, we introduce a KubernetesStepDecorator implementation, the method
> of
> > buildAccompanyingKubernetesResources creates the corresponding ConfigMap,
> > and the method of decorateFlinkPod configures the Volume for the Pod and
> > all the Containers.
> >
> > So, for the scenario of mounting a configuration file, the advantages of
> > this new architecture are as follows:
> >
> >    1.
> >
> >    One dedicated KubernetesStepDecorator implementation is enough to
> finish
> >    all the mounting work, meanwhile, this class would be shared between
> the
> >    client-side and the master-side. Besides that, the number of lines of
> > code
> >    in that class will not exceed 300 lines which facilitate code
> > readability
> >    and maintenance.
> >    2.
> >
> >    Testing becomes an easy thing now, as we can add a dedicated test
> class
> >    for only this class, the test would never rely on the construction of
> > other
> >    components such as the Deployment.
> >    3.
> >
> >    It’s quite convenient to mount a new configuration file via the newly
> >    dedicated KubernetesStepDecorator implementation.
> >
> >
> > Question 2: How do we construct the Pod?
> >
> > For the existing design,
> >
> >    1.
> >
> >    The FlinkMasterDeploymentDecorator is responsible for building the
> >    Deployment and configuring the Pod and the Containers, while the
> >    TaskManagerPodDecorator is responsible for building the TaskManager
> Pod.
> >    Take FlinkMasterDeploymentDecorator as an example, let’s see what it
> has
> >    done.
> >    1.
> >
> >       Configure the main Container, including the name, command, args,
> >       image, image pull policy, resource requirements, ports, all kinds
> of
> >       environment variables, all kinds of volume mounts, etc.
> >       2.
> >
> >       Configure the Pod, including service account, all kinds of volumes,
> >       and attach all kinds of Container, including the main Container,
> the
> > init
> >       Container, and the sidecar Container.
> >       3.
> >
> >       Configure the Deployment.
> >
> >
> >
> >    1.
> >
> >    The InitializerDecorator and the OwnerReferenceDecorator have basic
> >    logic so that the most complex work is completed in the
> >    FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator. With
> the
> >    introduction of new features for the Pod, such as customized volume
> > mounts,
> >    Hadoop configuration support, Kerberized HDFS support, secret mounts,
> >    Python support, etc. the construction process could become far more
> >    complicated, and the functionality of a single class could explode,
> > which
> >    hurts code readability, writability, and testability. Besides that,
> both
> >    the client-side and the master-side shares much of the Pod
> construction
> >    logic.
> >
> >
> > So the problems are as follows:
> >
> >    1.
> >
> >    We don’t have a consistent abstraction that is applicable to both the
> >    client-side and the master-side for Pod construction (including the
> >    Containers) so that we have to share some of the code via tool classes
> >    which is a trick solution, however, we can’t share most of the code as
> > much
> >    as possible.
> >    2.
> >
> >    We don’t have a step-by-step orchestrator architecture to help the Pod
> >    construction.
> >
> >
> > For the proposed design,
> >
> >    1.
> >
> >    The problems above are all solved: we have a consistent abstraction
> that
> >    leads to a monadic-step orchestrator architecture to construct the Pod
> > step
> >    by step. One step is responsible for exactly one thing, following that
> > is
> >    the fact that every step is well testable, because each unit can be
> >    considerably small, and the number of branches to test in any given
> > step is
> >    limited. Moreover, much of the step could be shared between the
> > client-side
> >    and the master-side, such as configuration files mount, etc.
> >
> >
> > Question 3: Could all the resources share the same orchestrator
> > architecture?
> >
> > For the existing design,
> >
> >    1.
> >
> >    We don’t have a unified orchestrator architecture for the construction
> >    of all the Kubernetes resources, therefore, we need a Decorator chain
> > for
> >    every Kubernetes resource.
> >
> >
> > For the proposed design,
> >
> >    1.
> >
> >    Following Question 1 ~ 2 and the design doc[1], we have introduced a
> >    monadic-step orchestrator architecture to construct the Pod and all
> the
> >    Containers. Besides that, this architecture also works for the other
> >    resources, such as the Services and the ConfigMaps. For example, if we
> > need
> >    a new Service or a ConfigMap, just introduce a new step.
> >
> >
> >
> > Question 4: How do we introduce the InitContainer or the side-car
> > Container?
> >
> > For the existing design,
> >
> >    1.
> >
> >    Both the client-side and the master-side introduce some new
> Decorators,
> >
> > the client-side chain could be:
> >
> > InitializerDecorator -> OwnerReferenceDecorator ->
> > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> > SidecarDecorator -> etc
> >
> > -
> >
> > and the master-side could be:
> >
> > InitializerDecorator -> OwnerReferenceDecorator  ->
> TaskManagerPodDecorator
> > -> InitContainerDecorator -> SidecarDecorator -> etc
> >
> > As we can see, the FlinkMasterDeploymentDecorator or the
> > TaskManagerPodDecorator is designed for the Pod construction, including
> the
> > Containers, so we don’t need to treat the init Container and the sidecar
> > Container as special cases different from the main Container by
> introducing
> > a dedicated Decorator for every one of them. Such kind of trick solution
> > confuses me, maybe to the other developers.
> >
> > For the proposed design,
> >
> >    1.
> >
> >    Following Question 1 ~ 4 and the design doc [1], the central premise
> of
> >    the proposed orchestrator architecture is that the Pod, the main
> > Container,
> >    the init Container, the sidecar Container, the Services, and the
> > ConfigMaps
> >    all sit on an equal footing. For every one of the Pod, the main
> > Container,
> >    the init Container and the sidecar Container, people could introduce
> any
> >    number of steps to finish its construction; we attach all the
> > Containers to
> >    the Pod in the Builder tool classes after the orchestrator
> architecture
> >    constructs them.
> >
> >
> > Question 5: What is the relation between the Decorators?
> >
> > For the existing design,
> >
> >    1.
> >
> >    The Decorators are not independent; most of them have a strict order
> in
> >    the chain.
> >    2.
> >
> >    The Decorators do not share common APIs in essence, as their input
> type
> >    could be different so that we can’t finish the construction of a
> > Kubernetes
> >    resource such as the Deployment and the TaskManager Pod through a
> > one-time
> >    traversal, there are boxing and unboxing overheads among some of the
> >    neighboring Decorators.
> >
> >
> > For the proposed design,
> >
> >    1.
> >
> >    The steps are completely independent so that they could have arbitrary
> >    order in the chain; The only rule is that the Hadoop configuration
> >    Decorator should be the last node in the chain.
> >    2.
> >
> >    All the steps share common APIs, with the same input type of all the
> >    methods, so we can construct a Kubernetes resource via a one-time
> > traversal.
> >
> >
> > Question 6: What is the number of Decorators chains?
> >
> > For the existing design,
> >
> >    1.
> >
> >    People have to introduce a new chain once they introduce a new
> >    Kubernetes resource. For example, a new ConfigMap Decorator chain for
> >    mounting a new configuration file. At this moment, we already have
> five
> >    Decorator chains.
> >
> >
> > For the proposed design,
> >
> >    1.
> >
> >    There are always two chains, one is for constructing all the
> Kubernetes
> >    resources on the client-side, including the JobManager Pod, the
> > Containers,
> >    the ConfigMap(s), and the Services(s); the other one is for
> constructing
> >    the TaskManager Pod and the Containers.
> >
> >
> >
> >
> >
> >
> >
> > Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
> >
> > >  Hi Canbing,
> > >
> > >
> > > Thanks a lot for sharing your thoughts to improve the Flink on K8s
> native
> > > integration.
> > > Frankly speaking, your discussion title confuses me and i am wondering
> > > whether you
> > > want to refactor the whole design. However, after i dive into the
> details
> > > and the provided
> > > document, i realize that mostly you want to improve the implementation.
> > >
> > >
> > > Regarding your two main points.
> > >
> > > >> Introduce a unified monadic-step based orchestrator architecture
> that
> > > has a better,
> > > cleaner and consistent abstraction for the Kubernetes resources
> > > construction process,
> > > both applicable to the client side and the master side.
> > >
> > > When i introduce the decorator for the K8s in Flink, there is always a
> > > guideline in my mind
> > > that it should be easy for extension and adding new features. Just as
> you
> > > say, we have lots
> > > of functions to support and the K8s is also evolving very fast. The
> > current
> > > `ConfigMapDecorator`,
> > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a basic
> > > implementation with
> > > some prerequisite parameters. Of course we could chain more decorators
> to
> > > construct the K8s
> > > resources. For example, InitializerDecorator -> OwnerReferenceDecorator
> > ->
> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> > > SidecarDecorator -> etc.
> > >
> > > I am little sceptical about splitting every parameter into a single
> > > decorator.  Since it does
> > > not take too much benefits. But i agree with moving some common
> > parameters
> > > into separate
> > > decorators(e.g. volume mount). Also introducing the `~Builder` class
> and
> > > spinning off the chaining
> > > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
> > >
> > >
> > >
> > > >> Add some dedicated tools for centrally parsing, verifying, and
> > managing
> > > the Kubernetes parameters.
> > >
> > > Currently, we always get the parameters directly from Flink
> > configuration(
> > > e.g.
> `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
> > > think it could be improved
> > > by introducing some dedicated conf parser classes. It is a good idea.
> > >
> > >
> > > Best,
> > > Yang
> > >
> > >
> > >
> > >
> > > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:
> > >
> > > > Hi everyone,
> > > >
> > > > I would like to kick off a discussion on refactoring the existing
> > > > Kubernetes resources construction architecture design.
> > > >
> > > > I created a design document [1] that clarifies our motivation to do
> > this
> > > > and some improvement proposals for the new design.
> > > >
> > > > Briefly, we would like to
> > > > 1. Introduce a unified monadic-step based orchestrator architecture
> > that
> > > > has a better, cleaner and consistent abstraction for the Kubernetes
> > > > resources construction process, both applicable to the client side
> and
> > > the
> > > > master side.
> > > > 2. Add some dedicated tools for centrally parsing, verifying, and
> > > managing
> > > > the Kubernetes parameters.
> > > >
> > > > It would be great to start the efforts before adding big features for
> > the
> > > > native Kubernetes submodule, and Tison and I plan to work together
> for
> > > this
> > > > issue.
> > > >
> > > > Please let me know your thoughts.
> > > >
> > > > Regards,
> > > > Canbin Zheng
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
> > > > [2] https://issues.apache.org/jira/browse/FLINK-16194
> > > >
> > >
> >
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Till Rohrmann <tr...@apache.org>.
Thanks for starting this discussion Canbin. If I understand your proposal
correctly, then you would like to evolve the existing decorator approach so
that decorators are monadic and smaller in size and functionality. The
latter aspect will allow to reuse them between the client and the cluster.
Just to make sure, it is not a fundamentally different approach compared to
what we have right now, is it?

If this is the case, then I think it makes sense to reuse code as much as
possible and to create small code units which are easier to test.

Cheers,
Till

On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng <fe...@gmail.com>
wrote:

> Thanks for the feedback @Yang Wang. I would like to discuss some of the
> details in depth about why I am confused about the existing design.
>
> Question 1: How do we mount a configuration file?
>
> For the existing design,
>
>    1.
>
>    We need several classes to finish it:
>    1.
>
>       InitializerDecorator
>       2.
>
>       OwnerReferenceDecorator
>       3.
>
>       ConfigMapDecorator
>       4.
>
>       KubernetesUtils: providing the getConfigMapVolume method to share for
>       the FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
>       5.
>
>       FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
>       6.
>
>       TaskManagerPodDecorator: mounts the ConfigMap volume.
>       7.
>
>       If in the future, someone would like to introduce an init Container,
>       the InitContainerDecorator has to mount the ConfigMap volume too.
>
>
> I am confused about the current solution to mounting a configuration file:
>
>    1.
>
>    Actually, we do not need so many Decorators for mounting a file.
>    2.
>
>    If we would like to mount a new file, we have no choice but to repeat
>    the same tedious and scattered routine.
>    3.
>
>    There’s no easy way to test the file mounting functionality alone; we
>    have to construct the ConfigMap, the Deployment or the TaskManagerPod
> first
>    and then do a final test.
>
>
> The reason why it is so complex to mount a configuration file is that we
> don’t fully consider the internal connections among those resources in the
> existing design.
>
> The new abstraction we proposed could solve such a kind of problem, the new
> Decorator object is as follows:
>
> public interface KubernetesStepDecorator {
>
>   /**
>
>    * Apply transformations to the given FlinkPod in accordance with this
> feature. This can include adding
>
>    * labels/annotations, mounting volumes, and setting startup command or
> parameters, etc.
>
>    */
>
>   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
>
>   /**
>
>    * Build the accompanying Kubernetes resources that should be introduced
> to support this feature. This could
>
>    * only applicable to the client-side submission process.
>
>    */
>
>   List<HasMetadata> buildAccompanyingKubernetesResources() throws
> IOException;
>
> }
>
> The FlinkPod is a composition of the Pod, the main Container, the init
> Container, and the sidecar Container.
>
> Next, we introduce a KubernetesStepDecorator implementation, the method of
> buildAccompanyingKubernetesResources creates the corresponding ConfigMap,
> and the method of decorateFlinkPod configures the Volume for the Pod and
> all the Containers.
>
> So, for the scenario of mounting a configuration file, the advantages of
> this new architecture are as follows:
>
>    1.
>
>    One dedicated KubernetesStepDecorator implementation is enough to finish
>    all the mounting work, meanwhile, this class would be shared between the
>    client-side and the master-side. Besides that, the number of lines of
> code
>    in that class will not exceed 300 lines which facilitate code
> readability
>    and maintenance.
>    2.
>
>    Testing becomes an easy thing now, as we can add a dedicated test class
>    for only this class, the test would never rely on the construction of
> other
>    components such as the Deployment.
>    3.
>
>    It’s quite convenient to mount a new configuration file via the newly
>    dedicated KubernetesStepDecorator implementation.
>
>
> Question 2: How do we construct the Pod?
>
> For the existing design,
>
>    1.
>
>    The FlinkMasterDeploymentDecorator is responsible for building the
>    Deployment and configuring the Pod and the Containers, while the
>    TaskManagerPodDecorator is responsible for building the TaskManager Pod.
>    Take FlinkMasterDeploymentDecorator as an example, let’s see what it has
>    done.
>    1.
>
>       Configure the main Container, including the name, command, args,
>       image, image pull policy, resource requirements, ports, all kinds of
>       environment variables, all kinds of volume mounts, etc.
>       2.
>
>       Configure the Pod, including service account, all kinds of volumes,
>       and attach all kinds of Container, including the main Container, the
> init
>       Container, and the sidecar Container.
>       3.
>
>       Configure the Deployment.
>
>
>
>    1.
>
>    The InitializerDecorator and the OwnerReferenceDecorator have basic
>    logic so that the most complex work is completed in the
>    FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator. With the
>    introduction of new features for the Pod, such as customized volume
> mounts,
>    Hadoop configuration support, Kerberized HDFS support, secret mounts,
>    Python support, etc. the construction process could become far more
>    complicated, and the functionality of a single class could explode,
> which
>    hurts code readability, writability, and testability. Besides that, both
>    the client-side and the master-side shares much of the Pod construction
>    logic.
>
>
> So the problems are as follows:
>
>    1.
>
>    We don’t have a consistent abstraction that is applicable to both the
>    client-side and the master-side for Pod construction (including the
>    Containers) so that we have to share some of the code via tool classes
>    which is a trick solution, however, we can’t share most of the code as
> much
>    as possible.
>    2.
>
>    We don’t have a step-by-step orchestrator architecture to help the Pod
>    construction.
>
>
> For the proposed design,
>
>    1.
>
>    The problems above are all solved: we have a consistent abstraction that
>    leads to a monadic-step orchestrator architecture to construct the Pod
> step
>    by step. One step is responsible for exactly one thing, following that
> is
>    the fact that every step is well testable, because each unit can be
>    considerably small, and the number of branches to test in any given
> step is
>    limited. Moreover, much of the step could be shared between the
> client-side
>    and the master-side, such as configuration files mount, etc.
>
>
> Question 3: Could all the resources share the same orchestrator
> architecture?
>
> For the existing design,
>
>    1.
>
>    We don’t have a unified orchestrator architecture for the construction
>    of all the Kubernetes resources, therefore, we need a Decorator chain
> for
>    every Kubernetes resource.
>
>
> For the proposed design,
>
>    1.
>
>    Following Question 1 ~ 2 and the design doc[1], we have introduced a
>    monadic-step orchestrator architecture to construct the Pod and all the
>    Containers. Besides that, this architecture also works for the other
>    resources, such as the Services and the ConfigMaps. For example, if we
> need
>    a new Service or a ConfigMap, just introduce a new step.
>
>
>
> Question 4: How do we introduce the InitContainer or the side-car
> Container?
>
> For the existing design,
>
>    1.
>
>    Both the client-side and the master-side introduce some new Decorators,
>
> the client-side chain could be:
>
> InitializerDecorator -> OwnerReferenceDecorator ->
> FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> SidecarDecorator -> etc
>
> -
>
> and the master-side could be:
>
> InitializerDecorator -> OwnerReferenceDecorator  -> TaskManagerPodDecorator
> -> InitContainerDecorator -> SidecarDecorator -> etc
>
> As we can see, the FlinkMasterDeploymentDecorator or the
> TaskManagerPodDecorator is designed for the Pod construction, including the
> Containers, so we don’t need to treat the init Container and the sidecar
> Container as special cases different from the main Container by introducing
> a dedicated Decorator for every one of them. Such kind of trick solution
> confuses me, maybe to the other developers.
>
> For the proposed design,
>
>    1.
>
>    Following Question 1 ~ 4 and the design doc [1], the central premise of
>    the proposed orchestrator architecture is that the Pod, the main
> Container,
>    the init Container, the sidecar Container, the Services, and the
> ConfigMaps
>    all sit on an equal footing. For every one of the Pod, the main
> Container,
>    the init Container and the sidecar Container, people could introduce any
>    number of steps to finish its construction; we attach all the
> Containers to
>    the Pod in the Builder tool classes after the orchestrator architecture
>    constructs them.
>
>
> Question 5: What is the relation between the Decorators?
>
> For the existing design,
>
>    1.
>
>    The Decorators are not independent; most of them have a strict order in
>    the chain.
>    2.
>
>    The Decorators do not share common APIs in essence, as their input type
>    could be different so that we can’t finish the construction of a
> Kubernetes
>    resource such as the Deployment and the TaskManager Pod through a
> one-time
>    traversal, there are boxing and unboxing overheads among some of the
>    neighboring Decorators.
>
>
> For the proposed design,
>
>    1.
>
>    The steps are completely independent so that they could have arbitrary
>    order in the chain; The only rule is that the Hadoop configuration
>    Decorator should be the last node in the chain.
>    2.
>
>    All the steps share common APIs, with the same input type of all the
>    methods, so we can construct a Kubernetes resource via a one-time
> traversal.
>
>
> Question 6: What is the number of Decorators chains?
>
> For the existing design,
>
>    1.
>
>    People have to introduce a new chain once they introduce a new
>    Kubernetes resource. For example, a new ConfigMap Decorator chain for
>    mounting a new configuration file. At this moment, we already have five
>    Decorator chains.
>
>
> For the proposed design,
>
>    1.
>
>    There are always two chains, one is for constructing all the Kubernetes
>    resources on the client-side, including the JobManager Pod, the
> Containers,
>    the ConfigMap(s), and the Services(s); the other one is for constructing
>    the TaskManager Pod and the Containers.
>
>
>
>
>
>
>
> Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:
>
> >  Hi Canbing,
> >
> >
> > Thanks a lot for sharing your thoughts to improve the Flink on K8s native
> > integration.
> > Frankly speaking, your discussion title confuses me and i am wondering
> > whether you
> > want to refactor the whole design. However, after i dive into the details
> > and the provided
> > document, i realize that mostly you want to improve the implementation.
> >
> >
> > Regarding your two main points.
> >
> > >> Introduce a unified monadic-step based orchestrator architecture that
> > has a better,
> > cleaner and consistent abstraction for the Kubernetes resources
> > construction process,
> > both applicable to the client side and the master side.
> >
> > When i introduce the decorator for the K8s in Flink, there is always a
> > guideline in my mind
> > that it should be easy for extension and adding new features. Just as you
> > say, we have lots
> > of functions to support and the K8s is also evolving very fast. The
> current
> > `ConfigMapDecorator`,
> > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a basic
> > implementation with
> > some prerequisite parameters. Of course we could chain more decorators to
> > construct the K8s
> > resources. For example, InitializerDecorator -> OwnerReferenceDecorator
> ->
> > FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> > SidecarDecorator -> etc.
> >
> > I am little sceptical about splitting every parameter into a single
> > decorator.  Since it does
> > not take too much benefits. But i agree with moving some common
> parameters
> > into separate
> > decorators(e.g. volume mount). Also introducing the `~Builder` class and
> > spinning off the chaining
> > decorator calls from `Fabric8FlinkKubeClient` make sense to me.
> >
> >
> >
> > >> Add some dedicated tools for centrally parsing, verifying, and
> managing
> > the Kubernetes parameters.
> >
> > Currently, we always get the parameters directly from Flink
> configuration(
> > e.g. `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
> > think it could be improved
> > by introducing some dedicated conf parser classes. It is a good idea.
> >
> >
> > Best,
> > Yang
> >
> >
> >
> >
> > felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:
> >
> > > Hi everyone,
> > >
> > > I would like to kick off a discussion on refactoring the existing
> > > Kubernetes resources construction architecture design.
> > >
> > > I created a design document [1] that clarifies our motivation to do
> this
> > > and some improvement proposals for the new design.
> > >
> > > Briefly, we would like to
> > > 1. Introduce a unified monadic-step based orchestrator architecture
> that
> > > has a better, cleaner and consistent abstraction for the Kubernetes
> > > resources construction process, both applicable to the client side and
> > the
> > > master side.
> > > 2. Add some dedicated tools for centrally parsing, verifying, and
> > managing
> > > the Kubernetes parameters.
> > >
> > > It would be great to start the efforts before adding big features for
> the
> > > native Kubernetes submodule, and Tison and I plan to work together for
> > this
> > > issue.
> > >
> > > Please let me know your thoughts.
> > >
> > > Regards,
> > > Canbin Zheng
> > >
> > > [1]
> > >
> > >
> >
> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
> > > [2] https://issues.apache.org/jira/browse/FLINK-16194
> > >
> >
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by felixzheng zheng <fe...@gmail.com>.
Thanks for the feedback @Yang Wang. I would like to discuss some of the
details in depth about why I am confused about the existing design.

Question 1: How do we mount a configuration file?

For the existing design,

   1.

   We need several classes to finish it:
   1.

      InitializerDecorator
      2.

      OwnerReferenceDecorator
      3.

      ConfigMapDecorator
      4.

      KubernetesUtils: providing the getConfigMapVolume method to share for
      the FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
      5.

      FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
      6.

      TaskManagerPodDecorator: mounts the ConfigMap volume.
      7.

      If in the future, someone would like to introduce an init Container,
      the InitContainerDecorator has to mount the ConfigMap volume too.


I am confused about the current solution to mounting a configuration file:

   1.

   Actually, we do not need so many Decorators for mounting a file.
   2.

   If we would like to mount a new file, we have no choice but to repeat
   the same tedious and scattered routine.
   3.

   There’s no easy way to test the file mounting functionality alone; we
   have to construct the ConfigMap, the Deployment or the TaskManagerPod first
   and then do a final test.


The reason why it is so complex to mount a configuration file is that we
don’t fully consider the internal connections among those resources in the
existing design.

The new abstraction we proposed could solve such a kind of problem, the new
Decorator object is as follows:

public interface KubernetesStepDecorator {

  /**

   * Apply transformations to the given FlinkPod in accordance with this
feature. This can include adding

   * labels/annotations, mounting volumes, and setting startup command or
parameters, etc.

   */

  FlinkPod decorateFlinkPod(FlinkPod flinkPod);

  /**

   * Build the accompanying Kubernetes resources that should be introduced
to support this feature. This could

   * only applicable to the client-side submission process.

   */

  List<HasMetadata> buildAccompanyingKubernetesResources() throws
IOException;

}

The FlinkPod is a composition of the Pod, the main Container, the init
Container, and the sidecar Container.

Next, we introduce a KubernetesStepDecorator implementation, the method of
buildAccompanyingKubernetesResources creates the corresponding ConfigMap,
and the method of decorateFlinkPod configures the Volume for the Pod and
all the Containers.

So, for the scenario of mounting a configuration file, the advantages of
this new architecture are as follows:

   1.

   One dedicated KubernetesStepDecorator implementation is enough to finish
   all the mounting work, meanwhile, this class would be shared between the
   client-side and the master-side. Besides that, the number of lines of code
   in that class will not exceed 300 lines which facilitate code readability
   and maintenance.
   2.

   Testing becomes an easy thing now, as we can add a dedicated test class
   for only this class, the test would never rely on the construction of other
   components such as the Deployment.
   3.

   It’s quite convenient to mount a new configuration file via the newly
   dedicated KubernetesStepDecorator implementation.


Question 2: How do we construct the Pod?

For the existing design,

   1.

   The FlinkMasterDeploymentDecorator is responsible for building the
   Deployment and configuring the Pod and the Containers, while the
   TaskManagerPodDecorator is responsible for building the TaskManager Pod.
   Take FlinkMasterDeploymentDecorator as an example, let’s see what it has
   done.
   1.

      Configure the main Container, including the name, command, args,
      image, image pull policy, resource requirements, ports, all kinds of
      environment variables, all kinds of volume mounts, etc.
      2.

      Configure the Pod, including service account, all kinds of volumes,
      and attach all kinds of Container, including the main Container, the init
      Container, and the sidecar Container.
      3.

      Configure the Deployment.



   1.

   The InitializerDecorator and the OwnerReferenceDecorator have basic
   logic so that the most complex work is completed in the
   FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator. With the
   introduction of new features for the Pod, such as customized volume mounts,
   Hadoop configuration support, Kerberized HDFS support, secret mounts,
   Python support, etc. the construction process could become far more
   complicated, and the functionality of a single class could explode, which
   hurts code readability, writability, and testability. Besides that, both
   the client-side and the master-side shares much of the Pod construction
   logic.


So the problems are as follows:

   1.

   We don’t have a consistent abstraction that is applicable to both the
   client-side and the master-side for Pod construction (including the
   Containers) so that we have to share some of the code via tool classes
   which is a trick solution, however, we can’t share most of the code as much
   as possible.
   2.

   We don’t have a step-by-step orchestrator architecture to help the Pod
   construction.


For the proposed design,

   1.

   The problems above are all solved: we have a consistent abstraction that
   leads to a monadic-step orchestrator architecture to construct the Pod step
   by step. One step is responsible for exactly one thing, following that is
   the fact that every step is well testable, because each unit can be
   considerably small, and the number of branches to test in any given step is
   limited. Moreover, much of the step could be shared between the client-side
   and the master-side, such as configuration files mount, etc.


Question 3: Could all the resources share the same orchestrator
architecture?

For the existing design,

   1.

   We don’t have a unified orchestrator architecture for the construction
   of all the Kubernetes resources, therefore, we need a Decorator chain for
   every Kubernetes resource.


For the proposed design,

   1.

   Following Question 1 ~ 2 and the design doc[1], we have introduced a
   monadic-step orchestrator architecture to construct the Pod and all the
   Containers. Besides that, this architecture also works for the other
   resources, such as the Services and the ConfigMaps. For example, if we need
   a new Service or a ConfigMap, just introduce a new step.



Question 4: How do we introduce the InitContainer or the side-car Container?

For the existing design,

   1.

   Both the client-side and the master-side introduce some new Decorators,

the client-side chain could be:

InitializerDecorator -> OwnerReferenceDecorator ->
FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
SidecarDecorator -> etc

-

and the master-side could be:

InitializerDecorator -> OwnerReferenceDecorator  -> TaskManagerPodDecorator
-> InitContainerDecorator -> SidecarDecorator -> etc

As we can see, the FlinkMasterDeploymentDecorator or the
TaskManagerPodDecorator is designed for the Pod construction, including the
Containers, so we don’t need to treat the init Container and the sidecar
Container as special cases different from the main Container by introducing
a dedicated Decorator for every one of them. Such kind of trick solution
confuses me, maybe to the other developers.

For the proposed design,

   1.

   Following Question 1 ~ 4 and the design doc [1], the central premise of
   the proposed orchestrator architecture is that the Pod, the main Container,
   the init Container, the sidecar Container, the Services, and the ConfigMaps
   all sit on an equal footing. For every one of the Pod, the main Container,
   the init Container and the sidecar Container, people could introduce any
   number of steps to finish its construction; we attach all the Containers to
   the Pod in the Builder tool classes after the orchestrator architecture
   constructs them.


Question 5: What is the relation between the Decorators?

For the existing design,

   1.

   The Decorators are not independent; most of them have a strict order in
   the chain.
   2.

   The Decorators do not share common APIs in essence, as their input type
   could be different so that we can’t finish the construction of a Kubernetes
   resource such as the Deployment and the TaskManager Pod through a one-time
   traversal, there are boxing and unboxing overheads among some of the
   neighboring Decorators.


For the proposed design,

   1.

   The steps are completely independent so that they could have arbitrary
   order in the chain; The only rule is that the Hadoop configuration
   Decorator should be the last node in the chain.
   2.

   All the steps share common APIs, with the same input type of all the
   methods, so we can construct a Kubernetes resource via a one-time traversal.


Question 6: What is the number of Decorators chains?

For the existing design,

   1.

   People have to introduce a new chain once they introduce a new
   Kubernetes resource. For example, a new ConfigMap Decorator chain for
   mounting a new configuration file. At this moment, we already have five
   Decorator chains.


For the proposed design,

   1.

   There are always two chains, one is for constructing all the Kubernetes
   resources on the client-side, including the JobManager Pod, the Containers,
   the ConfigMap(s), and the Services(s); the other one is for constructing
   the TaskManager Pod and the Containers.







Yang Wang <da...@gmail.com> 于2020年2月21日周五 下午2:05写道:

>  Hi Canbing,
>
>
> Thanks a lot for sharing your thoughts to improve the Flink on K8s native
> integration.
> Frankly speaking, your discussion title confuses me and i am wondering
> whether you
> want to refactor the whole design. However, after i dive into the details
> and the provided
> document, i realize that mostly you want to improve the implementation.
>
>
> Regarding your two main points.
>
> >> Introduce a unified monadic-step based orchestrator architecture that
> has a better,
> cleaner and consistent abstraction for the Kubernetes resources
> construction process,
> both applicable to the client side and the master side.
>
> When i introduce the decorator for the K8s in Flink, there is always a
> guideline in my mind
> that it should be easy for extension and adding new features. Just as you
> say, we have lots
> of functions to support and the K8s is also evolving very fast. The current
> `ConfigMapDecorator`,
> `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a basic
> implementation with
> some prerequisite parameters. Of course we could chain more decorators to
> construct the K8s
> resources. For example, InitializerDecorator -> OwnerReferenceDecorator ->
> FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> SidecarDecorator -> etc.
>
> I am little sceptical about splitting every parameter into a single
> decorator.  Since it does
> not take too much benefits. But i agree with moving some common parameters
> into separate
> decorators(e.g. volume mount). Also introducing the `~Builder` class and
> spinning off the chaining
> decorator calls from `Fabric8FlinkKubeClient` make sense to me.
>
>
>
> >> Add some dedicated tools for centrally parsing, verifying, and managing
> the Kubernetes parameters.
>
> Currently, we always get the parameters directly from Flink configuration(
> e.g. `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
> think it could be improved
> by introducing some dedicated conf parser classes. It is a good idea.
>
>
> Best,
> Yang
>
>
>
>
> felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:
>
> > Hi everyone,
> >
> > I would like to kick off a discussion on refactoring the existing
> > Kubernetes resources construction architecture design.
> >
> > I created a design document [1] that clarifies our motivation to do this
> > and some improvement proposals for the new design.
> >
> > Briefly, we would like to
> > 1. Introduce a unified monadic-step based orchestrator architecture that
> > has a better, cleaner and consistent abstraction for the Kubernetes
> > resources construction process, both applicable to the client side and
> the
> > master side.
> > 2. Add some dedicated tools for centrally parsing, verifying, and
> managing
> > the Kubernetes parameters.
> >
> > It would be great to start the efforts before adding big features for the
> > native Kubernetes submodule, and Tison and I plan to work together for
> this
> > issue.
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Canbin Zheng
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
> > [2] https://issues.apache.org/jira/browse/FLINK-16194
> >
>

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

Posted by Yang Wang <da...@gmail.com>.
 Hi Canbing,


Thanks a lot for sharing your thoughts to improve the Flink on K8s native
integration.
Frankly speaking, your discussion title confuses me and i am wondering
whether you
want to refactor the whole design. However, after i dive into the details
and the provided
document, i realize that mostly you want to improve the implementation.


Regarding your two main points.

>> Introduce a unified monadic-step based orchestrator architecture that
has a better,
cleaner and consistent abstraction for the Kubernetes resources
construction process,
both applicable to the client side and the master side.

When i introduce the decorator for the K8s in Flink, there is always a
guideline in my mind
that it should be easy for extension and adding new features. Just as you
say, we have lots
of functions to support and the K8s is also evolving very fast. The current
`ConfigMapDecorator`,
`FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a basic
implementation with
some prerequisite parameters. Of course we could chain more decorators to
construct the K8s
resources. For example, InitializerDecorator -> OwnerReferenceDecorator ->
FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
SidecarDecorator -> etc.

I am little sceptical about splitting every parameter into a single
decorator.  Since it does
not take too much benefits. But i agree with moving some common parameters
into separate
decorators(e.g. volume mount). Also introducing the `~Builder` class and
spinning off the chaining
decorator calls from `Fabric8FlinkKubeClient` make sense to me.



>> Add some dedicated tools for centrally parsing, verifying, and managing
the Kubernetes parameters.

Currently, we always get the parameters directly from Flink configuration(
e.g. `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
think it could be improved
by introducing some dedicated conf parser classes. It is a good idea.


Best,
Yang




felixzheng zheng <fe...@gmail.com> 于2020年2月21日周五 上午9:31写道:

> Hi everyone,
>
> I would like to kick off a discussion on refactoring the existing
> Kubernetes resources construction architecture design.
>
> I created a design document [1] that clarifies our motivation to do this
> and some improvement proposals for the new design.
>
> Briefly, we would like to
> 1. Introduce a unified monadic-step based orchestrator architecture that
> has a better, cleaner and consistent abstraction for the Kubernetes
> resources construction process, both applicable to the client side and the
> master side.
> 2. Add some dedicated tools for centrally parsing, verifying, and managing
> the Kubernetes parameters.
>
> It would be great to start the efforts before adding big features for the
> native Kubernetes submodule, and Tison and I plan to work together for this
> issue.
>
> Please let me know your thoughts.
>
> Regards,
> Canbin Zheng
>
> [1]
>
> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
> [2] https://issues.apache.org/jira/browse/FLINK-16194
>