You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Sweta Kalakuntla <sk...@bandwidth.com> on 2021/02/23 21:56:12 UTC

Flink jobs organization and maintainability

Hi,

I am going to have to implement many similar jobs. I need guidance and
examples that you may have for organizing them in the Git repository
without having to have one repo per job.

Thanks,
SK

--

Re: Flink jobs organization and maintainability

Posted by yidan zhao <hi...@gmail.com>.

I used a yarm config file to describe my jobs, and using 'start xxxJobName'
to start the job which is implemented by shell scripts.

Arvid Heise <ar...@apache.org> 于2021年2月24日周三 下午10:09写道：

> If you have many similar jobs, they should be in the same repo (especially
> if they have the same development cycle).
>
> First, how different are the jobs?
> A) If they are very similar, go with just one job and configure it
> differently for each application. Then you can use different deployments of
> the same jar with different parameters/config. If you have deployment by
> code, then you will have all deployment files in some special deploy
> directory on root.
> B) If they are somewhat similar, go with one maven/gradle project having
> several modules. Shared code should go into a *common* module. You should
> have a deploy directory per module.
>
> Note that I'd recommend Table API to implement the jobs as you can use the
> simpler Option A much longer. You can easily it configurable to: a) join
> from multiple sources, b) group by a varying number of fields, c) have
> different aggregation functions, d) use different transformation...
>
> On Tue, Feb 23, 2021 at 10:56 PM Sweta Kalakuntla <
> skalakuntla@bandwidth.com> wrote:
>
>> Hi,
>>
>> I am going to have to implement many similar jobs. I need guidance and
>> examples that you may have for organizing them in the Git repository
>> without having to have one repo per job.
>>
>> Thanks,
>> SK
>>
>> --
>>
>>
>>

Re: Flink jobs organization and maintainability

Posted by Arvid Heise <ar...@apache.org>.

If you have many similar jobs, they should be in the same repo (especially
if they have the same development cycle).

First, how different are the jobs?
A) If they are very similar, go with just one job and configure it
differently for each application. Then you can use different deployments of
the same jar with different parameters/config. If you have deployment by
code, then you will have all deployment files in some special deploy
directory on root.
B) If they are somewhat similar, go with one maven/gradle project having
several modules. Shared code should go into a *common* module. You should
have a deploy directory per module.

Note that I'd recommend Table API to implement the jobs as you can use the
simpler Option A much longer. You can easily it configurable to: a) join
from multiple sources, b) group by a varying number of fields, c) have
different aggregation functions, d) use different transformation...

On Tue, Feb 23, 2021 at 10:56 PM Sweta Kalakuntla <sk...@bandwidth.com>
wrote:

> Hi,
>
> I am going to have to implement many similar jobs. I need guidance and
> examples that you may have for organizing them in the Git repository
> without having to have one repo per job.
>
> Thanks,
> SK
>
> --
>
>
>