You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/05/28 10:27:23 UTC

[GitHub] [incubator-seatunnel] dijiekstra opened a new issue, #1968: [Feature][seatunnel-server] Architectural Design

dijiekstra opened a new issue, #1968:
URL: https://github.com/apache/incubator-seatunnel/issues/1968

   ### Search before asking
   
   - [X] I had searched in the [feature](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22Feature%22) and found no similar feature requirement.
   
   
   ### Description
   
   Because my English is not very good, so I post in Chinese first; I will add English instructions later.
   
   ## 概要设计
   
   
   ### 角色划分
   我们从Web页面的功能,自上而下的来推断,完整的seatunnel-server会有哪些模块及功能
   
   #### Web
   Web最主要的功能是提供可视化的任务编辑与开发以及任务运维,所以Web最主要的`menu`为
   - development
   - maintenance
   
   以此为基础,为了更便利的开发与运维,在进行任务编辑时
   1. 如果数据源可以一次填充多次使用,那么可以极大的简化了任务编辑的流程
   2. 当我们的Scheduler(下文会有讲)使用的是内嵌的模式,那么Server(下文会有讲)的稳定性是我们很关注的事情
   3. 另外,如果开发、运维等操作没有权限管控,那么一切都会变得非常可怕
   由上面可以推断出,我们还需要
   - governance (提供数据源以及权限的管控)
   - monitor (监控服务自身健康以及外部依赖服务健康)
   
   当然,这不会是最终的设计,我相信随着seatunnel的发展,用户对seatunnel web的要求会越来越高,越来越多的需求将会涌现,到时候我们会再根据需求进行功能的拓展,就目前来说,这4个`menu`足以满足
   
   #### Server
   Server作为Web和Scheduler中间的桥梁,主要是负责Web请求的翻译与转发。
   用户在使用Web进行任务开发时,通常有三种模式
   1. 向导模式
   2. 脚本模式
   3. 画布
   
   画布模式暂且不提,这属于很后面很后面的功能。先说向导模式和脚本模式
   向导模式本质上是提供更加便利更加简单的方式去开发一个seatunnel的任务,所以会将一个任务抽象为`来源`、`去向`、`映射`等几个模块儿,将这几个模块组装后,会形成一个独有的JSON或者是DSL脚本,这个DSL脚本在被解析后,最终会交由Scheduler去执行。
   为什么会有DSL的存在?为什么不能在Web侧就将seatunnel的脚本组装完毕?
   - 之所以用DSL的原因是在于提供通用的JSON模板,这样前端在开发Web页面时,不需要针对每个数据源进行重复性开发又或者叫烟囱式开发
   - 如果让前端组装seatunnel自身的执行脚本,那么对前端开发人员的要求比较高,而且前端不需要关注最终执行脚本是什么样子,只需要将页面上的参数构造成JSON丢给Server去处理即可
   
   说完了翻译,接下来再说一下转发
   
   转发什么样的请求?
   一些需要获取任务运行时信息、状态、结果的请求,会被转发到Scheduler。这些信息Server本身不进行维护和保存,因为Server存储的都是任务提交前的信息。如果说的更专业一点,任务在开发时叫做`job`,在发布到Scheduler后,在Scheduler侧的每个执行实例叫做`task-instance`,类似于模具和根据模具生产出来的产品的关系
   
   剩余的功能主要是一些简单的CRUD,比如数据源的管理、监控信息的收集与上报等等
   
   #### Scheduler
   
   Scheduler中主要有这三部分内容
   
   1. 统一调度层的抽象
   类似于scheduler-proxy,定义所有调度与执行,以及它们所有相关功能的接口
   2. 内嵌的Scheduler引擎
   提供简单的任务执行与调度的能力,部分能力不支持如:依赖触发、工作流模型等等
   3. 三方Scheduler引擎
   将第三方Scheduler引擎的API封装、SDK集成,重载抽象层的接口,完成与其他Scheduler引擎的集成
   
   具体的接口定义和实现的设计,将会在[Detail Design] 中体现
   
   ### 交互流程
   #### 流程图
   任务保存
   ![image](https://user-images.githubusercontent.com/19817318/170811657-5820d729-8813-46ed-a5d3-908b3de495f4.png)
   任务临时执行
   ![image](https://user-images.githubusercontent.com/19817318/170811696-30bdce4c-5e47-4a77-bdae-d5867c963372.png)
   任务运维
   ![image](https://user-images.githubusercontent.com/19817318/170811776-56611838-5536-46f5-b8eb-5a5b94c767e5.png)
   
   #### 简单说明
   - 任务保存
   1. 用户在Web操作完成任务配置后,将JSON发送到Server端
   2. Server在收到保存操作时,首先会将JSON保存,再根据JSON及其配置信息,翻译成seatunnel执行所需要的脚本,并同步到Scheduler中
   3. Scheduler会根据配置的信息,交由真正的引擎去保存
   - 任务临时执行
   1. 用户在Web临时执行任务,将执行信息发送到Server端
   2. Server端在收到请求后,将任务JSON解析并转发给Scheduler
   3. Scheduler收到临时执行请求后,将最终执行脚本提交给对应的Scheduler引擎。
   4. 这一系列操作成功后Web会显示执行成功,并开始不断请求Server以获取日志和执行结果
   5. Server将Query转发至Scheduler,Scheduler转发至Scheduler引擎,并将结果返回
   4. 在多次轮训后,在前端展示结果
   - 任务运维
   1. 用户打开运维中心,选择展示执行流水
   2. 此时Web带着过滤条件等一系列信息请求Server
   3. Server收到后直接转发
   4. Scheduler收到后请求引擎并返回结果
   5. Server收到结果返回Web
   6. Web展示执行流水
   
   ### 其它事项
   
   - 本次设计只设计开发和运维两方面,在概述中所描述的`schema evolution`、`数据时间`等功能,暂不设计与实现
   
   
   ### Usage Scenario
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] dijiekstra commented on issue #1968: [Feature][seatunnel-server] Architectural Design

Posted by GitBox <gi...@apache.org>.
dijiekstra commented on issue #1968:
URL: https://github.com/apache/incubator-seatunnel/issues/1968#issuecomment-1141025669

   > Very good design. It would be better if you could convert it into English.
   
   i will translate it befor 6.3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] chenhu commented on issue #1968: [Feature][seatunnel-server] Architectural Design

Posted by GitBox <gi...@apache.org>.
chenhu commented on issue #1968:
URL: https://github.com/apache/incubator-seatunnel/issues/1968#issuecomment-1172216319

   good design


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] zyd915 commented on issue #1968: [Feature][seatunnel-server] Architectural Design

Posted by GitBox <gi...@apache.org>.
zyd915 commented on issue #1968:
URL: https://github.com/apache/incubator-seatunnel/issues/1968#issuecomment-1151981677

   Very good design!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] gaojun2048 commented on issue #1968: [Feature][seatunnel-server] Architectural Design

Posted by GitBox <gi...@apache.org>.
gaojun2048 commented on issue #1968:
URL: https://github.com/apache/incubator-seatunnel/issues/1968#issuecomment-1140620870

   Very good design. It would be better if you could convert it into English.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] davidzollo commented on issue #1968: [Feature][seatunnel-server] Architectural Design

Posted by GitBox <gi...@apache.org>.
davidzollo commented on issue #1968:
URL: https://github.com/apache/incubator-seatunnel/issues/1968#issuecomment-1161336071

   good design


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] gaojun2048 commented on issue #1968: [Feature][seatunnel-server] Architectural Design

Posted by GitBox <gi...@apache.org>.
gaojun2048 commented on issue #1968:
URL: https://github.com/apache/incubator-seatunnel/issues/1968#issuecomment-1148530109

   > > Very good design. It would be better if you could convert it into English.
   > 
   > i will translate it befor 6.3
   
   Thank you very much!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org