You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2020/04/15 11:51:17 UTC

[GitHub] [incubator-dolphinscheduler] Rubik-W edited a comment on issue #249: Lineage data kinship(Lineage数据血缘关系)

Rubik-W edited a comment on issue #249: Lineage data kinship(Lineage数据血缘关系)
URL: https://github.com/apache/incubator-dolphinscheduler/issues/249#issuecomment-613854384
 
 
   I plan to implement data lineage function (Table level).
   
   - The sql node and etl node automatically parse the dependency table and target table.
   - The frontend controls whether to enable dependency detection through switch.
   - The master server automatically injects dependent nodes, create dependent nodes based on dependencies.
   - Rely on the node to set the default number of retries.
   - Open the node that dependent detection function, no longer need to manually connect.
   
   Already start a discussion in mail list.
   
   ---
   
   - 支持sql节点和etl节点自动分析表之间的依赖关系,通过解析sql的select表和insert表实现,其他节点可以手工维护insert表(如果存在这种需求)
   - 前端通过依赖检测开关控制master在任务调度时是否进行依赖解析
   - master server自动注入依赖节点(运行时根据依赖关系生成虚拟依赖节点,不修改工作流定义数据)
   - 生成的依赖节点设置默认失败重试次数,比如每5分钟检测一下
   - 打开依赖解析开关后,节点间不再需要手工连线,master根据依赖关系的顺序进行节点的调度
   
   E.g.
   ![image](https://user-images.githubusercontent.com/39549317/79307246-305cdb80-7f29-11ea-9c55-96ede9778a5e.png)
   ![image](https://user-images.githubusercontent.com/39549317/79307253-3488f900-7f29-11ea-8342-df7a52fac39e.png)
   ![image](https://user-images.githubusercontent.com/39549317/79307266-381c8000-7f29-11ea-9258-1572e6997215.png)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services