You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@shardingsphere.apache.org by GitBox <gi...@apache.org> on 2020/07/13 06:28:41 UTC

[GitHub] [shardingsphere-elasticjob-lite] coodajingang opened a new pull request #1040: Add dag to lite

coodajingang opened a new pull request #1040:
URL: https://github.com/apache/shardingsphere-elasticjob-lite/pull/1040


   Fixes #ISSUSE_ID.
   
   Changes proposed in this pull request:
   -
   -
   -
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere-elasticjob-lite] coodajingang commented on pull request #1040: Add dag to lite

Posted by GitBox <gi...@apache.org>.
coodajingang commented on pull request #1040:
URL: https://github.com/apache/shardingsphere-elasticjob-lite/pull/1040#issuecomment-657946352


   ## Dag 
   ### dag job 自注册 
   在job的配置类`JobConfigruation` 中增加`JobDagConfig`类,用来配置Dag相关信息,包括dag所属的组dagGroup、该job依赖的job名字、job重试次数和间隔等。
   当job注册自身到zk节点{jobName}/config 时,dag配置信息也会注册到其下。 同时,dag job也会把自己注册到{dagGroup}/config/{jobName} 节点下,其值为该job的依赖。
   ### 触发机制
   注册完后,dag job会启动一个`PathChildrenCacheListener`,监听路径是 {jobName}/state ,当该job运行结束时,成功失败状态会写到该路径下,监听器接收到事件后进行下一依赖job触发或统计整个dag状态,这是dag实现自我触发的动力。
   ### job 增加状态统计
   ZK结构如下:
   * /{namespace}	/{jobName}	/state/state	    运行状态,success-成功;fail-失败;running-处理中
   * /{namespace}	/{jobName}	/proc/succ/{item}	成功分片
   * /{namespace}	/{jobName}	/proc/fail/{item}	失败分片
   
   当job开始运行时状态置为running; 
   当一个分片执行结束时,会根据其成功失败状态登记到proc/succ或fail下,当所有的分片都有终态后,更新/state/state的值。
   
   ### dag 相关zk路径
   * /{namespace}    /dag    /{groupName}    /config     /{jobName}  值为依赖job,逗号分割;job自注册时登记;
   * /{namespace}    /dag    /{groupName}    /graph      /{jobName}  值为依赖job,逗号分割;当根节点运行时根据config生成,可以理解为执行计划
   * /{namespace}    /dag    /{groupName}    /graph      /{jobName}/retry  值为当前的重试次数
   * /{namespace}    /dag    /{groupName}    /states                 值为dag的状态,有成功、失败、运行中、暂停状态;
   * /{namespace}    /dag    /{groupName}    /running     /{jobName} 指正在运行中的job,job被触发运行时登记在该路径下
   * /{namespace}    /dag    /{groupName}    /success     /{jobName} 指运行成功的job,job运行成功后由running路径转移到该路径下
   * /{namespace}    /dag    /{groupName}    /fail        /{jobName} 指运行失败的job,job运行失败后由running路径转移到该路径下
   * /{namespace}    /dag    /{groupName}    /skip        /{jobName} 指skip的job,job运行失败时如果标记为可以跳过,则由running路径转移到该路径下
   * /{namespace}    /dag    /{groupName}    /retry       /{jobName} 指等待被重试的job,job运行失败时根据重试参数进行重试的登记在该路径下 
   * /{namespace}    /daglatch   /{groupName}                        根job选主路径 
   * /{namespace}    /dagretry   /{groupName}   /{jobName}           job重试延迟队列  
   
   ### 关于dag job 重试 
   dag job可以配置重试参数,失败时检查重试参数,需要重试则放入zk的延时队列进行重试;
   延时队列到期后触发job进行执行。 
   为了防止重试时重复触发,通过事务执行: 删除dag retry下该job + 写instances trigger进行控制, 保证trigger只会写一遍。  
   
   ### 关于graph 
   graph下的配置信息跟config下一样,引入原因有二:
   1. dag需要检查有无环,而config下信息可能会在dag运行期间变动,所以在运行前先生成dag graph,在本次dag运行中,不管config下如何变动都以graph为准执行。
   2. graph下保存有当前重试次数,每次执行时若dag达到终态都会重新生成,便于保存运行中参数。
   
   ### dag 状态 
   dag状态有running、pause 、 fail 、success , 状态转换如下:
   running:通过控制台暂停按钮转换为pause,将不再触发后续依赖job; 当没有可触发的job时,更新为fail或success; 
   pause: 通过控制台恢复安装转换为running,并触发后续依赖job ;
   fail: 在无可触发的job时,若存在失败job,则更新为fail 
   success:在无可触发的job时,若graph中所有的job都成功,更新为success;
   
   ### dag job cron 
   根job,指依赖为`self`的job, 在一个dagGroup中可以有多个,其cron表达式按正常定时需要来; 
   其他job,指依赖为其他job的job,它不应该被定时调起,而应该被依赖的job调起,其cron请设置为`1/59 * * * * ? 2099`
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere-elasticjob-lite] terrymanu commented on pull request #1040: Add dag to lite

Posted by GitBox <gi...@apache.org>.
terrymanu commented on pull request #1040:
URL: https://github.com/apache/shardingsphere-elasticjob-lite/pull/1040#issuecomment-657415689


   This PR is too big, could you provide some idea and design document to implement the PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org