You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@dolphinscheduler.apache.org by Jave-Chen <ke...@foxmail.com> on 2020/07/05 09:10:45 UTC

回复:[discuss] Force Task Success

Good ideas. Could&nbsp;you go on with more details ?




------------------&nbsp;原始邮件&nbsp;------------------
发件人:&nbsp;"1606079777"<1606079777@qq.com&gt;;
发送时间:&nbsp;2020年7月4日(星期六) 上午10:26
收件人:&nbsp;"dev"<dev@dolphinscheduler.apache.org&gt;;

主题:&nbsp;[discuss] Force Task Success



Hi everyone,


These are my ideas on the project "force task success".
(I will abbreviate it as FTS in the following text)


# Functional Requirements analysis


Since the process has two kinds of failure strategies, one is "continue" and the other is "end". 


1) "end"
In this case, failure of just one task can stop and end the whole process. So, here FTS should
mark the failed node and then let other nodes continue to execute.


2) "continue"
In this case, a failed process instance may contain several failed nodes. I think we are supposed
to let the user select certain failed nodes to execute FTS.


## some details in the form of Q&amp;amp;A&amp;nbsp;


Q: What exactly does TFS mean?
A: Briefly speaking, the system will change the states of those failed task intances, make the
process continue and log each execution of the continuing tasks.


Q: After requesting TFS, are the operating parameters (such as failure strategy, notification 
strategy, priority...) inherited or can they be reset?
A: Just use the params before


Q:&amp;nbsp;When sub_process fails, the target of TFS should be sub_process or the failed node in it?
A: The whole sub_process



# Implementation ideas


- For the module api-server, it needs to provide api for module UI:


1) add a new interface to deal with TFS where&amp;nbsp;user could choose certain failed nodes through
parameters. This interface will insert a new type of command into the database.


2) change one interface(/projects/{projectName}/instances/list-pagings), because the cmdType
needs to add another state.


- For the module master-server:


It will change state of failed nodes, log this operation and send new task instances to worker-server.


- For the module worker-server:


It will execute new task instances and log the execution.


Really looking forward to your advice~~