You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/12/16 03:56:50 UTC

[GitHub] [dolphinscheduler] hzyangkai opened a new pull request, #13202: [Feature-12968][Master]improvement failover process

hzyangkai opened a new pull request, #13202:
URL: https://github.com/apache/dolphinscheduler/pull/13202

   Achieve the basic goals of the design document in the issue https://github.com/apache/dolphinscheduler/issues/12968
   
   When the worker crashes, the task running on yarn  keep running and the other tasks are killed and restarted.
   When the master crashes, all tasks keep running.
   When the master & worker crash,  the task running on yarn  keep running and the other tasks are killed and restarted.
   
   ## Purpose of the pull request
   
   
   ## Brief change log
   
   Adding two abstractions methods to the class AbstractTask.
   
   1. AbstractTask#oneAppIdPerTask: task confirmation generates only one appid. This method affects fault tolerance.
     1. If the task subclass implements oneAppIdPerTask=true, it can collect an appid and report it when the task starts. Then fault tolerance is performed based on the appid.  By default AbstractYarnTask#oneAppIdPerTask=true. FlinkStreamTask original implementation is not good enough, confusing the appid and jobid.  Therefore, FlinkStreamTask#oneAppIdPerTask=false, the implementation of FlinkStreamTask should be changed later to adjust oneAppIdPerTask=true
     2. If the task subclass does not implement oneAppIdPerTask, use the default setting oneAppIdPerTask=false. Appids will not be collected when the task starts.  Task will be killed remotely by ssh kill -9 processId and then restart a new task when worker crashes.
     
   2. AbstractTask#exitAfterSubmitTask: The submitting process exits immediately after a task is submitted. This method is used to optimize the submission method and is optional. The default value is false. Currently, only the spark cluster mode task is true.
   
   ## Verify this pull request
   
   Master crashes:
   1. when master crashes, and then restart , all types of tasks will rebuild channel to worker , keep running.
   
   Worker crashes:
   1. When the worker crashes, the task implementing oneAppIdPerTask=true could keep running.  Otherwise, it will be killed and restarted.
   
   Master & Worker crash
   1. When the master & worker crash, the task implementing oneAppIdPerTask=true could keep running.  Otherwise, it will be killed and restarted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] hzyangkai commented on pull request #13202: [Feature-12968][Master]improvement failover process

Posted by GitBox <gi...@apache.org>.
hzyangkai commented on PR #13202:
URL: https://github.com/apache/dolphinscheduler/pull/13202#issuecomment-1354345887

   > @hzyangkai Could u please use git rebase and fix the git history? Or you could close this PR and open a new one. See: https://github.com/sevntu-checkstyle/sevntu.checkstyle/wiki/Development-workflow-with-Git:-Fork,-Branching,-Commits,-and-Pull-Request
   
   @EricGao888  thanks. i will open a new.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng commented on pull request #13202: [Feature-12968][Master]improvement failover process

Posted by GitBox <gi...@apache.org>.
caishunfeng commented on PR #13202:
URL: https://github.com/apache/dolphinscheduler/pull/13202#issuecomment-1356749796

   > > @hzyangkai Could u please use git rebase and fix the git history? Or you could close this PR and open a new one. See: https://github.com/sevntu-checkstyle/sevntu.checkstyle/wiki/Development-workflow-with-Git:-Fork,-Branching,-Commits,-and-Pull-Request
   > 
   > @EricGao888 thanks. i will open a new.
   
   I will close this pr.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on pull request #13202: [Feature-12968][Master]improvement failover process

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on PR #13202:
URL: https://github.com/apache/dolphinscheduler/pull/13202#issuecomment-1354224544

   @hzyangkai Could u please use git rebase and fix the git history? Or you could close this PR and open a new one. See: https://github.com/sevntu-checkstyle/sevntu.checkstyle/wiki/Development-workflow-with-Git:-Fork,-Branching,-Commits,-and-Pull-Request


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng closed pull request #13202: [Feature-12968][Master]improvement failover process

Posted by GitBox <gi...@apache.org>.
caishunfeng closed pull request #13202: [Feature-12968][Master]improvement failover process
URL: https://github.com/apache/dolphinscheduler/pull/13202


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org