You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2021/11/04 02:03:21 UTC

[GitHub] [dolphinscheduler] codexun opened a new issue #6687: 关于控制运行任务并发数及优先级

codexun opened a new issue #6687:
URL: https://github.com/apache/dolphinscheduler/issues/6687


   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   我现在试用2.0版本,3个worker节点,每个worker线程数设置为20个,总共限制为60个,每日凌晨12:00-9:00要完成2k+任务;在实际运行中,处于运行状态的任务数远超60个(已排除依赖类型的任务),但由于hadoop平台资源是有限,集群实际限制40个任务。且由于当前优先级的控制是按照任务实际的生成时间来排序的,而不是根据任务所在工作流配置的crontab时间排序,这导致我们无法控制跑批任务的优先级。比如一个配置为最高优先级每日1:00触发的跑批任务,由于等待前置依赖任务,可能要到3:00才会生成运行,但在1:00-3:00期间生成大量的普通优先级的运行任务,这导致高优先级任务要等待很久才会运行。
   所以,我希望:
   1 能够控制实际运行的任务并发数。
   2 优先级排序逻辑:根据优先级和工作流/任务配置的crontab时间排序,有两种排序方案;
       第一种,设置的“优先级”优先级最高,相同crontab时间下,根据任务的优先级排序;
       第二种,先根据crontab时间排序,时间越早优先级越高,相同crontab时间下,按设置的“优先级”排序;
   
   我认为,根据crontab时间排序要优于任务实例的生成时间(任务实例序号),因为配置的crontab时间是确定的,而任务实例的生成时间是不确定的。
   不知道以上我对DS2版本的逻辑理解对不对,期待回复沟通。在我们的试用期间,我们无法在指定时间窗口完成数仓任务,完成时间比我们原有的调度系统多4个小时,在hadoop集群资源不变的情况下。
   
   ### Use case
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #6687: About controlling the concurrent number and priority of running tasks

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #6687:
URL: https://github.com/apache/dolphinscheduler/issues/6687#issuecomment-960368974






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org