You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/09/28 12:21:41 UTC

[GitHub] [dolphinscheduler] HomminLee opened a new issue, #12196: [Bug] [Worker] Start too many tasks that consume memory will cause the Worker to crash

HomminLee opened a new issue, #12196:
URL: https://github.com/apache/dolphinscheduler/issues/12196

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### What happened
   
   If too many tasks are started at once, and any one of task consume too much memory,  Worker will crash.
   
   Worker will report maxCpuloadAvg and reservedMemory to Master though heartbeat. Master will check Worker status when dispatch task, if Worker has not enough memory, Master will put task back to taskPriorityQueue. But heartbeat has 10 seconds interval. Once the Worker status change to available, Master will dispatch all task in queue to Worker. Then Worker will crash.
   
   ### What you expected to happen
   
   Worker also check loadAverage and reservedMemory before run task.
   
   ### How to reproduce
   
   Set worker.reserved.memory = 1.5
   
   Create a workflow contains several task, every task consume a lot memory and run a long time, eg: 1G memory and 200 secons.
   
   Schedule run workflow with a few seconds.
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.0.x
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] HomminLee commented on issue #12196: [Question] [Worker] Start too many tasks that consume memory will cause the Worker to crash

Posted by GitBox <gi...@apache.org>.
HomminLee commented on issue #12196:
URL: https://github.com/apache/dolphinscheduler/issues/12196#issuecomment-1272294231

   @SbloodyS  It's nothing to do with the memory of a single task. My question is Master can't get Worker status in real time, and dispatch countless tasks to Worker.
   
   The following picture is the master log:
   
   <img width="1412" alt="image" src="https://user-images.githubusercontent.com/25881185/194704265-83a1dbe0-1c68-4b1c-9fde-3b14ee156e0c.png">
   
   At time 1, worker has not enough memory, and Master failed to dispatch task to worker.
   
   ![image](https://user-images.githubusercontent.com/25881185/194704191-15f08711-0c32-4903-ac4e-283eb5bb058a.png)
   
   At time 2, worker finish some tasks, master know worker has enough memory by heartbeat. Then master will dispatch all tasks in queue to worker, worker will run those tasks without other check.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] HomminLee closed issue #12196: [Question] [Worker] Start too many tasks that consume memory will cause the Worker to crash

Posted by GitBox <gi...@apache.org>.
HomminLee closed issue #12196: [Question] [Worker] Start too many tasks that consume memory will cause the Worker to crash
URL: https://github.com/apache/dolphinscheduler/issues/12196


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #12196: [Bug] [Worker] Start too many tasks that consume memory will cause the Worker to crash

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #12196:
URL: https://github.com/apache/dolphinscheduler/issues/12196#issuecomment-1261047502

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS commented on issue #12196: [Bug] [Worker] Start too many tasks that consume memory will cause the Worker to crash

Posted by GitBox <gi...@apache.org>.
SbloodyS commented on issue #12196:
URL: https://github.com/apache/dolphinscheduler/issues/12196#issuecomment-1261647720

   This problem can be solved by limiting the maximum execution resources of a single task. And it have been implemented in #10373. Will be release in the 3.1.0-release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org