You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/08/30 11:25:50 UTC

[GitHub] [dolphinscheduler] cadl opened a new issue, #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

cadl opened a new issue, #11704:
URL: https://github.com/apache/dolphinscheduler/issues/11704

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### What happened
   
   When my shell task got a timeout, the workflow instance status will fail, but the shell process is still running.
   
   
   https://github.com/apache/dolphinscheduler/pull/5212 fixed TaskKillProcessor. But the shell task timeout operation still issues kill action to parent pid directly.  
   
   Maybe https://github.com/apache/dolphinscheduler/issues/11051 meet same issue.
   
   worker log:
   ```
   TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[290] - task run command: sudo -u xcfapp sh /jfs/dolphinscheduler/exec/process/6035996488480/6708908920992_2/344166/399310/344166_399310.command
   [INFO] 2022-08-30 17:01:01.281 [taskAppId=TASK-20220830-6708908920992_2-344166-399310] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[181] - process start, process id is: 27646
   [ERROR] 2022-08-30 17:31:01.281 [taskAppId=TASK-20220830-6708908920992_2-344166-399310] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[103] - shell task error
   java.lang.IllegalThreadStateException: process hasn't exited
   	at java.base/java.lang.ProcessImpl.exitValue(ProcessImpl.java:521)
   	at org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:200)
   	at org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
   	at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:182)
   	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
   	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
   	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   	at java.base/java.lang.Thread.run(Thread.java:829)
   [INFO] 2022-08-30 17:31:01.281 [taskAppId=TASK-20220830-6708908920992_2-344166-399310] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[57] - FINALIZE_SESSION
   [INFO] 2022-08-30 17:31:01.281 [taskAppId=TASK-20220830-6708908920992_2-344166-399310] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[230] - cancel process: 27646
   [INFO] 2022-08-30 17:31:01.282 [taskAppId=TASK-20220830-6708908920992_2-344166-399310] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[259] - soft kill task:344166_399310, process id:27646, cmd:sudo -u xcfapp kill 27646
   [INFO] 2022-08-30 18:08:31.636 [taskAppId=TASK-20220830-6708908920992_2-344166-399310] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[57] - FINALIZE_SESSION
   ```
   
   ### What you expected to happen
   
   Shell task can kill process success
   
   ### How to reproduce
   
   - prepare a tenant named `foo`
   - create a workflow and shell task, set shell command as `sleep 300`, and set timeout to 1 minute
   - schedule the workflow
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.0.0
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] cadl commented on issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

Posted by GitBox <gi...@apache.org>.
cadl commented on issue #11704:
URL: https://github.com/apache/dolphinscheduler/issues/11704#issuecomment-1232374160

   > 
   
   Got it. Thank you very much for your patient explanation. 👍 
   At first glance, I thought it was just about the log changes in #11099 😅
   
   Closing this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] cadl closed issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

Posted by GitBox <gi...@apache.org>.
cadl closed issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`
URL: https://github.com/apache/dolphinscheduler/issues/11704


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11704:
URL: https://github.com/apache/dolphinscheduler/issues/11704#issuecomment-1233649508

   ohh u are correct. It will kill all subprocess. Just will throw a ExitCodeException, but process can exit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11704:
URL: https://github.com/apache/dolphinscheduler/issues/11704#issuecomment-1232857926

   @cadl @zhuangchong 
   Sorry, but i think this problem still exists, because the log show that has run the kill command(soft kill), and ProcessUtils.kill use the hard kill. But i think ProcessUtils.kill still cannot kill the process.
   Becuase `sudo -u xxx kill -9 PID` cannot kill `sudo -u xxx sh ...command` process. Please check the following screenshots. 
   ![image](https://user-images.githubusercontent.com/20518339/187675765-6d5158fd-49ee-4974-9a48-1d7ca76f9293.png)
   ![image](https://user-images.githubusercontent.com/20518339/187675824-35d700e3-ad79-42f1-9b32-3182dd0299d2.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11704:
URL: https://github.com/apache/dolphinscheduler/issues/11704#issuecomment-1231556534

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhuangchong commented on issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

Posted by GitBox <gi...@apache.org>.
zhuangchong commented on issue #11704:
URL: https://github.com/apache/dolphinscheduler/issues/11704#issuecomment-1232362419

   @cadl 
   
   The reason for the exception:
   
   The shell process execution timed out,
   
   ```
   boolean status = process.waitFor(remainTime, TimeUnit.SECONDS);
   ```
   
   Variable `status=false`, execute process.exitValue(), because the process does not end, this method will throw IllegalThreadStateException,
   
   ```
               logger.error("process has failure , exitStatusCode:{}, processExitValue:{}, ready to kill ...",
                       result.getExitStatusCode(), process.exitValue());
   ```
   
   This issue has been fixed in this PR #11099, currently in the dev branch, not released to the 3.0.0 branch.
   
   
   Variable `status=false`, execute ProcessUtils.kill(taskRequest); use `kill -9 xxx` to kill the shell process directly without executing soft kill
   
   3.0.0:
   
   https://github.com/apache/dolphinscheduler/blob/9badb2d2fbc6b79efca96247dbc49a39ca6868f7/dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/main/java/org/apache/dolphinscheduler/plugin/task/api/AbstractCommandExecutor.java#L187-L207
   
   dev:
   
   https://github.com/apache/dolphinscheduler/blob/5fabce783a7e90b834d9842eebc4a20da6198a0e/dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/main/java/org/apache/dolphinscheduler/plugin/task/api/AbstractCommandExecutor.java#L224-L240
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11704: [Bug] [TaskPlugin] Shell task timeout can't terminate process when `sudo.enable: true`

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11704:
URL: https://github.com/apache/dolphinscheduler/issues/11704#issuecomment-1232360986

   because kill command will failed, such as Operation not permitted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org