You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2020/12/23 05:56:23 UTC

[GitHub] [incubator-dolphinscheduler] gaoyu2016 opened a new issue #4296: [Bug][Module Name] 工作流实例手动停止后,仍然处于准备停止状态,而任务实例处于一直运行中。

gaoyu2016 opened a new issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296


   工作流实例跑起来后,手动停止,但是工作流一直处于准备停止状态。
   而对应的task(提交了一个spark任务)一直是执行状态,即使提交到yarn上的applicationID已经finished了,task还是处于正在执行状态。


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] gaoyu2016 commented on issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
gaoyu2016 commented on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-749968391


   也就是工作流叫:test_spark  工作流实例是:test_spark-0-1608691747369


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] gaoyu2016 commented on issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
gaoyu2016 commented on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-749971724


   master发出了对task的kill指令,得到的结果是2020-12-23 10:50:09.510 org.apache.dolphinscheduler.server.master.processor.TaskKillResponseProcessor:[49] - received task kill response command : Tas
   kKillResponseCommand{taskInstanceId=631654, host='10.11.54.133:1234', status=6, processId=215490, appIds=[]}
   但实际上task仍然在运行。从worker看到的日志,也没有接手kill的指令。


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xingchun-chen closed issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
xingchun-chen closed issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xloya edited a comment on issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
xloya edited a comment on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-750782976


   I met this problem too last month,i check the code in TaskKillProcessor.java.When kill yarn job,will execute it's process method,and you can see taskCallbackService.sendResult() method will close the netty channel.
   
   TaskKillProcessor.java/process()
   `taskCallbackService.sendResult(taskKillResponseCommand.getTaskInstanceId(),taskKillResponseCommand.convert2Command());`
   
   TaskCallbackService.java/sendResult()
   `public void sendResult(int taskInstanceId, Command command){
           NettyRemoteChannel nettyRemoteChannel = getRemoteChannel(taskInstanceId);
           nettyRemoteChannel.writeAndFlush(command).addListener(new ChannelFutureListener(){
               @Override
               public void operationComplete(ChannelFuture future) throws Exception {
                   if(future.isSuccess()){
                       remove(taskInstanceId);
                       return;
                   }
               }
           });
       }`
   
   So when i change the method taskCallbackService.sendResult() to taskCallbackService.sendAck(),the task will kill normally.I'm not sure if it is a bug,wish can help you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] gaoyu2016 commented on issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
gaoyu2016 commented on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-749967583


   master log如下:
   [INFO] 2020-12-23 10:49:07.381 org.apache.dolphinscheduler.server.master.runner.MasterExecThread:[939] - remove task from stand by list: test_spark
   [INFO] 2020-12-23 10:49:07.383 org.apache.dolphinscheduler.service.process.ProcessService:[833] - start submit task : test_spark, instance id:622681, state: 
   RUNNING_EXEUTION
   [INFO] 2020-12-23 10:49:07.385 org.apache.dolphinscheduler.service.process.ProcessService:[846] - end submit task to db successfully:test_spark state:SUBMITT
   ED_SUCCESS complete, instance id:622681 state: RUNNING_EXEUTION  
   [INFO] 2020-12-23 10:49:07.387 org.apache.dolphinscheduler.server.master.runner.MasterTaskExecThread:[179] - task ready to submit: TaskInstance{id=631654, na
   me='test_spark', taskType='SHELL', processDefinitionId=292, processInstanceId=622681, processInstanceName='null', taskJson='{"conditionResult":"{\"successNod
   e\":[\"\"],\"failedNode\":[\"\"]}","conditionsTask":false,"depList":[],"dependence":"{}","forbidden":false,"id":"tasks-79700","maxRetryTimes":0,"name":"test_
   spark","params":"{\"rawScript\":\"spark-submit --master yarn --deploy-mode cluster --num-executors 6 --driver-memory 1g --executor-memory 4g --executor-cores
    2 test_spark.py\",\"localParams\":[],\"resourceList\":[{\"res\":\"test_spark.py\",\"name\":\"test_spark.py\",\"id\":143}]}","preTasks":"[]","retryInterval":
   1,"runFlag":"NORMAL","taskInstancePriority":"MEDIUM","taskTimeoutParameter":{"enable":false,"interval":0},"timeout":"{\"enable\":false,\"strategy\":\"\"}","t
   ype":"SHELL","workerGroup":"default"}', state=SUBMITTED_SUCCESS, submitTime=Wed Dec 23 10:49:07 CST 2020, startTime=Wed Dec 23 10:49:07 CST 2020, endTime=nul
   l, host='null', executePath='null', logPath='null', retryTimes=0, alertFlag=NO, processInstance=null, processDefine=null, pid=0, appLink='null', flag=YES, de
   pendency='null', duration=null, maxRetryTimes=0, retryInterval=1, taskInstancePriority=MEDIUM, processInstancePriority=MEDIUM, dependentResult='null', worker
   Group='default', executorId=31, executorName='null'}
   [INFO] 2020-12-23 10:49:07.387 org.apache.dolphinscheduler.server.master.runner.MasterTaskExecThread:[190] - master submit success, task : test_spark
   [INFO] 2020-12-23 10:49:07.388 org.apache.dolphinscheduler.server.master.runner.MasterTaskExecThread:[123] - wait task: process id: 622681, task id:631654, t
   ask name:test_spark complete
   [INFO] 2020-12-23 10:49:07.528 org.apache.dolphinscheduler.server.master.processor.TaskAckProcessor:[81] - taskAckCommand : TaskExecuteAckCommand{taskInstanc
   eId=631654, startTime=Wed Dec 23 10:49:07 CST 2020, host='10.11.54.133:1234', status=1, logPath='/home/shared/opt/soft/ds/logs/292/622681/631654.log', execut
   ePath='/home/shared/opt/soft/ds/tmp/dolphinscheduler/exec/process/32/292/622681/631654'}
   [INFO] 2020-12-23 10:50:00.004 org.apache.dolphinscheduler.service.quartz.ProcessScheduleJob:[75] - scheduled fire time :Wed Dec 23 10:50:00 CST 2020, fire t
   ime :Wed Dec 23 10:50:00 CST 2020, process id :26
   [WARN] 2020-12-23 10:50:00.006 org.apache.dolphinscheduler.service.quartz.ProcessScheduleJob:[90] - process definition does not exist in db or offline,need 
   not to create command, projectId:14, processId:26
   [INFO] 2020-12-23 10:50:08.466 org.apache.dolphinscheduler.server.master.runner.MasterExecThread:[902] - work flow process instance [id: 622681, name:test_sp
   ark-0-1608691747369], state change from RUNNING_EXEUTION to READY_STOP, cmd type: START_PROCESS
   [INFO] 2020-12-23 10:50:09.503 org.apache.dolphinscheduler.server.master.runner.MasterTaskExecThread:[225] - master kill taskInstance name :test_spark taskIn
   stance id:631654
   [INFO] 2020-12-23 10:50:09.510 org.apache.dolphinscheduler.server.master.processor.TaskKillResponseProcessor:[49] - received task kill response command : Tas
   kKillResponseCommand{taskInstanceId=631654, host='10.11.54.133:1234', status=6, processId=215490, appIds=[]}


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xloya edited a comment on issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
xloya edited a comment on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-750782976


   I met this problem too last month,i check the code in TaskKillProcessor.java.When kill yarn job,will execute it's process method,and you can see taskCallbackService.sendResult() method will close the netty channel.
   
   TaskKillProcessor.java/process()
   `taskCallbackService.sendResult(taskKillResponseCommand.getTaskInstanceId(), taskKillResponseCommand.convert2Command());`
   
   TaskCallbackService.java/sendResult()
   `public void sendResult(int taskInstanceId, Command command){
           NettyRemoteChannel nettyRemoteChannel = getRemoteChannel(taskInstanceId);
           nettyRemoteChannel.writeAndFlush(command).addListener(new ChannelFutureListener(){
               @Override
               public void operationComplete(ChannelFuture future) throws Exception {
                   if(future.isSuccess()){
                       remove(taskInstanceId);
                       return;
                   }
               }
           });
       }`
   
   So when i change the method taskCallbackService.sendResult() to taskCallbackService.sendAck(),the task will kill normally.I'm not sure if it is a bug,wish can help you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xingchun-chen commented on issue #4296: [Bug][Module Name] 工作流实例手动停止后,仍然处于准备停止状态,而任务实例处于一直运行中。

Posted by GitBox <gi...@apache.org>.
xingchun-chen commented on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-749952904


   please update the title in English


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xloya commented on issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
xloya commented on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-750782976


   I met this problem too last month,i check the code in TaskKillProcessor.java.When kill yarn job,will execute it's process method,and you can see taskCallbackService.sendResult() method will close the netty channel.
   
   TaskKillProcessor.java/process()
   `taskCallbackService.sendResult(taskKillResponseCommand.getTaskInstanceId(), taskKillResponseCommand.convert2Command());`
   
   TaskCallbackService.java/sendResult()
   `    public void sendResult(int taskInstanceId, Command command){
           NettyRemoteChannel nettyRemoteChannel = getRemoteChannel(taskInstanceId);
           nettyRemoteChannel.writeAndFlush(command).addListener(new ChannelFutureListener(){
   
               @Override
               public void operationComplete(ChannelFuture future) throws Exception {
                   if(future.isSuccess()){
                       remove(taskInstanceId);
                       return;
                   }
               }
           });
       }`
   
   So when i change the method taskCallbackService.sendResult() to taskCallbackService.sendAck(),the task will kill normally.I'm not sure if it is a bug,wish can help you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xingchun-chen commented on issue #4296: [Bug][Module Name] 工作流实例手动停止后,仍然处于准备停止状态,而任务实例处于一直运行中。

Posted by GitBox <gi...@apache.org>.
xingchun-chen commented on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-749953273


   Please check the master log


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xingchun-chen commented on issue #4296: [Bug]After the workflow instance runs, it is manually stopped, but the workflow is always in a state of preparing to stop. The corresponding task (submitted a spark task) is always in the execution state, even if the applicationID submitted to yarn has been finished, the task is still in the execution state.

Posted by GitBox <gi...@apache.org>.
xingchun-chen commented on issue #4296:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4296#issuecomment-750701305


   This pr has solved the problem: https://github.com/apache/incubator-dolphinscheduler/pull/2965   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org