You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/10/28 03:33:55 UTC

[GitHub] [dolphinscheduler] dahai1996 opened a new pull request, #12584: check for duplicate msg about rob taskGroup

dahai1996 opened a new pull request, #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584

   
   ## Purpose of the pull request
   when using task group for jobs,we get a bug. here is the log:
   ```
   [INFO] 2022-10-25 09:00:00.140 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2929] - [WorkflowInstance-798][TaskInstance-28719] - Failed to rob taskGroup, taskInstanceId: 28719, t
   askGroupId: 26497
   [INFO] 2022-10-25 09:00:00.140 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable:[269] - [WorkflowInstance-798][TaskInstance-28719] - Begin to handle state event, StateEvent(
   key=798-28719, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=28719, taskCode=0, processInstanceId=798, context=null, channel=null)
   [INFO] 2022-10-25 09:00:00.159 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2929] - [WorkflowInstance-798][TaskInstance-28719] - Failed to rob taskGroup, taskInstanceId: 28719, t
   askGroupId: 26497
   [INFO] 2022-10-25 09:00:00.159 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable:[269] - [WorkflowInstance-798][TaskInstance-28719] - Begin to handle state event, StateEvent(
   key=798-28719, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=28719, taskCode=0, processInstanceId=798, context=null, channel=null)
   [INFO] 2022-10-25 09:00:00.178 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2929] - [WorkflowInstance-798][TaskInstance-28719] - Failed to rob taskGroup, taskInstanceId: 28719, t
   askGroupId: 26497
   [INFO] 2022-10-25 09:00:00.179 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable:[269] - [WorkflowInstance-798][TaskInstance-28719] - Begin to handle state event, StateEvent(
   key=798-28719, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=28719, taskCode=0, processInstanceId=798, context=null, channel=null)
   ```
   the log keeps recurring.
   this makes the task group get the task instance status error.
   I found out that it was caused by duplicate messages,so add code to check for duplicates
   
   ## Verify this pull request
   This bug is hard to reproduce:  it only happens when we repeatedly receive the message about "rob taskGroup". (maybe the 
   duplicate msg is a bug?)
   and after few days,I got the logs in the actual run :
   ```
   [INFO] 2022-10-28 04:02:51.179 +0800 org.apache.dolphinscheduler.server.master.processor.TaskEventProcessor:[64] - [WorkflowInstance-852][TaskInstance-32214] - Received task event change command, event: StateEvent(key=852-32214, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=32214, taskCode=0, processInstanceId=852, context=null, channel=null)
   [INFO] 2022-10-28 04:02:51.179 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThreadPool:[98] - [WorkflowInstance-852][TaskInstance-32214] - Submit state event success, stateEvent: StateEvent(key=852-32214, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=32214, taskCode=0, processInstanceId=852, context=null, channel=null)
   [INFO] 2022-10-28 04:02:51.197 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2942] - [WorkflowInstance-854][TaskInstance-32211] - This is a duplicate message,will not rob taskGroup, taskInstanceId: 32211, taskGroupId: 29988
   [INFO] 2022-10-28 04:02:51.198 +0800 TaskLogLogger-class org.apache.dolphinscheduler.server.master.runner.task.CommonTaskProcessor:[94] - [WorkflowInstance-854][TaskInstance-32211] - task ready to dispatch to worker: taskInstanceId: 32211
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng commented on a diff in pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
caishunfeng commented on code in PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#discussion_r1007883331


##########
dolphinscheduler-service/src/main/java/org/apache/dolphinscheduler/service/process/ProcessServiceImpl.java:
##########
@@ -2488,6 +2488,11 @@ public boolean robTaskGroupResource(TaskGroupQueue taskGroupQueue) {
             this.taskGroupQueueMapper.updateById(taskGroupQueue);
             this.taskGroupQueueMapper.updateInQueue(Flag.NO.getCode(), taskGroupQueue.getId());
             return true;
+        }else if (taskGroupQueueMapper.selectCountByTaskIdAndStatus(taskGroupQueue.getId(),

Review Comment:
   please check the code style, see https://dolphinscheduler.apache.org/en-us/docs/dev/user_doc/contribute/development-environment-setup.html



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296807446

   > Please use english to describe. Thanks. @dahai1996
   like #11399 .
   i found the queue of even get duplicate msg about "rob task group",and then the method "robTaskGroupResource.robTaskGroupResource" will return "false", the queue will not remove msg when get "false",so we get
   duplicate log "Failed to rob taskGroup..."
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296464165

   > Hi @dahai1996 which version was fixed? 3.0.x or 3.1.x? Please link a issue for this pr.
   
   3.0.1 , 我找找有没有一样的issue. 另外,今天这个bug又复现了,原因是我判断重复的时候没有检查状态为 WAIT_QUEUE 的. 目前我观察到的是,even消息重复出现,才会导致这种情况


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296798237

   > Please use english to describe. Thanks. @dahai1996
   
   for version 3.0.1 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
SbloodyS commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296792486

   Please use english to describe. Thanks. @dahai1996 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] sonarcloud[bot] commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296894934

   SonarCloud Quality Gate failed.&nbsp; &nbsp; [![Quality Gate failed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/failed-16px.png 'Quality Gate failed')](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler&pullRequest=12584)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [5 Code Smells](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL)
   
   [![12.5%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/0-16px.png '12.5%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list) [12.5% Coverage](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list)  
   [![0.0%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/3-16px.png '0.0%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list) [0.0% Duplication](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
caishunfeng commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1294785840

   Hi @dahai1996 which version was fixed? 3.0.x or 3.1.x? Please link a issue for this pr.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] sonarcloud[bot] commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296889634

   SonarCloud Quality Gate failed.&nbsp; &nbsp; [![Quality Gate failed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/failed-16px.png 'Quality Gate failed')](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler&pullRequest=12584)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [5 Code Smells](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL)
   
   [![12.5%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/0-16px.png '12.5%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list) [12.5% Coverage](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list)  
   [![0.0%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/3-16px.png '0.0%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list) [0.0% Duplication](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] dahai1996 closed pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
dahai1996 closed pull request #12584: check for duplicate msg about rob taskGroup
URL: https://github.com/apache/dolphinscheduler/pull/12584


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] dahai1996 commented on a diff in pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
dahai1996 commented on code in PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#discussion_r1008985811


##########
dolphinscheduler-service/src/main/java/org/apache/dolphinscheduler/service/process/ProcessServiceImpl.java:
##########
@@ -2488,6 +2488,11 @@ public boolean robTaskGroupResource(TaskGroupQueue taskGroupQueue) {
             this.taskGroupQueueMapper.updateById(taskGroupQueue);
             this.taskGroupQueueMapper.updateInQueue(Flag.NO.getCode(), taskGroupQueue.getId());
             return true;
+        }else if (taskGroupQueueMapper.selectCountByTaskIdAndStatus(taskGroupQueue.getId(),

Review Comment:
   ok~ 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup

Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296583838

   > Hi @dahai1996 which version was fixed? 3.0.x or 3.1.x? Please link a issue for this pr.
   
   前端表现跟问题 : #11399 一样(没看到他的服务器日志),工作流中的单个任务显示完成,但是无法更新到task group中,后续也无法更新整个工作流的状态,流程卡死了.日志里面一直循环打印 "Failed to rob taskGroup..."


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org