You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/10/28 03:33:55 UTC
[GitHub] [dolphinscheduler] dahai1996 opened a new pull request, #12584: check for duplicate msg about rob taskGroup
dahai1996 opened a new pull request, #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584
## Purpose of the pull request
when using task group for jobs,we get a bug. here is the log:
```
[INFO] 2022-10-25 09:00:00.140 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2929] - [WorkflowInstance-798][TaskInstance-28719] - Failed to rob taskGroup, taskInstanceId: 28719, t
askGroupId: 26497
[INFO] 2022-10-25 09:00:00.140 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable:[269] - [WorkflowInstance-798][TaskInstance-28719] - Begin to handle state event, StateEvent(
key=798-28719, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=28719, taskCode=0, processInstanceId=798, context=null, channel=null)
[INFO] 2022-10-25 09:00:00.159 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2929] - [WorkflowInstance-798][TaskInstance-28719] - Failed to rob taskGroup, taskInstanceId: 28719, t
askGroupId: 26497
[INFO] 2022-10-25 09:00:00.159 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable:[269] - [WorkflowInstance-798][TaskInstance-28719] - Begin to handle state event, StateEvent(
key=798-28719, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=28719, taskCode=0, processInstanceId=798, context=null, channel=null)
[INFO] 2022-10-25 09:00:00.178 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2929] - [WorkflowInstance-798][TaskInstance-28719] - Failed to rob taskGroup, taskInstanceId: 28719, t
askGroupId: 26497
[INFO] 2022-10-25 09:00:00.179 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable:[269] - [WorkflowInstance-798][TaskInstance-28719] - Begin to handle state event, StateEvent(
key=798-28719, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=28719, taskCode=0, processInstanceId=798, context=null, channel=null)
```
the log keeps recurring.
this makes the task group get the task instance status error.
I found out that it was caused by duplicate messages,so add code to check for duplicates
## Verify this pull request
This bug is hard to reproduce: it only happens when we repeatedly receive the message about "rob taskGroup". (maybe the
duplicate msg is a bug?)
and after few days,I got the logs in the actual run :
```
[INFO] 2022-10-28 04:02:51.179 +0800 org.apache.dolphinscheduler.server.master.processor.TaskEventProcessor:[64] - [WorkflowInstance-852][TaskInstance-32214] - Received task event change command, event: StateEvent(key=852-32214, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=32214, taskCode=0, processInstanceId=852, context=null, channel=null)
[INFO] 2022-10-28 04:02:51.179 +0800 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThreadPool:[98] - [WorkflowInstance-852][TaskInstance-32214] - Submit state event success, stateEvent: StateEvent(key=852-32214, type=WAIT_TASK_GROUP, executionStatus=null, taskInstanceId=32214, taskCode=0, processInstanceId=852, context=null, channel=null)
[INFO] 2022-10-28 04:02:51.197 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[2942] - [WorkflowInstance-854][TaskInstance-32211] - This is a duplicate message,will not rob taskGroup, taskInstanceId: 32211, taskGroupId: 29988
[INFO] 2022-10-28 04:02:51.198 +0800 TaskLogLogger-class org.apache.dolphinscheduler.server.master.runner.task.CommonTaskProcessor:[94] - [WorkflowInstance-854][TaskInstance-32211] - task ready to dispatch to worker: taskInstanceId: 32211
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] caishunfeng commented on a diff in pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
caishunfeng commented on code in PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#discussion_r1007883331
##########
dolphinscheduler-service/src/main/java/org/apache/dolphinscheduler/service/process/ProcessServiceImpl.java:
##########
@@ -2488,6 +2488,11 @@ public boolean robTaskGroupResource(TaskGroupQueue taskGroupQueue) {
this.taskGroupQueueMapper.updateById(taskGroupQueue);
this.taskGroupQueueMapper.updateInQueue(Flag.NO.getCode(), taskGroupQueue.getId());
return true;
+ }else if (taskGroupQueueMapper.selectCountByTaskIdAndStatus(taskGroupQueue.getId(),
Review Comment:
please check the code style, see https://dolphinscheduler.apache.org/en-us/docs/dev/user_doc/contribute/development-environment-setup.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296807446
> Please use english to describe. Thanks. @dahai1996
like #11399 .
i found the queue of even get duplicate msg about "rob task group",and then the method "robTaskGroupResource.robTaskGroupResource" will return "false", the queue will not remove msg when get "false",so we get
duplicate log "Failed to rob taskGroup..."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296464165
> Hi @dahai1996 which version was fixed? 3.0.x or 3.1.x? Please link a issue for this pr.
3.0.1 , 我找找有没有一样的issue. 另外,今天这个bug又复现了,原因是我判断重复的时候没有检查状态为 WAIT_QUEUE 的. 目前我观察到的是,even消息重复出现,才会导致这种情况
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296798237
> Please use english to describe. Thanks. @dahai1996
for version 3.0.1 .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] SbloodyS commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
SbloodyS commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296792486
Please use english to describe. Thanks. @dahai1996
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] sonarcloud[bot] commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296894934
SonarCloud Quality Gate failed. [![Quality Gate failed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/failed-16px.png 'Quality Gate failed')](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler&pullRequest=12584)
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG)
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY)
[![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT)
[![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [5 Code Smells](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL)
[![12.5%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/0-16px.png '12.5%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list) [12.5% Coverage](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list)
[![0.0%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/3-16px.png '0.0%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list) [0.0% Duplication](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] caishunfeng commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
caishunfeng commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1294785840
Hi @dahai1996 which version was fixed? 3.0.x or 3.1.x? Please link a issue for this pr.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] sonarcloud[bot] commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296889634
SonarCloud Quality Gate failed. [![Quality Gate failed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/failed-16px.png 'Quality Gate failed')](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler&pullRequest=12584)
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=BUG)
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=VULNERABILITY)
[![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=SECURITY_HOTSPOT)
[![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL) [5 Code Smells](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12584&resolved=false&types=CODE_SMELL)
[![12.5%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/0-16px.png '12.5%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list) [12.5% Coverage](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_coverage&view=list)
[![0.0%](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/3-16px.png '0.0%')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list) [0.0% Duplication](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12584&metric=new_duplicated_lines_density&view=list)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] dahai1996 closed pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
dahai1996 closed pull request #12584: check for duplicate msg about rob taskGroup
URL: https://github.com/apache/dolphinscheduler/pull/12584
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] dahai1996 commented on a diff in pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
dahai1996 commented on code in PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#discussion_r1008985811
##########
dolphinscheduler-service/src/main/java/org/apache/dolphinscheduler/service/process/ProcessServiceImpl.java:
##########
@@ -2488,6 +2488,11 @@ public boolean robTaskGroupResource(TaskGroupQueue taskGroupQueue) {
this.taskGroupQueueMapper.updateById(taskGroupQueue);
this.taskGroupQueueMapper.updateInQueue(Flag.NO.getCode(), taskGroupQueue.getId());
return true;
+ }else if (taskGroupQueueMapper.selectCountByTaskIdAndStatus(taskGroupQueue.getId(),
Review Comment:
ok~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] dahai1996 commented on pull request #12584: check for duplicate msg about rob taskGroup
Posted by GitBox <gi...@apache.org>.
dahai1996 commented on PR #12584:
URL: https://github.com/apache/dolphinscheduler/pull/12584#issuecomment-1296583838
> Hi @dahai1996 which version was fixed? 3.0.x or 3.1.x? Please link a issue for this pr.
前端表现跟问题 : #11399 一样(没看到他的服务器日志),工作流中的单个任务显示完成,但是无法更新到task group中,后续也无法更新整个工作流的状态,流程卡死了.日志里面一直循环打印 "Failed to rob taskGroup..."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org