You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/10/21 07:52:01 UTC

[GitHub] [dolphinscheduler] wcmolin opened a new issue, #12482: [Bug] [Master] check for dependency task instance need failover is error

wcmolin opened a new issue, #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### What happened
   
   There is a dependent task instance running. At this time, fault-tolerant execution is performed. At this time, there is a function `MasterRegistryClient.checkTaskAfterWorkerStart` that needs to determine whether the task instance is started after the worker node. The result is wrong, resulting in the fault-tolerant execution of the task.
   
   code in `org.apache.dolphinscheduler.server.master.registry.MasterRegistryClient`
   ```java
       private Date getServerStartupTime(List<Server> servers, String host) {
           if (CollectionUtils.isEmpty(servers)) {
               return null;
           }
           Date serverStartupTime = null;
           for (Server server : servers) {
               logger.error("host: {}, server host: {}", host, server.getHost() + Constants.COLON + server.getPort());
               if (host.equals(server.getHost() + Constants.COLON + server.getPort())) {
                   serverStartupTime = server.getCreateTime();
                   break;
               }
           }
           return serverStartupTime;
       }
   ```
   this is log
   ```log
   [ERROR] 2022-10-21 15:49:37.219 org.apache.dolphinscheduler.server.master.registry.MasterRegistryClient:[338] - host: 10.66.76.129:5678, server host: 10.66.76.129:1234
   ```
   
   ### What you expected to happen
   
   This dependent task instance should not be fault tolerant
   
   ### How to reproduce
   
   Add a dependent node to depend on an unexecuted process. After waiting for the fault-tolerant thread to execute, you can see that the dependent node instance is fault-tolerant
   
   ```
   
   [INFO] 2022-10-21 05:00:55.308 org.apache.dolphinscheduler.server.master.runner.FailoverExecuteThread:[68] - failover execute started
   [INFO] 2022-10-21 05:00:55.314 org.apache.dolphinscheduler.server.master.runner.FailoverExecuteThread:[74] - need failover hosts:[192.168.191.2:5678]
   [INFO] 2022-10-21 05:00:55.340 org.apache.dolphinscheduler.server.master.registry.MasterRegistryClient:[424] - start master[192.168.191.2:5678] failover, process list size:3
   [INFO] 2022-10-21 05:00:55.346 org.apache.dolphinscheduler.server.master.registry.MasterRegistryClient:[442] - failover task instance id: 8758, process instance id: 6178
   [INFO] 2022-10-21 05:00:55.375 org.apache.dolphinscheduler.server.master.registry.MasterRegistryClient:[456] - master[192.168.191.2:5678] failover end, useTime:48ms
   [INFO] 2022-10-21 05:00:56.007 org.apache.dolphinscheduler.server.master.runner.EventExecuteService:[127] - handle process instance : 6178 , events count:1
   [INFO] 2022-10-21 05:00:56.007 org.apache.dolphinscheduler.server.master.runner.EventExecuteService:[130] - already exists handler process size:0
   [INFO] 2022-10-21 05:00:56.008 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[308] - process event: State Event :key: null type: TASK_STATE_CHANGE executeStatus: NEED_FAULT_TOLERANCE task instance id: 8758 process instance id: 6178 context: null
   [INFO] 2022-10-21 05:00:56.010 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[422] - work flow 6178 task 8758 state:NEED_FAULT_TOLERANCE 
   [INFO] 2022-10-21 05:00:56.010 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[424] - resubmit NEED_FAULT_TOLERANCE dependent task
   ```
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   2.0.x
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] wcmolin commented on issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
wcmolin commented on issue #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482#issuecomment-1287637031

   I can try it, can I just fix the 2.0.x version? Which branch should I submit the PR to?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] wcmolin commented on issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
wcmolin commented on issue #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482#issuecomment-1288348045

   The code structure of the dev version and the 2.x version is inconsistent. If I fix the dev version, will the 2.x version be fixed by someone else performing cherry-pick? @jieguangzhou 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] JinyLeeChina closed issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
JinyLeeChina closed issue #12482: [Bug] [Master]  check for dependency task instance need failover is error 
URL: https://github.com/apache/dolphinscheduler/issues/12482


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482#issuecomment-1286595233

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] davidzollo commented on issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
davidzollo commented on issue #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482#issuecomment-1287635716

   hi @wcmolin , can you fix this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] jieguangzhou commented on issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
jieguangzhou commented on issue #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482#issuecomment-1288335191

   > > hi @wcmolin , can you fix this issue?
   > 
   > I can try it, can I just fix the 2.0.x version? Which branch should I submit the PR to? @davidzollo @jieguangzhou
   
   Please check to see if the dev has this problem, and if not, mention 2.0.8-prepare. 
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] wcmolin commented on issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
wcmolin commented on issue #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482#issuecomment-1287637108

   > hi @wcmolin , can you fix this issue?
   
   I can try it, can I just fix the 2.0.x version? Which branch should I submit the PR to?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] jieguangzhou commented on issue #12482: [Bug] [Master] check for dependency task instance need failover is error

Posted by GitBox <gi...@apache.org>.
jieguangzhou commented on issue #12482:
URL: https://github.com/apache/dolphinscheduler/issues/12482#issuecomment-1288351814

   
   
   > The code structure of the dev version and the 2.x version is inconsistent. If I fix the dev version, will the 2.x version be fixed by someone else performing cherry-pick? @jieguangzhou
   
   Maybe the dev has fixed it, so please help check if the dev has this problem.
   
   If the same problem is in dev, you can fix it, and someone will cherry-pick to 2.x


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org