You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2020/12/15 02:32:10 UTC

[GitHub] [incubator-dolphinscheduler] crazycarry opened a new issue #4226: [Bug][master server ] when get exception,dead loop can not get out

crazycarry opened a new issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226


   
   ### when i upgrade ds to 1.3.3, i find a bug like before ,in the class MaterSchedulerService
   
   ```
   
      public void run() {
           logger.info("master scheduler started");
           while (Stopper.isRunning()){
               InterProcessMutex mutex = null;
               try {
                   boolean runCheckFlag = OSUtils.checkResource(masterConfig.getMasterMaxCpuloadAvg(), masterConfig.getMasterReservedMemory());
                   if(!runCheckFlag) {
                       Thread.sleep(Constants.SLEEP_TIME_MILLIS);
                       continue;
                   }
                   if (zkMasterClient.getZkClient().getState() == CuratorFrameworkState.STARTED) {
   
                       mutex = zkMasterClient.blockAcquireMutex();
   
                       int activeCount = masterExecService.getActiveCount();
                       // make sure to scan and delete command  table in one transaction
                       Command command = processService.findOneCommand();
                       if (command != null) {
                           logger.info("find one command: id: {}, type: {}", command.getId(),command.getCommandType());
   
                           try{
   
                               ProcessInstance processInstance = processService.handleCommand(logger,
                                       getLocalAddress(),
                                       this.masterConfig.getMasterExecThreads() - activeCount, command);
                               if (processInstance != null) {
                                   logger.info("start master exec thread , split DAG ...");
                                   masterExecService.execute(new MasterExecThread(processInstance, processService, nettyRemotingClient));
                               }
                           }catch (Exception e){
                               logger.error("scan command error ", e);
                               processService.moveToErrorCommand(command, e.toString());
                           }
                       } else{
                           //indicate that no command ,sleep for 1s
                           Thread.sleep(Constants.SLEEP_TIME_MILLIS);
                       }
                   }
               } catch (Exception e){
                   logger.error("master scheduler thread error",e);
               } finally{
                   zkMasterClient.releaseMutex(mutex);
               }
           }
       }
   
   ```
   
   when the db get a error or some other exception,the loop do not hava any function to down it


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] CalvinKirs commented on issue #4226: [Bug][master server ] when get exception,dead loop can not get out

Posted by GitBox <gi...@apache.org>.
CalvinKirs commented on issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226#issuecomment-751232596


   I don't think it should be closed here. There is no difference between closing and close master server .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] CalvinKirs commented on issue #4226: [Bug][master server ] when get exception,dead loop can not get out

Posted by GitBox <gi...@apache.org>.
CalvinKirs commented on issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226#issuecomment-751608501


   > I will fix it.
   
   Deplly thanks for your enthusiastic participation, if you are interested, you can from https://github.com/apache/incubator-dolphinscheduler/issues/4124 looking for tasks you like inside, if you need any help, can also issue the message the first time.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] CalvinKirs closed issue #4226: [Bug][master server ] when get exception,dead loop can not get out

Posted by GitBox <gi...@apache.org>.
CalvinKirs closed issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] zt-1997 commented on issue #4226: [Bug][master server ] when get exception,dead loop can not get out

Posted by GitBox <gi...@apache.org>.
zt-1997 commented on issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226#issuecomment-751232246


   I will fix it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] CalvinKirs edited a comment on issue #4226: [Bug][master server ] when get exception,dead loop can not get out

Posted by GitBox <gi...@apache.org>.
CalvinKirs edited a comment on issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226#issuecomment-751232596


   I don't think it should be closed here. There is no difference between exit and close master server .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] CalvinKirs commented on issue #4226: [Bug][master server ] when get exception,dead loop can not get out

Posted by GitBox <gi...@apache.org>.
CalvinKirs commented on issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226#issuecomment-751607857


   Deplly thanks for your enthusiastic participation, about this question, I don't think it is a BUG, in fact, when such problems, we should consider is that mysql is health problem, or is the version of adaptation problem, etc., you can see, we here have a sleep time, is actually based on this design is also considered the Circuit Breaker. If you quit abruptly, it's the same as shutting down the service.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org