You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2021/02/02 03:05:37 UTC

[GitHub] [incubator-dolphinscheduler] felix-thinkingdata commented on a change in pull request #4645: [Improvement][alert ] Multiple instances of alert service

felix-thinkingdata commented on a change in pull request #4645:
URL: https://github.com/apache/incubator-dolphinscheduler/pull/4645#discussion_r568291024



##########
File path: dolphinscheduler-alert/src/main/java/org/apache/dolphinscheduler/alert/AlertServer.java
##########
@@ -129,9 +134,20 @@ private void runSender() {
             if (alertPluginManager == null || alertPluginManager.getAlertChannelMap().size() == 0) {
                 logger.warn("No Alert Plugin . Can not send alert info. ");
             } else {
-                List<Alert> alerts = alertDao.listWaitExecutionAlert();
-                alertSender = new AlertSender(alerts, alertDao, alertPluginManager);
-                alertSender.run();
+                InterProcessMutex mutex = null;
+                try {
+                    logger.error("创建分布式锁 : ");
+                    mutex = zookeeperClient.getAlertLockPath();
+                    mutex.acquire();
+                    List<Alert> alerts = alertDao.listWaitExecutionAlert();
+                    alertSender = new AlertSender(alerts, alertDao, alertPluginManager);
+                    alertSender.run();
+                } catch (Exception e) {
+                    logger.error("alert server with error : ", e);
+                } finally {
+                    zookeeperClient.release(mutex);
+
+                }

Review comment:
       > Distributed locking is so granularity that only one alert-server is active at any one time. If this is the case, it feels unnecessary to deploy multiple nodes of this alert-server. Alert-server distributed locks are at least task-level granularity.
   > 
   > 分布式锁的粒度很大,同一时间只有一个alert-server在工作,如果是这样,感觉这个alert-server的多节点部署没有必要。alert-server 分布式锁的粒度至少是到任务级别的。
   
   Yes, this is currently sorting queries between the alert services to the database. This is a distributed locking strategy implemented by the framework. For example: / XXXX-XX-XX-001, XXXX-XX-XX-002. Releasing the lock 002 tells the process that owns the lock 002 to continue. Simple implementation of alert multiple instances, increased a part of the reliability. A more perfect solution can be discussed later.
   ---
   是的,这个目前是alert 服务之间排序查询数据库的。这个是框架实现的一种分布式锁策略。例如: /xxxxx-xxx-xxx-001 ,xxxx-xxx-xxx-002。001释放锁会告诉002锁拥有者的进程去继续操作。简单了实现了alert的多实例,增加了一部分可靠性。后续可以再讨论出更完美的方案。
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org