You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2022/08/09 03:57:45 UTC

[GitHub] [iotdb] 23931017wu opened a new pull request, #6923: [IOTDB-4038] Add the leader metrics to the cluster

23931017wu opened a new pull request, #6923:
URL: https://github.com/apache/iotdb/pull/6923

   ## Add the leader metrics to the cluster
   
   ### After
   ![image](https://user-images.githubusercontent.com/71131924/183560985-ed94935e-a4cd-4511-b229-9e7549274f9f.png)
   ![image](https://user-images.githubusercontent.com/71131924/183561024-1e316388-6c49-4dc1-9373-dc8471899f24.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] 23931017wu commented on pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
23931017wu commented on PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#issuecomment-1211492014

   The name attribute that was modified was just the structure of "IP:port", so I removed the outer layer of "EndPoint()".
   ![image](https://user-images.githubusercontent.com/71131924/184055225-fae74106-ec9e-4b4f-a74b-e5bb38c381ea.png)
   ![image](https://user-images.githubusercontent.com/71131924/184055274-051b363a-9bbb-49d4-9237-0c625542e403.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] Beyyes commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
Beyyes commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r940949011


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+    regionGroupCacheMap.forEach(

Review Comment:
   602-610 can be replaced with `regionGroupCacheMap.forEach(
           (consensusGroupId, regionGroupCache) -> idToCountMap.merge(regionGroupCache.getLeaderDataNodeId(), 1, Integer::sum));`



##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+    regionGroupCacheMap.forEach(
+        (consensusGroupId, regionGroupCache) -> {
+          Integer count = idToCountMap.get(regionGroupCache.getLeaderDataNodeId());
+          if (count == null) {
+            idToCountMap.put(regionGroupCache.getLeaderDataNodeId(), 1);
+          } else {
+            idToCountMap.put(regionGroupCache.getLeaderDataNodeId(), count + 1);
+          }
+        });
+    return idToCountMap;
+  }
+
+  public void addLeaderCount() {
+    Map<Integer, Integer> idToCountMap = getLeadershipCountByDatanode();
+    getNodeManager()
+        .getRegisteredDataNodes(-1)
+        .forEach(
+            dataNodeInfo -> {
+              TDataNodeLocation dataNodeLocation = dataNodeInfo.getLocation();
+              int dataNodeId = dataNodeLocation.getDataNodeId();
+              if (idToCountMap.containsKey(dataNodeId)) {
+                String name =

Review Comment:
   I think this line can be extracted to a common method.
   It has been used 5 times.
   
   ![image](https://user-images.githubusercontent.com/6756545/183580729-37d638d8-884d-44cb-816b-9ae353375566.png)
   



##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {

Review Comment:
   Is this method can be private?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] CRZbulabula commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
CRZbulabula commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r942027241


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();

Review Comment:
   Add annotation to this variable like:
   
   `// Map<DataNodeId, leaderCount>`
   
   By the way, you can use Map<Integer, AtomicInteger> and map.computeIfAbsent to simplify the following code logic



##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+    regionGroupCacheMap.forEach(
+        (consensusGroupId, regionGroupCache) -> {
+          Integer count = idToCountMap.get(regionGroupCache.getLeaderDataNodeId());
+          if (count == null) {
+            idToCountMap.put(regionGroupCache.getLeaderDataNodeId(), 1);
+          } else {
+            idToCountMap.put(regionGroupCache.getLeaderDataNodeId(), count + 1);
+          }

Review Comment:
   Better use getAllLeadership interface to complete these code~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] CRZbulabula commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
CRZbulabula commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r942027241


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();

Review Comment:
   You can use Map<Integer, AtomicInteger> and map.computeIfAbsent to simplify the following code logic



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] Beyyes commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
Beyyes commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r940949331


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {

Review Comment:
   private is better?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] 23931017wu commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
23931017wu commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r941175170


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+    regionGroupCacheMap.forEach(

Review Comment:
   Okay, learned ~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] SpriCoder commented on pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
SpriCoder commented on PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#issuecomment-1210046040

   Besides, please update doc of new metrics at the some time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] Beyyes commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
Beyyes commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r940971922


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -327,7 +327,8 @@ private void updateNodeLoadStatistic() {
       broadcastLatestRegionRouteMap();
     }
     if (nodeCacheMap.size() == getNodeManager().getRegisteredNodeCount()) {
-      addMetrics();
+      addNodeMetrics();
+      addLeaderCount();

Review Comment:
   UpdateNodeLoadStatistic method is executed once in a second, do `addNodeMetrics();` and `addLeaderCount();` need to be executed so frequently?
   
   ![image](https://user-images.githubusercontent.com/6756545/183584881-ef32465c-b765-47a2-bd36-baa150bc9601.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] Beyyes commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
Beyyes commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r943441472


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManagerMetrics.java:
##########
@@ -0,0 +1,277 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iotdb.confignode.manager.load;
+
+import org.apache.iotdb.common.rpc.thrift.TConfigNodeLocation;
+import org.apache.iotdb.common.rpc.thrift.TDataNodeConfiguration;
+import org.apache.iotdb.common.rpc.thrift.TDataNodeLocation;
+import org.apache.iotdb.commons.cluster.NodeStatus;
+import org.apache.iotdb.commons.utils.NodeUrlUtils;
+import org.apache.iotdb.confignode.manager.IManager;
+import org.apache.iotdb.confignode.manager.NodeManager;
+import org.apache.iotdb.db.service.metrics.MetricsService;
+import org.apache.iotdb.db.service.metrics.enums.Metric;
+import org.apache.iotdb.db.service.metrics.enums.Tag;
+import org.apache.iotdb.metrics.config.MetricConfigDescriptor;
+import org.apache.iotdb.metrics.utils.MetricLevel;
+
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+
+/** This class collates metrics about loadManager */
+public class LoadManagerMetrics {
+
+  private final IManager configManager;
+  Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();

Review Comment:
   This variable has not been used.



##########
registerd:
##########
@@ -0,0 +1,2 @@
+[IOTDB-3955 37b6079ef3] registed->G:\deskfile\work\BONC\IoTDB\IoTDB_db\iotdb01\iotdb>git commit -m registed-

Review Comment:
   [important] remove this file



##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManagerMetrics.java:
##########
@@ -0,0 +1,277 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iotdb.confignode.manager.load;
+
+import org.apache.iotdb.common.rpc.thrift.TConfigNodeLocation;
+import org.apache.iotdb.common.rpc.thrift.TDataNodeConfiguration;
+import org.apache.iotdb.common.rpc.thrift.TDataNodeLocation;
+import org.apache.iotdb.commons.cluster.NodeStatus;
+import org.apache.iotdb.commons.utils.NodeUrlUtils;
+import org.apache.iotdb.confignode.manager.IManager;
+import org.apache.iotdb.confignode.manager.NodeManager;
+import org.apache.iotdb.db.service.metrics.MetricsService;
+import org.apache.iotdb.db.service.metrics.enums.Metric;
+import org.apache.iotdb.db.service.metrics.enums.Tag;
+import org.apache.iotdb.metrics.config.MetricConfigDescriptor;
+import org.apache.iotdb.metrics.utils.MetricLevel;
+
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+
+/** This class collates metrics about loadManager */
+public class LoadManagerMetrics {
+
+  private final IManager configManager;
+  Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+  public LoadManagerMetrics(IManager configManager) {
+    this.configManager = configManager;
+  }
+
+  public void addMetrics() {
+    addNodeMetrics();
+    addLeaderCount();
+  }
+
+  private int getRunningConfigNodesNum() {
+    List<TConfigNodeLocation> allConfigNodes =
+        configManager.getLoadManager().getOnlineConfigNodes();
+    if (allConfigNodes == null) {
+      return 0;
+    }
+    for (TConfigNodeLocation configNodeLocation : allConfigNodes) {
+      String name = NodeUrlUtils.convertTEndPointUrl(configNodeLocation.getInternalEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "ConfigNode")
+          .set(1);
+    }
+    return allConfigNodes.size();
+  }
+
+  private int getRunningDataNodesNum() {
+    List<TDataNodeConfiguration> allDataNodes = configManager.getLoadManager().getOnlineDataNodes();
+    if (allDataNodes == null) {
+      return 0;
+    }
+    for (TDataNodeConfiguration dataNodeInfo : allDataNodes) {
+      TDataNodeLocation dataNodeLocation = dataNodeInfo.getLocation();
+      String name = NodeUrlUtils.convertTEndPointUrl(dataNodeLocation.getClientRpcEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "DataNode")
+          .set(1);
+    }
+    return allDataNodes.size();
+  }
+
+  private int getUnknownConfigNodesNum() {
+    List<TConfigNodeLocation> allConfigNodes =
+        configManager.getLoadManager().getUnknownConfigNodes();
+    if (allConfigNodes == null) {
+      return 0;
+    }
+    for (TConfigNodeLocation configNodeLocation : allConfigNodes) {
+      String name = NodeUrlUtils.convertTEndPointUrl(configNodeLocation.getInternalEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "ConfigNode")
+          .set(0);
+    }
+    return allConfigNodes.size();
+  }
+
+  private int getUnknownDataNodesNum() {
+    List<TDataNodeConfiguration> allDataNodes =
+        configManager.getLoadManager().getUnknownDataNodes();
+    if (allDataNodes == null) {
+      return 0;
+    }
+    for (TDataNodeConfiguration dataNodeInfo : allDataNodes) {
+      TDataNodeLocation dataNodeLocation = dataNodeInfo.getLocation();
+      String name = NodeUrlUtils.convertTEndPointUrl(dataNodeLocation.getClientRpcEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "DataNode")
+          .set(0);
+    }
+    return allDataNodes.size();
+  }
+
+  public void addNodeMetrics() {
+    if (MetricConfigDescriptor.getInstance().getMetricConfig().getEnableMetric()) {
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.CONFIG_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getRunningConfigNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Online.toString());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.DATA_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getRunningDataNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Online.toString());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.CONFIG_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getUnknownConfigNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Unknown.toString());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.DATA_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getUnknownDataNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Unknown.toString());
+    }
+  }
+
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Integer getLeadershipCountByDatanode(int DataNodeId) {

Review Comment:
   ```suggestion
     public Integer getLeadershipCountByDatanode(int dataNodeId) {
   ```



##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManagerMetrics.java:
##########
@@ -0,0 +1,277 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iotdb.confignode.manager.load;
+
+import org.apache.iotdb.common.rpc.thrift.TConfigNodeLocation;
+import org.apache.iotdb.common.rpc.thrift.TDataNodeConfiguration;
+import org.apache.iotdb.common.rpc.thrift.TDataNodeLocation;
+import org.apache.iotdb.commons.cluster.NodeStatus;
+import org.apache.iotdb.commons.utils.NodeUrlUtils;
+import org.apache.iotdb.confignode.manager.IManager;
+import org.apache.iotdb.confignode.manager.NodeManager;
+import org.apache.iotdb.db.service.metrics.MetricsService;
+import org.apache.iotdb.db.service.metrics.enums.Metric;
+import org.apache.iotdb.db.service.metrics.enums.Tag;
+import org.apache.iotdb.metrics.config.MetricConfigDescriptor;
+import org.apache.iotdb.metrics.utils.MetricLevel;
+
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+
+/** This class collates metrics about loadManager */
+public class LoadManagerMetrics {
+
+  private final IManager configManager;
+  Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+  public LoadManagerMetrics(IManager configManager) {
+    this.configManager = configManager;
+  }
+
+  public void addMetrics() {
+    addNodeMetrics();
+    addLeaderCount();
+  }
+
+  private int getRunningConfigNodesNum() {
+    List<TConfigNodeLocation> allConfigNodes =
+        configManager.getLoadManager().getOnlineConfigNodes();
+    if (allConfigNodes == null) {
+      return 0;
+    }
+    for (TConfigNodeLocation configNodeLocation : allConfigNodes) {
+      String name = NodeUrlUtils.convertTEndPointUrl(configNodeLocation.getInternalEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "ConfigNode")
+          .set(1);
+    }
+    return allConfigNodes.size();
+  }
+
+  private int getRunningDataNodesNum() {
+    List<TDataNodeConfiguration> allDataNodes = configManager.getLoadManager().getOnlineDataNodes();
+    if (allDataNodes == null) {
+      return 0;
+    }
+    for (TDataNodeConfiguration dataNodeInfo : allDataNodes) {
+      TDataNodeLocation dataNodeLocation = dataNodeInfo.getLocation();
+      String name = NodeUrlUtils.convertTEndPointUrl(dataNodeLocation.getClientRpcEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "DataNode")
+          .set(1);
+    }
+    return allDataNodes.size();
+  }
+
+  private int getUnknownConfigNodesNum() {
+    List<TConfigNodeLocation> allConfigNodes =
+        configManager.getLoadManager().getUnknownConfigNodes();
+    if (allConfigNodes == null) {
+      return 0;
+    }
+    for (TConfigNodeLocation configNodeLocation : allConfigNodes) {
+      String name = NodeUrlUtils.convertTEndPointUrl(configNodeLocation.getInternalEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "ConfigNode")
+          .set(0);
+    }
+    return allConfigNodes.size();
+  }
+
+  private int getUnknownDataNodesNum() {
+    List<TDataNodeConfiguration> allDataNodes =
+        configManager.getLoadManager().getUnknownDataNodes();
+    if (allDataNodes == null) {
+      return 0;
+    }
+    for (TDataNodeConfiguration dataNodeInfo : allDataNodes) {
+      TDataNodeLocation dataNodeLocation = dataNodeInfo.getLocation();
+      String name = NodeUrlUtils.convertTEndPointUrl(dataNodeLocation.getClientRpcEndPoint());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateGauge(
+              Metric.CLUSTER_NODE_STATUS.toString(),
+              MetricLevel.IMPORTANT,
+              Tag.NAME.toString(),
+              name,
+              Tag.TYPE.toString(),
+              "DataNode")
+          .set(0);
+    }
+    return allDataNodes.size();
+  }
+
+  public void addNodeMetrics() {
+    if (MetricConfigDescriptor.getInstance().getMetricConfig().getEnableMetric()) {
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.CONFIG_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getRunningConfigNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Online.toString());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.DATA_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getRunningDataNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Online.toString());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.CONFIG_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getUnknownConfigNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Unknown.toString());
+
+      MetricsService.getInstance()
+          .getMetricManager()
+          .getOrCreateAutoGauge(
+              Metric.DATA_NODE.toString(),
+              MetricLevel.CORE,
+              this,
+              o -> getUnknownDataNodesNum(),
+              Tag.NAME.toString(),
+              "total",
+              Tag.STATUS.toString(),
+              NodeStatus.Unknown.toString());
+    }
+  }
+
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Integer getLeadershipCountByDatanode(int DataNodeId) {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+    configManager
+        .getLoadManager()
+        .getAllLeadership()
+        .forEach((consensusGroupId, nodeId) -> idToCountMap.merge(nodeId, 1, Integer::sum));
+    return idToCountMap.get(DataNodeId);
+  }
+
+  public void addLeaderCount() {
+    getNodeManager()
+        .getRegisteredDataNodes()
+        .forEach(
+            dataNodeInfo -> {
+              TDataNodeLocation dataNodeLocation = dataNodeInfo.getLocation();
+              int dataNodeId = dataNodeLocation.getDataNodeId();
+              String name =
+                  NodeUrlUtils.convertTEndPointUrl(dataNodeLocation.getClientRpcEndPoint());
+
+              MetricsService.getInstance()
+                  .getMetricManager()
+                  .getOrCreateAutoGauge(
+                      Metric.CLUSTER_NODE_LEADER_COUNT.toString(),
+                      MetricLevel.IMPORTANT,
+                      this,
+                      o -> getLeadershipCountByDatanode(dataNodeId),
+                      Tag.NAME.toString(),
+                      name);
+            });
+  }
+
+  public void removeMetrics() {
+    MetricsService.getInstance()
+        .getMetricManager()
+        .removeGauge(
+            Metric.CONFIG_NODE.toString(),
+            Tag.NAME.toString(),
+            "total",
+            Tag.STATUS.toString(),
+            NodeStatus.Online.toString());
+    MetricsService.getInstance()
+        .getMetricManager()
+        .removeGauge(
+            Metric.DATA_NODE.toString(),
+            Tag.NAME.toString(),
+            "total",
+            Tag.STATUS.toString(),
+            NodeStatus.Online.toString());
+    MetricsService.getInstance()
+        .getMetricManager()
+        .removeGauge(
+            Metric.CONFIG_NODE.toString(),
+            Tag.NAME.toString(),
+            "total",
+            Tag.STATUS.toString(),
+            NodeStatus.Unknown.toString());
+    MetricsService.getInstance()
+        .getMetricManager()
+        .removeGauge(
+            Metric.DATA_NODE.toString(),
+            Tag.NAME.toString(),
+            "total",
+            Tag.STATUS.toString(),
+            NodeStatus.Unknown.toString());
+  }
+
+  private LoadManager getLoadManager() {

Review Comment:
   never used



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] SpriCoder commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
SpriCoder commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r941296800


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -327,7 +327,8 @@ private void updateNodeLoadStatistic() {
       broadcastLatestRegionRouteMap();
     }
     if (nodeCacheMap.size() == getNodeManager().getRegisteredNodeCount()) {
-      addMetrics();
+      addNodeMetrics();
+      addLeaderCount();

Review Comment:
   From my perspective, it's a good idea to maintain variable that hold the number you need, and you can just update the value of the variable when it's truly update.
   As for metric, I think you can use autogauge metric to monitor the value of the variable above(you can see the use of getOrCreateAutoGauge() method in MetricsService).
   Besides, you can try to abstract all the metric operation into a util class which not only hold the variable but also setup all metrics. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] 23931017wu commented on a diff in pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
23931017wu commented on code in PR #6923:
URL: https://github.com/apache/iotdb/pull/6923#discussion_r941177441


##########
confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java:
##########
@@ -590,7 +591,54 @@ public int getUnknownDataNodesNum() {
     return allDataNodes.size();
   }
 
-  public void addMetrics() {
+  /**
+   * Get the LeaderCount of each DataNodeId
+   *
+   * @return Map<DataNodeId, LeaderCount>
+   */
+  public Map<Integer, Integer> getLeadershipCountByDatanode() {
+    Map<Integer, Integer> idToCountMap = new ConcurrentHashMap<>();
+
+    regionGroupCacheMap.forEach(
+        (consensusGroupId, regionGroupCache) -> {
+          Integer count = idToCountMap.get(regionGroupCache.getLeaderDataNodeId());
+          if (count == null) {
+            idToCountMap.put(regionGroupCache.getLeaderDataNodeId(), 1);
+          } else {
+            idToCountMap.put(regionGroupCache.getLeaderDataNodeId(), count + 1);
+          }
+        });
+    return idToCountMap;
+  }
+
+  public void addLeaderCount() {
+    Map<Integer, Integer> idToCountMap = getLeadershipCountByDatanode();
+    getNodeManager()
+        .getRegisteredDataNodes(-1)
+        .forEach(
+            dataNodeInfo -> {
+              TDataNodeLocation dataNodeLocation = dataNodeInfo.getLocation();
+              int dataNodeId = dataNodeLocation.getDataNodeId();
+              if (idToCountMap.containsKey(dataNodeId)) {
+                String name =

Review Comment:
   Okay, I'll add it in the tool class.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [iotdb] Beyyes merged pull request #6923: [IOTDB-4038] Add the leader metrics to the cluster

Posted by GitBox <gi...@apache.org>.
Beyyes merged PR #6923:
URL: https://github.com/apache/iotdb/pull/6923


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org