You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/09/29 01:37:21 UTC

[GitHub] [helix] pkuwm commented on a change in pull request #1383: Add Metrics for External Custom Health Check Requests

pkuwm commented on a change in pull request #1383:
URL: https://github.com/apache/helix/pull/1383#discussion_r496318526



##########
File path: helix-rest/src/main/java/org/apache/helix/rest/common/HelixDataAccessorWrapper.java
##########
@@ -166,12 +187,18 @@ public HelixDataAccessorWrapper(ZKHelixDataAccessor dataAccessor, CustomRestClie
 
   private Map<String, Boolean> getHealthStatusFromRest(String instance, List<String> partitions,
       RESTConfig restConfig, Map<String, String> customPayLoads) {
-    try {
+    MetricRegistry metrics = SharedMetricRegistries.getOrCreate(_namespace);
+    Counter requestsTotal = metrics.counter(CUSTOM_PARTITION_CHECK_HTTP_REQUESTS_TOTAL);
+    Counter errorRequestsTotal = metrics.counter(CUSTOM_PARTITION_CHECK_HTTP_REQUESTS_ERROR_TOTAL);
+    try (final Timer.Context timer = metrics.timer(CUSTOM_PARTITION_CHECK_HTTP_REQUEST_DURATION)

Review comment:
       It won't make the duration statistic too bad, compared with network IO. CPU is really fast to process the exception. Actually we still need to use a `try... finally` block - same as this `try.. with..resource`.

##########
File path: helix-rest/src/main/java/org/apache/helix/rest/common/HelixDataAccessorWrapper.java
##########
@@ -54,15 +60,30 @@
   public static final String IS_HEALTHY_KEY = "IS_HEALTHY";
   public static final String EXPIRY_KEY = "EXPIRE";
 
+  // Metric names for custom partition check
+  private static final String CUSTOM_PARTITION_CHECK_HTTP_REQUESTS_TOTAL =
+      MetricRegistry.name(InstanceService.class, "custom_partition_check_http_requests_total");

Review comment:
       In the object name, there is a `type` property for us to know that: `type=counters` or `type=gauges`. Then we'll use the `type` property.

##########
File path: helix-rest/src/main/java/org/apache/helix/rest/server/service/InstanceServiceImpl.java
##########
@@ -58,9 +63,18 @@
   private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
   private static final ExecutorService POOL = Executors.newCachedThreadPool();
 
+  // Metric names for custom instance check
+  private static final String CUSTOM_INSTANCE_CHECK_HTTP_REQUESTS_TOTAL =
+      MetricRegistry.name(InstanceService.class, "custom_instance_check_http_requests_total");
+  private static final String CUSTOM_INSTANCE_CHECK_HTTP_REQUESTS_ERROR_TOTAL =
+      MetricRegistry.name(InstanceService.class, "custom_instance_check_http_requests_error_total");
+  private static final String CUSTOM_INSTANCE_CHECK_HTTP_REQUEST_DURATION =
+      MetricRegistry.name(InstanceService.class, "custom_instance_check_http_request_duration");
+
   private final HelixDataAccessorWrapper _dataAccessor;

Review comment:
       No :) You got it. The first PR was a draft. A constructor with namespace is passed in, I didn't push the change yet when you saw this. Updated.

##########
File path: helix-rest/src/main/java/org/apache/helix/rest/common/HelixDataAccessorWrapper.java
##########
@@ -54,15 +60,30 @@
   public static final String IS_HEALTHY_KEY = "IS_HEALTHY";
   public static final String EXPIRY_KEY = "EXPIRE";
 
+  // Metric names for custom partition check
+  private static final String CUSTOM_PARTITION_CHECK_HTTP_REQUESTS_TOTAL =
+      MetricRegistry.name(InstanceService.class, "custom_partition_check_http_requests_total");
+  private static final String CUSTOM_PARTITION_CHECK_HTTP_REQUESTS_ERROR_TOTAL = MetricRegistry
+      .name(InstanceService.class, "custom_partition_check_http_requests_error_total");
+  private static final String CUSTOM_PARTITION_CHECK_HTTP_REQUEST_DURATION =
+      MetricRegistry.name(InstanceService.class, "custom_partition_check_http_request_duration");
+
   private final Map<PropertyKey, HelixProperty> _propertyCache = new HashMap<>();
   private final Map<PropertyKey, List<String>> _batchNameCache = new HashMap<>();
+  private String _namespace;
   protected CustomRestClient _restClient;
 
   public HelixDataAccessorWrapper(ZKHelixDataAccessor dataAccessor) {

Review comment:
       I was thinking other callers may use it. Within helix, only a mock class calls it. So I just changed it and deprecate this, assuming this constructor is supposed to be used within helix rest.

##########
File path: helix-rest/src/main/java/org/apache/helix/rest/server/service/InstanceServiceImpl.java
##########
@@ -231,14 +252,22 @@ private StoppableCheck performHelixOwnInstanceCheck(String clusterId, String ins
   private StoppableCheck performCustomInstanceCheck(String clusterId, String instanceName,
       String baseUrl, Map<String, String> customPayLoads) {
     LOG.info("Perform instance level client side health checks for {}/{}", clusterId, instanceName);
-    try {
-      return new StoppableCheck(
-          _customRestClient.getInstanceStoppableCheck(baseUrl, customPayLoads),
+    MetricRegistry metrics = SharedMetricRegistries.getOrCreate(_namespace);
+    Counter requestsTotal = metrics.counter(CUSTOM_INSTANCE_CHECK_HTTP_REQUESTS_TOTAL);
+    Counter errorRequestsTotal = metrics.counter(CUSTOM_INSTANCE_CHECK_HTTP_REQUESTS_ERROR_TOTAL);
+
+    try (final Timer.Context timer = metrics.timer(CUSTOM_INSTANCE_CHECK_HTTP_REQUEST_DURATION)

Review comment:
       Java 8 `try..with..resource` needs this variable, though we won't use it in the block :). Maybe a newer java supports it.

##########
File path: helix-rest/src/main/java/org/apache/helix/rest/server/service/InstanceServiceImpl.java
##########
@@ -58,9 +63,18 @@
   private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
   private static final ExecutorService POOL = Executors.newCachedThreadPool();
 
+  // Metric names for custom instance check
+  private static final String CUSTOM_INSTANCE_CHECK_HTTP_REQUESTS_TOTAL =

Review comment:
       This naming is based on the method/function. It is a generic interface api `getInstanceStoppableCheck`, so no matter it is a health check or else, as long as it calls this api, we record it. What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org