You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/06/02 23:34:08 UTC

[GitHub] [solr] janhoy commented on a change in pull request #96: SOLR-15056: add circuit breaker for CPU, fix load circuit breaker

janhoy commented on a change in pull request #96:
URL: https://github.com/apache/solr/pull/96#discussion_r644375943



##########
File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java
##########
@@ -66,14 +66,19 @@ protected boolean isEnabled() {
     private final int memCBThreshold;
     private final boolean cpuCBEnabled;
     private final int cpuCBThreshold;
+    private final boolean loadAverageCBEnabled;
+    private final double loadAverageCBThreshold;
 
     public CircuitBreakerConfig(final boolean enabled, final boolean memCBEnabled, final int memCBThreshold,
-                                  final boolean cpuCBEnabled, final int cpuCBThreshold) {
+                                final boolean cpuCBEnabled, final int cpuCBThreshold,

Review comment:
       After this change there will be tree CBs available: `memCBEnabled`, `loadAverageCBEnabled` (previously cpuCBEnabled), and `cpuCBEnabled` (the new one). So this should be correct, not?

##########
File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
##########
@@ -110,6 +113,27 @@ public double getCpuUsageThreshold() {
   }
 
   protected double calculateLiveCPUUsage() {
-    return operatingSystemMXBean.getSystemLoadAverage();
+    Metric metric = this.core
+        .getCoreContainer()
+        .getMetricManager()
+        .registry("solr.jvm")
+        .getMetrics()
+        .get("os.systemCpuLoad");
+
+    if (metric == null) {
+        return -1.0;
+    }
+
+    if (metric instanceof Gauge) {
+      @SuppressWarnings({"rawtypes"})
+          Gauge gauge = (Gauge) metric;
+      // unwrap if needed
+      if (gauge instanceof SolrMetricManager.GaugeWrapper) {
+        gauge = ((SolrMetricManager.GaugeWrapper) gauge).getGauge();
+      }
+      return ((Double) gauge.getValue()).doubleValue();
+    }
+
+    return -1.0;                // Unable to unpack metric

Review comment:
       I checked. The caller will look for <0 values and warn-log "Unable to get CPU usage". I don't fancy special return values to signal errors when there could be a dedicated Exception and catch logic though.

##########
File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/LoadAverageCircuitBreaker.java
##########
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.util.circuitbreaker;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.management.ManagementFactory;
+import java.lang.management.OperatingSystemMXBean;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * <p>
+ * Tracks current system load average and triggers if the specified threshold is breached.
+ *
+ * This circuit breaker gets the load average (length of the run queue) over the last
+ * minute and uses that data to take a decision. We depend on OperatingSystemMXBean which does
+ * not allow a configurable interval of collection of data.
+ * //TODO: Use Codahale Meter to calculate the value locally.
+ * </p>
+ *
+ * <p>
+ * The configuration to define which mode to use and the trigger threshold are defined in
+ * solrconfig.xml
+ * </p>
+ */
+public class LoadAverageCircuitBreaker extends CircuitBreaker {
+  private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+  private static final OperatingSystemMXBean operatingSystemMXBean = ManagementFactory.getOperatingSystemMXBean();
+
+  private final boolean enabled;
+  private final double loadAverageThreshold;
+
+  // Assumption -- the value of these parameters will be set correctly before invoking getDebugInfo()
+  private static final ThreadLocal<Double> seenLoadAverage = ThreadLocal.withInitial(() -> 0.0);
+
+  private static final ThreadLocal<Double> allowedLoadAverage = ThreadLocal.withInitial(() -> 0.0);
+
+  public LoadAverageCircuitBreaker(CircuitBreakerConfig config) {
+    super(config);
+
+    this.enabled = config.getLoadAverageCBEnabled();
+    this.loadAverageThreshold = config.getLoadAverageCBThreshold();
+  }
+
+  @Override
+  public boolean isTripped() {
+    if (!isEnabled()) {
+      return false;
+    }
+
+    if (!enabled) {
+      return false;
+    }
+
+    double localAllowedLoadAverage = getLoadAverageThreshold();
+    double localSeenLoadAverage = calculateLiveLoadAverage();
+
+    if (localSeenLoadAverage < 0) {
+      if (log.isWarnEnabled()) {

Review comment:
       We could also defer further refactoring to a followup JIRA to split things up a bit. There may be other larger refactorings for 9.0 so perhaps there are synergies?

##########
File path: solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java
##########
@@ -199,7 +201,63 @@ public void testFakeCPUCircuitBreaker() {
       PluginInfo pluginInfo = h.getCore().getSolrConfig().getPluginInfo(CircuitBreakerManager.class.getName());
 
       CircuitBreaker.CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerManager.buildCBConfig(pluginInfo);
-      CircuitBreaker circuitBreaker = new FakeCPUCircuitBreaker(circuitBreakerConfig);
+      CircuitBreaker circuitBreaker = new FakeCPUCircuitBreaker(circuitBreakerConfig, null);

Review comment:
       Ah, I miss Korlin and named non-positional arguments with default values! :) 

##########
File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
##########
@@ -40,12 +41,15 @@
  * solution. There will be a follow up with a SIP for a schema API design.
  */
 public class CircuitBreakerManager implements PluginInfoInitialized {
+  private final SolrCore core;

Review comment:
       Can we make `CircuitBreaker` subclasses that need the core implement `SolrCoreAware`? That would be the most elegant instead of having to pass all context down through the manager.
   
   I'd also like to split up things even more to make CBs truly pluggable, and not managed by some common Manager that has to know about all plugins, and a common config class? But that's for another JIRA, and also a breaking change that is better suited for 9.0




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org