You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@solr.apache.org by ja...@apache.org on 2023/09/20 23:04:28 UTC

[solr] branch branch_9x updated: SOLR-15056: add circuit breaker for CPU, fix load circuit breaker (#96)

This is an automated email from the ASF dual-hosted git repository.

janhoy pushed a commit to branch branch_9x
in repository https://gitbox.apache.org/repos/asf/solr.git


The following commit(s) were added to refs/heads/branch_9x by this push:
     new a1d0938bbe5 SOLR-15056:  add circuit breaker for CPU, fix load circuit breaker (#96)
a1d0938bbe5 is described below

commit a1d0938bbe5ca31a1cd2906d912c0a67952de040
Author: wrunderwood <wu...@wunderwood.org>
AuthorDate: Wed Sep 20 18:42:57 2023 -0400

    SOLR-15056:  add circuit breaker for CPU, fix load circuit breaker (#96)
    
    Co-authored-by: Jan Høydahl <ja...@users.noreply.github.com>
    (cherry picked from commit 51c1a785c4611d0103f7b73c8adefa028d608bcd)
---
 solr/CHANGES.txt                                   |  4 +
 .../util/circuitbreaker/CPUCircuitBreaker.java     | 92 +++++++++++++++------
 .../util/circuitbreaker/CircuitBreakerManager.java |  9 +-
 .../circuitbreaker/LoadAverageCircuitBreaker.java  | 95 ++++++++++++++++++++++
 .../conf/solrconfig-pluggable-circuitbreaker.xml   |  4 +
 .../apache/solr/util/BaseTestCircuitBreaker.java   | 84 ++++++++-----------
 .../deployment-guide/pages/circuit-breakers.adoc   | 59 ++++++++++++--
 .../pages/major-changes-in-solr-9.adoc             |  3 +
 8 files changed, 265 insertions(+), 85 deletions(-)

diff --git a/solr/CHANGES.txt b/solr/CHANGES.txt
index 457dcae7a03..8138ad6fc00 100644
--- a/solr/CHANGES.txt
+++ b/solr/CHANGES.txt
@@ -12,6 +12,10 @@ New Features
 
 * SOLR-16954: Make Circuit Breakers available for Update Requests (janhoy, Christine Poerschke, Pierre Salagnac)
 
+* SOLR-15056: A new Circuit breaker for percentage of CPU utilization is added. The former "CPU" circuit breaker
+  is now more correctly named LoadAverageCircuitBreaker as it trips on system load average which is not a percentage.
+  Users of legacy CircuitBreakerManager are not affected by this change. (Walter Underwood, janhoy, Christine Poerschke, Atri Sharma)
+
 * SOLR-15771: bin/auth creates reasonable roles and permissions for security: 'search', 'index', 'admin', and 'superadmin' and assigns user superadmin role. (Eric Pugh, janhoy)
 
 * SOLR-15367: Convert "rid" functionality into a default Tracer (Alex Deparvu, David Smiley)
diff --git a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
index 90c86499b3c..4c1ac111c58 100644
--- a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
+++ b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
@@ -17,56 +17,63 @@
 
 package org.apache.solr.util.circuitbreaker;
 
+import com.codahale.metrics.Gauge;
+import com.codahale.metrics.Metric;
 import java.lang.invoke.MethodHandles;
-import java.lang.management.ManagementFactory;
-import java.lang.management.OperatingSystemMXBean;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.metrics.SolrMetricManager;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 /**
  * Tracks current CPU usage and triggers if the specified threshold is breached.
  *
- * <p>This circuit breaker gets the average CPU load over the last minute and uses that data to take
- * a decision. We depend on OperatingSystemMXBean which does not allow a configurable interval of
- * collection of data. //TODO: Use Codahale Meter to calculate the value locally.
- *
- * <p>The configuration to define which mode to use and the trigger threshold are defined in
- * solrconfig.xml
+ * <p>This circuit breaker gets the recent average CPU usage and uses that data to take a decision.
+ * We depend on OperatingSystemMXBean which does not allow a configurable interval of collection of
+ * data.
  */
 public class CPUCircuitBreaker extends CircuitBreaker {
   private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
-  private static final OperatingSystemMXBean operatingSystemMXBean =
-      ManagementFactory.getOperatingSystemMXBean();
 
+  private boolean enabled = true;
   private double cpuUsageThreshold;
+  private final SolrCore core;
 
   private static final ThreadLocal<Double> seenCPUUsage = ThreadLocal.withInitial(() -> 0.0);
 
   private static final ThreadLocal<Double> allowedCPUUsage = ThreadLocal.withInitial(() -> 0.0);
 
-  public CPUCircuitBreaker() {
+  public CPUCircuitBreaker(SolrCore core) {
     super();
-  }
-
-  public void setThreshold(double threshold) {
-    this.cpuUsageThreshold = threshold;
+    this.core = core;
   }
 
   @Override
-  public boolean isTripped() {
-
-    double localAllowedCPUUsage = getCpuUsageThreshold();
+  public void init(NamedList<?> args) {
+    super.init(args);
     double localSeenCPUUsage = calculateLiveCPUUsage();
 
     if (localSeenCPUUsage < 0) {
-      if (log.isWarnEnabled()) {
-        String msg = "Unable to get CPU usage";
-
-        log.warn(msg);
+      String msg =
+          "Initialization failure for CPU circuit breaker. Unable to get 'systemCpuLoad', not supported by the JVM?";
+      if (log.isErrorEnabled()) {
+        log.error(msg);
       }
+      enabled = false;
+    }
+  }
 
+  @Override
+  public boolean isTripped() {
+    if (!enabled) {
+      if (log.isDebugEnabled()) {
+        log.debug("CPU circuit breaker is disabled due to initialization failure.");
+      }
       return false;
     }
+    double localAllowedCPUUsage = getCpuUsageThreshold();
+    double localSeenCPUUsage = calculateLiveCPUUsage();
 
     allowedCPUUsage.set(localAllowedCPUUsage);
 
@@ -84,11 +91,50 @@ public class CPUCircuitBreaker extends CircuitBreaker {
         + allowedCPUUsage.get();
   }
 
+  public void setThreshold(double thresholdValueInPercentage) {
+    if (thresholdValueInPercentage > 100) {
+      throw new IllegalArgumentException("Invalid Invalid threshold value.");
+    }
+
+    if (thresholdValueInPercentage <= 0) {
+      throw new IllegalStateException("Threshold cannot be less than or equal to zero");
+    }
+    cpuUsageThreshold = thresholdValueInPercentage;
+  }
+
   public double getCpuUsageThreshold() {
     return cpuUsageThreshold;
   }
 
+  /**
+   * Calculate the CPU usage for the system in percentage.
+   *
+   * @return Percent CPU usage of -1 if value could not be obtained.
+   */
   protected double calculateLiveCPUUsage() {
-    return operatingSystemMXBean.getSystemLoadAverage();
+    // TODO: Use Codahale Meter to calculate the value
+    Metric metric =
+        this.core
+            .getCoreContainer()
+            .getMetricManager()
+            .registry("solr.jvm")
+            .getMetrics()
+            .get("os.systemCpuLoad");
+
+    if (metric == null) {
+      return -1.0;
+    }
+
+    if (metric instanceof Gauge) {
+      @SuppressWarnings({"rawtypes"})
+      Gauge gauge = (Gauge) metric;
+      // unwrap if needed
+      if (gauge instanceof SolrMetricManager.GaugeWrapper) {
+        gauge = ((SolrMetricManager.GaugeWrapper) gauge).getGauge();
+      }
+      return (Double) gauge.getValue() * 100;
+    }
+
+    return -1.0; // Unable to unpack metric
   }
 }
diff --git a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
index 5b39217b33b..02e3c7af676 100644
--- a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
+++ b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
@@ -23,8 +23,8 @@ import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 /**
- * Single CircuitBreaker that registers both a Memory and a CPU CircuitBreaker. This is only for
- * backward compatibility with the 9.x versions prior to 9.4.
+ * Single CircuitBreaker that registers both a Memory and a LoadAverage CircuitBreaker. This is only
+ * for backward compatibility with the 9.x versions prior to 9.4.
  *
  * @deprecated Use individual Circuit Breakers instead
  */
@@ -36,7 +36,7 @@ public class CircuitBreakerManager extends CircuitBreaker {
   private int memThreshold = 100;
   private int cpuThreshold = 100;
   private MemoryCircuitBreaker memCB;
-  private CPUCircuitBreaker cpuCB;
+  private LoadAverageCircuitBreaker cpuCB;
 
   public CircuitBreakerManager() {
     super();
@@ -71,7 +71,8 @@ public class CircuitBreakerManager extends CircuitBreaker {
       memCB.setThreshold(memThreshold);
     }
     if (cpuEnabled) {
-      cpuCB = new CPUCircuitBreaker();
+      // In SOLR-15056 CPUCircuitBreaker was renamed to LoadAverageCircuitBreaker, need back-compat
+      cpuCB = new LoadAverageCircuitBreaker();
       cpuCB.setThreshold(cpuThreshold);
     }
   }
diff --git a/solr/core/src/java/org/apache/solr/util/circuitbreaker/LoadAverageCircuitBreaker.java b/solr/core/src/java/org/apache/solr/util/circuitbreaker/LoadAverageCircuitBreaker.java
new file mode 100644
index 00000000000..77772b927b3
--- /dev/null
+++ b/solr/core/src/java/org/apache/solr/util/circuitbreaker/LoadAverageCircuitBreaker.java
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.util.circuitbreaker;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.management.ManagementFactory;
+import java.lang.management.OperatingSystemMXBean;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Tracks current system load average and triggers if the specified threshold is breached.
+ *
+ * <p>This circuit breaker gets the load average (length of the run queue) over the last minute and
+ * uses that data to take a decision. We depend on OperatingSystemMXBean which does not allow a
+ * configurable interval of collection of data.
+ */
+public class LoadAverageCircuitBreaker extends CircuitBreaker {
+  private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+  private static final OperatingSystemMXBean operatingSystemMXBean =
+      ManagementFactory.getOperatingSystemMXBean();
+
+  private double loadAverageThreshold;
+
+  // Assumption -- the value of these parameters will be set correctly before invoking
+  // getDebugInfo()
+  private static final ThreadLocal<Double> seenLoadAverage = ThreadLocal.withInitial(() -> 0.0);
+
+  private static final ThreadLocal<Double> allowedLoadAverage = ThreadLocal.withInitial(() -> 0.0);
+
+  public LoadAverageCircuitBreaker() {
+    super();
+  }
+
+  @Override
+  public boolean isTripped() {
+    double localAllowedLoadAverage = getLoadAverageThreshold();
+    double localSeenLoadAverage = calculateLiveLoadAverage();
+
+    if (localSeenLoadAverage < 0) {
+      if (log.isWarnEnabled()) {
+        String msg = "Unable to get load average";
+
+        log.warn(msg);
+      }
+
+      return false;
+    }
+
+    allowedLoadAverage.set(localAllowedLoadAverage);
+
+    seenLoadAverage.set(localSeenLoadAverage);
+
+    return (localSeenLoadAverage >= localAllowedLoadAverage);
+  }
+
+  @Override
+  public String getErrorMessage() {
+    return "Load Average Circuit Breaker triggered as seen load average is above allowed threshold."
+        + "Seen load average "
+        + seenLoadAverage.get()
+        + " and allocated threshold "
+        + allowedLoadAverage.get();
+  }
+
+  public void setThreshold(double thresholdValueUnbounded) {
+    if (thresholdValueUnbounded <= 0) {
+      throw new IllegalStateException("Threshold cannot be less than or equal to zero");
+    }
+    loadAverageThreshold = thresholdValueUnbounded;
+  }
+
+  public double getLoadAverageThreshold() {
+    return loadAverageThreshold;
+  }
+
+  protected double calculateLiveLoadAverage() {
+    return operatingSystemMXBean.getSystemLoadAverage();
+  }
+}
diff --git a/solr/core/src/test-files/solr/collection1/conf/solrconfig-pluggable-circuitbreaker.xml b/solr/core/src/test-files/solr/collection1/conf/solrconfig-pluggable-circuitbreaker.xml
index 8719a00ea7b..52956f60824 100644
--- a/solr/core/src/test-files/solr/collection1/conf/solrconfig-pluggable-circuitbreaker.xml
+++ b/solr/core/src/test-files/solr/collection1/conf/solrconfig-pluggable-circuitbreaker.xml
@@ -98,6 +98,10 @@
     <double  name="threshold">75</double>
   </circuitBreaker>
 
+  <circuitBreaker class="solr.LoadAverageCircuitBreaker">
+    <double  name="threshold">3</double>
+  </circuitBreaker>
+
   <initParams path="/select">
     <lst name="defaults">
       <str name="df">text</str>
diff --git a/solr/core/src/test/org/apache/solr/util/BaseTestCircuitBreaker.java b/solr/core/src/test/org/apache/solr/util/BaseTestCircuitBreaker.java
index 607bf31c617..71c6fe67f8d 100644
--- a/solr/core/src/test/org/apache/solr/util/BaseTestCircuitBreaker.java
+++ b/solr/core/src/test/org/apache/solr/util/BaseTestCircuitBreaker.java
@@ -30,9 +30,11 @@ import org.apache.solr.common.SolrException;
 import org.apache.solr.common.params.CommonParams;
 import org.apache.solr.common.util.ExecutorUtil;
 import org.apache.solr.common.util.SolrNamedThreadFactory;
+import org.apache.solr.core.SolrCore;
 import org.apache.solr.util.circuitbreaker.CPUCircuitBreaker;
 import org.apache.solr.util.circuitbreaker.CircuitBreaker;
 import org.apache.solr.util.circuitbreaker.CircuitBreakerManager;
+import org.apache.solr.util.circuitbreaker.LoadAverageCircuitBreaker;
 import org.apache.solr.util.circuitbreaker.MemoryCircuitBreaker;
 import org.hamcrest.MatcherAssert;
 import org.junit.After;
@@ -120,62 +122,35 @@ public abstract class BaseTestCircuitBreaker extends SolrTestCaseJ4 {
   }
 
   public void testBuildingMemoryPressure() {
-    ExecutorService executor =
-        ExecutorUtil.newMDCAwareCachedThreadPool(new SolrNamedThreadFactory("TestCircuitBreaker"));
-
-    AtomicInteger failureCount = new AtomicInteger();
-
-    try {
-      removeAllExistingCircuitBreakers();
+    MemoryCircuitBreaker circuitBreaker = new BuildingUpMemoryPressureCircuitBreaker();
+    circuitBreaker.setThreshold(75);
 
-      CircuitBreaker circuitBreaker = new BuildingUpMemoryPressureCircuitBreaker();
-      MemoryCircuitBreaker memoryCircuitBreaker = (MemoryCircuitBreaker) circuitBreaker;
-
-      memoryCircuitBreaker.setThreshold(75);
-
-      h.getCore().getCircuitBreakerRegistry().register(circuitBreaker);
+    assertThatHighQueryLoadTrips(circuitBreaker, 1);
+  }
 
-      List<Future<?>> futures = new ArrayList<>();
+  public void testFakeCPUCircuitBreaker() {
+    CPUCircuitBreaker circuitBreaker = new FakeCPUCircuitBreaker(h.getCore());
+    circuitBreaker.setThreshold(75);
 
-      for (int i = 0; i < 5; i++) {
-        Future<?> future =
-            executor.submit(
-                () -> {
-                  try {
-                    h.query(req("name:\"john smith\""));
-                  } catch (SolrException e) {
-                    MatcherAssert.assertThat(
-                        e.getMessage(), containsString("Circuit Breakers tripped"));
-                    failureCount.incrementAndGet();
-                  } catch (Exception e) {
-                    throw new RuntimeException(e.getMessage());
-                  }
-                });
+    assertThatHighQueryLoadTrips(circuitBreaker, 5);
+  }
 
-        futures.add(future);
-      }
+  public void testFakeLoadAverageCircuitBreaker() {
+    LoadAverageCircuitBreaker circuitBreaker = new FakeLoadAverageCircuitBreaker();
+    circuitBreaker.setThreshold(75);
 
-      for (Future<?> future : futures) {
-        try {
-          future.get();
-        } catch (Exception e) {
-          throw new RuntimeException(e.getMessage());
-        }
-      }
-    } finally {
-      ExecutorUtil.shutdownAndAwaitTermination(executor);
-      assertEquals("Number of failed queries is not correct", 1, failureCount.get());
-    }
+    assertThatHighQueryLoadTrips(circuitBreaker, 5);
   }
 
-  public void testFakeCPUCircuitBreaker() {
+  /**
+   * Common assert method to be reused in tests
+   *
+   * @param circuitBreaker the breaker to test
+   * @param numShouldTrip the number of queries that should trip the breaker
+   */
+  private void assertThatHighQueryLoadTrips(CircuitBreaker circuitBreaker, int numShouldTrip) {
     removeAllExistingCircuitBreakers();
 
-    CircuitBreaker circuitBreaker = new FakeCPUCircuitBreaker();
-    CPUCircuitBreaker cpuCircuitBreaker = (CPUCircuitBreaker) circuitBreaker;
-
-    cpuCircuitBreaker.setThreshold(75);
-
     h.getCore().getCircuitBreakerRegistry().register(circuitBreaker);
 
     AtomicInteger failureCount = new AtomicInteger();
@@ -212,7 +187,7 @@ public abstract class BaseTestCircuitBreaker extends SolrTestCaseJ4 {
       }
     } finally {
       ExecutorUtil.shutdownAndAwaitTermination(executor);
-      assertEquals("Number of failed queries is not correct", 5, failureCount.get());
+      assertEquals("Number of failed queries is not correct", numShouldTrip, failureCount.get());
     }
   }
 
@@ -330,9 +305,20 @@ public abstract class BaseTestCircuitBreaker extends SolrTestCaseJ4 {
   }
 
   private static class FakeCPUCircuitBreaker extends CPUCircuitBreaker {
+    public FakeCPUCircuitBreaker(SolrCore core) {
+      super(core);
+    }
+
     @Override
     protected double calculateLiveCPUUsage() {
-      return 92; // Return a value large enough to trigger the circuit breaker
+      return Double.MAX_VALUE;
+    }
+  }
+
+  private static class FakeLoadAverageCircuitBreaker extends LoadAverageCircuitBreaker {
+    @Override
+    protected double calculateLiveLoadAverage() {
+      return Double.MAX_VALUE;
     }
   }
 }
diff --git a/solr/solr-ref-guide/modules/deployment-guide/pages/circuit-breakers.adoc b/solr/solr-ref-guide/modules/deployment-guide/pages/circuit-breakers.adoc
index b517a2dd195..f9f535c6005 100644
--- a/solr/solr-ref-guide/modules/deployment-guide/pages/circuit-breakers.adoc
+++ b/solr/solr-ref-guide/modules/deployment-guide/pages/circuit-breakers.adoc
@@ -24,7 +24,12 @@ resource configuration.
 Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability.
 If circuit breakers are enabled, requests may be rejected under the condition of high node duress with HTTP error code 429 'Too Many Requests'.
 
-It is up to the client to handle this error and potentially build a retrial logic as this should ideally be a transient situation.
+It is up to the client to handle this error and potentially build retry logic as this should be a transient situation.
+
+In a sharded collection, when a circuit breaker trips on one shard, the entire query will fail,
+even if the other shard requests succeed. This will multiply the failures seen by the end users.
+Setting the `shards.tolerant=true` parameter on requests can help with graceful degradation when
+circuit breaker thresholds are reached on some nodes. See the <<shards.tolerant Parameter>> for details.
 
 == Circuit Breaker Configurations
 All circuit breaker configurations are listed as independent `<circuitBreaker>` entries in `solrconfig.xml` as shown below.
@@ -32,6 +37,14 @@ A circuit breaker can register itself to trip for query requests and/or update r
 
 == Currently Supported Circuit Breakers
 
+[NOTE]
+====
+The legacy configuration syntax using `CircuitBreakerManager` is deprecated as of Solr 9.4, but will
+continue to work. The "CPU" circuit breaker used by this legacy plugin when configuring a `cpuThreshold`
+is actually the `LoadAverageCircuitBreaker` described below. Also, the `CircuitBreakerManager` will
+return a HTTP 503 code instead of the HTTP 429 code used by the new circuit breakers.
+====
+
 === JVM Heap Usage
 
 This circuit breaker tracks JVM heap memory usage and rejects incoming requests with a 429 error code if the heap usage exceeds a configured percentage of maximum heap allocated to the JVM (-Xmx).
@@ -48,7 +61,9 @@ To enable and configure the JVM heap usage based circuit breaker, add the follow
 
 The `threshold` is defined as a percentage of the max heap allocated to the JVM.
 
-It does not logically make sense to have a threshold below 50% and above 95% of the max heap allocated to the JVM.
+For the circuit breaker configuration, a value of "0" maps to 0% usage and a value of "100" maps to 100% usage.
+
+It does not logically make sense to have a threshold below 50% or above 95% of the max heap allocated to the JVM.
 Hence, the range of valid values for this parameter is [50, 95], both inclusive.
 
 Consider the following example:
@@ -56,11 +71,13 @@ Consider the following example:
 JVM has been allocated a maximum heap of 5GB (-Xmx) and `threshold` is set to `75`.
 In this scenario, the heap usage at which the circuit breaker will trip is 3.75GB.
 
-=== CPU Utilization
+=== System CPU Usage Circuit Breaker
+This circuit breaker tracks system CPU usage and triggers if the recent CPU usage exceeds a configurable threshold.
 
-This circuit breaker tracks CPU utilization and triggers if the average CPU utilization over the last one minute exceeds a configurable threshold.
-Note that the value used in computation is over the last one minute -- so a sudden spike in traffic that goes down might still cause the circuit breaker to trigger for a short while before it resolves and updates the value.
-For more details of the calculation, please see https://en.wikipedia.org/wiki/Load_(computing)
+This is tracked with the JMX metric `OperatingSystemMXBean.getSystemCpuLoad()`. That measures the
+recent CPU usage for the whole system. This metric is provided by the `com.sun.management` package,
+which is not implemented on all JVMs. If the metric is not available, the circuit breaker will be
+disabled and log an error message. An alternative can then be to use the <<system-load-average-circuit-breaker>>.
 
 To enable and configure the CPU utilization based circuit breaker:
 
@@ -71,7 +88,30 @@ To enable and configure the CPU utilization based circuit breaker:
 </circuitBreaker>
 ----
 
-The `threshold` is defined in units of CPU utilization.
+The triggering threshold is defined in percent CPU usage. A value of "0" maps to 0% usage
+and a value of "100" maps to 100% usage. The example above will trip when the CPU usage is
+equal to or greater than 75%.
+
+=== System Load Average Circuit Breaker
+This circuit breaker tracks system load average and triggers if the recent load average exceeds a configurable threshold.
+
+This is tracked with the JMX metric `OperatingSystemMXBean.getSystemLoadAverage()`. That measures the
+recent load average for the whole system. A "load average" is the number of processes using or waiting for a CPU,
+usually averaged over one minute. Some systems include processes waiting on IO in the load average. Check the
+documentation for your system and JVM to understand this metric. For more information, see the
+https://en.wikipedia.org/wiki/Load_(computing)[Wikipedia page for Load],
+
+To enable and configure the CPU utilization based circuit breaker:
+
+[source,xml]
+----
+<circuitBreaker class="org.apache.solr.util.circuitbreaker.LoadAverageCircuitBreaker">
+ <double  name="threshold">8.0</double>
+</circuitBreaker>
+----
+
+The triggering threshold is a floating point number matching load average.
+The example circuit breaker above will trip when the load average is equal to or greater than 8.0.
 
 == Advanced example
 
@@ -99,6 +139,7 @@ This would prevent expensive bulk updates from impacting search. Note also the s
 
 == Performance Considerations
 
-It is worth noting that while JVM or CPU circuit breakers do not add any noticeable overhead per request, having too many circuit breakers checked for a single request can cause a performance overhead.
+While JVM or CPU circuit breakers do not add any noticeable overhead per request, having too many circuit breakers checked for a single request can cause a performance overhead.
 
-In addition, it is a good practice to exponentially back off while retrying requests on a busy node.
+In addition, it is a good practice to exponentially back off while retrying requests on a busy node. 
+See the https://en.wikipedia.org/wiki/Exponential_backoff[Wikipedia page for Exponential Backoff].
diff --git a/solr/solr-ref-guide/modules/upgrade-notes/pages/major-changes-in-solr-9.adoc b/solr/solr-ref-guide/modules/upgrade-notes/pages/major-changes-in-solr-9.adoc
index df5a0985263..978e69f8bf0 100644
--- a/solr/solr-ref-guide/modules/upgrade-notes/pages/major-changes-in-solr-9.adoc
+++ b/solr/solr-ref-guide/modules/upgrade-notes/pages/major-changes-in-solr-9.adoc
@@ -80,6 +80,9 @@ Therefore, when using the default settings, nodes that were previously excluded
 * The Embedded Zookeeper can now be configured to listen to (or bind to) more hosts than just `localhost`,
 see the  xref:deployment-guide:securing-solr.adoc#network-configuration[Network Configuration documentation] for more information.
 
+=== Circuit Breaker
+* The Circuit Breakers are now pluggable, and you can define multiple circuit breakers including custom ones. The existing `CircuitBreakerManager` is deprecated, and users are encouraged to switch to the new plugins. While the old `CircuitBreakerManager` returned HTTP 503 when a circuit breaker was tripped, the new plugins return HTTP 429.
+
 === Security
 * Since Solr 8.4.1/8.5.0, the `solr.jetty.ssl.verifyClientHostName` sysProp and `SOLR_SSL_CLIENT_HOSTNAME_VERIFICATION` envVar have been used incorrectly.
 It has instead been used to override the `solr.ssl.checkPeerName` sysProp in the `HTTP2SolrClient`.