You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "EdColeman (via GitHub)" <gi...@apache.org> on 2023/02/13 14:34:31 UTC

[GitHub] [accumulo] EdColeman commented on a diff in pull request #3161: Rename GarbageCollectionLogger, optionally pause scans/compactions until memory usage is lower

EdColeman commented on code in PR #3161:
URL: https://github.com/apache/accumulo/pull/3161#discussion_r1104530543


##########
server/base/src/main/java/org/apache/accumulo/server/mem/LowMemoryDetector.java:
##########
@@ -81,9 +139,19 @@ public void logGCInfo(AccumuloConfiguration conf) {
         gcTimeIncreasedCount = 0;
       } else {
         gcTimeIncreasedCount++;
-        if (gcTimeIncreasedCount > 3 && mem < rt.maxMemory() * 0.05) {
+        if (gcTimeIncreasedCount > 3 && mem < rt.maxMemory() * freeMemoryPercentage) {
+          runningLowOnMemory = true;
           log.warn("Running low on memory");
           gcTimeIncreasedCount = 0;
+        } else {
+          // If we were running low on memory, but are not any longer, than log at warn
+          // so that it shows up in the logs
+          if (runningLowOnMemory) {
+            log.warn("Not running low on memory");
+          } else {
+            log.debug("Not running low on memory");
+          }

Review Comment:
   Would it be clearer if this was warn - recovered from low memory and debug: current memory usage at x, below threshold y
   
   Stating was is not happening seems harder to understand in the log rather that what is occurring.  There used to be other memory stats printed - so maybe the counts here would be redundant? 



##########
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/TabletBase.java:
##########
@@ -229,7 +235,13 @@ public Tablet.LookupResult lookup(List<Range> ranges, List<KVEntry> results,
   Batch nextBatch(SortedKeyValueIterator<Key,Value> iter, Range range, ScanParameters scanParams)
       throws IOException {
 
-    // log.info("In nextBatch..");
+    while (context.getLowMemoryDetector().isRunningLowOnMemory(context, DetectionScope.SCAN, () -> {
+      return isUserTable;
+    }, () -> {
+      log.info("Not starting next batch because low on memory, extent: {}", extent);
+      server.getScanMetrics().incrementScanPausedForLowMemory();
+      Uninterruptibles.sleepUninterruptibly(500, TimeUnit.MILLISECONDS);

Review Comment:
   Would preventing interrupts delay shutdown?  Basically, would it be better to allow interrupts and stop or at least throw it up the stack?   This is a common question for all usages of `sleepUninterruptibly` that follow.



##########
server/base/src/main/java/org/apache/accumulo/server/mem/LowMemoryDetector.java:
##########
@@ -16,32 +16,90 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-package org.apache.accumulo.server;
+package org.apache.accumulo.server.mem;
 
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
 import java.util.HashMap;
 import java.util.List;
 import java.util.concurrent.locks.Lock;
 import java.util.concurrent.locks.ReentrantLock;
+import java.util.function.Supplier;
 
 import org.apache.accumulo.core.conf.AccumuloConfiguration;
 import org.apache.accumulo.core.conf.Property;
 import org.apache.accumulo.core.util.Halt;
+import org.apache.accumulo.server.ServerContext;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-public class GarbageCollectionLogger {
-  private static final Logger log = LoggerFactory.getLogger(GarbageCollectionLogger.class);
+public class LowMemoryDetector {
+
+  @FunctionalInterface
+  public static interface Action {
+    void execute();
+  }
+
+  public enum DetectionScope {
+    MINC, MAJC, SCAN
+  };
+
+  private static final Logger log = LoggerFactory.getLogger(LowMemoryDetector.class);
 
   private final HashMap<String,Long> prevGcTime = new HashMap<>();
   private long lastMemorySize = 0;
   private long gcTimeIncreasedCount = 0;
-  private static long lastMemoryCheckTime = 0;
-  private static final Lock memCheckTimeLock = new ReentrantLock();
+  private long lastMemoryCheckTime = 0;
+  private final Lock memCheckTimeLock = new ReentrantLock();
+  private volatile boolean runningLowOnMemory = false;
+
+  public long getIntervalMillis(AccumuloConfiguration conf) {
+    return conf.getTimeInMillis(Property.GENERAL_LOW_MEM_DETECTOR_INTERVAL);
+  }
+
+  public boolean isRunningLowOnMemory() {
+    return runningLowOnMemory;
+  }
+
+  /**
+   * @param context server context
+   * @param scope whether this is being checked in the context of scan or compact code
+   * @param isUserTable boolean as to whether the table being scanned / compacted is a user table.
+   *        No action is taken for system tables.
+   * @param action Action to perform when this method returns true
+   * @return true if server running low on memory
+   */
+  public boolean isRunningLowOnMemory(ServerContext context, DetectionScope scope,
+      Supplier<Boolean> isUserTable, Action action) {
+    if (isUserTable.get()) {
+      Property p = null;
+      switch (scope) {
+        case SCAN:
+          p = Property.GENERAL_LOW_MEM_SCAN_PROTECTION;
+          break;
+        case MINC:
+          p = Property.GENERAL_LOW_MEM_MINC_PROTECTION;
+          break;
+        case MAJC:
+          p = Property.GENERAL_LOW_MEM_MAJC_PROTECTION;
+          break;
+        default:
+          throw new IllegalArgumentException("Unknown scope: " + scope);
+      }
+      boolean isEnabled = context.getConfiguration().getBoolean(p);
+      // Only incur the penalty of accessing the volatile variable when enabled for this scope
+      if (isEnabled) {
+        action.execute();
+        return runningLowOnMemory;
+      }
+    }
+    return false;
+  }
 
   public void logGCInfo(AccumuloConfiguration conf) {
 
+    Double freeMemoryPercentage = conf.getFraction(Property.GENERAL_LOW_MEM_DETECTOR_THRESHOLD);
+
     memCheckTimeLock.lock();
     try {
       final long now = System.currentTimeMillis();

Review Comment:
   This could be calculated with nanos.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org