You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2022/03/22 21:13:05 UTC

[GitHub] [accumulo] keith-turner commented on a change in pull request #2583: Add trace and debug log to consistency check

keith-turner commented on a change in pull request #2583:
URL: https://github.com/apache/accumulo/pull/2583#discussion_r832628526



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java
##########
@@ -809,27 +814,40 @@ public void run() {
     // Periodically check that metadata of tablets matches what is held in memory
     ThreadPools.watchCriticalScheduledTask(ThreadPools.getServerThreadPools()
         .createGeneralScheduledExecutorService(aconf).scheduleWithFixedDelay(() -> {
-          final SortedMap<KeyExtent,Tablet> onlineTabletsSnapshot = onlineTablets.snapshot();
-
-          Map<KeyExtent,Long> updateCounts = new HashMap<>();
+          Instant start = Instant.now();
+          Span mdScanSpan = TraceUtil.startSpan(this.getClass(), "metadataScan");
+          try (Scope scope = mdScanSpan.makeCurrent()) {
+            final SortedMap<KeyExtent,Tablet> onlineTabletsSnapshot = onlineTablets.snapshot();
 
-          // gather updateCounts for each tablet
-          onlineTabletsSnapshot.forEach((ke, tablet) -> {
-            updateCounts.put(ke, tablet.getUpdateCount());
-          });
+            Map<KeyExtent,Long> updateCounts = new HashMap<>();
 
-          // gather metadata for all tablets readTablets()
-          try (TabletsMetadata tabletsMetadata =
-              getContext().getAmple().readTablets().forTablets(onlineTabletsSnapshot.keySet())
-                  .fetch(FILES, LOGS, ECOMP, PREV_ROW).build()) {
-
-            // for each tablet, compare its metadata to what is held in memory
-            tabletsMetadata.forEach(tabletMetadata -> {
-              KeyExtent extent = tabletMetadata.getExtent();
-              Tablet tablet = onlineTabletsSnapshot.get(extent);
-              Long counter = updateCounts.get(extent);
-              tablet.compareTabletInfo(counter, tabletMetadata);
+            // gather updateCounts for each tablet
+            onlineTabletsSnapshot.forEach((ke, tablet) -> {
+              updateCounts.put(ke, tablet.getUpdateCount());
             });
+
+            // gather metadata for all tablets readTablets()
+            try (TabletsMetadata tabletsMetadata =
+                getContext().getAmple().readTablets().forTablets(onlineTabletsSnapshot.keySet())
+                    .fetch(FILES, LOGS, ECOMP, PREV_ROW).build()) {
+
+              // for each tablet, compare its metadata to what is held in memory
+              tabletsMetadata.forEach(tabletMetadata -> {
+                KeyExtent extent = tabletMetadata.getExtent();
+                Tablet tablet = onlineTabletsSnapshot.get(extent);
+                Long counter = updateCounts.get(extent);
+                tablet.compareTabletInfo(counter, tabletMetadata);
+              });
+            }
+          } finally {
+            mdScanSpan.end();
+            Instant end = Instant.now();
+            Duration duration = Duration.between(start, end);
+            log.debug("Metadata scan took {}ms", duration.toMillis());
+            if (duration.toMinutes() > 5) {
+              log.warn(
+                  "Metadata scan is taking too long. Performance of all activities will be severely hindered!");

Review comment:
       would be nice if the logs massage included how long it took and how many exents were looked up.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java
##########
@@ -809,27 +814,40 @@ public void run() {
     // Periodically check that metadata of tablets matches what is held in memory
     ThreadPools.watchCriticalScheduledTask(ThreadPools.getServerThreadPools()
         .createGeneralScheduledExecutorService(aconf).scheduleWithFixedDelay(() -> {
-          final SortedMap<KeyExtent,Tablet> onlineTabletsSnapshot = onlineTablets.snapshot();
-
-          Map<KeyExtent,Long> updateCounts = new HashMap<>();
+          Instant start = Instant.now();

Review comment:
       This lambda is starting to get long, wonder if it should be pulled out to a function.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java
##########
@@ -809,27 +814,40 @@ public void run() {
     // Periodically check that metadata of tablets matches what is held in memory
     ThreadPools.watchCriticalScheduledTask(ThreadPools.getServerThreadPools()
         .createGeneralScheduledExecutorService(aconf).scheduleWithFixedDelay(() -> {
-          final SortedMap<KeyExtent,Tablet> onlineTabletsSnapshot = onlineTablets.snapshot();
-
-          Map<KeyExtent,Long> updateCounts = new HashMap<>();
+          Instant start = Instant.now();
+          Span mdScanSpan = TraceUtil.startSpan(this.getClass(), "metadataScan");
+          try (Scope scope = mdScanSpan.makeCurrent()) {
+            final SortedMap<KeyExtent,Tablet> onlineTabletsSnapshot = onlineTablets.snapshot();
 
-          // gather updateCounts for each tablet
-          onlineTabletsSnapshot.forEach((ke, tablet) -> {
-            updateCounts.put(ke, tablet.getUpdateCount());
-          });
+            Map<KeyExtent,Long> updateCounts = new HashMap<>();
 
-          // gather metadata for all tablets readTablets()
-          try (TabletsMetadata tabletsMetadata =
-              getContext().getAmple().readTablets().forTablets(onlineTabletsSnapshot.keySet())
-                  .fetch(FILES, LOGS, ECOMP, PREV_ROW).build()) {
-
-            // for each tablet, compare its metadata to what is held in memory
-            tabletsMetadata.forEach(tabletMetadata -> {
-              KeyExtent extent = tabletMetadata.getExtent();
-              Tablet tablet = onlineTabletsSnapshot.get(extent);
-              Long counter = updateCounts.get(extent);
-              tablet.compareTabletInfo(counter, tabletMetadata);
+            // gather updateCounts for each tablet
+            onlineTabletsSnapshot.forEach((ke, tablet) -> {
+              updateCounts.put(ke, tablet.getUpdateCount());
             });
+
+            // gather metadata for all tablets readTablets()
+            try (TabletsMetadata tabletsMetadata =
+                getContext().getAmple().readTablets().forTablets(onlineTabletsSnapshot.keySet())
+                    .fetch(FILES, LOGS, ECOMP, PREV_ROW).build()) {
+
+              // for each tablet, compare its metadata to what is held in memory
+              tabletsMetadata.forEach(tabletMetadata -> {
+                KeyExtent extent = tabletMetadata.getExtent();
+                Tablet tablet = onlineTabletsSnapshot.get(extent);
+                Long counter = updateCounts.get(extent);
+                tablet.compareTabletInfo(counter, tabletMetadata);

Review comment:
       This method gets a lock on each tablet.  It should not hold things up (the tablet lock is never supposed to be held for long operations like I/O, but not sure if thats true), but it could possibly.  Anyway this could contribute to the time.  May want to time just the readTablets() call in the code above.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org