You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/12/02 16:06:18 UTC

[GitHub] [accumulo] keith-turner commented on a change in pull request #2320: Periodically verify tablet metadata

keith-turner commented on a change in pull request #2320:
URL: https://github.com/apache/accumulo/pull/2320#discussion_r761226430



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java
##########
@@ -794,6 +802,58 @@ public void run() {
       }
     }, 0, 5000, TimeUnit.MILLISECONDS);
 
+    int tabletCheckFrequency = 30 + random.nextInt(31); // random 30-60 minute delay
+    // Periodically check that metadata of tablets matches what is held in memory
+    ThreadPools.createGeneralScheduledExecutorService(aconf).scheduleWithFixedDelay(() -> {
+      final SortedMap<KeyExtent,Tablet> onlineTabletsSnapshot = onlineTablets.snapshot();
+
+      final SortedSet<KeyExtent> userTablets = new TreeSet<>();
+      final SortedSet<KeyExtent> nonUserTablets = new TreeSet<>();
+
+      // Create subsets of tablets based on DataLevel: one set who's DataLevel is USER and another
+      // containing the remaining tablets (those who's DataLevel is ROOT or METADATA).
+      // This needs to happen so we can use .readTablets() on the DataLevel.USER tablets in order
+      // to reduce RPCs.
+      // TODO: Push this partitioning, based on DataLevel, to ample - accumulo issue #2373
+      onlineTabletsSnapshot.forEach((ke, tablet) -> {
+        if (Ample.DataLevel.of(ke.tableId()) == Ample.DataLevel.USER) {
+          userTablets.add(ke);
+        } else {
+          nonUserTablets.add(ke);
+        }
+      });
+
+      Map<KeyExtent,Long> updateCounts = new HashMap<>();
+
+      // gather updateCounts for each tablet
+      onlineTabletsSnapshot.forEach((ke, tablet) -> {
+        updateCounts.put(ke, tablet.getUpdateCount());
+      });
+
+      List<TabletMetadata> tmdList;
+
+      // gather metadata for all tablets with DataLevel.USER using readTablets()
+      try (TabletsMetadata tabletsMetadata = getContext().getAmple().readTablets()
+          .forTablets(userTablets).fetch(FILES, LOGS, ECOMP, PREV_ROW).build()) {
+        tmdList = tabletsMetadata.stream().collect(Collectors.toList());
+      }
+
+      // gather metadata for all tablets with DataLevel.ROOT or METADATA using readTablet()
+      nonUserTablets.forEach(extent -> {
+        TabletMetadata tabletMetadata =
+            getContext().getAmple().readTablet(extent, FILES, LOGS, ECOMP, PREV_ROW);
+        tmdList.add(tabletMetadata);

Review comment:
       tmdList was created by Collectors.toList() a bit earlier.  Looking at the [javadoc for that method](https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collectors.html#toList--) it says the following.
   
   > Returns a Collector that accumulates the input elements into a new List. There are no guarantees on the type, mutability, serializability, or thread-safety of the List returned; if more control over the returned List is required, use toCollection(Supplier).
   
   Since the javadoc says there is no guarantee of mutability, should probably not add to that.  I think I would start tmdList off w/ an ArrayList and avoid the Collectors.toList() call.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/DatafileManager.java
##########
@@ -69,10 +69,11 @@
   // ensure we only have one reader/writer of our bulk file notes at at time
   private final Object bulkFileImportLock = new Object();
 
+  private long updateCount;

Review comment:
       ```suggestion
     // This must be incremented whenever datafileSizes is mutated
     private long updateCount;
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org