You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/02/02 19:43:02 UTC

[GitHub] [accumulo] milleruntime opened a new pull request #1899: Prevent deleted tablets from being flushed

milleruntime opened a new pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899


   * The CleanUp step of deletes will wait until all tablets of a tablet
   are unassigned. This will stop the memory mgr from flushing if the table
   is being deleted, allowing it to be unassigned faster.
   * Remove deleted tablet from memory reports in TabletServerResourceManager
   so it won't keep trying to flush delete tablets when they are large
   * Add TableState param to Tablet.initiateMinorCompaction() so calling code
   in TabletServerResourceManager only has to get the state once
   * Add debug to Tablet.completeClose() for better insight when waiting


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r568994093



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -556,14 +557,15 @@ private void manageMemory() {
                 continue;
               }
               Tablet tablet = tabletReport.getTablet();
-              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM)) {
-                if (tablet.isClosed()) {
+              var state = context.getTableManager().getTableState(tablet.getExtent().tableId());
+              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM, state)) {
+                if (tablet.isClosed() || state.equals(TableState.DELETING)) {

Review comment:
       If a tablet is in the deleteting state for a long time and holding lots of memory, it seems like it could starve other minor compactions from ever running.  Does this deleting of reports help mitigate that?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r574721044



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -556,14 +557,15 @@ private void manageMemory() {
                 continue;
               }
               Tablet tablet = tabletReport.getTablet();
-              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM)) {
-                if (tablet.isClosed()) {
+              var state = context.getTableManager().getTableState(tablet.getExtent().tableId());
+              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM, state)) {
+                if (tablet.isClosed() || state.equals(TableState.DELETING)) {

Review comment:
       Enums should always be compared using `==` instead of `.equals()`:
   
   ```suggestion
                   if (tablet.isClosed() || state == TableState.DELETING) {
   ```

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/memory/LargestFirstMemoryManager.java
##########
@@ -147,6 +148,11 @@ protected boolean tableExists(TableId tableId) {
     return context.getTableConfiguration(tableId) != null;
   }
 
+  private boolean tableBeingDeleted(TableId tableId) {
+    var state = context.getTableManager().getTableState(tableId);
+    return state.equals(TableState.DELETING);

Review comment:
       Should use `==` to compare enum.
   
   ```suggestion
       return context.getTableManager().getTableState(tableId) == TableState.DELETING;
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#issuecomment-778272456


   > > I think this PR is good to go. I added `isBeingDeleted()` to Tablet because I already found another spot we can call it to prevent extra work on Deletes. I will do this in another PR.
   > 
   > Hold up, one more fix...
   
   OK NOW I think its good to go.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#issuecomment-778239989


   I think this PR is good to go. I added `isBeingDeleted()` to Tablet because I already found another spot we can call it to prevent extra work on Deletes. I will do this in another PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r568994093



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -556,14 +557,15 @@ private void manageMemory() {
                 continue;
               }
               Tablet tablet = tabletReport.getTablet();
-              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM)) {
-                if (tablet.isClosed()) {
+              var state = context.getTableManager().getTableState(tablet.getExtent().tableId());
+              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM, state)) {
+                if (tablet.isClosed() || state.equals(TableState.DELETING)) {

Review comment:
       If a tablet is in the deleteting state for a long time and holding lots of memory, it seems like it could starve other minor compactions from ever running.  Does this deleting of reports help mitigate that?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime merged pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime merged pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r575213059



##########
File path: server/tserver/src/test/java/org/apache/accumulo/tserver/memory/LargestFirstMemoryManagerTest.java
##########
@@ -204,22 +212,35 @@ protected long getMinCIdleThreshold(KeyExtent extent) {
     protected boolean tableExists(TableId tableId) {
       return true;
     }
+
+    @Override
+    protected boolean tableBeingDeleted(TableId tableId) {
+      return false;
+    }
   }
 
   private static class LargestFirstMemoryManagerWithExistenceCheck
       extends LargestFirstMemoryManagerUnderTest {
 
     Function<TableId,Boolean> existenceCheck;
+    Function<TableId,Boolean> deletingCheck;
 
-    public LargestFirstMemoryManagerWithExistenceCheck(Function<TableId,Boolean> existenceCheck) {
+    public LargestFirstMemoryManagerWithExistenceCheck(Function<TableId,Boolean> existenceCheck,
+        Function<TableId,Boolean> deletingCheck) {
       super();
       this.existenceCheck = existenceCheck;
+      this.deletingCheck = deletingCheck;

Review comment:
       It'd be better to use the primitive functional interface that maps to a boolean: `Predicate<TableId>` instead of `Function<TableId,Boolean>`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r568896442



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java
##########
@@ -1899,7 +1906,8 @@ public void checkIfMinorCompactionNeededForLogs(List<DfsLogger> closedLogs) {
 
     if (reason != null) {
       // initiate and log outside of tablet lock
-      initiateMinorCompaction(MinorCompactionReason.SYSTEM);
+      TableState tableState = context.getTableManager().getTableState(extent.tableId());
+      initiateMinorCompaction(MinorCompactionReason.SYSTEM, tableState);
       log.debug("Initiating minor compaction for {} because {}", getExtent(), reason);

Review comment:
       It could get confusing that this debug statement says it's initiating a flush and occurs immediately after the one inside `initiateMinorCompaction()` that says not to flush.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#issuecomment-771930792


   I tested this enhancement on Uno with 2 tservers running 2 RW MultiTable jobs and a Conditional job for increased pressure. It worked well and I saw a few situations where the largest tablet to be flushed was being deleted and dropped from the memory report. I was able to get the cluster up to 200+ tables with 2000+ tablets before the GC started running low on memory.
   
   <pre>
   2021-02-02T13:59:35,144 [tablet.Tablet] DEBUG: Table 6z is being deleted so don't flush 6z;4;3
   2021-02-02T13:59:35,144 [tserver.TabletServerResourceManager] DEBUG: Cleaned up report for closed/deleted tablet 6z;4;3
   2021-02-02T13:59:35,144 [tserver.TabletServerResourceManager] DEBUG: Ignoring memory manager recommendation: not minor compacting closed tablet 6z;4;3
   </pre>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r568899255



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java
##########
@@ -1899,7 +1906,8 @@ public void checkIfMinorCompactionNeededForLogs(List<DfsLogger> closedLogs) {
 
     if (reason != null) {
       // initiate and log outside of tablet lock
-      initiateMinorCompaction(MinorCompactionReason.SYSTEM);
+      TableState tableState = context.getTableManager().getTableState(extent.tableId());
+      initiateMinorCompaction(MinorCompactionReason.SYSTEM, tableState);
       log.debug("Initiating minor compaction for {} because {}", getExtent(), reason);

Review comment:
       Good catch. I'll move the debug before the method call.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#issuecomment-777705235


   After some testing, I think I am going to change the log statement in `LargestFirstMemoryManager` back to trace. My change in dfc579b will skip over the tablet about 4x a second (which should give flushes better throughput) but can get spammy.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r568899255



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java
##########
@@ -1899,7 +1906,8 @@ public void checkIfMinorCompactionNeededForLogs(List<DfsLogger> closedLogs) {
 
     if (reason != null) {
       // initiate and log outside of tablet lock
-      initiateMinorCompaction(MinorCompactionReason.SYSTEM);
+      TableState tableState = context.getTableManager().getTableState(extent.tableId());
+      initiateMinorCompaction(MinorCompactionReason.SYSTEM, tableState);
       log.debug("Initiating minor compaction for {} because {}", getExtent(), reason);

Review comment:
       Good catch. I'll move the debug before the method call.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#issuecomment-777564622


   @ctubbsii I made another improvement to this PR in dfc579b that you may want to look over. I added another check to prevent tablets being deleted from added to the list. I am not sure if I want to include this and/or the previous change where I remove the tablet from the reports.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r568896442



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java
##########
@@ -1899,7 +1906,8 @@ public void checkIfMinorCompactionNeededForLogs(List<DfsLogger> closedLogs) {
 
     if (reason != null) {
       // initiate and log outside of tablet lock
-      initiateMinorCompaction(MinorCompactionReason.SYSTEM);
+      TableState tableState = context.getTableManager().getTableState(extent.tableId());
+      initiateMinorCompaction(MinorCompactionReason.SYSTEM, tableState);
       log.debug("Initiating minor compaction for {} because {}", getExtent(), reason);

Review comment:
       It could get confusing that this debug statement says it's initiating a flush and occurs immediately after the one inside `initiateMinorCompaction()` that says not to flush.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r574573443



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -556,14 +557,15 @@ private void manageMemory() {
                 continue;
               }
               Tablet tablet = tabletReport.getTablet();
-              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM)) {
-                if (tablet.isClosed()) {
+              var state = context.getTableManager().getTableState(tablet.getExtent().tableId());
+              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM, state)) {
+                if (tablet.isClosed() || state.equals(TableState.DELETING)) {

Review comment:
       I added this change in my latest commits. Let me know what you think.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#issuecomment-771930792


   I tested this enhancement on Uno with 2 tservers running 2 RW MultiTable jobs and a Conditional job for increased pressure. It worked well and I saw a few situations where the largest tablet to be flushed was being deleted and dropped from the memory report. I was able to get the cluster up to 200+ tables with 2000+ tablets before the GC started running low on memory.
   
   <pre>
   2021-02-02T13:59:35,144 [tablet.Tablet] DEBUG: Table 6z is being deleted so don't flush 6z;4;3
   2021-02-02T13:59:35,144 [tserver.TabletServerResourceManager] DEBUG: Cleaned up report for closed/deleted tablet 6z;4;3
   2021-02-02T13:59:35,144 [tserver.TabletServerResourceManager] DEBUG: Ignoring memory manager recommendation: not minor compacting closed tablet 6z;4;3
   </pre>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r569678771



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -556,14 +557,15 @@ private void manageMemory() {
                 continue;
               }
               Tablet tablet = tabletReport.getTablet();
-              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM)) {
-                if (tablet.isClosed()) {
+              var state = context.getTableManager().getTableState(tablet.getExtent().tableId());
+              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM, state)) {
+                if (tablet.isClosed() || state.equals(TableState.DELETING)) {

Review comment:
       Yeah. I haven't seen the MinC max threads (tserver.compaction.minor.concurrent.max=4) consumed by 4 deleted tablets after removing the report. But I think it is still possible.
   
   I am wondering now if I should move the check to inside of `LargestFirstMemoryManager.tabletsToMinorCompact()` so that the deleted tablets will be skipped and not returned in `tabletsToMinorCompact`. This would allow flushes up to `tserver.compaction.minor.concurrent.max` without having to make another pass. Right now the `tabletsToMinorCompact` won't utilize all of the MinC threads if some tablets are being deleted. If all large tablets are being deleted, then no MinC will run during that pass.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#issuecomment-778245506


   > I think this PR is good to go. I added `isBeingDeleted()` to Tablet because I already found another spot we can call it to prevent extra work on Deletes. I will do this in another PR.
   
   Hold up, one more fix...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1899: Prevent deleted tablets from being flushed

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1899:
URL: https://github.com/apache/accumulo/pull/1899#discussion_r574573443



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -556,14 +557,15 @@ private void manageMemory() {
                 continue;
               }
               Tablet tablet = tabletReport.getTablet();
-              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM)) {
-                if (tablet.isClosed()) {
+              var state = context.getTableManager().getTableState(tablet.getExtent().tableId());
+              if (!tablet.initiateMinorCompaction(MinorCompactionReason.SYSTEM, state)) {
+                if (tablet.isClosed() || state.equals(TableState.DELETING)) {

Review comment:
       I added this change in my latest commit dfc579b. Let me know what you think.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org