You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/09/08 17:45:47 UTC

[GitHub] [accumulo] milleruntime opened a new pull request #2263: Split compact method into two more specific methods

milleruntime opened a new pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263


   * Split compact method in CompactionService into to 2 methods:
   getCompactionPlan() and submitCompactionJob()
   * Add getCandidates() method to CompactionPlan
   * Rename other compact() in CompactionManager and CompactionService to submitCompaction()
   * Add javadoc comments to various methods
   * Add final to various private members
   * Drop redundant enum check and boolean assignment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r705451949



##########
File path: core/src/main/java/org/apache/accumulo/core/spi/compaction/CompactionPlan.java
##########
@@ -55,5 +56,13 @@ Builder addJob(short priority, CompactionExecutorId executor,
     CompactionPlan build();
   }
 
+  /**
+   * Return the set of compaction candidates.
+   */
+  Set<CompactableFile> getCandidates();

Review comment:
       > There shouldn't ever be a single CompactableFile by itself right?
   
   @EdColeman reminded me of the use case where a user configures the `AgeOffFilter` and runs a user compaction to drop old data. Do you know where this would occur in CompactableImpl? I can't find it in the new code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r705608288



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/compactions/CompactionService.java
##########
@@ -363,11 +374,11 @@ private void planCompaction(CompactionKind kind, Compactable compactable,
 
       if (!jobs.isEmpty()) {
         log.trace("Submitted compaction plan {} id:{} files:{} plan:{}", compactable.getExtent(),
-            myId, files, plan);
+            myId, plan.getCandidates(), plan);

Review comment:
       I reverted the changes to CompactionPlan but here I changed the method to use the `Compactable.Files` object as opposed to the Optional. I think this cleans it up a bit and should print better info than printing the Optional object.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r705293998



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/compactions/CompactionService.java
##########
@@ -223,15 +224,22 @@ private boolean reconcile(Set<CompactionJob> jobs, Collection<SubmittedJob> subm
     return true;
   }
 
-  public void compact(CompactionKind kind, Compactable compactable,
+  /**
+   * Get compaction plan for the provided compactable tablet and possibly submit for compaction.
+   * Plans get added to the planning queue before calling the planningExecutor to get the plan. If
+   * no files are selected, return. Otherwise, submit the compaction job.
+   */
+  public void submitCompaction(CompactionKind kind, Compactable compactable,
       Consumer<Compactable> completionCallback) {
     Objects.requireNonNull(compactable);
 
+    // add tablet to planning queue and use planningExecutor to get the plan
     if (queuedForPlanning.get(kind).putIfAbsent(compactable.getExtent(), compactable) == null) {
       try {
         planningExecutor.execute(() -> {
           try {
-            planCompaction(kind, compactable, completionCallback);
+            Optional<CompactionPlan> plan = getCompactionPlan(kind, compactable);
+            plan.ifPresent(cp -> submitCompactionJob(cp, compactable, completionCallback));

Review comment:
       Thanks. I like this better. I thought getting the files was part of getting the plan but it looks like the plan is more about the jobs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r705291592



##########
File path: core/src/main/java/org/apache/accumulo/core/spi/compaction/CompactionPlan.java
##########
@@ -55,5 +56,13 @@ Builder addJob(short priority, CompactionExecutorId executor,
     CompactionPlan build();
   }
 
+  /**
+   * Return the set of compaction candidates.
+   */
+  Set<CompactableFile> getCandidates();

Review comment:
       I was thinking since the `CompactionPlan` is the output of the `CompactionPlanner`, it would have the candidates in it. But now I see that `CompactionJob` has files as well. I will drop this from the plan. I am thinking now, it would be nice to be able to differentiate between the files as to when `CompactableFile` actually gets selected for compaction. Maybe we could do something similar to what I did with `TabletFile` and `StoredTabletFile`. 
   
   Also, trying to write javadoc for `CompactableFile` made me wonder if we should make it a self contained collection since you need at least 2 files to compact them into 1. There shouldn't ever be a single CompactableFile by itself right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r705486999



##########
File path: core/src/main/java/org/apache/accumulo/core/spi/compaction/CompactionPlan.java
##########
@@ -55,5 +56,13 @@ Builder addJob(short priority, CompactionExecutorId executor,
     CompactionPlan build();
   }
 
+  /**
+   * Return the set of compaction candidates.
+   */
+  Set<CompactableFile> getCandidates();

Review comment:
       yeah user initiated compactions will compact a tablet even if it has a single file, unless there is a selection plugin that does something else.  The following code is called by CompactableImpl and actually does the selection.
   
   https://github.com/apache/accumulo/blob/77157661af4e6de790e4b0cd34debd02f69bdce2/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableUtils.java#L444




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime merged pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
milleruntime merged pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r704739058



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/compactions/CompactionService.java
##########
@@ -223,15 +224,22 @@ private boolean reconcile(Set<CompactionJob> jobs, Collection<SubmittedJob> subm
     return true;
   }
 
-  public void compact(CompactionKind kind, Compactable compactable,
+  /**
+   * Get compaction plan for the provided compactable tablet and possibly submit for compaction.
+   * Plans get added to the planning queue before calling the planningExecutor to get the plan. If
+   * no files are selected, return. Otherwise, submit the compaction job.
+   */
+  public void submitCompaction(CompactionKind kind, Compactable compactable,
       Consumer<Compactable> completionCallback) {
     Objects.requireNonNull(compactable);
 
+    // add tablet to planning queue and use planningExecutor to get the plan
     if (queuedForPlanning.get(kind).putIfAbsent(compactable.getExtent(), compactable) == null) {
       try {
         planningExecutor.execute(() -> {
           try {
-            planCompaction(kind, compactable, completionCallback);
+            Optional<CompactionPlan> plan = getCompactionPlan(kind, compactable);
+            plan.ifPresent(cp -> submitCompactionJob(cp, compactable, completionCallback));

Review comment:
       As mentioned in other comments, should probably not add candidates to plan and the reason it was added does not actually log all of the info.  Think the following changes may get the needed information where it needs to go w/o modifying the plan interface.
   
   ```suggestion
               var files = compactable.getFiles(...);
               Optional<CompactionPlan> plan = getCompactionPlan(kind, files, compactable.getExtent());
               plan.ifPresent(cp -> submitCompactionJob(cp, files, compactable.getExtent(), completionCallback));
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/spi/compaction/CompactionPlan.java
##########
@@ -55,5 +56,13 @@ Builder addJob(short priority, CompactionExecutorId executor,
     CompactionPlan build();
   }
 
+  /**
+   * Return the set of compaction candidates.
+   */
+  Set<CompactableFile> getCandidates();

Review comment:
       Thinking this should not be added to SPI.  It has nothing to do with the plan for a compaction, but is a subset of the information that went into creating a plan.   Also I suggested other changes that would not require this change.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/compactions/CompactionService.java
##########
@@ -363,11 +374,11 @@ private void planCompaction(CompactionKind kind, Compactable compactable,
 
       if (!jobs.isEmpty()) {
         log.trace("Submitted compaction plan {} id:{} files:{} plan:{}", compactable.getExtent(),
-            myId, files, plan);
+            myId, plan.getCandidates(), plan);

Review comment:
       With changes in other comment could revert to the following.  Also this changes is logging less information than it used to. Used to log the Compactable.Files obj now it logging a subset of that info.
   
   ```suggestion
               myId, files, plan);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r705291592



##########
File path: core/src/main/java/org/apache/accumulo/core/spi/compaction/CompactionPlan.java
##########
@@ -55,5 +56,13 @@ Builder addJob(short priority, CompactionExecutorId executor,
     CompactionPlan build();
   }
 
+  /**
+   * Return the set of compaction candidates.
+   */
+  Set<CompactableFile> getCandidates();

Review comment:
       I was thinking since the `CompactionPlan` is the output of the `CompactionPlanner`, it would have the candidates in it. But now I see that `CompactionJob` has files as well. I will drop this from the plan. I am thinking now, it would be nice to be able to differentiate between the files as to when `CompactableFile` actually gets selected for compaction or not. Maybe we could do something similar to what I did with `TabletFile` and `StoredTabletFile`. 
   
   Also, trying to write javadoc for `CompactableFile` made me wonder if we should make it a self contained collection since you need at least 2 files to compact them into 1. There shouldn't ever be a single CompactableFile by itself right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2263: Split compact method into two more specific methods

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2263:
URL: https://github.com/apache/accumulo/pull/2263#discussion_r705617567



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/compactions/CompactionService.java
##########
@@ -363,11 +374,11 @@ private void planCompaction(CompactionKind kind, Compactable compactable,
 
       if (!jobs.isEmpty()) {
         log.trace("Submitted compaction plan {} id:{} files:{} plan:{}", compactable.getExtent(),
-            myId, files, plan);
+            myId, plan.getCandidates(), plan);

Review comment:
       I also noticed printing the `Optional.empty` after we check if its empty just clutters the log message, so I dropped it:
   <pre>
   TRACE: Compactable returned no files 4;row_0003250000;row_0003000000 USER Optional.empty
   TRACE: Compactable returned no files 4;row_0003250000;row_0003000000 CHOP Optional.empty
   TRACE: Compactable returned no files 4;row_0003500000;row_0003250000 SYSTEM Optional.empty
   TRACE: Compactable returned no files 4;row_0003500000;row_0003250000 SELECTOR Optional.empty
   </pre>




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org