You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/03/18 22:49:42 UTC

[GitHub] [accumulo] keith-turner edited a comment on issue #1036: fixes #1033 optimize default compaction strategy

keith-turner edited a comment on issue #1036: fixes #1033 optimize default compaction strategy
URL: https://github.com/apache/accumulo/pull/1036#issuecomment-474127431
 
 
   I tweaked the similuation a bit.  The simuation does a single compaction after adding files. I changed it to run compactions while needed at the end.  I also tracked the number of files per tablet during the simulation and found that the new code has less files on average.  Having less files on average offsets  the increased amount of work.
   
    Files added between compactions | Old rewrite | Old final file count | Old avg files 
   ----------|----|---|----
   1 | 4,412,000 |  8 |  6.30
   2 |  9,184,000 |  5  | 6.38
   3 |  15,240,000 |  6 |  6.35
   4 |  19,336,000 |  7  | 6.32
   5 |  27,040,000 |  6  | 6.31
   6 |  29,250,000 |  9  | 6.37
   7 |  37,499,000 |  9  | 6.34
   8 |  58,576,000 |  6  | 6.43
   9 |  1,138,500,000 |  1  | 6.16
   
    Files added between compactions | New rewrite | New final file count | New avg files
   ----------|----|---|----
   1 | 4,416,000 | 6 | 5.97
   2 | 9,184,000 | 5 | 6.21
   3 | 15,240,000 | 6 | 6.13
   4 | 21,304,000 | 5 | 6.09
   5 | 25,495,000 | 10 | 6.07
   6 | 33,984,000 | 8 | 6.04
   7 | 41,874,000 | 8 | 6.03
   8 | 39,684,000 | 1 | 6.99
   9 | 46,295,000 | 1 | 7.86
   
   Below is the java code for the simulation that produced these numbers, excluding the simulated tablet code.
   
   ```java
    LongSummaryStatistics lss = new LongSummaryStatistics();
   
       for (int n = 1; n < 10; n++) {
         // create a simulated tablet with max files per compaction of 10 and max files per tablet of 15
         SimulatedTablet simuTablet = new SimulatedTablet(10, 15);
   
         for (int i = 0; i < 1000; i++) {
           simuTablet.addFiles(n, 1000, 10);
   
           simuTablet.compact(MajorCompactionReason.NORMAL);
   
           lss.accept(simuTablet.getNumFiles());
         }
   
         // compact tablet until it no longer compacts
         while(simuTablet.compact(MajorCompactionReason.NORMAL) > 0) {
           lss.accept(simuTablet.getNumFiles());
         }
   
         System.out.printf("%d %,d %d %.2f\n", n, simuTablet.getTotalRead(), simuTablet.getNumFiles(), lss.getAverage());
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services