You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/03/18 20:50:51 UTC

[GitHub] [accumulo] keith-turner commented on issue #1036: fixes #1033 optimize default compaction strategy

keith-turner commented on issue #1036: fixes #1033 optimize default compaction strategy
URL: https://github.com/apache/accumulo/pull/1036#issuecomment-474095604
 
 
   Below are the results of running a simulation of the old and new compaction strategy.  For the simulation N 1K files were added in a loop 1000 times.  After the files were added the tablet was compacted.
   
   Files Added between compactions | Total data added | Old code rewrite | New code rewrite  | Relative work (new rewrite/old rewrite)
   ----------------|-----|----------------|----------------|------
   1 | 1,000,000 | 4,412,000 | 4,416,000 | 1.00
   2 | 2,000,000 | 9,184,000 | 9,184,000 | 1.00
   3 | 3,000,000 | 15,240,000 | 15,240,000 | 1.00
   4 | 4,000,000 | 19,336,000 | 21,304,000 | 1.10
   5 | 5,000,000 | 27,040,000 | 25,495,000 | 0.94
   6 | 6,000,000 | 29,250,000 | 33,984,000 | 1.16
   7 | 7,000,000 | 37,499,000 | 41,874,000 | 1.12
   8 | 8,000,000 | 58,576,000 | 31,507,000| 0.54
   9 | 9,000,000 | 1,129,500,000 | 34,936,000 | 0.03
   
   The old strategy performed slightly better for adding 4,6, and 7 files between compactions.  I don't have a solid understanding of this and need to dig into this some more.   The case of adding 9 files between compaction is significantly better, the updated strategy only does 3% of the work done by the current strategy.  This is the case that motivated this change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services