You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/09/04 14:18:22 UTC

[GitHub] [accumulo] ctubbsii opened a new issue #1345: GC forces metadata compaction every cycle

ctubbsii opened a new issue #1345: GC forces metadata compaction every cycle
URL: https://github.com/apache/accumulo/issues/1345
 
 
   The SimpleGarbageCollector class calls `.compact(...)` (with both `flush` and `wait` flags set to `true`) every cycle, on the operating assumption that the GC performed many changes to the metadata and it'd be good to flush and compact.
   
   However, we probably don't need to compact. We can probably flush instead. If the metadata table's compaction ratio is still set to the default of `1` (set during initialization), this will be enough to trigger a compaction on the changed tablets (but not on the unmodified ones). This is a less aggressive solution than forcing a full table compaction on the metadata, and will also permit users to select a different compaction ratio for the metadata tables, if they choose to do so.
   
   Some other ideas that could be done as separate issues (if necessary): eliminate the automatic flush from the GC entirely, and put this responsibility on the user, or implement a custom compaction strategy that makes use of summarizers to compact the metadata more efficiently (only when there's a clear benefit).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services