You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2020/03/03 22:20:42 UTC

[GitHub] [accumulo] ctubbsii opened a new issue #1543: Use batching strategy for SimpleGarbageCollector candidate memory utilization

ctubbsii opened a new issue #1543: Use batching strategy for SimpleGarbageCollector candidate memory utilization
URL: https://github.com/apache/accumulo/issues/1543
 
 
   The SimpleGarbageCollector tries to gather, check, and delete candidate files for garbage collection without running out of memory. It currently tries to do this by monitoring its own memory utilization when filling up its internal data structure of candidates, and then processing them before memory is full (or near full), clearing, and resuming collection of more candidates.
   
   This makes the SGC handle as many candidates as possible in each pass. However, it makes memory utilization less predictable, and unstable.
   
   A similar problem occurred in trying to test Upgrader9to10's memory utilization and was fixed in #1441 by adopting a strategy of batching candidates in approximately 8MB chunks rather than runtime memory monitoring.
   
   The SimpleGarbageCollector should be changed to employ a similar batching strategy. The batch size could be configurable to give system administrators more control over maximum memory utilization by the SGC.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services