You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2017/10/22 19:16:19 UTC

[GitHub] janl commented on issue #909: Compaction daemon should prioritise based on gains

janl commented on issue #909: Compaction daemon should prioritise based on gains
URL: https://github.com/apache/couchdb/issues/909#issuecomment-338501774
 
 
   +1
   
   Should be not too hard a fix, the main compaction daemon loop is working off of _all_docs: https://github.com/apache/couchdb/blob/master/src/couch/src/couch_compaction_daemon.erl#L129
   
   The code to calculate the biggest-benefeacting database can be scavenged from here: https://github.com/apache/couchdb/blob/master/src/couch/src/couch_compaction_daemon.erl#L318-L323
   
   With both of these entry points, the algorithm in compact_loop should now:
   1. calculate size info for all dbs/shards into a sorted table
   2. top of the table entry gets compacted
   3. GOTO 1.
   
   Bit of handwaving about views. Two options:
   
   1. views just go into the main list of things that can be compacted and biggest-benefit views just go on top, like a database would.
   2. when calculating database fragmentation, include the view fragmentation as well, and then compact the database and its views (with or without the parallel option as it does now), as a batch before moving on to the next.
   
   My gut feel 1. is what makes a little more sense, but I would take 2. if the PR is easier and leave 1. for a later patch.
   
   @calonso wanna give this a shot?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services