You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2021/03/04 04:23:26 UTC

[GitHub] [bookkeeper] klwilson227 opened a new issue #2638: Compaction Limits

klwilson227 opened a new issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638


   **FEATURE REQUEST**
   
   Currently the compaction feature allows for two compaction specifications (minor and major) that define how the system will perform compaction. Each minor and major can be given a interval and threshold which tells the duration between runs, and the usage % of the entrylog to be considered for compaction. There also exists a rate limit on either (bytes or entries to be written that can be used to balance the system load with the compaction load.)
   
   What we observe as a issue, is that there is no limit to the amount of time that is spent in compaction or limit on the number of logs to be compacted. This may force a delay in garbage collection, which may in turn affect other aspects of the system. In a heavily loaded system the number of entry logs can grow, which will extend the compaction time. 
   
   While compaction is good, the amount of time spent on compaction compared to processing messages should be a limited cost. If the system is under heavy load, then the compaction is also delayed. 
   
   - What are the impacts of continued delay of garbage collection? Can/should the garbage collection be interleaved between compaction of the logs if the compaction spans across gc interval?
   
   
    
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] codelipenghui commented on issue #2638: Compaction Limits

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638#issuecomment-790279333


   Thanks @klwilson227, we will take look this issue soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] codelipenghui commented on issue #2638: Compaction Limits

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638#issuecomment-791058825


   Yes, agree. I also did not see anything in the code regarding the ordering of the compaction. I have a short conversation with @jiazhai yesterday, we will work on an enhancement ASAP.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] klwilson227 commented on issue #2638: Compaction Limits

Posted by GitBox <gi...@apache.org>.
klwilson227 commented on issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638#issuecomment-790657578


   I did not see anything in the code regarding ordering of the compaction. If there is a limit in the time frame, then the order becomes important as well. By tackling the oldest logs first we may free up more space quicker. Than working from the latest logs backwards.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] klwilson227 commented on issue #2638: Compaction Limits

Posted by GitBox <gi...@apache.org>.
klwilson227 commented on issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638#issuecomment-791483333


   The other thought I had was should the compaction of the bookies be staggered within the interval so not all bookies compact the ledgers at the same time. By staggering the bookies, we don't end up adding extra IOPS on all bookies at the same time causing larger ripple on the incoming rates. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] klwilson227 commented on issue #2638: Compaction Limits

Posted by GitBox <gi...@apache.org>.
klwilson227 commented on issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638#issuecomment-791480811


   With regards to the compaction we saw some behavior, in our latest CHO run with compaction enable where there seemed to be a time when over 200 logs per bookie were eligible for compaction (at .3 and below) ... Where as the trend looked something like the following on a single node
        08:37:29.809+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"08:37:29.809 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         0, 0, 1, 2, 25, 21, 17, 10, 98]  
        09:57:53.756+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"09:57:53.756 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         6, 2, 6, 35, 8, 4, 13, 4, 147] 
        11:10:08.514+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"11:10:08.514 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         8, 2, 2, 8, 7, 11, 63, 15, 111]  
        12:36:47.082+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"12:36:47.082 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         194, 3, 4, 1, 0, 0, 0, 0, 160]    
   
    201 removed, in the last compaction which took some time and caused other cascading issues in the system. With the proposed request, we could limit the work done during this anomaly in which many logs became eligable for compaction at the same time and spead the processing of this out over the coming intervals. Handling spikes during the compaction should not affect the rate of incoming data. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] klwilson227 commented on issue #2638: Compaction Limits

Posted by GitBox <gi...@apache.org>.
klwilson227 commented on issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638#issuecomment-791513334


   The precipitating event to the above issues, may have been a bookkeeper restart. which forced movement of ledger data using auto recovery.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] klwilson227 edited a comment on issue #2638: Compaction Limits

Posted by GitBox <gi...@apache.org>.
klwilson227 edited a comment on issue #2638:
URL: https://github.com/apache/bookkeeper/issues/2638#issuecomment-791480811


   With regards to the compaction we saw some behavior, in our latest CHO run with compaction enable where there seemed to be a time when over 200 logs per bookie were eligible for compaction (at .3 and below) ... Where as the trend looked something like the following on a single node
        08:37:29.809+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"08:37:29.809 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         0, 0, 1, 2, 25, 21, 17, 10, 98]  
        09:57:53.756+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"09:57:53.756 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         6, 2, 6, 35, 8, 4, 13, 4, 147] 
        11:10:08.514+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"11:10:08.514 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         8, 2, 2, 8, 7, 11, 63, 15, 111]  
        12:36:47.082+0000,,smperfqa37.hpeswlab.net,helmcharts,itomdipulsar-,bookkeeper-2,"12:36:47.082 [GarbageCollectorThread-12-1] INFO  org.apache.bookkeeper.bookie.GarbageCollectorThread - Compaction: entry log usage buckets[10% 20% 30% 40% 50% 60% 70% 80% 90% 100%] = [0,         194, 3, 4, 1, 0, 0, 0, 0, 160]    
   
    201 removed, in the last compaction which took some time and caused other cascading issues in the system. With the proposed request, we could limit the work done during this anomaly in which many logs became eligable for compaction at the same time and spead the processing of this out over the coming intervals. Handling spikes during the compaction should not affect the rate of incoming data. 
    
    In the case of 200+ entrylogs to be cleaned up... These processes were running for between 10 minutes to over an hour. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org