You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/06/28 01:25:54 UTC

[GitHub] [pulsar] rdhabalia opened a new issue #4632: Bookie GC doesn't clean up overreplicated ledgers from entry logs

rdhabalia opened a new issue #4632: Bookie GC doesn't clean up overreplicated ledgers from entry logs
URL: https://github.com/apache/pulsar/issues/4632
 
 
   ### Motivation
   - Sometimes due to overreplication, bookie contains ledgers which are not owned by that bookie anymore and that bookie is not part of the ensemble-list of those ledgers. In this case, GC finds out those overreplicated ledgers and 
   - deletes their index from dbStorage (rocksDB) and 
   - tries to delete them from entrylog files.
   
   However, bookie doesn't delete them from entry-log files due to change made in [#870](https://github.com/apache/bookkeeper/issues/870) where bookie avoids deleting ledger if znode of that ledger exists.
   
   Because of that bookie ends up storing large number entrylog files with ledgers which are owned by different bookies. It also cause OOM when GC tries to deal with large number of entry log files.
   
   ### Fix
   
   1. OOM should be addressed by: [1949](https://github.com/apache/bookkeeper/pull/1949) or [1938](https://github.com/apache/bookkeeper/pull/1938)
   2. And clean up overreplicated ledgers which are owned by different bookies should be fixed by [this commit](https://github.com/rdhabalia/bookkeeper/commit/67a375c28e595d12395754d5244c7ea926d3d62d) 
   I will create a PR with this fix.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services