You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Enrico Olivelli <eo...@gmail.com> on 2019/09/27 08:01:44 UTC

Garbage collection of Journal files

Hello,
I am seeing in some low traffic bookie a weird behaviour about garbage
collection of journal files.

Let's say that I have a bookie that has this data on the journal directory


drwxrwxr-x. 2      65536 27 set 09.01 .

drwxrwxr-x. 3       4096  9 gen  2019 ..

-rw-rw-r--. 1 1107296256 24 set 18.08 168331bd95f.txn

-rw-rw-r--. 1 1107296256 26 set 07.46 168331bd960.txn

-rw-rw-r--. 1  704643072 26 set 15.21 168331bd961.txn

-rw-rw-r--. 1  301989888 27 set 08.57 168331bd962.txn

-rw-rw-r--. 1   16777216 27 set 08.57 168331bd963.txn

-rw-rw-r--. 1   16777216 27 set 09.01 168331bd964.txn

-rw-rw-r--. 1        208  9 gen  2019 VERSION


Last mark tells:

LastLogMark : Journal Id - 1547045689698(168331bd962.txn), Pos - 301541888

I see on logs only lines like:

Flush ledger storage at checkpoint CheckpointList{checkpoints=[LogMark:
logFileId - 1547045689698 , logFileOffset - 301541888]}


In **other** heavy traffic bookies I sometime see this line:

garbage collected journal 150cdc02736.txn


As I have maxJournalBackups=1 I would like to see that the bookie drops the
files before 16833db962.txn file and release disk space but this is
actually not happening.


I see that in Journal.java we are not deleting date if the checkkpoint is
not a LogMarkCheckpoint
https://github.com/apache/bookkeeper/blob/afd8dbadd8a205fe4de4daa9d0c09680effab49d/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java#L735

It seems to me that the actual class is a CheckpointList that is wrapping a
LogMarkCheckpoint.

I think that a LogMarkCheckpoint may arrive to that line of code only in
case of an hard flush due to entry log file size limit reached

https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/SortedLedgerStorage.java#L302

If my analysis is correct.....
- do we have a bug in Journal.java#L735 (liked above) ?
- is there any way to force the Journal to reclaim disk space without
writes (that force a flush of the EntryLogger) ?


Enrico