You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org> on 2012/03/26 05:10:47 UTC

[jira] [Commented] (BOOKKEEPER-193) Ledger is garbage collected by mistake.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238075#comment-13238075 ] 

Sijie Guo commented on BOOKKEEPER-193:
--------------------------------------

this issue is a bug of the logic of garbage collection. currently the garbage collection is executed, by first fetching a list of all ledgers, fetching the active ledgers from bookie, then garbage collecting those active ledgers not in zookeeper list. there is a time period between fetching list from zookeeper and fetching list from bookie, if a ledger created in this time period, it would be garbage collected by mistake.

for FlatLedgerManager, this issue could be fixed easily. Since the ledgers are created in sequence, we can get the max ledger id when fetching list of all ledgers from zookeeper, during garbage collection, those ledgers are larger than max ledger id would not be garbage collected until next garbage collection is executed.

for HierarchicalLedgerManager, it is different, because the id generation and the ledger creation is two different operations running in asynchronous. one possible solution is fetching a copy of active ledgers from bookie first (the requests came in after fetching should not put in the list of active ledgers used for gc), then fetching the list of all ledgers from zookeeper, which can ensure we get the right list of all ledgers from zookeeper.
                
> Ledger is garbage collected by mistake.
> ---------------------------------------
>
>                 Key: BOOKKEEPER-193
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-193
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>             Fix For: 4.1.0
>
>
> currently, we encountered such case: ledger is garbage collected by mistake, and following requests would fail due to NoLedgerException.
> {code}
> 2012-03-23 19:10:47,403 - INFO  [GarbageCollectorThread:GarbageCollectorThread@234] - Garbage collecting deleted ledger index files.
> 2012-03-23 19:10:48,702 - INFO  [GarbageCollectorThread:LedgerCache@544] - Deleting ledgerId: 89408
> 2012-03-23 19:10:48,703 - INFO  [GarbageCollectorThread:LedgerCache@577] - Deleted ledger : 89408
> 2012-03-23 19:11:10,013 - ERROR [NIOServerFactory-3181:BookieServer@361] - Error writing 1@89408
> org.apache.bookkeeper.bookie.Bookie$NoLedgerException: Ledger 89408 not found
>         at org.apache.bookkeeper.bookie.LedgerCache.getFileInfo(LedgerCache.java:228)
>         at org.apache.bookkeeper.bookie.LedgerCache.updatePage(LedgerCache.java:260)
>         at org.apache.bookkeeper.bookie.LedgerCache.putEntryOffset(LedgerCache.java:158)
>         at org.apache.bookkeeper.bookie.LedgerDescriptor.addEntry(LedgerDescriptor.java:135)
>         at org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1059)
>         at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1099)
>         at org.apache.bookkeeper.proto.BookieServer.processPacket(BookieServer.java:357)
>         at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.readRequest(NIOServerFactory.java:315)
>         at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.doIO(NIOServerFactory.java:213)
>         at org.apache.bookkeeper.proto.NIOServerFactory.run(NIOServerFactory.java:124)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira