You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2022/07/27 11:05:42 UTC

[GitHub] [bookkeeper] horizonzy commented on issue #3408: AutoRecovery caused DirectMemory OOM error.

horizonzy commented on issue #3408:
URL: https://github.com/apache/bookkeeper/issues/3408#issuecomment-1196586078

   After research for a long time. We found that is bookie problem, the request from `ReplicationWorker` is too many.
   
   The shutdown bookie holds many ledgers, when it shutdown, the `Auditor` mark many ledgers to underreplication. 
   And there are many `ReplicationWorker` to replicate ledger, the config `rereplicationEntryBatchSize ` is 500, so every `ReplicationWorker` will send 500 read request to bookie servers, so the bookie server receives lots of reuqest, it will allocate direct memory for reuqest.
   
   The release operation is not catching up allocate operation, so the PoolChunk is more and more until it reach maxDirectMemory.
   
   @gaozhangmin supply two heap dumps file, the `less` is dumpped when replicate operation start, The `more` file is dumpped when the replicate for a while.
   
   [less.hprof.zip](https://github.com/apache/bookkeeper/files/9197159/less.hprof.zip)
   [more.hprof.zip](https://github.com/apache/bookkeeper/files/9197161/more.hprof.zip)
   
   I found that `PoolChunk` is 244 in `more`, 120 in `less`. The `PoolChunk` direct memory is 4M in bookie, so it increase 124 * 4M direct memory than `less`.  
   
   And there is another issue we found, if user config `DbLedgerStorage`, when it start, it will occupy 1/2 direct memory for readCache and writeCache, it's unpooled but cuupy direct memory. 
   
   In the Direct memory pool, it only has 1/2 direct memory to allocate, it will cause oom easier.
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org