You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2023/01/24 22:10:00 UTC

[jira] [Created] (HBASE-27587) L1 cache leaks index blocks over time when under subscribed

Bryan Beaudreault created HBASE-27587:
-----------------------------------------

             Summary: L1 cache leaks index blocks over time when under subscribed
                 Key: HBASE-27587
                 URL: https://issues.apache.org/jira/browse/HBASE-27587
             Project: HBase
          Issue Type: Bug
            Reporter: Bryan Beaudreault


Let's say you have CombinedBlockCache enabled. DATA goes to L2, INDEX/BLOOM go to L1.  Your regionserver has index size of 2gb and bloom size of 1gb. So you really only need around 3gb of L1 to fully hold all of the "L1 candidates".

When data set does not fit into cache, LRU will handle evictions to stay under max. But in the above scenario, if you configure 6gb for L1 (3 more than needed) over time you will end up filling that entire 6gb with old INDEX blocks. Once you reach max, LRU will handle evicting out the oldest ones.

Since the leak is contained to the configured max L1 size, this isn't a huge issue but it results in heap waste. Under high heap allocations, if you haven't left enough buffer outside memstore, L1, etc, you will start seeing GC pressure. This L1 leak then becomes a little more problematic, because you end up in a circumstance where longer lived regionservers (who've leaked closer to the max L1 size) have less extra buffer available than more newly restarted regionservers.

The best fix is to appropriately set your L1 size so there is not a lot of excess, but this can be painful to maintain over time as clusters shrink, grow, or data shape changes. It'd be a lot better if the L1 did not leak so you don't have to so finely tune the L1.

I haven't fully figured out where the leak comes from, but I think it's related to compactions. Perhaps the INDEX blocks are not being evicted as hfiles are compacted away. The leak is very linear over time in our experience.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)