You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "ZhaoYang (Jira)" <ji...@apache.org> on 2020/05/11 16:07:00 UTC

[jira] [Comment Edited] (CASSANDRA-15229) BufferPool Regression

    [ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104629#comment-17104629 ] 

ZhaoYang edited comment on CASSANDRA-15229 at 5/11/20, 4:06 PM:
----------------------------------------------------------------

bq.  I would just be looking to smooth out the random distribution of sizes used with e.g. a handful of queues each containing a single size of buffer and at most a handful of items each. 

It looks simpler in its initial form, but I am wondering whether it will eventually grow/evolve into another buffer pool.

bq. so if you are willing to at least enable the behaviour only for the ChunkCache so this change cannot have any unintended negative effect for those users not expected to benefit, my main concern will be alleviated.

+1, partially freed chunk recirculation is only enabled for permanent pool, not for temporary pool.

----

[Patch|https://github.com/apache/cassandra/pull/535/files] / [Circle|https://circleci.com/workflow-run/096afbe1-ec99-4d5f-bdaa-06f538b8280f]:

* Initiate 2 buffer pool instances, one for chunk cache (default 512mb) called {{"Permanent Pool"}}, one for network (default 128mb) called {{"Temporary Pool"}}. So they won't interfere each other.
* Improve buffer pool metrics to track: {{"overflowSize"}} - buffer size that is allocated outside of buffer pool and {{"UsedSize"}} - buffer size that is currently being allocated.
* Allow partially freed chunk to be recycled in Permanent Pool to improve cache utilization due to chunk cache holding buffer for arbitrary time period. Note that due to various allocation sizes, fragmentation still exists in partially freed chunk.




was (Author: jasonstack):
bq.  I would just be looking to smooth out the random distribution of sizes used with e.g. a handful of queues each containing a single size of buffer and at most a handful of items each. 

It looks simpler in its initial form, but I am wondering whether it will eventually grow/evolve into another buffer pool.

bq. so if you are willing to at least enable the behaviour only for the ChunkCache so this change cannot have any unintended negative effect for those users not expected to benefit, my main concern will be alleviated.

+1, partially freed chunk recirculation is only enabled for permanent pool, not for temporary pool.

----

[Patch|https://github.com/apache/cassandra/pull/535/files]/[Circle|https://circleci.com/workflow-run/096afbe1-ec99-4d5f-bdaa-06f538b8280f]:

* Initiate 2 buffer pool instances, one for chunk cache (default 512mb) called {{"Permanent Pool"}}, one for network (default 128mb) called {{"Temporary Pool"}}. So they won't interfere each other.
* Improve buffer pool metrics to track: {{"overflowSize"}} - buffer size that is allocated outside of buffer pool and {{"UsedSize"}} - buffer size that is currently being allocated.
* Allow partially freed chunk to be recycled in Permanent Pool to improve cache utilization due to chunk cache holding buffer for arbitrary time period. Note that due to various allocation sizes, fragmentation still exists in partially freed chunk.



> BufferPool Regression
> ---------------------
>
>                 Key: CASSANDRA-15229
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Caching
>            Reporter: Benedict Elliott Smith
>            Assignee: ZhaoYang
>            Priority: Normal
>             Fix For: 4.0, 4.0-beta
>
>         Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, 15229-unsafe.png
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we need to either change our behaviour to handle uncorrelated lifetimes or use something else.  This is particularly important with the default chunk size for compressed sstables being reduced.  If we address the problem, we should also utilise the BufferPool for native transport connections like we do for internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used for things with uncorrelated lifetimes, which essentially boils down to tracking those chunks that have not been freed and re-circulating them when we run out of completely free blocks.  We should probably also permit instantiating separate {{BufferPool}}, so that we can insulate internode messaging from the {{ChunkCache}}, or at least have separate memory bounds for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce the amount of global coordination and per-allocation overhead.  We don’t need 1KiB granularity for allocations, nor 16 byte granularity for tiny allocations.
> -----
> Since CASSANDRA-5863, chunk cache is implemented to use buffer pool. When local pool is full, one of its chunks will be evicted and only put back to global pool when all buffers in the evicted chunk are released. But due to chunk cache, buffers can be held for long period of time, preventing evicted chunk to be recycled even though most of space in the evicted chunk are free.
> There two things need to be improved:
> 1. Evicted chunk with free space should be recycled to global pool, even if it's not fully free. It's doable in 4.0.
> 2. Reduce fragmentation caused by different buffer size. With #1, partially freed chunk will be available for allocation, but "holes" in the partially freed chunk are with different sizes. We should consider allocating fixed buffer size which is unlikely to fit in 4.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org