You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict Elliott Smith (Jira)" <ji...@apache.org> on 2020/01/14 20:43:00 UTC

[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

    [ https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015376#comment-17015376 ] 

Benedict Elliott Smith commented on CASSANDRA-15388:
----------------------------------------------------

The change LGTM, except a minor compilation breakage if the javaagent isn't used.

I've pushed [here|https://github.com/belliottsmith/cassandra/tree/15388-suggest] some extra tests, that I haven't yet had available server time to run.  These are just re-abstractions of tests I wrote for some work I plan to post in the coming days, that permit us to cover a slightly wider range of partition characteristics (though still probably not as many as we might like), and integrates them into JMH so we can compare performance as well as allocations.

It's very much complementary with the work you've done, as it doesn't track end-to-end costs of compaction, only the isolated costs each of merge and deserialization (and not the entire deserialization pipeline, to keep it simple).  But I think it should capture the main areas of expense and improvement, and these tests have been informative in my other work - certain data characteristics can lead to surprising results with a given approach.

We can always file this as a follow-up, though I think it would be nice (if you agree with the tests) to run some comparisons before we commit the follow-up work, though it will take a long time to run a full comparison (several days, though we can prune the state-space).

> Add compaction allocation measurement test to support compaction gc optimization. 
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of potential gc optimizations against a wide range of (synthetic) compaction workloads. This test accurately measures allocation rates from 16 workloads in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the workloads. Measurements using this agent are very accurate and pretty repeatable from run to run, with most variance being negligible (1-2 bytes per partition), although workloads with larger but fewer partitions vary a bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally interested in the memory allocated per partition, since garbage scales more or less linearly with the number of partitions compacted. So measuring allocation from a small number of partitions that otherwise represent real world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used as a template for future optimization work. This pattern could also be used to set allocation limits on workloads/operations and fail CI if the allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org