You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jordan West (Jira)" <ji...@apache.org> on 2020/01/08 19:22:00 UTC
[jira] [Commented] (CASSANDRA-15213) DecayingEstimatedHistogramReservoir Inefficiencies

    [ https://issues.apache.org/jira/browse/CASSANDRA-15213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010956#comment-17010956 ] 

Jordan West commented on CASSANDRA-15213:
-----------------------------------------

I’ve verified the reported {{LongAdder}} memory consumption. I’m seeing 13.5MiB right after startup.

A few questions about your proposed solutions?
 * By the backing array do you mean the original AtomicLongArray? Two issues I see with increasing the number of buckets: the maximum we can create are 237 (then the offset overflows) and increasing the number of buckets doesn’t necessarily reduce contention. This is because the resolution of the buckets doesn’t change except for the last bucket (which for microseconds is over 200 days and for nanoseconds is over 5 hours). Or are you suggesting increasing the bucket resolution by decreasing the factor in {{newOffsets}} when adding more buckets?
 * I’m not sure what you mean by “providing multiple buckets” but I suspect it might be what I am missing to better understand the proposal. Do you mean having multiple buckets per offset? Additionally, re: your {{LongAdder}} reference, are you proposing creating new buckets for the same offset under contention (like {{LongAdder}} does w/ {{Cell}})?
 * Regarding the binary search suggestion, are you suggesting using the approximation to replace the maximum index (e.g. binarySearch(offsets, 0, Math.floor(….) + 1, value)? Or to outright use that approximation? I wasn’t sure of your exact proposal but it does seem like using the equation you gave as an upper bound estimate and {{Math.floor(Math.log(Math.floor(n / 1.2)) / Math.log(1.2))}} as a lower bound estimate can reduce the space of the array we have to search by 25-60% (I saw up to 75% in one case with DEFAULT_BUCKET_COUNT but haven’t reproduced that high of a reduction since). I’ve added a QuickTheories test to show this here: [https://github.com/jrwest/cassandra/commit/a4795bc651cd0e0bd582a8df2414e312782b5020].
 * We could also circumvent the search entirely in the lowest buckets by using {{if value < bucketCount && bucketOffsets[value - 1] == value}} although given our use case I’m not sure how often we will see sub 165 ns operations.

> DecayingEstimatedHistogramReservoir Inefficiencies
> --------------------------------------------------
>
>                 Key: CASSANDRA-15213
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15213
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Observability/Metrics
>            Reporter: Benedict Elliott Smith
>            Assignee: Jordan West
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> * {{LongAdder}} introduced to trunk consumes 9MiB of heap without user schemas, and this will grow significantly under contention and user schemas with many tables.  This is because {{LongAdder}} is a very heavy class designed for single contended values.  
>  ** This can likely be improved significantly, without significant loss of performance in the contended case, by simply increasing the size of our primitive backing array and providing multiple buckets, with each thread picking a bucket to increment, or simply multiple backing arrays.  Probably a better way still to do this would be to introduce some competition detection to the update, much like {{LongAdder}} utilises, that increases the number of backing arrays under competition.
>  ** To save memory this approach could partition the space into chunks that are likely to be updated together, so that we do not need to duplicate the entire array under competition.
>  * Similarly, binary search is costly and a measurable cost as a share of the new networking work (without filtering it was > 10% of the CPU used overall).  We can compute an approximation floor(log2 n / log2 1.2) extremely cheaply, to save the random memory access costs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org