You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2014/09/14 14:18:33 UTC

[jira] [Commented] (CASSANDRA-7926) Stress can OOM on merging of timing samples

    [ https://issues.apache.org/jira/browse/CASSANDRA-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133191#comment-14133191 ] 

Benedict commented on CASSANDRA-7926:
-------------------------------------

Patch available [here|https://github.com/belliottsmith/cassandra/tree/7926-stressoom]

I've made a few small changes:
* The number of samples we collect/accumulate at any point are all now configurable with the \-samples setting
* When merging multiple samples, we no longer merge them altogether and _then_ downsample, but instead downsample as we merge, ensuring our memory use is bounded much lower
* Switched to ThreadLocalRandom instead of Random for generating probabilities 



> Stress can OOM on merging of timing samples
> -------------------------------------------
>
>                 Key: CASSANDRA-7926
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7926
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>             Fix For: 2.1.1
>
>
> {noformat}
> Exception in thread "StressMetrics:2" java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2343)
>         at org.apache.cassandra.stress.util.SampleOfLongs.merge(SampleOfLongs.java:76)
>         at org.apache.cassandra.stress.util.TimingInterval.merge(TimingInterval.java:95)
>         at org.apache.cassandra.stress.util.Timing.snapInterval(Timing.java:95)
>         at org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:124)
>         at org.apache.cassandra.stress.StressMetrics.access$200(StressMetrics.java:36)
>         at org.apache.cassandra.stress.StressMetrics$1.run(StressMetrics.java:72)
>         at java.lang.Thread.run(Thread.java:744)
> {noformat}
> This is partially down to recently increasing the per-thread sample size, but also because we allocate temporary space linear in size to total sample size in all threads during merge. This can easily be avoided. We should also scale per-thread sample size based on total number of threads, so we limit total memory use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)