You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Ben Manes (JIRA)" <ji...@apache.org> on 2010/11/01 05:33:25 UTC

[jira] Commented: (CASSANDRA-975) explore upgrading CLHM

    [ https://issues.apache.org/jira/browse/CASSANDRA-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926815#action_12926815 ] 

Ben Manes commented on CASSANDRA-975:
-------------------------------------

Lets give this another shot. I have finalized v1.1 and am working on providing the release JAR.

As before, a stress test may not show improved performance by not demonstrating real-world behavior. The improvements in the official releases provides more consistent behavior without scenarios degradation previously. Unlike the Second Chance implementation currently in use, it supports non-blocking behavior for concurrent writes to different segments. The improvements to provides a stricter LRU may have a slight overhead, but due to the non-blocking nature it shouldn't be significant in practice. This version should also reduced the memory overhead compared to v1.0.

I plan to take another stab at the LIRS eviction policy soon. This would again not be noticeable in an artificial stress test, but it would improve the hit rate (thereby improving average performance in real-world usages). Like LRU it can be performed in O(1) time cheaply, but its more difficult to implement. I think I feel comfortable enough to give it another shot.

Alternatively, Google Guava will provide a variant of this concurrent LRU algorithm in MapMaker (r08). It is currently being used internally by early adopters to wring out the bugs (admittedly we've had a few). The advantage is that it supports additional configuration options (expiration, soft references, memoization) and will be more widely supported. The disadvantage is that it does not provide dynamic resizing (which Cassandra exposes, I believe) and uses per-segment LRU chains. The per-segment nature has some interesting algorithmic trade-offs, but most likely that isn't of concern to users. It also has the backing of Google's Java.

> explore upgrading CLHM
> ----------------------
>
>                 Key: CASSANDRA-975
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-975
>             Project: Cassandra
>          Issue Type: Task
>            Reporter: Jonathan Ellis
>            Assignee: Matthew F. Dennis
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: 0001-trunk-975.patch, clhm_test_results.txt, insertarator.py, readarator.py
>
>
> The new version should be substantially better "on large caches where many entries were readon large caches where many entries were read," which is exactly what you see in our row and key caches.
> http://code.google.com/p/concurrentlinkedhashmap/issues/detail?id=9
> Hopefully we can get Digg to help test, since they could reliably break CLHM when it was buggy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.