You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Ben Manes (JIRA)" <ji...@apache.org> on 2017/04/26 05:54:04 UTC

[jira] [Comment Edited] (ACCUMULO-4626) improve cache hit rate via weak reference map

    [ https://issues.apache.org/jira/browse/ACCUMULO-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984217#comment-15984217 ] 

Ben Manes edited comment on ACCUMULO-4626 at 4/26/17 5:54 AM:
--------------------------------------------------------------

I would be interested to know if there was a difference in hit rates between the two caches, prior to your improvement. It tends to evict new and idle arrivals more aggressively, as those are often pollutants. That could be beneficial or a liability, depending on how recency biased the workload is. We have an adaptive approach that uses hill climbing to tune towards recency or frequency, which corrects for this. I hope to incorporate that after I finish my timer wheel based policy (variable expiration).


was (Author: ben.manes):
I would be interested to know if there was a difference in hit rates between the two caches, prior to your improvement. It tends to evict new and idle arrivals more aggressively, as those are often pollutants. That could be beneficial or a liability, depending on how recency biased the workload is. We have an adaptive approach that uses bill climbing to tune towards recency or frequency, which corrects for this. I hope to incorporate that after I finish my timer wheel based policy (variable expiration).

> improve cache hit rate via weak reference map
> ---------------------------------------------
>
>                 Key: ACCUMULO-4626
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4626
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Adam Fuchs
>              Labels: performance, stability
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> When a single iterator tree references the same RFile blocks in different branches we sometimes get cache misses for one iterator even though the requested block is held in memory by another iterator. This is particularly important when using something like the IntersectingIterator to intersect many deep copies. Instead of evicting completely, keeping evicted blocks into a WeakReference value map can avoid re-reading blocks that are currently referenced by another deep copied source iterator.
> We've seen this in the field for some of Sqrrl's queries against very large tablets. The total memory usage for these queries can be equal to the size of all the iterator block reads times the number of readahead threads times the number of files times the number of IntersectingIterator children when cache miss rates are high. This might work out to something like:
> {code}
> 16 readahead threads * 200 deep copied children * 99% cache miss rate * 20 files * 252KB per reader = ~16GB of memory
> {code}
> In most cases, evicting to a weak reference value map changes the cache miss rate from very high to very low and has a dramatic effect on total memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)