You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@samza.apache.org by Jay Kreps <ja...@gmail.com> on 2013/09/10 17:50:35 UTC

soft references for object caching in the key-value storage engine

One idea I had was to use soft references for object cache in key-value
store. Currently we use an LRU hashmap, but the drawback of this is that it
needs to be carefully sized based on heap size and the number of
partitions. It is a little hard to know when to add memory to the object
cache vs the block cache. Plus, since the size is based both on the objects
in it, but also the overhead per object this is pretty impossible to
calculate the worst case memory usage of N objects to make this work
properly with a given heap size.

Another option would be to use soft references:
http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html

Soft references will let you use all available heap space as a cache that
gets gc'd only when strong These are usually frowned upon for caches due to
the unpredictability of the discard--basically the garbage collector has
some heuristic by which it chooses what to discard (
http://jeremymanson.blogspot.com/2009/07/how-hotspot-decides-to-clear_07.html)
but it is based on a heuristic of how much actual free memory to maintain.
This makes soft references a little dicey for latency sensitive services.

But for Samza the caching is really about optimizing throughput not
reducing the latency of a particular lookup. So using the rest of the free
memory in the heap for caching is actually attractive. It is true that the
garbage collector might occasionally destroy our cache but that is actually
okay and possibly worth getting orders of magnitude extra cache space.

This does seem like the kind of thing that would have odd corner cases.
Anyone have practical experience with these who can tell me why this is a
bad idea?

-Jay

Re: soft references for object caching in the key-value storage engine

Posted by sriram <sr...@gmail.com>.

1. The soft reference collection is largely jvm dependent. The question
arises if the task needs predictable performance or best effort. If there
were two tasks(both need state) of the same user, one being memory
intensive and the other not so much, the user would have a hard time
understanding the behavior of his tasks.

2. The JVM needs to do more work on GC. By not bounding the memory,
depending on the heuristic used and the memory characteristics of the
tasks, during GC there would be long stalls to clear up all the weak
references. My guess is we would end up tweaking
XX:SoftRefLRUPolicyMSPerMB depending
on the task.

It is worth testing with soft references but I would argue that we should
test it with different types of tasks. If it works fine for a large subset
then we have a win. If not, it is not much different from bounding the
cache size.

On Tue, Sep 10, 2013 at 8:50 AM, Jay Kreps <ja...@gmail.com> wrote:

> One idea I had was to use soft references for object cache in key-value
> store. Currently we use an LRU hashmap, but the drawback of this is that it
> needs to be carefully sized based on heap size and the number of
> partitions. It is a little hard to know when to add memory to the object
> cache vs the block cache. Plus, since the size is based both on the objects
> in it, but also the overhead per object this is pretty impossible to
> calculate the worst case memory usage of N objects to make this work
> properly with a given heap size.
>
> Another option would be to use soft references:
> http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html
>
> Soft references will let you use all available heap space as a cache that
> gets gc'd only when strong These are usually frowned upon for caches due to
> the unpredictability of the discard--basically the garbage collector has
> some heuristic by which it chooses what to discard (
>
> http://jeremymanson.blogspot.com/2009/07/how-hotspot-decides-to-clear_07.html
> )
> but it is based on a heuristic of how much actual free memory to maintain.
> This makes soft references a little dicey for latency sensitive services.
>
> But for Samza the caching is really about optimizing throughput not
> reducing the latency of a particular lookup. So using the rest of the free
> memory in the heap for caching is actually attractive. It is true that the
> garbage collector might occasionally destroy our cache but that is actually
> okay and possibly worth getting orders of magnitude extra cache space.
>
> This does seem like the kind of thing that would have odd corner cases.
> Anyone have practical experience with these who can tell me why this is a
> bad idea?
>
> -Jay
>

Re: soft references for object caching in the key-value storage engine

Posted by sriram <sr...@gmail.com>.

I am not sure about the performance difference between the object cache and
the block cache. So I would leave that decision to you. W.r.t the GC
latencies, I do think it could be an issue for long running near real time
systems. Let us consider an hypothetical case where a long GC pause of 30
seconds happens once every hour. In a day, the task is not going to do any
work for effectively 12 minutes. In other words it would lag by 12 minutes
in processing the input streams per day. This will obviously increase over
time and irrespective of how well the task keeps up, it would finally start
lagging. There are use cases for which this lag may not be acceptable. This
may or may not happen depending on the task semantics but I don't think it
could be considered as a minor issue.


On Wed, Sep 11, 2013 at 8:22 PM, Jay Kreps <ja...@gmail.com> wrote:

> Sriram, yes, I think you raise the best criticism of this approach. In the
> current design the caches are per task-store combination. This is arguably
> a nightmare to tune, and in my experience people never do this kind of
> thing right, but you do at least have the ability to say X% of memory for
> store A, Y% for store B. Arguably at the store level LRU across should be
> fine (better even), but between tasks this could be an issue.
>
> Martin, both you and Sriram raise the possibility of GC latency but that is
> actually kind of a minor issue for a stream processing system (certainly in
> comparison to a real-time request-response service).
>
> Overall I think both these issues would tend to be minor because this is
> just the object cache. LevelDB still has a block cache.
>
> In either case I threw this out there more speculatively to see if anyone
> knew of any critical drawbacks.
>
> -Jay
>
>
>
>
> On Tue, Sep 10, 2013 at 11:54 AM, Martin Scholl <m...@funkpopes.org> wrote:
>
> > I'm by no means a JVM expert and I am by no means able to give any final
> > judgement on this, but I can say I remember various problems people ran
> > into when using SoftReferences as well as WeakReferences.
> >
> > What a quick search yielded:
> >
> > "Soft references contribute to memory pressure but throughput collectors
> > clear them all at once when memory fills up while CMS gradually clears
> > them, so while you do get this memory sensitive gradual eviction of soft
> > reference data, you also get increased unpredictability of your garbage
> > collectors and that's not really what you want with CMS."
> > -- http://www.javaperformancetuning.com/news/newtips136.shtml
> >
> > This is a nice argument that would defeat the purpose you make up here
> > though I cannot tell if only CMS shows this behavior.
> > This having said, [1] seems to imply that SoftReferences, like
> > WeakReferences,are GC'd LRU'ish.
> >
> > My humble suggestion is to rather extend LevelDB to allow expunge data by
> > time in constant time.
> >
> >
> > Hope it helps,
> > Martin
> >
> > [1]
> >
> >
> http://stackoverflow.com/questions/299659/what-is-the-difference-between-a-soft-reference-and-a-weak-reference-in-java
> >
> >
> > On Tue, Sep 10, 2013 at 5:50 PM, Jay Kreps <ja...@gmail.com> wrote:
> >
> > > One idea I had was to use soft references for object cache in key-value
> > > store. Currently we use an LRU hashmap, but the drawback of this is
> that
> > it
> > > needs to be carefully sized based on heap size and the number of
> > > partitions. It is a little hard to know when to add memory to the
> object
> > > cache vs the block cache. Plus, since the size is based both on the
> > objects
> > > in it, but also the overhead per object this is pretty impossible to
> > > calculate the worst case memory usage of N objects to make this work
> > > properly with a given heap size.
> > >
> > > Another option would be to use soft references:
> > >
> >
> http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html
> > >
> > > Soft references will let you use all available heap space as a cache
> that
> > > gets gc'd only when strong These are usually frowned upon for caches
> due
> > to
> > > the unpredictability of the discard--basically the garbage collector
> has
> > > some heuristic by which it chooses what to discard (
> > >
> > >
> >
> http://jeremymanson.blogspot.com/2009/07/how-hotspot-decides-to-clear_07.html
> > > )
> > > but it is based on a heuristic of how much actual free memory to
> > maintain.
> > > This makes soft references a little dicey for latency sensitive
> services.
> > >
> > > But for Samza the caching is really about optimizing throughput not
> > > reducing the latency of a particular lookup. So using the rest of the
> > free
> > > memory in the heap for caching is actually attractive. It is true that
> > the
> > > garbage collector might occasionally destroy our cache but that is
> > actually
> > > okay and possibly worth getting orders of magnitude extra cache space.
> > >
> > > This does seem like the kind of thing that would have odd corner cases.
> > > Anyone have practical experience with these who can tell me why this
> is a
> > > bad idea?
> > >
> > > -Jay
> > >
> >
>

Re: soft references for object caching in the key-value storage engine

Posted by Jay Kreps <ja...@gmail.com>.

Sriram, yes, I think you raise the best criticism of this approach. In the
current design the caches are per task-store combination. This is arguably
a nightmare to tune, and in my experience people never do this kind of
thing right, but you do at least have the ability to say X% of memory for
store A, Y% for store B. Arguably at the store level LRU across should be
fine (better even), but between tasks this could be an issue.

Martin, both you and Sriram raise the possibility of GC latency but that is
actually kind of a minor issue for a stream processing system (certainly in
comparison to a real-time request-response service).

Overall I think both these issues would tend to be minor because this is
just the object cache. LevelDB still has a block cache.

In either case I threw this out there more speculatively to see if anyone
knew of any critical drawbacks.

-Jay




On Tue, Sep 10, 2013 at 11:54 AM, Martin Scholl <m...@funkpopes.org> wrote:

> I'm by no means a JVM expert and I am by no means able to give any final
> judgement on this, but I can say I remember various problems people ran
> into when using SoftReferences as well as WeakReferences.
>
> What a quick search yielded:
>
> "Soft references contribute to memory pressure but throughput collectors
> clear them all at once when memory fills up while CMS gradually clears
> them, so while you do get this memory sensitive gradual eviction of soft
> reference data, you also get increased unpredictability of your garbage
> collectors and that's not really what you want with CMS."
> -- http://www.javaperformancetuning.com/news/newtips136.shtml
>
> This is a nice argument that would defeat the purpose you make up here
> though I cannot tell if only CMS shows this behavior.
> This having said, [1] seems to imply that SoftReferences, like
> WeakReferences,are GC'd LRU'ish.
>
> My humble suggestion is to rather extend LevelDB to allow expunge data by
> time in constant time.
>
>
> Hope it helps,
> Martin
>
> [1]
>
> http://stackoverflow.com/questions/299659/what-is-the-difference-between-a-soft-reference-and-a-weak-reference-in-java
>
>
> On Tue, Sep 10, 2013 at 5:50 PM, Jay Kreps <ja...@gmail.com> wrote:
>
> > One idea I had was to use soft references for object cache in key-value
> > store. Currently we use an LRU hashmap, but the drawback of this is that
> it
> > needs to be carefully sized based on heap size and the number of
> > partitions. It is a little hard to know when to add memory to the object
> > cache vs the block cache. Plus, since the size is based both on the
> objects
> > in it, but also the overhead per object this is pretty impossible to
> > calculate the worst case memory usage of N objects to make this work
> > properly with a given heap size.
> >
> > Another option would be to use soft references:
> >
> http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html
> >
> > Soft references will let you use all available heap space as a cache that
> > gets gc'd only when strong These are usually frowned upon for caches due
> to
> > the unpredictability of the discard--basically the garbage collector has
> > some heuristic by which it chooses what to discard (
> >
> >
> http://jeremymanson.blogspot.com/2009/07/how-hotspot-decides-to-clear_07.html
> > )
> > but it is based on a heuristic of how much actual free memory to
> maintain.
> > This makes soft references a little dicey for latency sensitive services.
> >
> > But for Samza the caching is really about optimizing throughput not
> > reducing the latency of a particular lookup. So using the rest of the
> free
> > memory in the heap for caching is actually attractive. It is true that
> the
> > garbage collector might occasionally destroy our cache but that is
> actually
> > okay and possibly worth getting orders of magnitude extra cache space.
> >
> > This does seem like the kind of thing that would have odd corner cases.
> > Anyone have practical experience with these who can tell me why this is a
> > bad idea?
> >
> > -Jay
> >
>

Re: soft references for object caching in the key-value storage engine

Posted by Martin Scholl <m...@funkpopes.org>.

I'm by no means a JVM expert and I am by no means able to give any final
judgement on this, but I can say I remember various problems people ran
into when using SoftReferences as well as WeakReferences.

What a quick search yielded:

"Soft references contribute to memory pressure but throughput collectors
clear them all at once when memory fills up while CMS gradually clears
them, so while you do get this memory sensitive gradual eviction of soft
reference data, you also get increased unpredictability of your garbage
collectors and that's not really what you want with CMS."
-- http://www.javaperformancetuning.com/news/newtips136.shtml

This is a nice argument that would defeat the purpose you make up here
though I cannot tell if only CMS shows this behavior.
This having said, [1] seems to imply that SoftReferences, like
WeakReferences,are GC'd LRU'ish.

My humble suggestion is to rather extend LevelDB to allow expunge data by
time in constant time.


Hope it helps,
Martin

[1]
http://stackoverflow.com/questions/299659/what-is-the-difference-between-a-soft-reference-and-a-weak-reference-in-java


On Tue, Sep 10, 2013 at 5:50 PM, Jay Kreps <ja...@gmail.com> wrote:

> One idea I had was to use soft references for object cache in key-value
> store. Currently we use an LRU hashmap, but the drawback of this is that it
> needs to be carefully sized based on heap size and the number of
> partitions. It is a little hard to know when to add memory to the object
> cache vs the block cache. Plus, since the size is based both on the objects
> in it, but also the overhead per object this is pretty impossible to
> calculate the worst case memory usage of N objects to make this work
> properly with a given heap size.
>
> Another option would be to use soft references:
> http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html
>
> Soft references will let you use all available heap space as a cache that
> gets gc'd only when strong These are usually frowned upon for caches due to
> the unpredictability of the discard--basically the garbage collector has
> some heuristic by which it chooses what to discard (
>
> http://jeremymanson.blogspot.com/2009/07/how-hotspot-decides-to-clear_07.html
> )
> but it is based on a heuristic of how much actual free memory to maintain.
> This makes soft references a little dicey for latency sensitive services.
>
> But for Samza the caching is really about optimizing throughput not
> reducing the latency of a particular lookup. So using the rest of the free
> memory in the heap for caching is actually attractive. It is true that the
> garbage collector might occasionally destroy our cache but that is actually
> okay and possibly worth getting orders of magnitude extra cache space.
>
> This does seem like the kind of thing that would have odd corner cases.
> Anyone have practical experience with these who can tell me why this is a
> bad idea?
>
> -Jay
>

Re: soft references for object caching in the key-value storage engine

Posted by Chris Riccomini <cr...@linkedin.com>.

Hey Jay,

Hmm. This seems cool, but I don't really know much about it. It seems like
it wouldn't be that much effort to patch the cache to run it, though.

One question I'd have is how this affects our heap usage metrics. If it
always appears that you're using 100% of the heap, it'd be nice to get
some measure of non-soft referenced usage, so we have a view of how close
we are to running out of memory in a given container. It's the same
problem as the OS cache with top's memory usage statistics.

It seems pretty straight-forward to patch locally, and try it out. Maybe
we'll learn something from that.

Cheers,
Chris

On 9/10/13 8:50 AM, "Jay Kreps" <ja...@gmail.com> wrote:

>One idea I had was to use soft references for object cache in key-value
>store. Currently we use an LRU hashmap, but the drawback of this is that
>it
>needs to be carefully sized based on heap size and the number of
>partitions. It is a little hard to know when to add memory to the object
>cache vs the block cache. Plus, since the size is based both on the
>objects
>in it, but also the overhead per object this is pretty impossible to
>calculate the worst case memory usage of N objects to make this work
>properly with a given heap size.
>
>Another option would be to use soft references:
>http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html
>
>Soft references will let you use all available heap space as a cache that
>gets gc'd only when strong These are usually frowned upon for caches due
>to
>the unpredictability of the discard--basically the garbage collector has
>some heuristic by which it chooses what to discard (
>http://jeremymanson.blogspot.com/2009/07/how-hotspot-decides-to-clear_07.h
>tml)
>but it is based on a heuristic of how much actual free memory to maintain.
>This makes soft references a little dicey for latency sensitive services.
>
>But for Samza the caching is really about optimizing throughput not
>reducing the latency of a particular lookup. So using the rest of the free
>memory in the heap for caching is actually attractive. It is true that the
>garbage collector might occasionally destroy our cache but that is
>actually
>okay and possibly worth getting orders of magnitude extra cache space.
>
>This does seem like the kind of thing that would have odd corner cases.
>Anyone have practical experience with these who can tell me why this is a
>bad idea?
>
>-Jay