You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cayenne.apache.org by Andrus Adamchik <an...@objectstyle.org> on 2019/11/30 08:28:08 UTC

Watch out for memory leaks with EhCache

Hi,

Wanted to mention a scenario when cache misconfiguration can lead to a hard-to-detect memory leak. More details are available in this Jira [1], but the short story is that when you are using JCache/EhCache, make sure that *all* the cache groups used in queries are *explicitly* configured with desired size limits in ehcache.xml [2], including an entry for a no-cache group scenario that corresponds to "cayenne.default.cache" cache name. 

Andrus

[1] https://issues.apache.org/jira/browse/CAY-2642

[2] <config
        xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
        xmlns='http://www.ehcache.org/v3'
        xsi:schemaLocation="
        http://www.ehcache.org/v3 http://www.ehcache.org/schema/ehcache-core-3.0.xsd">

    <!-- Used by Cayenne for queries without explicit cache groups. -->
    <cache alias="cayenne.default.cache">
        <expiry>
            <ttl unit="minutes">1</ttl>
        </expiry>
        <heap>100</heap>
    </cache>

    <cache alias="my-cache-group">
        <expiry>
            <ttl unit="minutes">1</ttl>
        </expiry>
        <heap>10</heap>
    </cache>
</config>


Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
On Wed, Dec 11, 2019 at 3:13 PM John Huss <jo...@gmail.com> wrote:

>
>
> On Wed, Dec 11, 2019 at 1:59 PM John Huss <jo...@gmail.com> wrote:
>
>>
>>
>> On Wed, Dec 11, 2019 at 10:59 AM Andrus Adamchik <an...@objectstyle.org>
>> wrote:
>>
>>>
>>>
>>> > On Dec 10, 2019, at 9:39 PM, John Huss <jo...@gmail.com> wrote:
>>> >
>>> >> My projects are set up to use soft references using "
>>> >> cayenne.server.object_retain_strategy=soft". I wouldn't expect this to
>>> >> behave differently than weak under memory pressure, but my unit test
>>> with
>>> >> soft is not releasing this memory. I'll try running the test with
>>> "weak"
>>> >> and see if that changes it. And I'll look at your project (thanks for
>>> >> that).
>>> >>
>>> >
>>> > To clarify for future readers, we're talking about objects being
>>> retained
>>> > by this reference path:
>>> > QueryCache -> cached_object -> ObjectContext ->
>>> > other_objects_that_weren't_cached
>>> >
>>> > Sorry, I was wrong about the cause here. The persistent objects ARE
>>> > released by the context's objectStore (whether using weak or soft).
>>> But the
>>> > context is still retaining a lot of extra memory. I'm having a hard
>>> time
>>> > determining the specific cause. It might be the dataRowCache?
>>>
>>> If you have "Use Shared Cache" unchecked in the modeler, then you get a
>>> single dataRowCache per context, so that would definitely explain it. I am
>>> using the default - one dataRowCache per stack, shared by all contexts, so
>>> that's never a problem.
>>>
>>
>> I've narrowed it down - the "extra" memory being retained by the
>> ObjectContext is: entries in ObjectStore.objectMap - but not DataObjects
>> themselves (those get cleared), it's just the references to those objects:
>> the mapping from ObjectId -> WeakReference. These entries stay present
>> after the WeakReference is cleared. All those ObjectIds (though small) add
>> up to a significant amount of memory over time. It looks like these are
>> supposed to be cleared ReferenceMap.checkReferenceQueue(), which calls
>> ReferenceQueue.poll() to find the cleared WeakReferences and remove those
>> entries from the ObjectStore's objectMap. However, poll() doesn't ever seem
>> to return any results (a cleared WeakReference). If I add in a call to
>> ReferenceQueue.remove(5) manually (which will block for five milliseconds
>> while it finds cleared references), it does return them and clear that
>> memory. I need to read more about ReferenceQueue, but the current
>> implementation of checkReferenceQueue() does not appear to be working since
>> poll() never returns anything.
>>
>
> This is actually a timing problem, not a problem with poll().
> checkReferenceQueue() would need to be called *after* the context is done
> being used *and* the gc has occurred. As it currently is
> checkReferenceQueue() is called while the context is being used, which can
> help clear *some* memory that can be freed while in use, but doesn't help
> after the context is unused (like when it is just sitting being referenced
> by an object in the QueryCache.
>

This can be addressed by brute force by checking all the ObjectContexts
that are being retained by the QueryCache and touching the
ObjectStore.objectMap to make it garbage collect.

ExecutorService pool = Executors.*newFixedThreadPool*(4);


*for* (String cacheName : cacheManager.getCacheNames()) {

Cache cache = cacheManager.getCache(cacheName);

*for* (Object key : cache.getKeys()) {

Element element = cache.get(key);

List<Persistent> list = (List<Persistent>) element.getObjectValue();

*if* (!list.isEmpty() && list.get(0).getObjectContext() *instanceof*
DataContext) {

ObjectStore objectStore = ((DataContext)list
.get(0).getObjectContext()).getObjectStore();


pool.submit(() -> {

objectStore.registeredObjectsCount(); // this will trigger
ObjectStore.objectMap.checkReferenceQueue()

});

}

}

}


pool.shutdown();

...

But this is a poor solution really (specific to each QueryCache
implementation, wasteful of resources). The better choice by far would be
to not retain the original ObjectContext at all - either by replacing it
with a fresh context, or by nulling it out. What problems are there in
doing that?


>
>>
>>>
>>> Andrus
>>>
>>>

Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
On Wed, Dec 11, 2019 at 1:59 PM John Huss <jo...@gmail.com> wrote:

>
>
> On Wed, Dec 11, 2019 at 10:59 AM Andrus Adamchik <an...@objectstyle.org>
> wrote:
>
>>
>>
>> > On Dec 10, 2019, at 9:39 PM, John Huss <jo...@gmail.com> wrote:
>> >
>> >> My projects are set up to use soft references using "
>> >> cayenne.server.object_retain_strategy=soft". I wouldn't expect this to
>> >> behave differently than weak under memory pressure, but my unit test
>> with
>> >> soft is not releasing this memory. I'll try running the test with
>> "weak"
>> >> and see if that changes it. And I'll look at your project (thanks for
>> >> that).
>> >>
>> >
>> > To clarify for future readers, we're talking about objects being
>> retained
>> > by this reference path:
>> > QueryCache -> cached_object -> ObjectContext ->
>> > other_objects_that_weren't_cached
>> >
>> > Sorry, I was wrong about the cause here. The persistent objects ARE
>> > released by the context's objectStore (whether using weak or soft). But
>> the
>> > context is still retaining a lot of extra memory. I'm having a hard time
>> > determining the specific cause. It might be the dataRowCache?
>>
>> If you have "Use Shared Cache" unchecked in the modeler, then you get a
>> single dataRowCache per context, so that would definitely explain it. I am
>> using the default - one dataRowCache per stack, shared by all contexts, so
>> that's never a problem.
>>
>
> I've narrowed it down - the "extra" memory being retained by the
> ObjectContext is: entries in ObjectStore.objectMap - but not DataObjects
> themselves (those get cleared), it's just the references to those objects:
> the mapping from ObjectId -> WeakReference. These entries stay present
> after the WeakReference is cleared. All those ObjectIds (though small) add
> up to a significant amount of memory over time. It looks like these are
> supposed to be cleared ReferenceMap.checkReferenceQueue(), which calls
> ReferenceQueue.poll() to find the cleared WeakReferences and remove those
> entries from the ObjectStore's objectMap. However, poll() doesn't ever seem
> to return any results (a cleared WeakReference). If I add in a call to
> ReferenceQueue.remove(5) manually (which will block for five milliseconds
> while it finds cleared references), it does return them and clear that
> memory. I need to read more about ReferenceQueue, but the current
> implementation of checkReferenceQueue() does not appear to be working since
> poll() never returns anything.
>

This is actually a timing problem, not a problem with poll().
checkReferenceQueue() would need to be called *after* the context is done
being used *and* the gc has occurred. As it currently is
checkReferenceQueue() is called while the context is being used, which can
help clear *some* memory that can be freed while in use, but doesn't help
after the context is unused (like when it is just sitting being referenced
by an object in the QueryCache.


>
>>
>> Andrus
>>
>>

Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
On Wed, Dec 11, 2019 at 10:59 AM Andrus Adamchik <an...@objectstyle.org>
wrote:

>
>
> > On Dec 10, 2019, at 9:39 PM, John Huss <jo...@gmail.com> wrote:
> >
> >> My projects are set up to use soft references using "
> >> cayenne.server.object_retain_strategy=soft". I wouldn't expect this to
> >> behave differently than weak under memory pressure, but my unit test
> with
> >> soft is not releasing this memory. I'll try running the test with "weak"
> >> and see if that changes it. And I'll look at your project (thanks for
> >> that).
> >>
> >
> > To clarify for future readers, we're talking about objects being retained
> > by this reference path:
> > QueryCache -> cached_object -> ObjectContext ->
> > other_objects_that_weren't_cached
> >
> > Sorry, I was wrong about the cause here. The persistent objects ARE
> > released by the context's objectStore (whether using weak or soft). But
> the
> > context is still retaining a lot of extra memory. I'm having a hard time
> > determining the specific cause. It might be the dataRowCache?
>
> If you have "Use Shared Cache" unchecked in the modeler, then you get a
> single dataRowCache per context, so that would definitely explain it. I am
> using the default - one dataRowCache per stack, shared by all contexts, so
> that's never a problem.
>

I've narrowed it down - the "extra" memory being retained by the
ObjectContext is: entries in ObjectStore.objectMap - but not DataObjects
themselves (those get cleared), it's just the references to those objects:
the mapping from ObjectId -> WeakReference. These entries stay present
after the WeakReference is cleared. All those ObjectIds (though small) add
up to a significant amount of memory over time. It looks like these are
supposed to be cleared ReferenceMap.checkReferenceQueue(), which calls
ReferenceQueue.poll() to find the cleared WeakReferences and remove those
entries from the ObjectStore's objectMap. However, poll() doesn't ever seem
to return any results (a cleared WeakReference). If I add in a call to
ReferenceQueue.remove(5) manually (which will block for five milliseconds
while it finds cleared references), it does return them and clear that
memory. I need to read more about ReferenceQueue, but the current
implementation of checkReferenceQueue() does not appear to be working since
poll() never returns anything.


>
> Andrus
>
>

Re: Watch out for memory leaks with EhCache

Posted by Andrus Adamchik <an...@objectstyle.org>.

> On Dec 10, 2019, at 9:39 PM, John Huss <jo...@gmail.com> wrote:
> 
>> My projects are set up to use soft references using "
>> cayenne.server.object_retain_strategy=soft". I wouldn't expect this to
>> behave differently than weak under memory pressure, but my unit test with
>> soft is not releasing this memory. I'll try running the test with "weak"
>> and see if that changes it. And I'll look at your project (thanks for
>> that).
>> 
> 
> To clarify for future readers, we're talking about objects being retained
> by this reference path:
> QueryCache -> cached_object -> ObjectContext ->
> other_objects_that_weren't_cached
> 
> Sorry, I was wrong about the cause here. The persistent objects ARE
> released by the context's objectStore (whether using weak or soft). But the
> context is still retaining a lot of extra memory. I'm having a hard time
> determining the specific cause. It might be the dataRowCache?

If you have "Use Shared Cache" unchecked in the modeler, then you get a single dataRowCache per context, so that would definitely explain it. I am using the default - one dataRowCache per stack, shared by all contexts, so that's never a problem.

Andrus


Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
On Tue, Dec 10, 2019 at 9:51 AM John Huss <jo...@gmail.com> wrote:

>
> Thanks for the feedback - I appreciate it.
>
> On Sun, Dec 8, 2019 at 3:17 AM Andrus Adamchik <an...@objectstyle.org>
> wrote:
>
>> Hi John,
>>
>> Thanks for the analysis. You've put a significant effort in this issue,
>> but I still feel like we have differences in understanding of the causes
>> (and hence the remedies). Let me try to clarify how I see it.
>>
>> > On Dec 7, 2019, at 12:09 AM, John Huss <jo...@gmail.com> wrote:
>> >
>> > 1) The lifetime of entries in the Local Query Cache exceeds their
>> > availability, which is the life of their ObjectContext. Any cache that
>> is
>> > not expiring entries (or limiting them) will just leak this memory.
>>
>> A properly configured cache will "overcommit" memory, but not "leak"
>> (i.e. it won't expand indefinitely). Misconfigured cache was the cause of
>> the the issue in the parent message. Once I added upper limits on the
>> number of entries to all cache groups, the leak disappeared.
>>
>> > 2) Cached query results will retain the ObjectContext they were fetched
>> > into, which in turn may retain a much larger number of objects than
>> > intended. For example. If you use a single ObjectContext to fetch 1
>> million
>> > uncached objects along with 1 cached object, you will retain 1 million
>> and
>> > 1 objects in memory rather than just 1.
>>
>> This doesn't look right. Objects are stored via weak references by the
>> OC, so if there are no other references to them, they will be GC'd even if
>> the context is still around. I wrote a self-contained test project [1] that
>> proves this assumption [2]. If you see unexpected extra objects cached,
>> they are likely retained via prefetched / faulted relationships from the
>> explicitly cached objects (see "testWeakReferences_ToOneRelationships" test
>> in [2]).
>>
>> > This is potentially an issue with both the Shared and Local Query
>> Caches.
>>
>> Shared cache stores snapshots that do not have references to the
>> ObjectContext. So it can't be an issue there.
>>
>> > Also, because the cached objects still reference the ObjectContext, it
>> > appears that the context will not be garbage collected.
>>
>> Correct.
>>
>> > Possible Solutions:
>> >
>> > One solution is to null out the ObjectContext on any objects that are
>> > inserted into the Query Cache. This solves both problems above, and it
>> > seems logical since when the objects are retrieved from the cache they
>> will
>> > be placed into a new context anyway. This should work, but the
>> > implementation has been tricky.
>>
>> This will not work at the framework level, as Cayenne doesn't know who
>> else beside the cache references cached objects. So you'd be breaking the
>> state of the object while it is still exposed to the world.
>>
>
> My approach was to:
> 1) make a localObject copy of the object first
> 2) null the ObjectContext on the copy
> 3) store it in the cache
>
> That way only the cached object is affected. Based on some VERY simple
> tests this seems to work, but I haven't tried with any real data or real
> apps yet.
>
>
>> > What Now?
>> >
>> > I've taken a first stab at implementing both of these solutions and have
>> > had some concerns raised about each of them [1] [2]. I'd like to
>> implement
>> > something to fix this problem directly in Cayenne rather than fixing it
>> > only for myself. I'd love to hear any feedback or suggestions on this
>> > before I go further down what might be the wrong road.
>>
>> I happen to agree the the cache model needs improvement to be more
>> transparent and easy to use. But I'd like to establish some common ground
>> in understanding the problems before we can discuss solutions. So to
>> reiterate what I said in reply to items 1 and 2:
>>
>> * Is there a difference in terminology of what a "leak" is? Do you view
>> it as just an overuse of memory but with a fixed upper boundary, or do you
>> see it as a constant expansion that eventually leads to the app running out
>> of memory no matter how much memory it had?
>>
>
> The constant expansion is the real problem. The overuse of memory is
> undesirable, but not a real problem. What makes it overuse instead of
> constant expansion depends on how the cache is configured (as you said). I
> would prefer to leave my cache groups unbounded for groups I'm using Local
> Caching for, just so that I can have consistent (unsurprising) behavior
> (always hits the cache or not for a given code path). Additionally, like
> this original message to the user's list points out, new or unaware users
> can accidentally cause constant expansion without knowing it - and I think
> we can do better there.
>
>
>> * Objects are stored in the context via weak references, so if they are
>> not directly cached, they can be GC'd, even if their context is retained.
>> Do you observe a behavior that contradicts this?
>>
>
> My projects are set up to use soft references using "
> cayenne.server.object_retain_strategy=soft". I wouldn't expect this to
> behave differently than weak under memory pressure, but my unit test with
> soft is not releasing this memory. I'll try running the test with "weak"
> and see if that changes it. And I'll look at your project (thanks for
> that).
>

To clarify for future readers, we're talking about objects being retained
by this reference path:
QueryCache -> cached_object -> ObjectContext ->
other_objects_that_weren't_cached

Sorry, I was wrong about the cause here. The persistent objects ARE
released by the context's objectStore (whether using weak or soft). But the
context is still retaining a lot of extra memory. I'm having a hard time
determining the specific cause. It might be the dataRowCache?

My test fetch 100 objects and just throws them away (not cached). Then it
fetches a single object (from a different entity) and caches it. It repeats
this over and over with a new context each time. Since the 100 objects
aren't used in any way I wouldn't expect them to be retained in memory, but
the unit test ends with OutOfMemory much sooner (around 100x sooner) when
that fetch for 100 objects is present. So the single cached object is
retaining the ObjectContext, which in turn is retaining a bunch of memory
for things I'm no longer using and don't want cached.


> * Shared cache can not be the cause of ObjectContext retention. Again, do
>> you observe a behavior that contradicts this?
>>
>
> No, I haven't had issues with the shared cache and really only been
> looking at the Local Cache. So that was just conjecture, sorry. The Shared
> cache clearly needs expiration or bounding (or both) to be configured by
> the user so it isn't subject to these same considerations anyway.
>
>
>> Andrus
>>
>> [1] https://github.com/andrus/cache-test/
>> [2]
>> https://github.com/andrus/cache-test/blob/master/src/test/java/com/foo/ApplicationTest.java
>>
>>
>>

Re: Watch out for memory leaks with EhCache

Posted by Andrus Adamchik <an...@objectstyle.org>.
> On Dec 23, 2019, at 6:09 PM, John Huss <jo...@gmail.com> wrote:
> 
> I've decided to try using a simple QueryCache subclass I'm calling
> PartitionedQueryCache
> <https://gist.github.com/johnthuss/c5f6b6ce76036afd028199da312ddb08> which
> will enable independent lifetimes for the shared cache versus local cache.
> I'm going to run with this for a while. If there is any interest I can
> contribute it.
> 
> Thanks!

Ah right. There's another dimension to this discussion - separation of the long-living from short-living caches. This can also be achieved at the query level by using different cache groups. Each cache group maps to its own EhCache Cache object that has its own size and expiration settings.


> On Dec 23, 2019, at 6:09 PM, John Huss <jo...@gmail.com> wrote:
> 
>> 2. Set fairly short expiration times. E.g. 10x of your average response
>> time (say 10-30 sec).

BTW, the above statement from my previous email needs some clarification. When implementing "request-scope cache", you actually don't care if the expiration time is too long, as the cache is not shared between requests (if each request has its own ObjectContext). Eviction happens because of the *size* constraint, not expiration constraint.

So the conclusion here is that such a cache can be long-lived as well, just size-limited.

Andrus


Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
On Sun, Dec 22, 2019 at 5:47 AM Andrus Adamchik <an...@objectstyle.org>
wrote:

>
>
> > On Dec 11, 2019, at 11:18 PM, John Huss <jo...@gmail.com> wrote:
> >
> > My use case is very limited in scope. I want to have fresh data basically
> > all the time, but not fetch the same data twice in the same request. Once
> > the request is over and the request's context is out of scope, then the
> > local cache for that context can be cleared. So this cached data is
> > extremely short lived, and as a consequence the size of it doesn't really
> > matter (though I'm only caching small query results anyway).
>
> Thanks for clarifying. I've dealt a lot with "request-scope cache"
> scenarios. Here is my approach:
>
> 1. Set the lowest reasonable upper limit on the number of entries in the
> main cache (parent of local caches). E.g. 3x of your average requests / min
> or something.
> 2. Set fairly short expiration times. E.g. 10x of your average response
> time (say 10-30 sec).
>
> (1) ensures quick turn and prevents leaks
> (2) ensures your request doesn't fetch the same data twice, and will not
> result in any stale data, as every new context will start its own "region"
> for the local cache
>
> So you may end up using a bit more memory than absolutely needed, but you
> avoid the leaks, and get the desired freshness/caching balance.
>
> > I've narrowed it down - the "extra" memory being retained by the
> > ObjectContext is: entries in ObjectStore.objectMap - but not DataObjects
> > themselves (those get cleared), it's just the references to those
> objects:
> > the mapping from ObjectId -> WeakReference. These entries stay present
> > after the WeakReference is cleared. All those ObjectIds (though small)
> add
> > up to a significant amount of memory over time.
>
> Good catch. DataContext getting stuck in the local cache does seem to
> leave a lot of unintended garbage floating around. While it doesn't change
> the equation above, it is still a waste that we may need to address.
>
> So to me the approach above is "good enough". Though we may still come up
> with a design that allows for more control (and hence the lowest possible
> memory footprint). I suspect any such solution would require
> ObjectContext.close() method to allow the user to delineate context scope.
>

 I've decided to try using a simple QueryCache subclass I'm calling
PartitionedQueryCache
<https://gist.github.com/johnthuss/c5f6b6ce76036afd028199da312ddb08> which
will enable independent lifetimes for the shared cache versus local cache.
I'm going to run with this for a while. If there is any interest I can
contribute it.

Thanks!

Re: Watch out for memory leaks with EhCache

Posted by Andrus Adamchik <an...@objectstyle.org>.

> On Dec 11, 2019, at 11:18 PM, John Huss <jo...@gmail.com> wrote:
> 
> My use case is very limited in scope. I want to have fresh data basically
> all the time, but not fetch the same data twice in the same request. Once
> the request is over and the request's context is out of scope, then the
> local cache for that context can be cleared. So this cached data is
> extremely short lived, and as a consequence the size of it doesn't really
> matter (though I'm only caching small query results anyway).

Thanks for clarifying. I've dealt a lot with "request-scope cache" scenarios. Here is my approach:

1. Set the lowest reasonable upper limit on the number of entries in the main cache (parent of local caches). E.g. 3x of your average requests / min or something.
2. Set fairly short expiration times. E.g. 10x of your average response time (say 10-30 sec).

(1) ensures quick turn and prevents leaks
(2) ensures your request doesn't fetch the same data twice, and will not result in any stale data, as every new context will start its own "region" for the local cache

So you may end up using a bit more memory than absolutely needed, but you avoid the leaks, and get the desired freshness/caching balance. 

> I've narrowed it down - the "extra" memory being retained by the
> ObjectContext is: entries in ObjectStore.objectMap - but not DataObjects
> themselves (those get cleared), it's just the references to those objects:
> the mapping from ObjectId -> WeakReference. These entries stay present
> after the WeakReference is cleared. All those ObjectIds (though small) add
> up to a significant amount of memory over time. 

Good catch. DataContext getting stuck in the local cache does seem to leave a lot of unintended garbage floating around. While it doesn't change the equation above, it is still a waste that we may need to address.

So to me the approach above is "good enough". Though we may still come up with a design that allows for more control (and hence the lowest possible memory footprint). I suspect any such solution would require ObjectContext.close() method to allow the user to delineate context scope. 

Andrus



Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
On Wed, Dec 11, 2019 at 11:38 AM Andrus Adamchik <an...@objectstyle.org>
wrote:

>
>
> > On Dec 10, 2019, at 6:51 PM, John Huss <jo...@gmail.com> wrote:
> >
> >>> One solution is to null out the ObjectContext on any objects that are
> >>> inserted into the Query Cache. This solves both problems above, and it
> >>> seems logical since when the objects are retrieved from the cache they
> >> will
> >>> be placed into a new context anyway. This should work, but the
> >>> implementation has been tricky.
> >>
> >> This will not work at the framework level, as Cayenne doesn't know who
> >> else beside the cache references cached objects. So you'd be breaking
> the
> >> state of the object while it is still exposed to the world.
> >
> > My approach was to:
> > 1) make a localObject copy of the object first
> > 2) null the ObjectContext on the copy
> > 3) store it in the cache
> >
> > That way only the cached object is affected. Based on some VERY simple
> > tests this seems to work, but I haven't tried with any real data or real
> > apps yet.
>
> That's an interesting approach. "localObject" has to be called on some
> context, so I guess you have another hidden context for this operation? And
> I assume you need to call "localObject" again when retrieving that object
> from cache into another context. So since you need to go through all these
> steps, you might as well use built-in shared cache.
>

Yes, obtaining a new ObjectContext to use for localObject is the sketchy
part here.

The point of this little dance is to avoid retaining extra memory other
than the cached objects, so using the shared cache wouldn't improve that
situation.

I'm not actually using this yet, just experimenting. It looks like the
extra memory issue can be addressed more directly.


>
> >> * Is there a difference in terminology of what a "leak" is? Do you view
> it
> >> as just an overuse of memory but with a fixed upper boundary, or do you
> see
> >> it as a constant expansion that eventually leads to the app running out
> of
> >> memory no matter how much memory it had?
> >
> > The constant expansion is the real problem. The overuse of memory is
> > undesirable, but not a real problem.
>
> So we are on the same page about the definition of a leak.
>
> > I would prefer to leave my cache groups unbounded for groups I'm using
> Local
> > Caching for, just so that I can have consistent (unsurprising) behavior
> > (always hits the cache or not for a given code path).
>
> I am starting to understand your thinking here (though still not the
> underlying reasons for it). To me cache is naturally probabilistic (entries
> expire, or fall off the end of an LRU map). You are talking about
> "consistent cache". So let's delve into that scenario.
>
> If you have a lot of possible cache keys (not cache groups, but rather
> combinations of query permutations), an unbounded cache will always leak.
> With a fixed set of keys and many contexts the unbounded cache will leak as
> well, and to add insult to injury, it will not reuse entries between the
> contexts.
>
> It will NOT leak with a fixed set of keys and a singleton context (but
> will require calling "localObject" on each "get"). And it will not leak
> with a fixed set of keys and a shared cache. Do you see any issue with
> either of these solutions?
>
> And even though the solutions above should solve it, an idea of an
> unbounded, never-expiring cache still seems dangerous. Sooner or later a
> programming error will lead to a memory use explosion. Could you maybe
> expand on why probabilistic cache is not a good for for your app?
>

My use case is very limited in scope. I want to have fresh data basically
all the time, but not fetch the same data twice in the same request. Once
the request is over and the request's context is out of scope, then the
local cache for that context can be cleared. So this cached data is
extremely short lived, and as a consequence the size of it doesn't really
matter (though I'm only caching small query results anyway).

I'd like to have methods like:

*public* MyEntity fetchMyEntity(ObjectContext context, *int* propValue) {

*return* ObjectSelect.*query*(MyEntity.*class*).where(MyEntity
.PROP_VALUE.eq(propValue)).localCache().selectFirst();

}

Where I can call this from any code path without having to worry about
whether the context has a lot of objects in it or not (sometimes it is
huge). The localCache will keep it from fetching one than once per request.
This is really just a shorthand for this:

*private* MyEntity cachedMyEntity;

*public* MyEntity fetchMyEntity(ObjectContext context, *int* propValue) {

*if* (cachedMyEntity == *null*) {

cachedMyEntity = ObjectSelect.*query*(MyEntity.*class*).where(MyEntity
.PROP_VALUE.eq(propValue)).selectFirst();

}

*return* cachedMyEntity;

}

But with localCache I don't have to declare a temporary cache variable
every time I want to avoid a fetch.


> Andrus
>
>

Re: Watch out for memory leaks with EhCache

Posted by Andrus Adamchik <an...@objectstyle.org>.

> On Dec 10, 2019, at 6:51 PM, John Huss <jo...@gmail.com> wrote:
> 
>>> One solution is to null out the ObjectContext on any objects that are
>>> inserted into the Query Cache. This solves both problems above, and it
>>> seems logical since when the objects are retrieved from the cache they
>> will
>>> be placed into a new context anyway. This should work, but the
>>> implementation has been tricky.
>> 
>> This will not work at the framework level, as Cayenne doesn't know who
>> else beside the cache references cached objects. So you'd be breaking the
>> state of the object while it is still exposed to the world.
> 
> My approach was to:
> 1) make a localObject copy of the object first
> 2) null the ObjectContext on the copy
> 3) store it in the cache
> 
> That way only the cached object is affected. Based on some VERY simple
> tests this seems to work, but I haven't tried with any real data or real
> apps yet.

That's an interesting approach. "localObject" has to be called on some context, so I guess you have another hidden context for this operation? And I assume you need to call "localObject" again when retrieving that object from cache into another context. So since you need to go through all these steps, you might as well use built-in shared cache.


>> * Is there a difference in terminology of what a "leak" is? Do you view it
>> as just an overuse of memory but with a fixed upper boundary, or do you see
>> it as a constant expansion that eventually leads to the app running out of
>> memory no matter how much memory it had?
> 
> The constant expansion is the real problem. The overuse of memory is
> undesirable, but not a real problem.

So we are on the same page about the definition of a leak. 

> I would prefer to leave my cache groups unbounded for groups I'm using Local
> Caching for, just so that I can have consistent (unsurprising) behavior
> (always hits the cache or not for a given code path).

I am starting to understand your thinking here (though still not the underlying reasons for it). To me cache is naturally probabilistic (entries expire, or fall off the end of an LRU map). You are talking about "consistent cache". So let's delve into that scenario. 

If you have a lot of possible cache keys (not cache groups, but rather combinations of query permutations), an unbounded cache will always leak. With a fixed set of keys and many contexts the unbounded cache will leak as well, and to add insult to injury, it will not reuse entries between the contexts.

It will NOT leak with a fixed set of keys and a singleton context (but will require calling "localObject" on each "get"). And it will not leak with a fixed set of keys and a shared cache. Do you see any issue with either of these solutions? 

And even though the solutions above should solve it, an idea of an unbounded, never-expiring cache still seems dangerous. Sooner or later a programming error will lead to a memory use explosion. Could you maybe expand on why probabilistic cache is not a good for for your app? 

Andrus


Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
Thanks for the feedback - I appreciate it.

On Sun, Dec 8, 2019 at 3:17 AM Andrus Adamchik <an...@objectstyle.org>
wrote:

> Hi John,
>
> Thanks for the analysis. You've put a significant effort in this issue,
> but I still feel like we have differences in understanding of the causes
> (and hence the remedies). Let me try to clarify how I see it.
>
> > On Dec 7, 2019, at 12:09 AM, John Huss <jo...@gmail.com> wrote:
> >
> > 1) The lifetime of entries in the Local Query Cache exceeds their
> > availability, which is the life of their ObjectContext. Any cache that is
> > not expiring entries (or limiting them) will just leak this memory.
>
> A properly configured cache will "overcommit" memory, but not "leak" (i.e.
> it won't expand indefinitely). Misconfigured cache was the cause of the the
> issue in the parent message. Once I added upper limits on the number of
> entries to all cache groups, the leak disappeared.
>
> > 2) Cached query results will retain the ObjectContext they were fetched
> > into, which in turn may retain a much larger number of objects than
> > intended. For example. If you use a single ObjectContext to fetch 1
> million
> > uncached objects along with 1 cached object, you will retain 1 million
> and
> > 1 objects in memory rather than just 1.
>
> This doesn't look right. Objects are stored via weak references by the OC,
> so if there are no other references to them, they will be GC'd even if the
> context is still around. I wrote a self-contained test project [1] that
> proves this assumption [2]. If you see unexpected extra objects cached,
> they are likely retained via prefetched / faulted relationships from the
> explicitly cached objects (see "testWeakReferences_ToOneRelationships" test
> in [2]).
>
> > This is potentially an issue with both the Shared and Local Query Caches.
>
> Shared cache stores snapshots that do not have references to the
> ObjectContext. So it can't be an issue there.
>
> > Also, because the cached objects still reference the ObjectContext, it
> > appears that the context will not be garbage collected.
>
> Correct.
>
> > Possible Solutions:
> >
> > One solution is to null out the ObjectContext on any objects that are
> > inserted into the Query Cache. This solves both problems above, and it
> > seems logical since when the objects are retrieved from the cache they
> will
> > be placed into a new context anyway. This should work, but the
> > implementation has been tricky.
>
> This will not work at the framework level, as Cayenne doesn't know who
> else beside the cache references cached objects. So you'd be breaking the
> state of the object while it is still exposed to the world.
>

My approach was to:
1) make a localObject copy of the object first
2) null the ObjectContext on the copy
3) store it in the cache

That way only the cached object is affected. Based on some VERY simple
tests this seems to work, but I haven't tried with any real data or real
apps yet.


> > What Now?
> >
> > I've taken a first stab at implementing both of these solutions and have
> > had some concerns raised about each of them [1] [2]. I'd like to
> implement
> > something to fix this problem directly in Cayenne rather than fixing it
> > only for myself. I'd love to hear any feedback or suggestions on this
> > before I go further down what might be the wrong road.
>
> I happen to agree the the cache model needs improvement to be more
> transparent and easy to use. But I'd like to establish some common ground
> in understanding the problems before we can discuss solutions. So to
> reiterate what I said in reply to items 1 and 2:
>
> * Is there a difference in terminology of what a "leak" is? Do you view it
> as just an overuse of memory but with a fixed upper boundary, or do you see
> it as a constant expansion that eventually leads to the app running out of
> memory no matter how much memory it had?
>

The constant expansion is the real problem. The overuse of memory is
undesirable, but not a real problem. What makes it overuse instead of
constant expansion depends on how the cache is configured (as you said). I
would prefer to leave my cache groups unbounded for groups I'm using Local
Caching for, just so that I can have consistent (unsurprising) behavior
(always hits the cache or not for a given code path). Additionally, like
this original message to the user's list points out, new or unaware users
can accidentally cause constant expansion without knowing it - and I think
we can do better there.


> * Objects are stored in the context via weak references, so if they are
> not directly cached, they can be GC'd, even if their context is retained.
> Do you observe a behavior that contradicts this?
>

My projects are set up to use soft references using "
cayenne.server.object_retain_strategy=soft". I wouldn't expect this to
behave differently than weak under memory pressure, but my unit test with
soft is not releasing this memory. I'll try running the test with "weak"
and see if that changes it. And I'll look at your project (thanks for that).

* Shared cache can not be the cause of ObjectContext retention. Again, do
> you observe a behavior that contradicts this?
>

No, I haven't had issues with the shared cache and really only been looking
at the Local Cache. So that was just conjecture, sorry. The Shared cache
clearly needs expiration or bounding (or both) to be configured by the user
so it isn't subject to these same considerations anyway.


> Andrus
>
> [1] https://github.com/andrus/cache-test/
> [2]
> https://github.com/andrus/cache-test/blob/master/src/test/java/com/foo/ApplicationTest.java
>
>
>

Re: Watch out for memory leaks with EhCache

Posted by Andrus Adamchik <an...@objectstyle.org>.
Hi John,

Thanks for the analysis. You've put a significant effort in this issue, but I still feel like we have differences in understanding of the causes (and hence the remedies). Let me try to clarify how I see it.

> On Dec 7, 2019, at 12:09 AM, John Huss <jo...@gmail.com> wrote:
> 
> 1) The lifetime of entries in the Local Query Cache exceeds their
> availability, which is the life of their ObjectContext. Any cache that is
> not expiring entries (or limiting them) will just leak this memory.

A properly configured cache will "overcommit" memory, but not "leak" (i.e. it won't expand indefinitely). Misconfigured cache was the cause of the the issue in the parent message. Once I added upper limits on the number of entries to all cache groups, the leak disappeared. 

> 2) Cached query results will retain the ObjectContext they were fetched
> into, which in turn may retain a much larger number of objects than
> intended. For example. If you use a single ObjectContext to fetch 1 million
> uncached objects along with 1 cached object, you will retain 1 million and
> 1 objects in memory rather than just 1.

This doesn't look right. Objects are stored via weak references by the OC, so if there are no other references to them, they will be GC'd even if the context is still around. I wrote a self-contained test project [1] that proves this assumption [2]. If you see unexpected extra objects cached, they are likely retained via prefetched / faulted relationships from the explicitly cached objects (see "testWeakReferences_ToOneRelationships" test in [2]).

> This is potentially an issue with both the Shared and Local Query Caches.

Shared cache stores snapshots that do not have references to the ObjectContext. So it can't be an issue there.

> Also, because the cached objects still reference the ObjectContext, it
> appears that the context will not be garbage collected.

Correct.

> Possible Solutions:
> 
> One solution is to null out the ObjectContext on any objects that are
> inserted into the Query Cache. This solves both problems above, and it
> seems logical since when the objects are retrieved from the cache they will
> be placed into a new context anyway. This should work, but the
> implementation has been tricky.

This will not work at the framework level, as Cayenne doesn't know who else beside the cache references cached objects. So you'd be breaking the state of the object while it is still exposed to the world.

> What Now?
> 
> I've taken a first stab at implementing both of these solutions and have
> had some concerns raised about each of them [1] [2]. I'd like to implement
> something to fix this problem directly in Cayenne rather than fixing it
> only for myself. I'd love to hear any feedback or suggestions on this
> before I go further down what might be the wrong road.

I happen to agree the the cache model needs improvement to be more transparent and easy to use. But I'd like to establish some common ground in understanding the problems before we can discuss solutions. So to reiterate what I said in reply to items 1 and 2:

* Is there a difference in terminology of what a "leak" is? Do you view it as just an overuse of memory but with a fixed upper boundary, or do you see it as a constant expansion that eventually leads to the app running out of memory no matter how much memory it had? 

* Objects are stored in the context via weak references, so if they are not directly cached, they can be GC'd, even if their context is retained. Do you observe a behavior that contradicts this?

* Shared cache can not be the cause of ObjectContext retention. Again, do you observe a behavior that contradicts this?

Andrus

[1] https://github.com/andrus/cache-test/
[2] https://github.com/andrus/cache-test/blob/master/src/test/java/com/foo/ApplicationTest.java



Re: Watch out for memory leaks with EhCache

Posted by John Huss <jo...@gmail.com>.
Switching to the dev list:

I've had some time this week to revisit this issue with memory leaks,
especially when using the Local Query Cache. There are two separate issues
to address:

1) The lifetime of entries in the Local Query Cache exceeds their
availability, which is the life of their ObjectContext. Any cache that is
not expiring entries (or limiting them) will just leak this memory.

2) Cached query results will retain the ObjectContext they were fetched
into, which in turn may retain a much larger number of objects than
intended. For example. If you use a single ObjectContext to fetch 1 million
uncached objects along with 1 cached object, you will retain 1 million and
1 objects in memory rather than just 1. This is potentially an issue with
both the Shared and Local Query Caches.

Also, because the cached objects still reference the ObjectContext, it
appears that the context will not be garbage collected. So a simple attempt
to solve issue #1 by invalidating the Local Query Cache when an
ObjectContext is finalized doesn't work, because the context will never be
finalized.


Possible Solutions:

One solution is to null out the ObjectContext on any objects that are
inserted into the Query Cache. This solves both problems above, and it
seems logical since when the objects are retrieved from the cache they will
be placed into a new context anyway. This should work, but the
implementation has been tricky. Handling the deep object graphs due to
prefetching makes this a bit more complicated. This is what I've be working
on most recently.

A different solution that fixes the issues with the Local Cache (but not
the Shared Cache) would be to separate the storage for these two caches and
use a new instance of the Local Cache for each ObjectContext so that their
lifetimes correspond exactly. This could be done inside the Query Cache
itself so that users wouldn't need to know. I'm imagining a CombinedCache
or something that combines two caches together into one, where the Shared
Cache entries go to a long-lived cache and the Local Cache entries go to a
small short-lived cache. This solution is easier to implement (less
invasive) and seems more natural to me since it ties these lifetimes
together.

These two solutions are not mutually exclusive - they both could be done.


What Now?

I've taken a first stab at implementing both of these solutions and have
had some concerns raised about each of them [1] [2]. I'd like to implement
something to fix this problem directly in Cayenne rather than fixing it
only for myself. I'd love to hear any feedback or suggestions on this
before I go further down what might be the wrong road.

Thanks,
John

[1] https://github.com/apache/cayenne/pull/292 (My local copy of this code
now differs fairly significantly from what is currently in the pull request)
[2] https://github.com/apache/cayenne/pull/388 (I would change this code to
make the separate caches live inside a single CombinedCache now)



On Sat, Nov 30, 2019 at 2:28 AM Andrus Adamchik <an...@objectstyle.org>
wrote:

> Hi,
>
> Wanted to mention a scenario when cache misconfiguration can lead to a
> hard-to-detect memory leak. More details are available in this Jira [1],
> but the short story is that when you are using JCache/EhCache, make sure
> that *all* the cache groups used in queries are *explicitly* configured
> with desired size limits in ehcache.xml [2], including an entry for a
> no-cache group scenario that corresponds to "cayenne.default.cache" cache
> name.
>
> Andrus
>
> [1] https://issues.apache.org/jira/browse/CAY-2642
>
> [2] <config
>         xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
>         xmlns='http://www.ehcache.org/v3'
>         xsi:schemaLocation="
>         http://www.ehcache.org/v3
> http://www.ehcache.org/schema/ehcache-core-3.0.xsd">
>
>     <!-- Used by Cayenne for queries without explicit cache groups. -->
>     <cache alias="cayenne.default.cache">
>         <expiry>
>             <ttl unit="minutes">1</ttl>
>         </expiry>
>         <heap>100</heap>
>     </cache>
>
>     <cache alias="my-cache-group">
>         <expiry>
>             <ttl unit="minutes">1</ttl>
>         </expiry>
>         <heap>10</heap>
>     </cache>
> </config>
>
>