You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@shiro.apache.org by matan_a <ma...@ematan.com> on 2011/10/02 05:43:06 UTC

Cache called too many times per request

Hi All,

I've got Shiro running in native mode in a web app.  I think i've ironed out
all the kinks, but while implementing my own Cache / CacheManager, i noticed
some things.

I was looking at the EhCacheManager/EhCache to see how often "get" and "put"
were called and in a single request.  In my case, the number ranges from
30-200 times in a single request.  That seems extraordinarily high (i would
have thought once per request).  I was planning a cache backend in memcached
or mongodb, but if this is standard behavior, the overhead would be too
much.

Can i get a sanity check on this to make sure i don't have something
misconfigured? :)



--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6851915.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
Just an update that there are some timing issues with this regarding multiple
threads creating different sessions that i haven't had the time iron out.

In the mean time, a temporary solution that i'm starting to find more and
more appealing is revert shiro to non-native webserver-based session
management and using jetty session clustering (we're using embedded jetty). 
So far it's been rock solid and pretty efficient.

If someone else irons this out, let me know. I'll try to spend some time on
this when i have some breathing room.

--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6873301.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
Sure.  Let me get this organized and i'll open a Jira ticket for it.

On Thu, Oct 6, 2011 at 9:26 AM, Les Hazlewood-2 [via Shiro User] <
ml-node+s582556n6866484h41@n2.nabble.com> wrote:

> Awesome - thanks for sharing!
>
> Is there any chance you could contribute the relevant code as a patch
> to a Shiro Jira issue?  This would be the easiest way to 1) keep track
> of adding this as a feature and 2) accepting a code contribution.
>
> Regards,
>
> --
> Les Hazlewood
> CTO, Katasoft | http://www.katasoft.com | 888.391.5282
> twitter: @lhazlewood | http://twitter.com/lhazlewood
> katasoft blog: http://www.katasoft.com/blogs/lhazlewood
> personal blog: http://leshazlewood.com
>
>
> On Thu, Oct 6, 2011 at 12:21 AM, matan_a <[hidden email]<http://user/SendEmail.jtp?type=node&node=6866484&i=0>>
> wrote:
>
> > I got a version of this working and i thought i'd share what i have so
> far:
> >
> > http://www.box.net/shared/rgjvtrxvn66jv8a1478u
> >
> > Basically, it's another DAO implementation (set your sessionManager to
> use
> > this version).   The RequestCacheSessionDAO caches the session in the
> > ThreadContext of the current request which is emptied out with everything
>
> > else at the end of the request.  It greatly reduces the amount of times
> the
> > session is retrieved from the cache/backend storage.
> >
> > RequestCacheSessionDAO requires an implementation of
> > RequestCacheSessionStore which is the actual class that would do the
> > serialization and backend storage IO.
> >
> > I've included a MongoDB implementation of the Store and two
> > CacheSerializers.  JavaCacheSerializer is a plain vanilla Java
> Serialization
> > implementation.  KryoCacheSerializer uses Kryo (see
> > http://code.google.com/p/kryo/), which is quite a bit faster, but
> requires
> > you to register any special classes you dump in the session with the
> > KryoCacheSerializer.
> >
> > I apologize for the messy code and the Spring references.  I just pulled
> it
> > out and removed any proprietary info.  Feel free to use/abuse/keep/delete
> as
> > you see fit.
> >
> > Cheers.
> >
> >
> > --
> > View this message in context:
> http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6864830.html
>
> > Sent from the Shiro User mailing list archive at Nabble.com.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6866484.html
>  To unsubscribe from Cache called too many times per request, click here<http://shiro-user.582556.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=6851915&code=bWF0YW5AZW1hdGFuLmNvbXw2ODUxOTE1fDk2OTQ3MDg2Mg==>.
>
>


--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6866616.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by Les Hazlewood <lh...@apache.org>.
Awesome - thanks for sharing!

Is there any chance you could contribute the relevant code as a patch
to a Shiro Jira issue?  This would be the easiest way to 1) keep track
of adding this as a feature and 2) accepting a code contribution.

Regards,

-- 
Les Hazlewood
CTO, Katasoft | http://www.katasoft.com | 888.391.5282
twitter: @lhazlewood | http://twitter.com/lhazlewood
katasoft blog: http://www.katasoft.com/blogs/lhazlewood
personal blog: http://leshazlewood.com


On Thu, Oct 6, 2011 at 12:21 AM, matan_a <ma...@ematan.com> wrote:
> I got a version of this working and i thought i'd share what i have so far:
>
> http://www.box.net/shared/rgjvtrxvn66jv8a1478u
>
> Basically, it's another DAO implementation (set your sessionManager to use
> this version).   The RequestCacheSessionDAO caches the session in the
> ThreadContext of the current request which is emptied out with everything
> else at the end of the request.  It greatly reduces the amount of times the
> session is retrieved from the cache/backend storage.
>
> RequestCacheSessionDAO requires an implementation of
> RequestCacheSessionStore which is the actual class that would do the
> serialization and backend storage IO.
>
> I've included a MongoDB implementation of the Store and two
> CacheSerializers.  JavaCacheSerializer is a plain vanilla Java Serialization
> implementation.  KryoCacheSerializer uses Kryo (see
> http://code.google.com/p/kryo/), which is quite a bit faster, but requires
> you to register any special classes you dump in the session with the
> KryoCacheSerializer.
>
> I apologize for the messy code and the Spring references.  I just pulled it
> out and removed any proprietary info.  Feel free to use/abuse/keep/delete as
> you see fit.
>
> Cheers.
>
>
> --
> View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6864830.html
> Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
I got a version of this working and i thought i'd share what i have so far:

http://www.box.net/shared/rgjvtrxvn66jv8a1478u

Basically, it's another DAO implementation (set your sessionManager to use
this version).   The RequestCacheSessionDAO caches the session in the
ThreadContext of the current request which is emptied out with everything
else at the end of the request.  It greatly reduces the amount of times the
session is retrieved from the cache/backend storage.

RequestCacheSessionDAO requires an implementation of
RequestCacheSessionStore which is the actual class that would do the
serialization and backend storage IO.

I've included a MongoDB implementation of the Store and two
CacheSerializers.  JavaCacheSerializer is a plain vanilla Java Serialization
implementation.  KryoCacheSerializer uses Kryo (see
http://code.google.com/p/kryo/), which is quite a bit faster, but requires
you to register any special classes you dump in the session with the
KryoCacheSerializer.

I apologize for the messy code and the Spring references.  I just pulled it
out and removed any proprietary info.  Feel free to use/abuse/keep/delete as
you see fit.

Cheers.


--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6864830.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
Actually, i started writing this and the ThreadContext has put()/get()
methods which should do the job.  Means i can piggyback on the request
thread local lifecycle like the Subject.

I'll send an update when i get this done.

--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6860396.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by Luke Biddell <lu...@gmail.com>.
I raised a feature request about something similar a while ago.

https://issues.apache.org/jira/browse/SHIRO-317

I haven't had a chance to do anything about it yet, so I shall watch this
with interested.

Thanks guys.


On 4 October 2011 05:57, matan_a <ma...@ematan.com> wrote:

> I took some time to look at the code and SessionDAO is definitely the place
> for this.  I'm planning to create a RequestCacheSessionDAO class that does
> not use the CacheManager (i'll keep that just for Authentication caching)
> and instead use a implementation specific backend storage for the session
> data (i.e. MongoDB).
>
> The process will be:
>
> 1. Check local cache
> 2. Check data store, then put in local cache if found.
>
> Now the only part where i need some guidance is how can i safely keep a
> local cache.
>
> My own ThreadLocal is dangerous unless i can know when to clear it before
> the thread is returned to the pool (end of the request).  I didn't find a
> spot for that.
>
> I was looking at the ThreadContext, but that is pretty closed and doesn't
> offer a way to store arbitrary data there.  It would be perfect tho since
> the session.execute(...) would clear it.
>
> Any pointers on this one?  It seems to be my only obstacle at the moment...
>
>
>
> --
> View this message in context:
> http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6857599.html
> Sent from the Shiro User mailing list archive at Nabble.com.
>

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
I took some time to look at the code and SessionDAO is definitely the place
for this.  I'm planning to create a RequestCacheSessionDAO class that does
not use the CacheManager (i'll keep that just for Authentication caching)
and instead use a implementation specific backend storage for the session
data (i.e. MongoDB).

The process will be:

1. Check local cache
2. Check data store, then put in local cache if found.

Now the only part where i need some guidance is how can i safely keep a
local cache.  

My own ThreadLocal is dangerous unless i can know when to clear it before
the thread is returned to the pool (end of the request).  I didn't find a
spot for that.  

I was looking at the ThreadContext, but that is pretty closed and doesn't
offer a way to store arbitrary data there.  It would be perfect tho since
the session.execute(...) would clear it.

Any pointers on this one?  It seems to be my only obstacle at the moment...



--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6857599.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by Les Hazlewood <lh...@apache.org>.
I would expect using Shiro's ThreadState mechanism would be best for
this - it is guaranteed to be cleared out at the end of each request
by the ShiroFilter.

The sequence goes like this:

AbstractShiroFilter calls subject.execute

DelegatingSubject's execute method creates a SubjectCallable, which
uses an internal ThreadState object to retain thread state during the
request.  When the execute method completes (and consequently, the
request completes), the SubjectCallable ensures the ThreadState is
cleared to keep the thread 'clean' in a thread-pooled environment.

If the ThreadState can be used to retain a session, that could work.

However, the same concept can be used to create thread-local-based
cache in the SessionDAO.  So it could work like this:
1. check the thread-local cache
2. check the normal cache
3. check the underlying data store

The latter approach is probably a little easier to implement since it
does not rely on too much of Shiro's Subject internals.

HTH,

Les

On Mon, Oct 3, 2011 at 4:43 PM, matan_a <ma...@ematan.com> wrote:
> Thanks Les for your quick reply!
>
> I do agree with your definition of what a cache should be doing.
>
> I think that i'm looking at the cache manager from a session clustering
> perspective.  Caching elements and efficient session clustering might not
> have perfectly overlapping requirements.
>
> When i look at the CacheManager as a pure caching storage.  It does what it
> does perfectly fine and you're absolutely right.
>
> When i look at it as a session clustering solution, i'm still trying to
> decide.  Session clustering is pretty tightly coupled with a request
> lifecycle..like querying the cluster if a session id doesn't exist locally,
> caching it there, and possibly writing it back to the cluster when changes
> are made.  Those actions should only be done when necessary - and a local
> copy should be used when it's not.  Almost like a database... you don't run
> the same SQL statements over and over again in the same request when you can
> just do it once and save the value.  I think the Jetty session clustering
> solution is a good implementation of this.
>
> I guess that's implementation details and I could definitely add something
> like this.  I just have one disconnect that i'm sure you can help me with :)
>
> What do you suggest is the best way to keep track of request start and end
> when dealing with the CacheManager?  Ideally, I could wrap the CacheManager
> with a DelayedRequestAware version that only read from the remote storage on
> the first get() in a request, and only wrote to remote storage at the end of
> the request (if there was any put() commands queued).  It would save the
> local variables in the Request context (thread?).
>
> Any suggestions would be great regarding this.  I'd have to do quite a bit
> of code hunting otherwise.
>
> Thanks again for supporting a great product!
>
>
>
> --
> View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6855668.html
> Sent from the Shiro User mailing list archive at Nabble.com.

-- 
Les Hazlewood
CTO, Katasoft | http://www.katasoft.com | 888.391.5282
twitter: @lhazlewood | http://twitter.com/lhazlewood
katasoft blog: http://www.katasoft.com/blogs/lhazlewood
personal blog: http://leshazlewood.com

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
Thanks Les.  I'll try to implement this in the next few days.  Once i get it
working, i can post my code in case anyone else is interested in this type
of cache behavior.

On Mon, Oct 3, 2011 at 10:25 AM, dan [via Shiro User] <
ml-node+s582556n6855818h95@n2.nabble.com> wrote:

> I have seen this scenario of many gets from the cache.  In our case, our
> cache manager was Hazelcast, which means that with every get there is, at
> the very least, a deserialization that occurs.  I concur with Jiggy that one
> get at the beginning of a web request should be sufficient.  The approach
> that this "caching" is in the realm of a caching manager is certainly
> reasonable, but this seems to be a different sort of caching - call it
> local, request-specific caching.
>
> Best,
> Dan
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6855818.html
>  To unsubscribe from Cache called too many times per request, click here<http://shiro-user.582556.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=6851915&code=bWF0YW5AZW1hdGFuLmNvbXw2ODUxOTE1fDk2OTQ3MDg2Mg==>.
>
>


--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6856400.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by dan <da...@bamlabs.com>.
I have seen this scenario of many gets from the cache.  In our case, our
cache manager was Hazelcast, which means that with every get there is, at
the very least, a deserialization that occurs.  I concur with Jiggy that one
get at the beginning of a web request should be sufficient.  The approach
that this "caching" is in the realm of a caching manager is certainly
reasonable, but this seems to be a different sort of caching - call it
local, request-specific caching.

Best,
Dan

--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6855818.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
Thanks Les for your quick reply!

I do agree with your definition of what a cache should be doing.

I think that i'm looking at the cache manager from a session clustering
perspective.  Caching elements and efficient session clustering might not
have perfectly overlapping requirements.

When i look at the CacheManager as a pure caching storage.  It does what it
does perfectly fine and you're absolutely right.

When i look at it as a session clustering solution, i'm still trying to
decide.  Session clustering is pretty tightly coupled with a request
lifecycle..like querying the cluster if a session id doesn't exist locally,
caching it there, and possibly writing it back to the cluster when changes
are made.  Those actions should only be done when necessary - and a local
copy should be used when it's not.  Almost like a database... you don't run
the same SQL statements over and over again in the same request when you can
just do it once and save the value.  I think the Jetty session clustering
solution is a good implementation of this.

I guess that's implementation details and I could definitely add something
like this.  I just have one disconnect that i'm sure you can help me with :)

What do you suggest is the best way to keep track of request start and end
when dealing with the CacheManager?  Ideally, I could wrap the CacheManager
with a DelayedRequestAware version that only read from the remote storage on
the first get() in a request, and only wrote to remote storage at the end of
the request (if there was any put() commands queued).  It would save the
local variables in the Request context (thread?).

Any suggestions would be great regarding this.  I'd have to do quite a bit
of code hunting otherwise.

Thanks again for supporting a great product!



--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6855668.html
Sent from the Shiro User mailing list archive at Nabble.com.

Re: Cache called too many times per request

Posted by Les Hazlewood <lh...@apache.org>.
This is by design - a cache should cache elements as necessary so as
to not saturate the network.  This is well within a cache's realm of
operation and responsibility.

> If you keep session work to a minimum (only principal info), this shouldn't
> really happen in reality but still the # of calls will flood any distributed
> cache/storage.

All distributed caches I've ever used will (or can) perform local
in-memory optimizations as necessary to prevent hits to the remote
nodes when possible.  If for some reason your distributed cache does
not do this, you can easily create a Shiro CacheManager that 'wraps'
your networked Cache mechanism that uses SoftHashMaps as a local
in-memory cache as an optimization.

An additional reason this was by design is due to write operations -
during a thread's execution, when should a Shiro write operation 'hit'
the underlying data store?  Typically DAOs and Cache mechanisms are
very intelligent in knowing when to do this (e.g. synchronizing w/ a
transaction mechanism) - far better than Shiro could without do
(without adding in tremendous complexity).

However, we're always open to suggestions!  If you feel that
additional work should be done in this area, please open a Jira issue
and open a discussion on the dev list and we can take it from there.

Cheers,

-- 
Les Hazlewood
CTO, Katasoft | http://www.katasoft.com | 888.391.5282
twitter: @lhazlewood | http://twitter.com/lhazlewood
katasoft blog: http://www.katasoft.com/blogs/lhazlewood
personal blog: http://leshazlewood.com

Re: Cache called too many times per request

Posted by matan_a <ma...@ematan.com>.
I'd like to refine this a bit more.  The upper end of numbers i talked about
200-300 cache hits is for a page, not a request - but it's still very high.

For reference, hitting just a static CSS file which is under the shiro
filter hits the cache around 25 times.  When you add up all the page
elements, it's up there.

I still trying to figure out if this is "by design".

Ideally, you'd want to retrieve it from the cache at the start of a request
and keep a local copy until the request ends.  At that point, you can update
the cache item if relevant.  

I'd consider a request an atomic action for consistency's sake. If it really
does get/put multiple times during a request, and if the session is
clustered, there is a possibility that will modify the session state of
another request on a different machine while that request is still being
processed which might render it inconsistent.

If you keep session work to a minimum (only principal info), this shouldn't
really happen in reality but still the # of calls will flood any distributed
cache/storage.



--
View this message in context: http://shiro-user.582556.n2.nabble.com/Cache-called-too-many-times-per-request-tp6851915p6853251.html
Sent from the Shiro User mailing list archive at Nabble.com.