You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@wicket.apache.org by Thomas Heigl <th...@umschalt.com> on 2020/03/21 10:44:16 UTC

Re: PerSessionPageStore thread-safety

Hi all,

In case anyone else is running into this:

I have successfully implemented a `PageAccessSynchronizer` for Spring
Session that keeps locks in application scope. We have been running it in
production for a couple of days now and have served millions of requests
without any issues.

Spring Session with Redis adds about 3-5ms to each request for loading and
persisting the session. Together with a `PerSessionPageStore` that keeps
the last used page on each machine and avoids (de)serialization of the page
on every request this is more than acceptable in our case.

Best,

Thomas

On Fri, Feb 28, 2020 at 1:27 PM Thomas Heigl <th...@umschalt.com> wrote:

> Hi Martin,
>
> I created https://github.com/apache/wicket/pull/411 with my suggested
> changes. Feedback would be greatly appreciated ;)
>
> Best regards,
>
> Thomas
>
> On Fri, Feb 28, 2020 at 10:14 AM Martin Grigorov <mg...@apache.org>
> wrote:
>
>> On Thu, Feb 27, 2020 at 3:58 PM Thomas Heigl <th...@umschalt.com> wrote:
>>
>> > Hi Martin,
>> >
>> > Could you please explain how exactly Spring Session works ? This will
>> help
>> > > us both understand where the problem comes from.
>> > >
>> >
>> > Sure.  From the docs
>> > <https://docs.spring.io/spring-session/docs/current/reference/html5/>:
>> >
>> > Spring Session provides an API and implementations for managing a user’s
>> > > session information while also making it trivial to support clustered
>> > > sessions without being tied to an application container-specific
>> > solution.
>> > > It allows replacing the HttpSession in an application
>> container-neutral
>> > > way, with support for providing session IDs in headers to work with
>> > RESTful
>> > > APIs.
>> >
>> >
>> > Spring Session basically is a servlet filter
>> > <
>> >
>> https://docs.spring.io/spring-session/docs/current/reference/html5/#httpsession-how
>> > >
>> > that looks up and persists sessions in an external store.
>> > It does so, by wrapping the servlet request and providing a custom
>> > implementation of HttpSession that keeps
>> > track of changes and persists them at the end of the request. It is
>> similar
>> > to what container-based replication
>> > does but is *independent* of the container. That's why I prefer to use
>> it
>> > instead of adding a custom session manager to Tomcat.
>> > It is much easier to customize Spring Session behavior (e.g. how cookies
>> > are stored, etc) because it is part
>> > of your application and not your container.
>> >
>> > Also how exactly do you integrate it with Wicket ? Which extension
>> > > points do you use ?
>> >
>> >
>> > There are *no* integration points with Wicket. Replacing the HttpSession
>> > with Spring Session is completely transparent.
>> > Wicket does not know anything about Spring session. The only difference
>> is
>> > that the session is not held in the containers memory,
>> > but serialized to and from an external source. I have done extensive
>> > testing and except for access synchronization everything
>> > else works as expected.
>> >
>> > Since it is backed by Redis it means that it is distributed.
>> > >
>> > But at the same time you try to use PerSessionPageStore which is
>> in-memory
>> > > store, i.e. it lives only in the memory of one of the web container
>> > nodes.
>> > > Another node has its own instance of this class.
>> >
>> >
>> > I want to use PerSessionPageStore as a local cache on each node together
>> > with sticky sessions to avoid
>> > having to go through the whole process of Redis + Deserialization for
>> every
>> > single ajax request.
>> > In a non-clustered environment or an environment where clustering is
>> done
>> > by the servlet container, Wicket
>> > can make use of the session cache that stores the last touched pages in
>> the
>> > session. In case of
>> > Spring Session, the session cache cannot be used. Sessions are
>> > (de)serialized on every request and the session
>> > cache in Wicket is transient.
>> >
>> > So strictly speaking, I don't need the PerSessionPageStore but since I
>> > cannot use the session cache, ajax requests
>> > become quite expensive. I'm using it as an application-scoped
>> alternative
>> > to the session cache.
>> >
>> > Here is how it works in normal Servler environment:
>> > > .....
>> >
>> >
>> > I completely agree on all your descriptions of session management in a
>> > (clustered) container environment. We have
>> > a shared understanding on container-based session management.
>> >
>> > I don't see how application-scope would help at all. As I explained
>> > > earlier PageAccessSynchronizer is per o.a.w.Session instance.
>> > > Maing it application-scoped will only add more concurrency issues and
>> you
>> > > will have to add locks on session level. At the moment it is
>> lock-free.
>> >
>> >
>> > There will be more concurrency, but that is not really an issue:
>> >
>> > The quick replacement for the PageAccessSynchronizer I wrote this
>> morning
>> > uses the following data structure:
>> >
>> > private static final ConcurrentMap<PageKey, PageLock> LOCKS = new
>> > > ConcurrentHashMap<>();
>> >
>> >
>> > I made it static instead of moving it to the application, because it was
>> > faster to implement. The only difference to
>> > the default synchronizer is the key used for the map. Instead of an
>> integer
>> > for the pageId, I'm using a composite
>> > key of sessionId and pageId.
>> >
>> > @Value
>> > > static final class PageKey {
>> > > String sessionId;
>> > > Integer pageId;
>> > > }
>> >
>> >
>> > All locking/synchronization is done on the PageLock object that is
>> scoped
>> > to a single session by the composite key.
>> > There will be more reads/writes to the concurrent map, but no additional
>> > locking is necessary and unless there are
>> > thousands of requests per second this should hardly be an issue.
>> >
>> > Let us first hear back how this will solve the issue and then we will
>> > > discuss such change.
>> >
>> >
>> > To summarize:
>> >
>> > I'm using Spring Session that is a simple, container-neutral,
>> transparent
>> > solution for clustering sessions. Wicket
>> > works fine with Spring Session. Except for one minor issue:
>> > PageAccessSynchronizer does not work because
>> > sessions are not held in memory but (de)serialized on every request.
>> Thus
>> > the internal lock map is not shared
>> > between requests. Moving the lock map into global or application scope
>> > solves the issue with the downside of
>> > increased concurrency to the, now shared, map but should not require any
>> > additional locking or synchronization.
>> > It is quite trivial to write such a global synchronizer, but it could be
>> > much easier if some additional methods
>> > in the default access manager were public and lock handling would be
>> > encapsulated in an interface.
>> >
>> > I hope I managed to explain myself. The topic is quite complex ;)
>> >
>>
>> Thank you for this comprehensive explanation, Thomas! :-)
>> Please create a Pull Request with your suggested changes!
>>
>> Martin
>>
>>
>> >
>> > Best,
>> >
>> > Thomas
>> >
>> >
>> >
>> > On Thu, Feb 27, 2020 at 1:47 PM Martin Grigorov <mg...@apache.org>
>> > wrote:
>> >
>> > > Hi Thomas,
>> > >
>> > > Could you please explain how exactly Spring Session works ? This will
>> > help
>> > > us both understand where the problem comes from.
>> > > Also how exactly do you integrate it with Wicket ? Which extension
>> points
>> > > do you use ?
>> > >
>> > > Since it is backed by Redis it means that it is distributed.
>> > > But at the same time you try to use PerSessionPageStore which is
>> > in-memory
>> > > store, i.e. it lives only in the memory of one of the web container
>> > nodes.
>> > > Another node has its own instance of this class.
>> > >
>> > > Here is how it works in normal Servler environment:
>> > >
>> > > The HttpSession instances are managed by the web container, e.g.
>> Tomcat.
>> > > The container keeps them in memory! The container may persist them for
>> > > failover/restart but usually the HttpSessions are kept in memory.
>> > >
>> > > 1) if there is just one web container node then Wicket asks the
>> container
>> > > for the javax.servler.HttpSession: httpServletRequest.getSession()
>> > > Then Wicket extracts the org.apache.wicket.Session from the
>> HttpSession
>> > > attributes.
>> > > The Wicket Session class has a member instance of
>> PageAccessSynchronizer
>> > > that is specific for the current instance of the Session.
>> > > PageAccessSynchronizer has a Map<Integer, PageLock>, i.e. pageId ->
>> > > PageLock.
>> > > Whenever a request needs to use a specific Wicket Page then the
>> current
>> > > session's PageAccessSynchronizer is used to acquire a lock on it. This
>> > > makes sure that a specific page instance is used by at most one
>> request
>> > in
>> > > the *current* server node.
>> > >
>> > > 2) if there is session replication in place, i.e. more than one server
>> > > nodes, then:
>> > > The web container fetches the HttpSessions for its backend. In case of
>> > > Tomcat - it keeps the HttpSessions in memory but there is a
>> > ClusterManager
>> > > that replicates the HttpSessions' which have been modified, i.e. their
>> > > #setAttibute() has been called.
>> > > In cluster mode there will be one HttpSession per server node,
>> > respectfully
>> > > one Wicket Session instance per node, and one PageAccessSynchronizer
>> per
>> > > session per node.
>> > > Using Wicket or not if you don't use sticky sessions you may face
>> timing
>> > > related issues in such environment. Every distributed software has
>> this
>> > > problem.
>> > >
>> > >
>> > > On Wed, Feb 26, 2020 at 9:17 PM Thomas Heigl <th...@umschalt.com>
>> > wrote:
>> > >
>> > > > Hi Sven,
>> > > >
>> > > > Thanks for the link but I'm not using asynchronous serialization.
>> > > >
>> > > > I thought some more about the issue and I think I figured it out. My
>> > > setup
>> > > > looks like this:
>> > > >
>> > > > 1. Spring Session Redis
>> > > > 2. [Session Cache] (Not used because it is transient and stored with
>> > > > writeObject/readObject and does not get serialized into Redis as we
>> do
>> > > not
>> > > > use Java serialization)
>> > > > 3. PerSessionPageStore (Application-level cache held in memory)
>> > > > 4. RedisDataStore (Synchronous)
>> > > >
>> > > > Observations:
>> > > >
>> > > > 1. If i disable second-level cache or use the serializing
>> second-level
>> > > > cache provided by the DefaultPageManager, there are no issues
>> > > > 2. As soon as I enable the PerSessionPageStore we run into
>> concurrency
>> > > > issues
>> > > >
>> > > > Analysis:
>> > > >
>> > > > I first thought that there were some thread-safety issues
>> > > > with PerSessionPageStore but that is not the case because even a
>> fully
>> > > > synchronized version shows these problems.
>> > > >
>> > > > The reason why disabling the 2nd-level cache or using a serializing
>> > > variant
>> > > > works, is because they do not operate on the same *instance* of the
>> > page.
>> > > > Each thread gets their
>> > > > own instance because the page is deserialized before being accessed.
>> > > >
>> > > > PerSessionPageStore stores the page in memory without serializing
>> it,
>> > > thus
>> > > > all threads share the same instance. This is also the case when
>> using
>> > the
>> > > > session cache or
>> > > > the session-based stores, but the PageAccessSynchronizer bound to
>> the
>> > > > session takes care of ensuring that only single request can
>> manipulate
>> > > the
>> > > > page at any given time.
>> > > >
>> > > > So how does the synchronizer work? It keeps a Map<Integer,
>> PageLock> in
>> > > the
>> > > > session and checks whether the page is locked on every request. In a
>> > > > non-replicated
>> > > > environment this works as expected as the session object lives in
>> the
>> > > > servlet container and is the same for each concurrent request. In my
>> > > case,
>> > > > the session
>> > > > is not provided by the servlet container, but fetched from Redis by
>> > > Spring
>> > > > Session on every request. So each concurrent thread has *their own
>> > > version*
>> > > > of the session and
>> > > > the locks are *not shared between threads* because the session is
>> only
>> > > > saved back to Redis after the request has finished.
>> > > >
>> > > > So the problematic flow looks like this
>> > > >
>> > > > 1. A request comes in, we fetch the session from Redis, the request
>> > > > acquires the session-scoped lock and starts processing
>> > > > 2. Before the request is finished, another request comes in, fetches
>> > the
>> > > > session from Redis, the lock map is empty because the state of
>> request
>> > #1
>> > > > has not been persisted to Redis
>> > > > 3. Now both requests can modify the page and we run into concurrency
>> > > issues
>> > > >
>> > > > Summary:
>> > > >
>> > > > PageAccessSynchronizer does not work with Spring Session or other
>> > > solutions
>> > > > that replace the servlet container session.
>> > > >
>> > > > Possible solutions:
>> > > >
>> > > > 1. We could ensure that session locks are updated in Redis
>> immediately
>> > > but
>> > > > that still leaves a couple of milliseconds for race conditions and
>> > adds a
>> > > > lot of overhead
>> > > >
>> > >
>> > > This is called a distributed lock.
>> > > It will reduce the problems you face but will make things slower too.
>> > >
>> > >
>> > > > 2. We could use an application-scoped PageAccessSynchronizer. This
>> > solves
>> > > > the problem as long as sessions are sticky and all concurrent
>> requests
>> > > are
>> > > > sent to the same server.
>> > > >
>> > >
>> > > I don't see how application-scope would help at all. As I explained
>> > > earlier PageAccessSynchronizer is per o.a.w.Session instance.
>> > > Maing it application-scoped will only add more concurrency issues and
>> you
>> > > will have to add locks on session level. At the moment it is
>> lock-free.
>> > >
>> > >
>> > > > 3. If we want to use non-sticky session we could use Redis locks for
>> > > > implementing a global PageAccessSynchronizer
>> > > >
>> > >
>> > > Sticky sessions is usually good enough for most of the cases.
>> > > Distributed locks will make your application much slower.
>> > > If you use single instance of Redis you will have single point of
>> > failure.
>> > > If you use Redis replication then the distributed locking will become
>> > even
>> > > more complex and slow.
>> > >
>> > >
>> > > >
>> > > > I would like to go with solution #2 for now. The problem is
>> > > > that PageAccessSynchronizer is not an interface.
>> > > >
>> > > > Would it be possible to extract an interface so I can easily
>> implement
>> > > > access synchronizers with different scopes?
>> > > >
>> > >
>> > > Let us first hear back how this will solve the issue and then we will
>> > > discuss such change.
>> > >
>> > >
>> > > >
>> > > > Best regards,
>> > > >
>> > > > Thomas
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Feb 26, 2020 at 12:24 PM Sven Meier <sv...@meiers.net>
>> wrote:
>> > > >
>> > > > > Hi Thomas,
>> > > > >
>> > > > > Im wondering whether you're running into
>> > > > > https://issues.apache.org/jira/browse/WICKET-6702
>> > > > >
>> > > > > I've been working on a solution to that problem, which is caused
>> by
>> > > pages
>> > > > > being asynchronously serialized while another request is already
>> > coming
>> > > > in.
>> > > > >
>> > > > > Or maybe it is something different.
>> > > > > Could you create a quickstart?
>> > > > >
>> > > > > Sven
>> > > > >
>> > > > > Am 25. Februar 2020 22:12:46 MEZ schrieb Thomas Heigl <
>> > > > thomas@umschalt.com
>> > > > > >:
>> > > > > >Hi again,
>> > > > > >
>> > > > > >I investigated a bit and it does not seem to have anything to do
>> > with
>> > > > > >the
>> > > > > >PerSessionPageStore. I implemented a completely synchronized
>> version
>> > > of
>> > > > > >it
>> > > > > >and the problems still exist.
>> > > > > >
>> > > > > >If I switch to the default second-level cache that stores
>> serialized
>> > > > > >pages
>> > > > > >in application scope, everything works as expected. Only the
>> > > > > >non-serialized
>> > > > > >pages in PerSessionPageStore seem to be affected by concurrent
>> ajax
>> > > > > >modifications.
>> > > > > >
>> > > > > >What is the difference between keeping pages in the session and
>> > > keeping
>> > > > > >the
>> > > > > >same pages in the PerSessionPageStore? Is there some additional
>> > > locking
>> > > > > >done for pages in the session?
>> > > > > >
>> > > > > >Best,
>> > > > > >
>> > > > > >Thomas
>> > > > > >
>> > > > > >On Tue, Feb 25, 2020 at 8:25 PM Thomas Heigl <
>> thomas@umschalt.com>
>> > > > > >wrote:
>> > > > > >
>> > > > > >> Hi all,
>> > > > > >>
>> > > > > >> I'm currently experimenting with PerSessionPageStore as a
>> > > > > >second-level
>> > > > > >> cache. We are moving our page store from memory (i.e. session)
>> to
>> > > > > >Redis and
>> > > > > >> keeping 1-2 pages per session in memory speeds up ajax requests
>> > > quite
>> > > > > >a bit
>> > > > > >> because network roundtrips and (de)serialization can be skipped
>> > for
>> > > > > >cached
>> > > > > >> pages.
>> > > > > >>
>> > > > > >> Our application is very ajax heavy (it is basically a single
>> page
>> > > > > >> application with lots of lazy-loading). While rapidly clicking
>> > > around
>> > > > > >and
>> > > > > >> firing as many parallel ajax requests as possible, I noticed
>> that
>> > it
>> > > > > >is
>> > > > > >> quite easy to trigger exceptions that I have never seen before.
>> > > > > >> ConcurrentModificationExceptions during serialization,
>> > > > > >> MarkupNotFoundExceptions, exceptions about components already
>> > > > > >dequeuing etc.
>> > > > > >>
>> > > > > >> So I had a look at the implementation of PerSessionPageStore
>> and
>> > > > > >noticed
>> > > > > >> that is does not do any kind of synchronization and does not
>> use
>> > > > > >atomic
>> > > > > >> operations when updating the cache. It seems to me that the
>> > > > > >second-level
>> > > > > >> cache is not really usable in a concurrent ajax environment.
>> > > > > >>
>> > > > > >> I think that writing pages to the second level cache store
>> should
>> > > > > >either
>> > > > > >> synchronize on sessionId+pageId or attempt to use atomic
>> > operations
>> > > > > >> provided by ConcurrentHashMap.
>> > > > > >>
>> > > > > >> Did anyone else ever run into these issues? The code
>> > > > > >> of PerSessionPageStore is quite complex because of soft
>> > references,
>> > > > > >> skip-list maps etc. so I'm not sure what the right approach to
>> > > > > >address
>> > > > > >> these problems would be.
>> > > > > >>
>> > > > > >> Best regards,
>> > > > > >>
>> > > > > >> Thomas
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>