You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@geronimo.apache.org by James Strachan <ja...@gmail.com> on 2006/07/12 15:16:53 UTC

Re: [wadi-dev] Session/clustering API and the web tier

On 7/12/06, Jules Gosnell <ju...@coredevelopers.net> wrote:
> Greg Wilkins wrote:
> > All,
> >
> >
> > Here are my comments on the Session API that were promised after apachecon dublin.
> > This is also CC'd to the wadi list and some of the points are relevant to them
> > as well.
> >
> > My own reason for focusing on the Session API when I think about clustering,
> > is that I like the idea of pluggable clustering implementations.   Clustering
> > is not one size fits all and solutions can go from non-replicated nodes configured
> > in a flat file to auto discovered, self healing, redundant hierarchies.

Agreed. We should be focussed purely on what is the contract between a
container and the session API and making that contract as simple and
abstract as is possible while minimising leaky abstractions.


> > I think the previous discussions we had on this were making good progress, but
> > I think we ran out of steam before the API was improved etc.  So I think it
> > worthwhile to re-read the original threads.
> >
> > But I will repeat my main unresolved concerns here:
> > While I appreciate the keep-it-simple-stupid approach adopted by the proposed
> > session API, I remain concerned that it may over simplify and may also mix concerns.
> >
> > However, I do think that the API is pitched at about the right level - namely
> > that it is below the specific concerns of things such as HTTP.  As the implementor
> > of the web container, I would prefer to not delegate HttpSession management
> > or request proxying to a pluggable session implementation (I doubt that
> > a cluster impl wants to deal with non-blocking proxying of requests etc.)
>
> I think that our discussions about this have suffered from an ambiguity
> around the word 'delegate'...
>
> In one sense of the word, given WADI's current implementation, Jetty
> does delegate Session management and HTTP handling to WADI, in that WADI
> passes the WebApp/Jetty an object on which it calls a method and the
> work in question is done.
>
> However, in another sense, Jetty need not delegate this task, since the
> object returned in these cases is managed by WADI, but created by a
> Factory that is injected at startup time. This factory might be
> generating instances of a class that has very Jetty-specific knowledge
> or is even a part of the Jetty distro...

Thats certainly one approach. Another is for the container to just ask
the policy API what to do (i.e. is the request going to be serviced
locally or not) so that the container can take care of the rest.

I understand the cleanliness from the session API implementor's
perspective of using a factory and calling back the container when you
see fit - however I also understand the container developers
requirement to understand at all times what each thread is doing, to
tune things aggressively with full knowledge of threading models and
to generally be master of its own domain, so I can understand why a
container developer might prefer a non-callback related solution
(which could introduce all kinds of nasty thread related bugs into the
container). I don't see why both options can't be offered.


> I would wholeheartedly agree that the code for Http request relocation
> should be written by someone with expertise in that area - namely the
> container writer. I would just rather see it injected into the clustered
> manager, so that it can be called when required, without having to
> burden Jetty with the added task of making this decision itself.

I don't see that as mutually exclusive. Just have a way for Jetty to
ask the clustering solution if a request can be satisfied locally, if
not Jetty does the proxy/redirect thing.


> > I see that the webcontainer needs to interact with the cluster implementation
> > in 4 areas:
> >
> >
> > 1) Policy
> > ---------
> >
> > When a container receives a request, it needs to make a policy decision along
> > the lines of:
> >
> >     1) The request can be handled locally.
> >     2) The request can be handled locally, but only after some other actions
> >        (eg session is moved to local)
> >     3) request cannot be handled locally, but can be redirected to another node
> >     4) request cannot be handled locally, but can be proxied to another node.
> >
> > This kind of corresponds to the Locator and SessionLocation APIs.  However
> > these APIs give the power to enact a policy decision, but give no support to make
> > a policy decision.
> >
> > To implement a policy, you might want to use:  the size of the cluster, the total
> > number of sessions, the number of  session on the local node, the number of sessions
> > collocated with a remote session, how many requests for the session have recently
> > arrived on what nodes, etc. etc.
> >
> > The API does not give me this information and I think it would be
> > difficult to provide all that might be used.  Potentially
> > we could get by with a mechanism to store/access cluster wide meta-data
> > attributes?
> >
> > However, it is very unlikely that one policy will fit all, so each consumer
> > of this Location API will have to implement a pluggable policy frame work of
> > some sorts.
> >
> > But as the session API is already a pluggable framework, why don't we
> > just delegate the policy decision to the API.  The web container
> > should make the policy decision, but should call the session API to
> > make the decision.  Something like:
> >
> >   SessionLocation executeAt =  locator.getSessionExecutionLocation(clientId);
> >   if (executeAt.isLocal())
> >     // handle request
> >   else
> >     // proxy or redirect to executeAt location.
> >
> > (Note the need for something like this has been discussed before and
> > generally agreed.  I have seen the proposed RemoteSessionStrategy, but I am not
> > sure how you obtain a handle to one - nor do I think the policy should
> > decide between redirect and proxy - which is HTTP business).

Agreed. Just some way to ask the Session API if a request can be
processed locally might do the trick, then if not Jetty can do its
proxy/redirect thing. The trickier thing is what to pass into the
strategy to help it decide...


> By exposing the 'policy' api to the container and putting it in charge
> of when it used, you are exposing clustering details to it.

Also the container details may be required by this policy. e.g.
details about the previous http requests received at the current node,
their type and various metadata statitsics and so forth which only the
container is aware of.


> WADI's approach is to completely shield the container from having to
> know anything about clustering, whilst maintaining contracts with the
> container encapsulating various pieces of tier/domain-specific
> functionality that may be injected into the clustered session manager.

The issue is though, how invisible can clustering ever be? Information
from the container and from the clustering implementation will
typically be required for the policy decision.


> > 3) Life cycle
> >
> > Unfortunately the life and death of a session is not simple - specially
> > when cross context dispatch is considered.  Session ID's may or may not
> > be reused, their uniqueness might need to be guarenteed and the decision
> > may depend on the existence of the same session ID in other contexts.
> >
> > I think this can be modeled with a structured name space - so perhaps
> > this is not an issue anymore?
> >
> >
> > 4) Configuration and Management
> > It would generally be good to be able to know how many nodes are in
> > the cluster (or to set what the nodes are). To be able to monitor node
> > status and give commands to gracefully or brutally shutdown a node, move
> > sessions etc.
> >
> > Clustering aware clients (JNDI stubs, EJB proxies or potentially fancy
> > Ajax web clients) might need to be passed a list of known nodes - but it
> > is not possible to obtain/set that from the API - thus every impl will need
> > to implement it's own cluster config/discover even if that information is
> > available in other implementations.
> >
> >
>
> This is the clustering API (in my mind) that was mooted in the meeting.
> A number of clustering substrates (JGroups, ActiveCluster, Tribes,
> etc...) have homesteaded this area (WADI maintains an abstraction layer
> that can map on to any of these three). All provide an API which
> provides membership notification/querying, 1->1 and 1->all messaging
> functionality. These are the basic building blocks of clustering and
> they will be required in every clustered service that is built for
> Geronimo. This is a natural candidate for encapsulation and sharing.
> Failing to do this will result in each different service having to build
> its own concepts about clustering from the ground up, which would be a
> disaster.

Agreed.

Things in the Java world have changed greatly since the introduction
of JGroups, JCluster, ActiveCluster, Tribes et al. Nowadays there is
no reason why we can't have a really simple POJO based model to
represent Nodes in a cluster with listeners to be notified when nodes
come and go. (Its really the main point of ActiveCluster - but we
could maybe refactor that API to be just a POJO model of a cluster
with no dependencies on external APIs or technologies and with the
ability maybe to cast a Node to some service interface to communicate
with the nodes).

Then using things like Spring Remoting we can add the remoting
technology as a deployment issue (rather than having lots of different
middleware specific APIs). e.g. see how Lingo allows you to invisibly
add JMS remoting to any POJO. (http://lingo.codehaus.org/)

-- 

James
-------
http://radio.weblogs.com/0112098/