You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Bip Thelin <bi...@razorfish.com> on 2001/05/02 06:46:47 UTC

[PROPOSAL Tomcat 4.x] Cluster

I started looking at how to implement a DistributedManager and as I see it the
best way to do this is to use MulticastSocket. So I started to look at how to
implement it using MulticastSocket and started thinking about including that in
a Cluster package. So you configure a <Cluster> package from within Server.xml
The Cluster package joins a Cluster group(Multicast group) and have a defined
set of methods/interfaces to communicate with each other. These clusters could
then be used to implement in memory session replication, aka. Distributed sessions.

What we would gain from this in the future is that we can hook in different things
that uses the Cluster, for doing weighted load balancing and other stuff.

I started writing some initial stubs and interfaces on how this could work, I
don't just want to commit it without having ventilated it here first. Maybe
this is a shitty idea to begin with.

	..bip

Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by "Mark.Abbott" <Ma...@openwave.com>.
I would think that, if it already existed, the API described by
the new JSR 107 
http://java.sun.com/aboutJava/communityprocess/jsr/jsr_107_cache.html
would be strongly considered as the basis for a distributed session
manager.  Any design made now ought to have an eye on going over to that
API in the future.  Perhaps that merely means that a new Manager
implementation could be made in the future.

This does not reflect either way on your suggestion for using MulticastSocket
and a Cluster abstraction. 

   Mark


Bip Thelin wrote:
> 
> I started looking at how to implement a DistributedManager and as I see it the
> best way to do this is to use MulticastSocket. So I started to look at how to
> implement it using MulticastSocket and started thinking about including that in
> a Cluster package. So you configure a <Cluster> package from within Server.xml
> The Cluster package joins a Cluster group(Multicast group) and have a defined
> set of methods/interfaces to communicate with each other. These clusters could
> then be used to implement in memory session replication, aka. Distributed sessions.
> 
> What we would gain from this in the future is that we can hook in different things
> that uses the Cluster, for doing weighted load balancing and other stuff.
> 
> I started writing some initial stubs and interfaces on how this could work, I
> don't just want to commit it without having ventilated it here first. Maybe
> this is a shitty idea to begin with.
> 
>         ..bip



Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by Kief Morris <ki...@bitbull.com>.
To summarize, I see the following issues needing to be considered
and resolved for distributed sessions:

- Management of session ID/request redirection,
- Architecture of Manager/Store/Cluster interfaces.

I made a few points on the architectural front in my previous post.

For request/ID management, the issues are:
a) What happens to incoming requests to control which instance of 
    Catalina gets each request?
b) How do we manage the session ID when a session is handled
    by different Catalina instances?

Ideally "something" in front will manage incoming requests, ensuring
that a client is always sent to the same Catalina instance if possible,
choosing an alternative instance if not. 

Assuming this front-end exists, it ought to ensure that the second
Catalina instance knows which session ID to use for the client.

So what can this front end be? 
It could be something we provide, as Craig has suggested. What
if a web server connector is used? Will we need our load balancer
to sit in front of the web servers, or can we take advantage of the
connectors somehow? Do the connectors use any load balancing
and/or dead server detection which may affect us?

It could also be a third-party load balancer device (e.g. a router,
etc.). I'm aware that many (most? all?) of these will try to keep
a client on the same server instance. But are there any that do
not? Are they a significant market presence that Catalina should
be able to handle them without trying to defeat them by imposing
a custom load balancer/request redirector?

Also, what happens to a client using URL rewriting for the session
ID? I suppose we will need to modify HttpServletRequest.encodeURL()
to ensure ID's are encoded correctly in distributed applications.
Our own request redirector can use the encoded session ID to
direct requests to the right Catalina instance. What about connectors?

I would like to have a better understanding of these issues, and
what people believe are the best ways to approach them. Comments,
suggestions, criticisms, questions?

Kief


Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Mon, 7 May 2001, Kief Morris wrote:

> Craig R. McClanahan typed the following on 11:18 AM 5/7/2001 -0700
> >An interesting question is, how do you detect when a session has been
> >"changed"?  Obviously, you can detect setAttribute/removeAttribute, but
> >what about changes to the *internal* state of the attributes themselves
> >that the session does not know about?
> 
> I think we have to consider the session to be "dirty" at the end of
> any request in which it was accessed.
> 

That's certainly feasible, but I'd bet we find it's too conservative a
view given the potential impact on performance
(i.e. "needless" replications).

> Kief
> 
> 

Craig



Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by Kief Morris <ki...@bitbull.com>.
Craig R. McClanahan typed the following on 11:18 AM 5/7/2001 -0700
>An interesting question is, how do you detect when a session has been
>"changed"?  Obviously, you can detect setAttribute/removeAttribute, but
>what about changes to the *internal* state of the attributes themselves
>that the session does not know about?

I think we have to consider the session to be "dirty" at the end of
any request in which it was accessed.

Kief


Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Mon, 7 May 2001, Bip Thelin wrote:

> [SNIP]
> 
> Do we really need to lock a session for each request and then
> replicate it? Sorry I might be confused, you mean a request for a
> session or a request as in generating a new request object(http
> request). If we assume that a session is only in use in one JVM at a
> time(which I think we can assume) then that session doesn't need to be
> locked it just needs replication when it's changed.
> 

The servlet spec *requires* that all requests for a given session, at any
point in time, be served by a single JVM.

Whether and when replication happens seems to me like a quality of service
issue for different implementations of the cluster -- I don't think a
single answer will suffice here.  I can conceive of everything from never
migrating an existing session (essentially what the current "load
balancing" support provides) to duplicating every single change live.

An interesting question is, how do you detect when a session has been
"changed"?  Obviously, you can detect setAttribute/removeAttribute, but
what about changes to the *internal* state of the attributes themselves
that the session does not know about?  (I understand, but haven't
verified, that some J2EE containers expect you to call setAttribute again,
on the same attribute, to tell the container that you've modified
something).

Craig


Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by Bip Thelin <bi...@razorfish.com>.
Kief Morris wrote:
> 
> [...]
>
> My point is that the Manager/Cluster needs to know when the session is in
> use by another instance of Catalina. A locking mechanism must be
> implemented by the Cluster (or whatever) to prevent a session from being
> used by multiple instances at once. This mechanism will require the
> Manager/Cluster system to know when a request  begins using
> a session, and when it has finished.
> 
> >If we say that only one JVM at a time can manipulate a sessions since
> >a sessions only belongs to one machine at a time the only time a session
> >needs to be replicated is when it's created/changed/destroyed.
> 
> Yes. But putting the session into the Cluster at creation time is
> unnecessary. It should be put in at the end of the request when
> it is created (other Catalina instances can't use it before then
> anyway), and updated at the end of each subsequent request.
> So we need to have the end of a request call into the Manager
> to indicate that the session can be sent to the Cluster and
> unlocked for use.

Do we really need to lock a session for each request and then replicate it?
Sorry I might be confused, you mean a request for a session or a request
as in generating a new request object(http request). If we assume that a session
is only in use in one JVM at a time(which I think we can assume) then that
session doesn't need to be locked it just needs replication when it's changed.

> >I'd rather see the replication be implemented in a Manager(i.e.
> >DistributedManager or
> >maybe change name to MulticastDistributedManager) thus making it possible to
> >run any Store with the DistributedManager(i.e. FileStore).
> 
> OK, I take your point that extending Store isn't the way to go. But
> I don't think we should have a different Manager implementation for
> each available distribution mechanism - MulticastDistributedManager,
> JMSDistributedManager, JavaSpacesDistributedManager,
> JCacheDistributedManager, etc. We should use the same pattern
> that Manager/Store uses: a single DistributedManager should be
> implemented which is independent of the actual session sharing
> mechanism. It should be able to use any implementation of the
> Cluster interface.

Yes, sorry I was clear as mudd in my last email, so if you look at the
DistributedManager.java I checked in 14h ago(as of writing) it now uses the
new API's I created which is common for any distribuition protocol you might implement.

> I'd like to keep the possibility open to implement different distribution
> strategies. The strategy we're looking at now is for each instance of the
> application to hold copies of every session. An alternative Cluster strategy
> would keep the sessions in a central location such as a database: when
> a request comes in for a session not found in the current instance, the
> Cluster checks the database to see if it's there. This isn't too different
> from simply using JDBCStore. A third way is to have just two instances
> of the application holding a given session: when instance A creates
> the session, the Cluster chooses instance B to hold a backup copy
> in case A goes down: if a request comes in to C, B still has it available.

One way that you could simply go with the cluster is to group them. So there
is an option now to specify the name/port/address of the cluster. What I was
thinking is that you could specify a cluster that this.jvm belongs too and then
specify a cluster it should replicate too.

> Not that we need to implement all of these, but the architecture we
> build now should allow these possibilities and others. Other people
> can try out different ideas, and users can choose the system best
> suited to their needs.

yes, agree.

> I'm also not sure about the issues with using persistence and distribution
> simultaneously. If we simply use PersistentManager with this distribution
> code, each instance will save its own copy of every session to persistent
> storage. This might be desirable in some cases - I can see using FileStore,
> for instance. But if you use JDBCStore and the Multicast distribution, it's
> wasteful - with a 4 server farm, we have 4 copies of each session in the
> database. So how should this be addressed? Cluster ought to have some
> mechanism which (optionally) ensures that each session is only
> persisted once. This may mean having Cluster override Store functionality,
> which is why I was thinking of combining the two.

Yes, that's a good point, at first I was thinking that each machine in a Cluster
is having it's own unique key, so when you generate an session id machine 1 would
get something like: A1KDSFNRKIFLKMFDSFDSA where:
--------------------|| <-- Are the two letters that identifies the machine, so when
you know which machine that "owns" the session all machines that have the session replicated
know that it doesn't belong to them so they shouldn't save in a Store. It's also
useful for an eventuall tomcat dispatcher frontend to know which machine the session
origins from. However some complications occur when you replicate a session and
the machine that "owned" the session dies so another machine takes it over. Should that
machine then take the role as Machine A1.

	Cheers, Bip

Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by Kief Morris <ki...@bitbull.com>.
Bip Thelin typed the following on 04:06 PM 5/6/2001 -0700
>> We also need to answer the question of the request life cycle: the
>> DistributedManager needs to know when a request begins and ends.
>> At the beginning, it must lock the session to prevent other Catalina
>> instances from using it in requests. This can probably just be done
>> in Manager.findSession(). At the end, it must tell the ClusterStore to
>> update the session to other members of the Cluster, and unlock it.
>
>I'm not really sure what you're saying here. 

OK, ignore my use of the term ClusterStore - I was melding the
Store and Cluster concepts, but from your comments I see that
we may want to be able to use both in the same setup.

My point is that the Manager/Cluster needs to know when the session is in 
use by another instance of Catalina. A locking mechanism must be 
implemented by the Cluster (or whatever) to prevent a session from being 
used by multiple instances at once. This mechanism will require the
Manager/Cluster system to know when a request  begins using
a session, and when it has finished.

>If we say that only one JVM at a time can manipulate a sessions since
>a sessions only belongs to one machine at a time the only time a session
>needs to be replicated is when it's created/changed/destroyed.

Yes. But putting the session into the Cluster at creation time is
unnecessary. It should be put in at the end of the request when
it is created (other Catalina instances can't use it before then
anyway), and updated at the end of each subsequent request.
So we need to have the end of a request call into the Manager
to indicate that the session can be sent to the Cluster and
unlocked for use.

>I'd rather see the replication be implemented in a Manager(i.e. 
>DistributedManager or
>maybe change name to MulticastDistributedManager) thus making it possible to
>run any Store with the DistributedManager(i.e. FileStore).

OK, I take your point that extending Store isn't the way to go. But
I don't think we should have a different Manager implementation for 
each available distribution mechanism - MulticastDistributedManager,
JMSDistributedManager, JavaSpacesDistributedManager, 
JCacheDistributedManager, etc. We should use the same pattern
that Manager/Store uses: a single DistributedManager should be
implemented which is independent of the actual session sharing
mechanism. It should be able to use any implementation of the
Cluster interface.

I'd like to keep the possibility open to implement different distribution 
strategies. The strategy we're looking at now is for each instance of the 
application to hold copies of every session. An alternative Cluster strategy 
would keep the sessions in a central location such as a database: when 
a request comes in for a session not found in the current instance, the 
Cluster checks the database to see if it's there. This isn't too different
from simply using JDBCStore. A third way is to have just two instances
of the application holding a given session: when instance A creates
the session, the Cluster chooses instance B to hold a backup copy
in case A goes down: if a request comes in to C, B still has it available.

Not that we need to implement all of these, but the architecture we
build now should allow these possibilities and others. Other people
can try out different ideas, and users can choose the system best
suited to their needs.

I'm also not sure about the issues with using persistence and distribution
simultaneously. If we simply use PersistentManager with this distribution 
code, each instance will save its own copy of every session to persistent 
storage. This might be desirable in some cases - I can see using FileStore,
for instance. But if you use JDBCStore and the Multicast distribution, it's
wasteful - with a 4 server farm, we have 4 copies of each session in the
database. So how should this be addressed? Cluster ought to have some
mechanism which (optionally) ensures that each session is only
persisted once. This may mean having Cluster override Store functionality,
which is why I was thinking of combining the two. 

Kief


Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by Bip Thelin <bi...@razorfish.com>.
Kief Morris wrote:
> 
> [...]
> 
> This is one possibility, but if this technique is used, I'm not sure there's
> a real need to distribute the sessions at all - the redirector can simply
> send the client to the Tomcat instance which holds the session locally.

Well that's true if the machine that originally held the sessions is still
alive, the beauty of ditributed sessions is that if the machine that spawned
the session unexpectadely died any server in the cluster could continue that
session.

> The problem I have with this approach is the front-end Tomcat becomes
> a single point of failure. A big advantage distributing sessions is that any
> Tomcat instance is able to handle requests for any client session, even
> in the event of failure of other instances. We should also allow Tomcat's
> distribution to work if a router-type solution is used to distribute incoming
> requests in a non-sticky manner.

Well I belive this is the way that the IBM network dispatcher works, and
you can cluster it so it would'nt become a singel point of failure. I believe
that we need some sort of frontend to the session distribution, I don't think
that we should allow a request to go to any server at any time in a cluster.
I think that we should as far as possible try to have the session in use hit
the machine that spawned it, we can't take for granted that any machine in the
cluster have an EXACTLY up to date version of the same session, there's always
gonna be some overhead in the replication procedure.

> I would make the (controversial, I know) suggestion that the domain
> for the cookie be configurable, so that organization foo.com can
> set cookies at .foo.com, which would be sent to Tomcat instances
> at www1.foo.com, www2.foo.com, etc.

This is what a dispatcher/frontend would do, but also keeps a list in memory
of where the sessions "belong".

> There have been a variety of suggestions for how to implement, including
> JavaSpaces, JMS, and JCache. Also JNDI and JDBC and the ugly old file
> system (if a networked file system is used) could work. MulticastSocket
> is a good idea I hadn't thought of - from looking at your code it's obviously
> workable and pretty straightforward. But I think we ought to make the
> mechanism pluggable the same way Store is pluggable for PersistentManager.

If we change name of the Cluster implementation to MulticastCluster it could
be pluggable by anything that implements the org.apache.catalina.Cluster interface.
We should also make a Interface to send and receive data that the
MulticastReceiver/Sender would implement. I'll start work on that immediately.

> We also need to answer the question of the request life cycle: the
> DistributedManager needs to know when a request begins and ends.
> At the beginning, it must lock the session to prevent other Catalina
> instances from using it in requests. This can probably just be done
> in Manager.findSession(). At the end, it must tell the ClusterStore to
> update the session to other members of the Cluster, and unlock it.

I'm not really sure what you're saying here. As I envisioned it a
DistributedManager is responsible for replicating sessions and to restore
replicated sessions. A Store is a pluggable component that are the
same for any Manager implementation(that makes use of a Store).

If we say that only one JVM at a time can manipulate a sessions since
a sessions only belongs to one machine at a time the only time a session
needs to be replicated is when it's created/changed/destroyed.

> A MulticastStoreCluster (or whatever) should be pretty straightforward
> to do with Bip's code. Some comments (intended as a "to think about/do",
> rather than as a criticism of Bip's code, which is meant as a starting point
> for discussion)

I'd rather see the replication be implemented in a Manager(i.e. DistributedManager or
maybe change name to MulticastDistributedManager) thus making it possible to
run any Store with the DistributedManager(i.e. FileStore).

Thanks for the comments Kief!

	Regards, Bip

Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by Bip Thelin <bi...@razorfish.com>.
Kief Morris wrote:
> 
> Bip, thanks for kick-starting this discussion, sorry I've taken a while to look
> at it.
> 
> [...]

If you(or anyone else) wants to play around with the highly experimental Cluster
add this right under your <Host> in server.xml

<Cluster className="org.apache.catalina.cluster.StandardCluster"
    multicastSocket="6789"
    multicastAddress="228.5.6.7"
    checkInterval="5"
    clusterName="myCluster"
    debug="99"/>



	..bip

Re: [PROPOSAL Tomcat 4.x] Cluster

Posted by Kief Morris <ki...@bitbull.com>.
Bip, thanks for kick-starting this discussion, sorry I've taken a while to look 
at it.

>> One thing I haven't figured out is this. Say I replicate a
>> Session to another machine and succefully install it in the manger, if
>> I connect to that machine and tries to resume that session it doesn't
>> seem to relate that session to me instead it creates a new session.

[Craig says:]
>That's probably because of the way sessions work -- the browser sends
>session ids back to the host they came from (for either cookies or URL
>rewriting).  I suspect we might have to create some sort of proxy
>front-end (sort of a Local Redirector) that modifies the session cookies
>returned by the back-end Tomcats to be the name of the proxy front-end.  
>It would then need to keep track of which host is really hosting this
>session.  (In an Apache-connected environment, this would probably be done
>in the connector.  For stand-alone, we could create a proxy webapp that
>would run in a front-end Tomcat.)

This is one possibility, but if this technique is used, I'm not sure there's
a real need to distribute the sessions at all - the redirector can simply
send the client to the Tomcat instance which holds the session locally.

The problem I have with this approach is the front-end Tomcat becomes
a single point of failure. A big advantage distributing sessions is that any 
Tomcat instance is able to handle requests for any client session, even 
in the event of failure of other instances. We should also allow Tomcat's
distribution to work if a router-type solution is used to distribute incoming
requests in a non-sticky manner.

I would make the (controversial, I know) suggestion that the domain
for the cookie be configurable, so that organization foo.com can
set cookies at .foo.com, which would be sent to Tomcat instances
at www1.foo.com, www2.foo.com, etc. 

>The front-end could also later be the basis for dynamic load balancing --
>although we need to remember to support the servlet spec restriction that,
>at any specific point in time, a session lives in one and only one
>JVM.  In other words, if I have an active request for session 123 being
>executed on host A, I have to direct any other simultaneous requests for
>session 123 to host A as well.

In order to cope with load balancing mechanisms which aren't aware
of these issues, we could implement locking - flag when a session is
in use by a particular instance - and do "something" if a request comes
in on another instance which doesn't hold the lock -send an error, redirect, 
or just wait until the lock frees (ugh).

[Bip says:]
>>I started looking at how to implement a DistributedManager and as I see it the
>>best way to do this is to use MulticastSocket. 

There have been a variety of suggestions for how to implement, including 
JavaSpaces, JMS, and JCache. Also JNDI and JDBC and the ugly old file
system (if a networked file system is used) could work. MulticastSocket 
is a good idea I hadn't thought of - from looking at your code it's obviously 
workable and pretty straightforward. But I think we ought to make the
mechanism pluggable the same way Store is pluggable for PersistentManager.

The DistributedManager should manage the manipulation of sessions for
the local instance, for instance overriding getSession() to make a call
to the ClusterStore if a) the application is marked as distributable, and
b) the Store is an instance of ClusterStore.

We also need to answer the question of the request life cycle: the
DistributedManager needs to know when a request begins and ends.
At the beginning, it must lock the session to prevent other Catalina
instances from using it in requests. This can probably just be done
in Manager.findSession(). At the end, it must tell the ClusterStore to 
update the session to other members of the Cluster, and unlock it.

So what would the ClusterStore interface look like? The following
are some methods it may need:

/** Returns an up-to-date copy of the Session for the given ID,
    and locks the session. Throws IllegalStateException if the
    session is in use by another instance. */
public Session findSession(String id) throws IllegalStateException;

/** Adds or updates the session in the cluster, releasing the lock if
    applicable. */
public void sendSession(Session session);

/** Receive a new or updated session from another instance of the 
    cluster. May be called by another method of the same class, but 
    is useful to have in the interface - a ClusterStoreBase will probably
    have the canonical implementation of this.
*/
public void receiveSession(Session session);

/** Gets a unique identifier for this Context/JVM instance, used
    by the locking mechanism to indicate the owner of the lock.
*/
public int getInstanceID();

/** Sets a unique identifier for this Context/JVM instance.
*/
public void setInstanceID(int id);

public Manager getManager();
public void setManager(Manager manager);

No doubt there will need to be more to it than this. Any comments?
I'll code it up and commit it in a day or two if nobody has major
problems with the concept.

A MulticastStoreCluster (or whatever) should be pretty straightforward
to do with Bip's code. Some comments (intended as a "to think about/do",
rather than as a criticism of Bip's code, which is meant as a starting point 
for discussion)
- The findSession() would probably just return null, since receiveSession()
  would update the Manager's session list for it.
- Rather than publishing the session after creation, it should be
  published at the end of each request, so any changes will be shared.
  This will take a bit of digging into request handling.
- Rather than receiving updates when the normal background thread
  gets around to it periodically, this Store should probably have a
  separate thread which handles incoming session updates from
  other instances. Otherwise a request may come in before the
  session has been absorbed from another instance.

Regards,
Kief