You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geronimo.apache.org by Jules Gosnell <ju...@coredevelopers.net> on 2003/10/23 12:43:44 UTC
Re: [Re] Web Clustering : Stick Sessions with Shared Store - current state of play.

Guys,

since this topic has come up again, I thought this would a useful point 
to braindump my current ideas for comment and as a common point of 
reference...

Here goes :



Each session has one 'primary' node and n 'replicant' nodes associated
with it.

Sticky load-balancing is a hard requirement.

Changes to the session may only occur on the primary node.

Such changes are then replicated (possibly asynchronously, depending
on data integrity requirements) to the replicant nodes.

If, for any reason, a session is required to 'migrate' to another node
(fail-over or clusterwide-state-balancing), this 'target' node makes a
request to the cluster for this session, the current 'source' node
handshakes and the migration ensues, after which the target node is
promoted to primary status.

Any inbound request landing on a node that is not primary for the
required session results in a forward/redirect of the request to it's
current primary, or a migration of the session to the receiving node
and it's promotion to primary.

A shared store is used to passivate sessions that have been inactive
for a given period, or are surplus to constraints on a node's session
cache size.

Once in the shared store, a session is disassociated from it's primary
and replicant nodes. Any node in the cluster, receiving a relevant
request, may load the session, become it's primary and choose
replicant nodes for it.

Correct tuning of this feature, in a situation where frequent
migration is taking place, might cut this dramatically.


The reason for the hard node-level session affinity requirement is to
ensure maximum cache hits in e.g. the business tier. If a web session
is interacting with cached resources that are not explicitly tied to
it (and so could be associated with the same replicant nodes), the
only way to ensure that subsequent uses of this session hit resources
in these caches is to ensure that these occur on the same node as the
cache - i.e. the session's primary node.

By only having one node that can write to a session, we remove the
possibility of concurrent writes occurring on different nodes and the
subsequent complexity of deciding how to merge them.

The above strategy will work for a 'implicit-affinity' lb (e.g. BigIP),
which remembers the last node that a session was successfully accessed
on and rolls this value forward as and when it has to fail-over to a
new node. We should be able to migrate sessions forward to the next
node picked by the lb, underneath it, keeping the two in sync.

With an 'explicit-affinity' lb (e.g. mod_jk), where routing info is
actually encoded into the jsessionid/JSESSIONID value (or maybe an
auxiliary path param or cookie), it should be possible, in the case of
fail-over, to choose a (probably) replicant node to promote to primary 
and to stick
requests falling elsewhere to this new primary by resetting this
routing info on their jsessionid/JSESSIONID and redirecting/forwarding
them to it.

If, in the future, we write/enhance an lb to be Geronimo-aware, we can
be even smarter in the case of fail-over and just ask the cluster to
choose a (probably) replicant node to promote to primary and then direct 
requests
directly to this node.

The cluster should dynamically inform the lb about joining/leaving
nodes, and sessions should likewise maintain their primary/replicant
lists accordingly.

LBs also need to be kept up to date with the locations and access
points of the various webapps deployed around the cluster, relevant
node and webapp stats (on which to base balancing decisions), etc...

All of this information should be available to any member of the
cluster and a Geronimo-aware lb should be a full cluster member.

On shutting down every node in the cluster all session state should
end up in the shared store.


These are fairly broad brushstrokes, but they have been placed after
some thought and outline the sort of picture that I would like to see.

Your thoughts ?


Jules






Bhagwat, Hrishikesh wrote:

>>I am also not convinced it reduces the amount of net traffic. After each
>>request the MS must write to the shared store, which is the same traffic as
>>a unicast write to another node or a multicast write to the partition
>>(discounting the processing power needed to receive the message).
>>    
>>
>I agree. However, this is based on the assumption that only one unicast 
>write is required. In other words, this is a primary/secondary topology. I 
>think that hd did not intended such a topology and hence his statement.
>
>[hb]  Yes i was not assuming a Pri/Sec design but a layout where any active server
>	can be request to pick up a client request which is destined to server that has just failed
>
>-----Original Message-----
>From: gianny DAMOUR [mailto:gianny_damour@hotmail.com]
>Sent: Sunday, October 19, 2003 7:35 AM
>To: geronimo-dev@incubator.apache.org
>Subject: [Re] Web Clustering : Stick Sessions with Shared Store
>
>
>Jeremy Boynes wrote:
>
>  
>
>>However, as Andy says, the cost of storing a serialized object in a BLOB is
>>significant. Other forms of shared store are available though which may
>>offer better performance (e.g. a hi-av NFS server).
>>    
>>
>Do we need a shared repository or a replicated repository?
>
>
>  
>
>>The issue I have with hb's approach is the reliance on an Admin Server, of
>>which there would need to be at least two and they would need to co-operate
>>between themselves and with any load-balancers. I think this can be handled
>>by the regular servers themselves just as efficiently.
>>    
>>
>I agree. It seems that in such a design an Admin Server is only used to 
>route incoming requests to the relevant node.
>
>However, I do not believe that regular servers can do this job. I assume 
>that they will implement a standard peer-to-peer cluster topology to provide 
>redundancies, however I do not see how they can handle the dispatch of 
>incoming requests.
>
>This feature seems to be either a client or a proxy one: I mean it should be 
>done prior to reach the nodes.
>
>For instance, this feature is treated on the client-side via a stub aware of 
>the available nodes in WebLogic. It seems that JBoss (correct me if I am 
>wrong) has also followed this design.
>
>  
>
>>I am also not convinced it reduces the amount of net traffic. After each
>>request the MS must write to the shared store, which is the same traffic as
>>a unicast write to another node or a multicast write to the partition
>>(discounting the processing power needed to receive the message).
>>    
>>
>I agree. However, this is based on the assumption that only one unicast 
>write is required. In other words, this is a primary/secondary topology. I 
>think that hd did not intended such a topology and hence his statement.
>
>Gianny
>
>_________________________________________________________________
>MSN Search, le moteur de recherche qui pense comme vous !  
>http://search.msn.fr/
>  
>


-- 
/*************************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 * http://www.coredevelopers.net
 *************************************/