You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by "Pier P. Fumagalli" <pi...@betaversion.org> on 2001/02/26 10:58:35 UTC

FW: [advanced-servlets] Session Load Balancing (was: To avoid Duplicate Login)

Really interesting, when thinking about load balancing...

    Pier

-- 
----------------------------------------------------------------------------
Pier Fumagalli  <http://www.betaversion.org/>  <ma...@betaversion.org>

------ Forwarded Message
> From: Charles Forsythe <fo...@alum.mit.edu>
> Organization: NetVoice Technologies
> Reply-To: advanced-servlets@yahoogroups.com
> Date: Mon, 26 Feb 2001 03:51:50 -0600
> To: advanced-servlets@yahoogroups.com
> Subject: Re: [advanced-servlets] Session Load Balancing   (was: To avoid
> Duplicate Login)
> 
> Nic Ferrier wrote:
>> I don't know how NAS or Websphere work but there are (AFAIK) only two
>> main approaches you can take to session load balancing. ...
>> either:
>> 
>> a) the server recieving the session request asks an authoritative
>> source for the current session data which locks the session until the
>> updated data is returned on completion of the reciever
>> 
>> b) the sever recieiving the session request locks the session for the
>> duration of the request via a broadcast to other servers and on
>> completion sends the updated session data to those other servers
> 
> I thought of a third way a while back.  To my knowledge, it's never been
> done. If anyone thinks it's a good idea, please use it before Microsoft
> patents it.
> 
> Basically, the Session ID is enhanced with a server-of-origin tag.  I'll
> call this a "composite ID" consisting of a "server tag" and a "Session
> ID" where the Session ID serves its traditional function.  Ignoring
> security issues (for now), you might see composite IDs that looked like:
> 
> serverX:sessionPPP
> serverX:sessionDDD
> serverY:sessionQQQ
>  ...and so on...
> 
> In this example, the first ID indicates that server X owns the Session
> with the ID sessionDDD.  The Session with ID sessionQQQ is owned by
> server Y.
> 
> If server X gets a request with a composite ID of "serverX:sessionPPP"
> it proceeds with the request the way a stand-alone container would;
> nothing gets serialized, broadcasted, saved, loaded or locked.
> 
> If the next request from that client is routed to server Y, then server
> Y will get a request with that same composite ID of
> "serverX:sessionPPP".  This tells server Y that the first thing it needs
> to do is get the canonical version of Session sessionPPP from server X.
> (The exact method for this may vary, but suffice to say it will not
> involve spawning Threads from Servlets. :-)  In the response from server
> Y, the composite Session ID will change to "serverY:sessionPPP".
> 
> This approach is extremely efficient if there is any level of server
> affinity in your load balancing.  By server affinity, I mean that the
> load balancing attempts to keep a particular client and server paired.
> For example "DNS round-robin" has 100% server affinity because once the
> client has resolved the IP address of the server, it will continue to go
> to the same server.  DNS round-robin is a poor load balancing, but it
> eliminates application distribution problems entirely.
> 
> On the other end of the scale is the Cisco LocalDirector and other
> "Layer 4 Switches" (also called "sprayers" sometimes).  To put
> perspective on things, consider that a LocalDirector can load balance a
> million simultaneous TCP sessions with a combined throughput of 200
> Mbits/s.  It's aimed at sites that get hundreds of millions of hits per
> day.  The LocalDirector will try some tricks to ensure server affinity,
> but it can't guarantee server affinity (DNS round-robin is really the
> closest you get to guaranteed affinity).  Server affinity is just one of
> several desired goals that are weighted by a LocalDirector
> configuration.  For example, if a client request is paired with a server
> that is currently at its connection limit, the LocalDirector will send
> the request to another server (unless you configure it never to do
> that).
> 
> The net result of this is that Sessions may move from one server to
> another, but it will happen as infrequently as possible.
> 
> The really nasty issue is this: What if a client makes two concurrent
> requests, each of which is routed to a different server?  Well, there
> are a lot of different ways to handle that, but it's problem faced by
> any distributed Servlet container (therefore someone's probably already
> thought of a good solution).
> 
> SECURITY ISSUES:  There is a simple way to make this system resistant to
> exploits and non-catastrophic failures.  Due to painful past experience
> with the blinkered analysis that is usually applied to web application
> security, I figured I'd explore the most obvious wrong answers just to
> get them out of the way.
> 
> Good Session IDs are virtually impossible to guess, thus preventing
> malicious hackers from hijacking other users' Sessions by forging their
> Session IDs.  One-way hash functions, such as MD5, are useful for
> generating values of mysterious origin.  The composite ID must contain a
> server tag that must not be opaque to the servers.  You might think that
> it would be valuable to encrypt the composite ID with a two-way cipher
> before sending it to the client to ensure that the server tag was opaque
> to the client.  Actually, such an encryption accomplishes nothing, so
> you'd be wrong.
> 
> To illustrate, let's look at some Session IDs with an obvious server tag
> format.  What would you gain by making the following changes to a
> Session ID?
> 
> Original:    www1:54b0c58c7ce9f2a8b551351102ee0938
> Case 1:        www1:54b0c58c7ce9f2a8b551351102ee0925
> Case 2:        www2:54b0c58c7ce9f2a8b551351102ee0938
> 
> In case #1, the portion of the composite ID corresponding to the Session
> ID has been changed; the hash value has been changed linearly to a
> smaller value.  This should have the same result it has on any other
> securely-generated Session ID.  That is, it should merely create a
> meaningless random number.
> 
> In case #2, the server-tag has been changed from "www1" to "www2".
> First of all, www2 will only have a Session matching that composite ID
> if had been involved in an earlier request.  If www2 is unfamiliar with
> this Session, the attempt to switch server tags will simply fail.
> 
> Assuming that www2 has participated in that Session, the worst-case
> scenario is that the the Session state it has is out of date.  We need
> to prevent such stale-data exploits.  It's easy to imagine an
> application, such as an online exam, where returning to a previous
> application state might provide an undesirable advantage to the client.
> 
> An approach I've seen used is to change the value of the Session ID with
> each request.  This is employed in non-distributed applications to
> insure that a stolen Session ID is only valuable in the short time
> between its issue and the subsequent request.  Furthermore, a
> successfully hijacked Session will cause the legitimate user to lose the
> sequential integrity of Session IDs, locking that user out of the
> application and making them instantly aware that there is some kind of
> problem.
> 
> Realistically, if an attacker was able to steal a Session ID that was
> still valid, it could be used immediately, thereby negating the benefit
> of changing it on each request.  Additionally, two decades of experience
> with Microsoft products has acclimatized users to frequent, inexplicable
> software failures.  The casts serious doubt on the scenario wherein the
> legitimate user finds the sudden inability to use a web application
> mysterious, suspicious or even unusual.
> 
> In the distributed case, this questionable "security feature" might
> appear to prevent an attacker from returning to a previous server and
> its stale Session state.  Simply changing the server tag of the current
> composite ID should fail because the Session ID will have changed as
> well, rendering it meaningless to any servers with stale data.  Again,
> while this is true, its only actual effect is an increase in the
> complexity of the Session management code.
> 
> For example, we might have these IDs:
> 
> Last ID on www2:    www2:54b0c58c7ce9f2a8b551351102ee0938
> Current ID on www1:    www1:3699ae702046d4d24d5ced13f2034531
> 
> The attacker records the first ID from www2, and forces a server switch
> to www1 (or any other server).  The procedure to force a server switch
> depends on the method of server affinity used, but, again, no
> server-affinity method is bullet-proof.  After the switch, the attacker
> could request the stale Session state by replaying the old ID.  It
> doesn't matter that the Session ID has changed or even if the server tag
> is indecipherable to the client.
> 
> This is why encrypted IDs don't do any good.  If there is any benefit to
> switching back to a Server with stale data, the only information you
> need to do so is the value of the composite ID last delivered by that
> stale server.  It bears repeating: this is regardless of whether or not
> you can interpret or modify the composite ID's contents.  The world is
> full of web developers who think that Session Cookies containing
> encrypted data provite really 31337 application security.  Why?  FOR THE
> LOVE OF GOD, WHY?!  If you don't want the client to have access to some
> data, KEEP IT ON THE SERVER (indexed by a suitably random Session ID
> that is sent to the client).
> 
> Anyway, the easy (and obvious) answer to preventing stale-data exploits
> is to prevent stale data from being considered valid.  Each server
> simply invalidates its own copy of the Session state when it is sent to
> another server.  
> 
> In addition, the server releasing the state should maintain a
> "forwarding address" for the current Session state.  If, for example,
> www2 were to get a request for a Session it no longer maintains, it
> could request that Session from the server it last sent the state to
> (e.g. www1).  If, in turn, that Server no longer has the official copy,
> the request would be relayed to the next server in the chain (e.g.
> wwwN).  The worst case would be an exhaustive search of the server farm
> to find the data.  If the server affinity mechanism works at all, it is
> extremely unlikely that this would occur.
> 
> The active state search is a valuable feature.  Not only will it
> gracefully thwart stale data exploits, but it provides an elegant
> recovery mechanism in case there is a glitch that causes the client to
> get out of sync with the servers.
> 
> In order for the search to work, the Session ID must remain static
> throughout the lifetime of the Session, so that it is the same on every
> server that hosts it.  Therefore, rotating the Session ID (Lame Idea #1)
> costs you more in functionality than it gains you in security.  Even
> without the lost-state search, invalidating state Session state will
> remove any value to obscuring or the server-tag syntax or encrypting the
> composite IDs (Lame Idea #2).
> 
> -- Charles
> 
> Before posting a question, try to find your answer here:
> <http://www.egroups.com/links/advanced-servlets>
> Announcements should go to: advanced-servlets-announce@egroups.com
> To Post a message, send it to: advanced-servlets@eGroups.com
> To Unsubscribe, send a blank message to:
> advanced-servlets-unsubscribe@eGroups.com
> 
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
> 
> 

------ End of Forwarded Message


Re: FW: [advanced-servlets] Session Load Balancing (was: To avoid

Posted by Jason Brittain <ja...@olliance.com>.
Nick Bauman wrote:

> Jason,
> 
>> Yes, and the way I've seen people solve this issue is to make each
>> server constantly replicate its sessions to another server so that any
>> session's state is stored in two servers (not just one).  For example,
>> if you've got four servers, A, B, C, and D, you configure A to
>> replicate to B, B to replicate to C, C to replicate to D, and D to
>> replicate to A.  Then the "composite ID" would contain the primary
>> server tag, secondary server tag, and the session ID, like: 
>> A:B:sessionXXX.  So, if server A went down, the load balancer could
>> still get session info from server B, and at the  same
>> time let server D know that A is down and to replicate to B until
>> further notice.
> 
> 
> Naaaah.
> 
> This is once again only making sure a majority of the sessions are saved in
> a rotation. A lot of work for very little real fault tolerance.


Sorry to say, but the folks at BEA disagree with you -- this is exactly
what Weblogic does to facilitate distributed HTTP sessions.  In-memory
replication to a selected "buddy" server is pretty fast and fault tolerant
enough for most failures.  It can also ensure no single point of failure.

> Also I think your english up there indicates a solution that causes
> tremendous hysterysis amongst the servers.


How so?

>> This also works when each server replicates sessions to more than one
>> backup server so that you've got even higher fault tolerance (but
>> you'll probably never need that level of fault tolerance).
> 
> 
> Now you may have something: a seperate, parallel, session cluster.


Anyone could sure do it that way.  But, I'm not sure that separating it
out from the servers themselves could add much fault tolerance since
the cons to doing this seem to be about as large as the pros.  It seems to
me that making it part of the servers (the servlet containers, for instance)
would work just as well.


-- 
Jason Brittain
Software Engineer, Olliance Inc.        http://www.Olliance.com
Current Maintainer, Locomotive Project  http://www.Locomotive.org


Re: FW: [advanced-servlets] Session Load Balancing (was: To avoid

Posted by Nick Bauman <ni...@cortexity.com>.
Jason,

> 
> Yes, and the way I've seen people solve this issue is to make each
> server constantly replicate its sessions to another server so that any
> session's state is stored in two servers (not just one).  For example,
> if you've got four servers, A, B, C, and D, you configure A to
> replicate to B, B to replicate to C, C to replicate to D, and D to
> replicate to A.  Then the "composite ID" would contain the primary
> server tag, secondary server tag, and the session ID, like: 
> A:B:sessionXXX.  So, if server A went down, the load balancer could
> still get session info from server B, and at the  same
> time let server D know that A is down and to replicate to B until
> further notice.

Naaaah.

This is once again only making sure a majority of the sessions are saved in
a rotation. A lot of work for very little real fault tolerance.

Also I think your english up there indicates a solution that causes
tremendous hysterysis amongst the servers.

> This also works when each server replicates sessions to more than one
> backup server so that you've got even higher fault tolerance (but
> you'll probably never need that level of fault tolerance).

Now you may have something: a seperate, parallel, session cluster.

> -- 
> Jason Brittain

-- 
Nick Bauman


Re: FW: [advanced-servlets] Session Load Balancing (was: To avoid

Posted by Jason Brittain <ja...@olliance.com>.
Nick Bauman wrote:

> Pier,
> 
>>> If the next request from that client is routed to server Y, then
>>> server Y will get a request with that same composite ID of
>>> "serverX:sessionPPP".  This tells server Y that the first thing it
>>> needs to do is get the canonical version of Session sessionPPP from
>>> server X. (The exact method for this may vary, but suffice to say it
>>> will not involve spawning Threads from Servlets. :-)  In the response
>> 
> 
> The only problem with this is you have N servers in a rotation (sprayed or
> DNS round-robin) and one goes down, you lose 1/N sessions. 
> 
> Some people think that if you are going to bother with session load
> balancing / distribution at all, why not try and ensure all the sessions are
> safe, not just a majority.


Yes, and the way I've seen people solve this issue is to make each server
constantly replicate its sessions to another server so that any session's
state is stored in two servers (not just one).  For example, if you've got
four servers, A, B, C, and D, you configure A to replicate to B, B to
replicate to C, C to replicate to D, and D to replicate to A.  Then the
"composite ID" would contain the primary server tag, secondary server tag,
and the session ID, like:  A:B:sessionXXX.  So, if server A went down,
the load balancer could still get session info from server B, and at the 
same
time let server D know that A is down and to replicate to B until further
notice.

This also works when each server replicates sessions to more than one
backup server so that you've got even higher fault tolerance (but you'll
probably never need that level of fault tolerance).

-- 
Jason Brittain
Software Engineer, Olliance Inc.        http://www.Olliance.com
Current Maintainer, Locomotive Project  http://www.Locomotive.org


Re: FW: [advanced-servlets] Session Load Balancing (was: To avoid

Posted by Nick Bauman <ni...@cortexity.com>.
Pier,

>> If the next request from that client is routed to server Y, then
>> server Y will get a request with that same composite ID of
>> "serverX:sessionPPP".  This tells server Y that the first thing it
>> needs to do is get the canonical version of Session sessionPPP from
>> server X. (The exact method for this may vary, but suffice to say it
>> will not involve spawning Threads from Servlets. :-)  In the response

The only problem with this is you have N servers in a rotation (sprayed or
DNS round-robin) and one goes down, you lose 1/N sessions. 

Some people think that if you are going to bother with session load
balancing / distribution at all, why not try and ensure all the sessions are
safe, not just a majority.


Re: [advanced-servlets] Session Load Balancing (was: To avoid Duplicate Login)

Posted by James Duncan Davidson <du...@x180.net>.
on 2/26/01 1:58 AM, Pier P. Fumagalli at pier@betaversion.org wrote:

> Really interesting, when thinking about load balancing...

By the way, most of the things talked about there *have* been done in one
place or another -- though almost always as part of a custom site solution
for big sites that do such things as "proprietary trade secrets".

Most cases I've seen bury the fact that its being done by using the first x
bytes of the cookie to determine session affinity and not using something so
obvious as www1:asdf -- the reason they do this isn't to encrypt data, but
to keep their competitors to figure out how they get better performance. :)

.duncan

-- 
James Duncan Davidson
http://x180.net/                                             !try; do();