You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by Rajith Attapattu <ra...@gmail.com> on 2006/09/22 05:19:01 UTC
[axis2] Clustering
Hi All,
Chathura and Chamikara have posted the following proposal on the wiki.
http://wiki.apache.org/ws/FrontPage/Axis2/clustering_proposal
Next step is for us to figure out and document the demarcation points where
we would want to cluster.
Then we can start on an implementation.
Regards,
Rajith
Re: [axis2] Clustering
Posted by Steve Loughran <st...@apache.org>.
Rajith Attapattu wrote:
> Hi All,
>
> Chathura and Chamikara have posted the following proposal on the wiki.
> http://wiki.apache.org/ws/FrontPage/Axis2/clustering_proposal
>
> Next step is for us to figure out and document the demarcation points where
> we would want to cluster.
> Then we can start on an implementation.
>
> Regards,
>
> Rajith
>
Some observations
1. try and use other people's work if you can. Getting clustering to
work properly in the face of network failures is hard. We use a
partition-aware tuple space for such things; Anubis [1,2].
2. Are you planning on saving state to the servlet context? Its usually
the simplest way, as the app server vendor will have solved some of the
problems.
3. Servlet state can be brittle against hot redeploy on a single machine
if you update the implementations, and very brittle if you do a rolling
cold redeploy across a cluster. Unless you can go offline, you need to
do choreographed redeploys in which you partition the cluster and have
the load balancer serve the old nodes until the new nodes are up and
have it switched over.
4. Round robin sucks from performance if requests in the same session
arent biased towards the previous machine, just for cache (including HDD
& DB) cache reasons.
5. Round robin needs to use happyaxis.jsp or equivalent to decide where
to route stuff. You cannot just rely on presence of a machine as a
liveness cue, you need to monitor the health of the operations.
6. Are your clusters going to be on the same site/network? What are the
minimum network requirements, with WLAN and one end, and infiniband at
the other?
7. How are you going to stop system management scaling at O(nodes) or
worse. It can be worse unless your diagnostics are good at tracking down
which machine has a problem, believe me.
8. Testing all of this gets hard indeed. Sometimes we have to resort to
mathematical proofs of correctness.
Overall, you need to decide on your goals. Is it scalability or
availability? Both can be done with clustering but you need good
awareness of the problems before you can get it right. A High
Availability system will be robust against transient network failures,
and may or may not support rolling redeployment. More to the point, an
underlying design that is not robust against network outages is very
hard to fix, and stops you doing fun things like downsizing or upsizing
the nodes based on demand, rerouting to different machines based on WS-A
internals (*) and session info (i.e per-customer and geographic selection),
The other thing is that achieving consistency of behaviour in your
distributed system is hard. Whoever implementing it needs to be able to
argue about Lamport's papers on byzantine generals, or Gray's
experiences, otherwise they haven't got the background needed to get it
right. My own skills in the area are limited, which is why I delegate.
But I do know why its hard.
1. Anubis is OSS, on our sourceforge project, so you could use it, but
it is LGPL. while we are happy with you calling it from Apache code, I'm
not sure that apache is. If we can come up with an
implementation-neutral API, we may be able to implement it and so you
could use it as your way of sharing state across a single-site cluster,
preferably one with a decent ethernet behind it.
2. I would think that a back-end neutral SOAP/HTTP load balancer with
awareness of back end availability and able to route on WS-A information
is broadly useful to other SOAP stacks, including Axis1.x and Xfire.
Maybe it should be a separate project with a JMX management API for live
configuration. And before you do it, look at what exists in terms of
HTTP load balancing in the rest of Apache. There's mod-proxy in Apache
HTTPD, and there's Tomcat's own rule-based load-balancer [3].
-Steve
[1,2] http://www.hpl.hp.com/techreports/2005/HPL-2005-72.html
http://www.smartfrog.org/releasedocs/smartfrogdoc/anubis/AnubisUserGuide.pdf
[3]
http://tomcat.apache.org/tomcat-5.5-doc/balancer-howto.html#Using%20the%20balancer%20webapp
(*) Load balancing is one reason I dont like WS-A; you need to parse the
doc to find the URL, unless the URL is the only thing you redirect on.
---------------------------------------------------------------------
To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-dev-help@ws.apache.org