You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@geronimo.apache.org by Hernan Cunico <hc...@gmail.com> on 2006/01/27 22:17:27 UTC
Re: Clustering Overview - 10.000 feet...
Hi Jules,
I took the liberty to do some very basic formating to your doc and put it in confluence already (I
hope you won't mind :) )
Here is the link to the *Geronimo Clustering* doc:
http://opensource2.atlassian.com/confluence/oss/display/GERONIMO/Clustering
If you are OK with it I could do some additional formatting and editing to match it to the rest of
the documentation.
Cheers!
Hernan
Jules Gosnell wrote:
> Rather than wait for resolution as to where exactly this document should
> live, I thought I would put this out here before it fossilises on my
> disc, so that people can get on with giving input.
>
> This document is intended to bring together the various threads on
> clustering into a skeleton which we can begin to flesh out. Thus is is
> very incomplete... Please don't take offence at ommissions, mistakes,
> gaps in my understanding, or statement of the obvious - this is just the
> first broad brushstrokes, as evolved by the various clustering
> discussions that have happened recently, drawn into a single, slightly
> more orderly document - everything can change...
>
> Until it finds a permanent home, I am happy to look after it and
> endeavour to keep it up to date with any new threads. if you have a
> particular change that you would like made, please let me know and I
> will ensure that it gets in.
>
> I hope this will give birth to a few more threads to fill in sparse
> areas. Then we can hooks refs to this back into the doc and continue
> forward....
>
>
> Jules
>
>
> ------------------------------------------------------------------------
>
>
> Geronimo Clustering Overview - 10,000 feet...
>
>
> Clustering is one of the key features that distinguishes an enterprise
> JEE implementation from the rest of the pack. As such it is an
> important requirement for Apache Geronimo.
>
> By using clustering technology to provide a scalable and highly
> available platform for JEE deployment, Apache Geronimo will be able to
> compete on equal terms with existing commercial offerings.
>
> This document will begin the task of :
>
> - enumerating clustering requirements in Geronimo's tiers
> - abstracting out commonality from these
> - cataloguing software available to us that may be used to implement them
> - suggesting a clustering architecture
>
> A 'Cluster' is an architecture that achieves scalability and high
> availability through the arrangement of multiple smaller, cheaper,
> less reliable, resources, rather than single large, expensive,
> extremely reliable ones.
>
> Scalability is ensured by decoupling dependencies on shared resources
> so that related tasks may be run concurrently on many machines (nodes)
> in the cluster without interfering with each other. If your
> architecture achieves this, you can scale to service more users by
> just adding more nodes.
>
> High Availability is achieved through redundancy. If one less-reliable
> node fails, you fail-over to the next. In this way, the availability
> of your system becomes not the sum, but the product of the
> availability of its constituent nodes.
>
> The presence of State in a cluster, frustrates the achievement of both
> of these goals, through being a point of shared contention (many tasks
> may need to read/write the same piece of state at the same time) and
> a point of failure (if state is held in a fragile resource
> e.g. memory, that is lost, then so is the state i,e, it is no longer
> available).
>
> Partitioning state can help restore scalability and making and
> maintaining multiple copies of state (replication) can be used to
> circumvent contention issues. Both solutions lead to further smaller
> problems such as ensuring that processes are run in the same partition
> as their state and ensuring consistancy of view across multiple copies
> of state etc.Other solutions and further problems abound.
>
> The number of different uses of state within a cluster precludes the
> possibility of a single effective solution. So we need to devise
> solutions for each usecase within Geronimo.
>
> We will enumerate and examine these usecases.
>
>
> Web
> ---
>
> URL: http://java.sun.com/products/servlet/download.html#specs
> Interest Group: Jules Gosnell, Jan Bartel, Jeff Genender, David Jencks,...
>
> Clustering in the web-tier has two points of implementation :
>
> The HTTP Load Balancer:
> - our solution should work with any load-balancer
> - we can expect the LB to support some form of affinity (sticky or persistant sessions).
>
>
> The HttpSession:
> - large numbers of potentially large objects (more data in the tier than can comfortably be held in one node) - needs partitioned cache
> - typically frequently written
> - typically frequently read
> - typically a single consumer/client
> - transactionless
> - only one copy of an HttpSession may be 'active' at any one time
> - multi-threaded
> - pessimistic
> - suggested impl - WADI
>
>
> N.B.: Some other solutions allow the clustering of more than just the
> HttpSession. The spec only requires that data stored in the
> HttpSession is distributable.
>
> Dev Threads
> WADI and Network Partitions: http://www.mail-archive.com/dev%40geronimo.apache.org/msg15855.html
> WADI/AS merger - http://www.mail-archive.com/dev@geronimo.apache.org/msg14749.html
>
>
>
> EJB
> ---
>
> URL: http://java.sun.com/products/ejb/docs.html#specs
> Interest group: David Blevins, Gianny Damour
>
> Clustering in the EJB tier (like the web) has two points of implementation :
>
> The Client/Proxy (equiv to Web Load-balancer):
> - cluster-aware proxies needed
> - can same proxies work with both OpenEJB and IIOP protocols ?
>
> The Server:
>
> MDB
> - stateless
>
> SLSB
> - stateless
>
> SFSB
> - large numbers of potentially large objects (more data in the tier than can comfortably be held in one node) - needs partitioned cache
> - transactional
> - typically frequently written
> - typically a single consumer/client
> - typically frequently read
> - single threaded
> - pessimistic
> - suggested impl - WADI
>
> Entity
> - potentially multiple consumers
> - mapped to shared, persistant store
> - transactional
> - read/write frequency variable, pluggable impls required (e.g. RO Beans etc..)
> - distributed db backed cache needed
> - since mapped to db, does not need partitioned cache, data can just be unloaded
> - distributed caching of entity beans: http://www.mail-archive.com/dev%40geronimo.apache.org/msg15697.html
> - suggested impl - ActiveSpace
>
> Dev Threads:
> client stubs and load-balancing: http://www.mail-archive.com/dev@geronimo.apache.org/msg13533.html
> AC in client stub: http://www.mail-archive.com/dev@geronimo.apache.org/msg14756.html
> Entity invalidation...
>
>
> JNDI
> ----
> Interest Group: Rajith Attapattu
> http://java.sun.com/products/jndi/1.2/javadoc/
>
> Apache Directory - http://directory.apache.org/ (may use this?)
>
> - small amounts of small objects
> - typically seldom written
> - typically frequently read
> - typically multiple consumers/clients
> - multi-threaded
> - transactionless
> - prime candidate for straight forward 1->all replication
> - suggested impl - ActiveSpace
>
> Dev Threads:
> ActiveSpace in JNDI: http://www.mail-archive.com/dev@geronimo.apache.org/msg14743.html
> WADI/AS merger - http://www.mail-archive.com/dev@geronimo.apache.org/msg14749.html
>
>
> JMS
> ---
>
> http://java.sun.com/products/jms/docs.html
>
> - impl - ActiveMQ
>
> Would one of the AMQ guys like to look after this area ?
>
> Dev Threads:
> AMQ Clustering: http://www.mail-archive.com/dev%40geronimo.apache.org/msg15717.html
>
>
> Deployment
> ----------
>
> http://jcp.org/en/jsr/detail?id=088
>
> - potentially large objects (perhaps we could just send references)
> - union of tier content smaller than single node capacity
> - seldom written ([un]deployed)
> - frequently read (invoked)
> - transactionless
> - no uniqueness constraint ? (what should we do about creating heterogeneous deployments?)
> - prime candidate for straight forward 1->all replication
> - suggested impl - ActiveSpace
>
> suggestion - 'deployments sets' - a groups of nodes that
> share/implement a homogeneous deployment. See "Homogeneous vs
> Heterogeneous Deployments" section.
>
> suggestion - app is deployed on one node, it forms a url to the app
> via its http server and replicates this link to other servers within
> its deployment-set. Each member of this set, receives the link, pulls
> down the app and deploys it.
>
> suggestion - could replication be done synchronously and serially, so
> that as the app is deployed on each node it can be sanity checked
> before the distribution continues to the next node ?
>
> WADI,AC,AS & Deployment: http://www.mail-archive.com/dev@geronimo.apache.org/msg15214.html
>
>
> Management/Monitoring
> ---------------------
>
> http://jcp.org/en/jsr/detail?id=077
>
> - still little discussion here
>
> - may need to be centralised
>
> - history - perhaps selected statistics snapshots should be dropped
> into JMS every few secs by each server, then stashed in an rrdb ?
> (WHEN, WHERE, WHAT, VALUE)
>
>
> POJO:
> -----
>
> JCache - http://jcp.org/en/jsr/detail?id=107
>
> - all of the above - whatever is available to Geronimo should be available to application-space pojos.
> - JCache, and related technologies...
> - suggested impls - ActiveSpace/WADI
>
>
> DB:
> ---
>
> any takers ?
>
>
> Trans-Tier issues:
> ------------------
> Network Partitions: check out 'Totem' thread on dev - not yet archived...
> Application Session: http://www.mail-archive.com/dev@geronimo.apache.org/msg12072.html
> clustering shopping list: http://www.mail-archive.com/dev@geronimo.apache.org/msg10561.html
>
>
> Web Services
> -------------
>
> Interest Group: Rajith Attapattu
>
> As WS moves towards more transport independance, session management
> needs to be decoupled from Http.
>
> Suggested impl - WADI/ActiveSpace - more discussion needed here
>
>
>
> Clustering Substrate:
> ---------------------
>
> - suggested impl - ActiveCluster on top of various protocols
>
>
> Auto-wiring
> ------------
>
> - clients need to autolocate nodes (ejb-client->jndi-port, http-load-balancer->http=port etc...)...
> - nodes on same box need to negotiate ownership of per-host resources (e.g. ports)
>
>
> Homogeneous vs Heterogeneous deployments
> -------------------------------------------
>
> - homogeneous cluster is one set of nodes - the universal set
> - heterogeneous cluster is one set (the universal set), with subsets defining internal scopes
>
> Perhaps ActiveCluster could be extended to allow subclusters via the
> same Cluster connection, or we could use multiple cluster
> connections, or an AMQ MessageGroups...?
>
>
> WADI/ActiveSpace synergy
> -------------------------
>
> WADI and ActiveSpace are complimentary technologies. There is a scope
> for API convergence and code reuse here. If the two could coexist
> behind the same API, we might be able to completely abstract the Cache
> impl from the consumer, giving more flexibility and allowing us to
> tailor solutions more closely to particular problems.
>
>
> --------------------------------------------------------------------------------
> Suggested Building Blocks
> --------------------------------------------------------------------------------
>
>
> ActiveMQ
> --------
>
> URL: http://www.activemq.org/
> Interest Group: James Strachan, Hiram Chirino, ...
>
> Geronimo's JMS implementation with pluggable transports including a
> peer:// protocol which allows peers in a cluster to message each other
> directly without the need for a central broker.
>
> ActiveCluster
> -------------
>
> URL: http://activecluster.codehaus.org/
> Interest Group: James Strachan, Hiram Chirino, ...
>
> An API providing basic clustering fn-ality (specifically membership
> change notification) along with 1->all and 1->1 messaging. Also,
> various impls of this API, the most notable using ActiveMQ.
>
>
> ActiveSpace
> -----------
>
> URL: http://activespace.codehaus.org/
> Interest Group: James Strachan, Hiram Chirino, ...
>
> Provides two abstractions, a JCache style Cache (replicated and/or
> transactional) and a JavaSpaces style Space, for distributed
> computing.
>
>
> WADI
> ----
>
> URL: http://wadi.codehaus.org/
> Interest Group: Jules Gosnell,...
>
> Provides Jetty and Tomcat compatible SessionManagers, a portable
> HttpSession impl and a partitioned distributed cache, for support of
> distributable webapps. OpenEJB SFSB support, also underway.
>
>
> Tomcat Clustering
> -----------------
>
> URL: http://tomcat.apache.org/tomcat-5.0-doc/cluster-howto.html
> Interest Group: Jeff Genender, ...
>
> Clustered HttpSessions for Tomcat (not Jetty).
> 1->All replication, so clusters constrained to a few nodes
>
>
> EVS4J/TOTEM
> ------------
> URL: http://www.bway.net/~lichtner/evs4j.html
> Interest Group: Guglielmo Lichtner
>
> Extended Virtual Synchrony for Java.
> Potentially very useful in cases where we need fast, strictly ordered 1->all messaging.
> Might be integrated via ActiveCluster or ActiveMQ
>
> Introduction: http://www.mail-archive.com/dev%40geronimo.apache.org/msg15644.html
> Totem and ActiveCluster: http://www.mail-archive.com/dev%40geronimo.apache.org/msg15695.html
> Infiniband: http://www.mail-archive.com/dev%40geronimo.apache.org/msg15743.html
>
>
>
>
> Further Discussion
> -------------------
>
>
>
> Further Reading:
> ----------------
>
> http://portal.acm.org/citation.cfm?id=326136
> http://scholar.google.com/url?sa=U&q=http://historical.ncstrl.org/tr/ps/cornellcs/TR99-1726.ps
> http://www.jgroups.org/javagroupsnew/docs/index.html
> http://citeseer.csail.mit.edu/amir95totem.html
> http://docs.codehaus.org/display/WADI/Library
> http://citeseer.ist.psu.edu/32449.html
>
> "Fault Tolerance in Distributed Systems"
> Pankaj Jalote, 1994
> Chapter 7, Section 5, "Degree of Replication"
>
>
>