You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@geronimo.apache.org by Va...@nokia.com on 2005/10/18 13:47:22 UTC

RE: Clustering - JGroups issues and others

Hello

Here is my 5 cents... I have some comments regarding clustering based on
J-Groups. We were trying to use this technology and came to certain
points, that render it unusable in our case.

Many of the cluster caches/replicates assume that all the information
propagated to all the nodes in the cluster. Some of the solutions
propagate only keys, however. In any case this solution can not be used
in sufficiently large clusters as the rate of upates would eat all the
node capacity making it unusable.
 
Regarding J-Groups itself. Probably that is specific to cluster
facilities in JBoss, but generally J-Groups organize a list of nodes,
and every node checks the state of the next one in the chain. The
problem is that in many cases servers may fail/disconnect in groups,
which causes two problems: the segmentation of the cluster and
extremelly high failure report time, as for architectures based on blade
technology servers shut down in large packs and it really takes time to
detect several sequentally disconnected servers.

To overcome the problems we ended up with the "star" architecture, where
the central node is responsible for maintaining the list of other nodes.
The availability of the central node itself could be provided with
facilities like Red Hat Cluster Suite or similar (service failover,
floating IPs, etc).

-valeri

Re: Clustering - JGroups issues and others

Posted by Jeff Genender <jg...@savoirtech.com>.

Hi Valeri,

Thanks for your 5 cents.  J-Groups would be interesting to use, but I 
think its licensing is where we may have a problem (LGPL) - others 
please correct me on this if I am not right.

Jeff

Valeri.Atamaniouk@nokia.com wrote:
> Hello
> 
> Here is my 5 cents... I have some comments regarding clustering based on
> J-Groups. We were trying to use this technology and came to certain
> points, that render it unusable in our case.
> 
> Many of the cluster caches/replicates assume that all the information
> propagated to all the nodes in the cluster. Some of the solutions
> propagate only keys, however. In any case this solution can not be used
> in sufficiently large clusters as the rate of upates would eat all the
> node capacity making it unusable.
>  
> Regarding J-Groups itself. Probably that is specific to cluster
> facilities in JBoss, but generally J-Groups organize a list of nodes,
> and every node checks the state of the next one in the chain. The
> problem is that in many cases servers may fail/disconnect in groups,
> which causes two problems: the segmentation of the cluster and
> extremelly high failure report time, as for architectures based on blade
> technology servers shut down in large packs and it really takes time to
> detect several sequentally disconnected servers.
> 
> To overcome the problems we ended up with the "star" architecture, where
> the central node is responsible for maintaining the list of other nodes.
> The availability of the central node itself could be provided with
> facilities like Red Hat Cluster Suite or similar (service failover,
> floating IPs, etc).
> 
> -valeri

Re: Clustering - JGroups issues and others

Posted by Jules Gosnell <ju...@coredevelopers.net>.

Valeri.Atamaniouk@nokia.com wrote:

>Hello
>
>Here is my 5 cents... I have some comments regarding clustering based on
>J-Groups. We were trying to use this technology and came to certain
>points, that render it unusable in our case.
>
>Many of the cluster caches/replicates assume that all the information
>propagated to all the nodes in the cluster. Some of the solutions
>propagate only keys, however. In any case this solution can not be used
>in sufficiently large clusters as the rate of upates would eat all the
>node capacity making it unusable.
>  
>
This is the dreaded 1->all replication that is a popular implementation 
at the moment. See my previous mail about wadi's avoidance of this 
giving it a significant advantage over such solutions, in terms of 
scalability.

> 
>Regarding J-Groups itself. Probably that is specific to cluster
>facilities in JBoss, but generally J-Groups organize a list of nodes,
>and every node checks the state of the next one in the chain.
>
I wasn't sure how it worked... interesting ...
We should look into how membership is tracked by ActiveCluster.

> The
>problem is that in many cases servers may fail/disconnect in groups,
>which causes two problems: the segmentation of the cluster 
>
cluster segmentation is a really tricky issue :-( - do all the segments 
then try to arrange themselves into smaller clusters, shifting loads of 
state around, or is jgroups smart enough to put all the pieces back 
together before passing control back to the application ?

>and
>extremelly high failure report time, as for architectures based on blade
>technology servers shut down in large packs
>
do these 'packs' correspond to racks ? I have plans (NYI) for pluggable 
algorithms that will allow WADI to choose e.g. nodes in other racks, on 
other power sources, in other buildings etc as replication partners, 
otherwise you will lose state in a situation like this, if you happen to 
have yours backed up on to the node next to you in the same rack...

> and it really takes time to
>detect several sequentally disconnected servers.
>  
>
What sort of lag are we talking about - a few seconds, or a few tens of 
seconds ?

>To overcome the problems we ended up with the "star" architecture, where
>the central node is responsible for maintaining the list of other nodes.
>The availability of the central node itself could be provided with
>facilities like Red Hat Cluster Suite or similar (service failover,
>floating IPs, etc).
>  
>
Hmmm.. - I understand why you went for this architecture, but I would 
prefer to find one that is homogeneous - i.e. we don't need a special, 
non-standard configuration for the central node. Deployment is much 
easier if every node has the configuration. Still, this is good input 
and has got me thinking in a direction which I had not really considered 
before.

Thanks, Valeri,

Jules

>-valeri
>  
>


-- 
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/