You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@geronimo.apache.org by Gianny Damour <gi...@optusnet.com.au> on 2007/11/12 14:37:26 UTC

Distribution and start/stop of clustered deployments

Hi,

I have just checked in support for distribution of configurations to  
clusters and also management, i.e. start/stop, of such clustered  
deployments.

I will try to explain how everything hangs together so that people  
can jump in, provide feedback, request enhancements etc.

There is now a secondary configuration store:
org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car? 
ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/ 
car,j2eeType=ConfigurationStore,name=MasterConfigurationStore
which is a configuration store, which is aware of the cluster members  
statically configured by users (more on this later). Its  
responsibilities are:
* (un)installation of configurations on cluster members; and
* creation of "master" configurations defining GBeans able to remote  
start and stop a given configuration on a specific cluster member.

Here is what happens when a configuration, e.g. groupId/artifactId/ 
2.0/car, is distributed to this store:
1. The usual configuration processing is executed. This results into  
a backed configuration, i.e. with its associated GBeans, ready to be  
installed by the clustered store.
2. The clustered store uploads the backed configuration to the  
registered cluster members, which subsequently locally install them.  
If the "remote" installation fails for one of the members, then the  
clustered store removes the configuration from all the members having  
successfully installed it so far.
3. The clustered store installs the configuration locally.
4. The clustered store creates from scratch a master configuration,  
e.g. groupId/artifactId_G_MASTER/2.0/car. This master configuration  
is made of GBeans, one for each member, which can remote start or  
stop the configuration on a given member: when the master  
configuration starts, its GBeans start, which in turn remote start  
the configuration on a given member. In order to be able to start the  
master configuration without all the members up, these GBeans "fail"  
silently when a remote start fails. However, as these GBeans expose  
startConfiguration and stopConfiguration managed operations, it is  
pretty easy to remote start a configuration on a given member later  
via JMX. As expected, when the master configuration is stopped, its  
GBeans stop, which in turn remote stop the configurations.

The clustered store relies on the static configuration of cluster  
members. This static configuration MUST be done within  
org.apache.geronimo.configs/clustering//car as nodes must be  
registered before the start of any master configurations. Indeed,  
master configurations are injected with this static cluster  
configuration to retrieve the necessary JMX connection info to  
connect and cluster members and remote start/stop configurations.

At step 3. of the above deployment process, I wrote that the  
configuration is locally installed, i.e. into the clustered  
configuration store. At this stage, this is pretty much useless;  
however, I believe that keeping a carbon-copy of the configuration in  
the master repository may become quite handy. For instance, within  
the master configuration, we could add a GBean able to upload on  
demand this configuration to a given member. This way, when you add a  
new member to an existing clustered deployment, you simply need to  
add a new GBean to remote start/stop the configuration on this new  
member and upload the configuration to this new member via the  
utility GBean.

Hope the above is clear enough.

I will comment the org.apache.geronimo.configs/clustering//car  
deployment plan as there are new GBeans declarations not too obvious  
to understand without reading the code.


Following this, I will move to the remote start/stop of Geronimo  
instances from a single Geronimo server. This should provide a set of  
administration GBeans admin console people may want to leverage to  
improve the remote management of Geronimo instances. These GBeans  
will talk to GShell instances and send arbitrary groovy scripts for  
execution within GShells.

Meanwhile, if people are interested by working on the clustering of  
Tomcat or OpenEJB via WADI, then please reply as I am keen and happy  
to provide help. One of those two new features will be the next stuff  
I will work on after completion of the above management enhancement.

Thanks,
Gianny

Re: Distribution and start/stop of clustered deployments

Posted by Gianny Damour <gi...@optusnet.com.au>.

On 21/11/2007, at 7:55 AM, Joe Bohn wrote:

>> Regarding the list-modules command, it lists all the  
>> configurations per target, i.e. configuration store.
>
> As Kevan also pointed out, I think that we need to consider the  
> multiple configuration stores in the console when more than one is  
> present for display and deploy (including plugins) as well as the  
> CLI.  With your changes proposed above I would presume that they  
> would always work with the default configuration store (the first  
> one returned) until they can be updated to support multiple.

Indeed: some portlets will need to be improved so that users can  
select a target configuration store. Based on the committed changes,  
these portlets now distribute to the default store.

Thanks,
Gianny

>
>
>> Thanks,
>> Gianny
>>>
>>> --kevan

Re: Distribution and start/stop of clustered deployments

Posted by Joe Bohn <jo...@earthlink.net>.


Gianny Damour wrote:
> On 17/11/2007, at 12:30 AM, Kevan Miller wrote:
> 
>>
>> On Nov 13, 2007, at 9:36 PM, Gianny Damour wrote:
>>
>>> Hi Joe,
>>>
>>> After some investigations, here is my understanding of problem 1: 
>>> there are two deployments because by default, i.e. when no target is 
>>> specified, the distribute command executes against all the 
>>> configuration stores defined by a Geronimo instance. Note that this 
>>> default behavior is also applied by other deployment components, such 
>>> as the hot directory scanner or the installation portlet. To some 
>>> extent, I believe this default behavior should be changed to deploy 
>>> to only one configuration store. Indeed, I am not convinced that 
>>> users distributing applications would expect their applications to be 
>>> deployed as many times as the number of configuration stores defined 
>>> by the targeted Geronimo server. Also, having the same configuration 
>>> multiple times in a Geronimo instance does not make a lot of sense.
>>>
>>> A potentially better default behavior would be: only distribute to 
>>> the first target returned by DeploymentManager.getTargets(). 
>>> Internally, our implementation of getTargets returns as the first 
>>> target the "default" configuration store.
>>>
>>> Problem 3) is caused by problem 1).
>>>
>>> What do you think?
>>
>> Hi Gianny,
>> That seems like reasonable behavior.
>>
>> I haven't looked closely at this, yet. I'm curious about how 
>> list-modules would work. I'm also wondering about plugin installation, 
>> console support, etc. Have they been updated appropriately to reflect 
>> your multiple config store scheme?
> 
> Hi Joe,
> 
> I created a JIRA to track this change of behavior and will soon commit 
> the change.
> 

Hi Gianny,

Sorry for the delay ... somehow I missed this response.  Yes, I think 
that default configuration store approach makes sense.  Thanks for 
opening the JIRA.  I presume that a deploy would have to specify the 
configuration store if it were other than the default, correct?


> Regarding the list-modules command, it lists all the configurations per 
> target, i.e. configuration store.
> 

As Kevan also pointed out, I think that we need to consider the multiple 
configuration stores in the console when more than one is present for 
display and deploy (including plugins) as well as the CLI.  With your 
changes proposed above I would presume that they would always work with 
the default configuration store (the first one returned) until they can 
be updated to support multiple.


> Thanks,
> Gianny
> 
>>
>> --kevan
> 
>

Re: Distribution and start/stop of clustered deployments

Posted by Gianny Damour <gi...@optusnet.com.au>.

On 17/11/2007, at 12:30 AM, Kevan Miller wrote:

>
> On Nov 13, 2007, at 9:36 PM, Gianny Damour wrote:
>
>> Hi Joe,
>>
>> After some investigations, here is my understanding of problem 1:  
>> there are two deployments because by default, i.e. when no target  
>> is specified, the distribute command executes against all the  
>> configuration stores defined by a Geronimo instance. Note that  
>> this default behavior is also applied by other deployment  
>> components, such as the hot directory scanner or the installation  
>> portlet. To some extent, I believe this default behavior should be  
>> changed to deploy to only one configuration store. Indeed, I am  
>> not convinced that users distributing applications would expect  
>> their applications to be deployed as many times as the number of  
>> configuration stores defined by the targeted Geronimo server.  
>> Also, having the same configuration multiple times in a Geronimo  
>> instance does not make a lot of sense.
>>
>> A potentially better default behavior would be: only distribute to  
>> the first target returned by DeploymentManager.getTargets().  
>> Internally, our implementation of getTargets returns as the first  
>> target the "default" configuration store.
>>
>> Problem 3) is caused by problem 1).
>>
>> What do you think?
>
> Hi Gianny,
> That seems like reasonable behavior.
>
> I haven't looked closely at this, yet. I'm curious about how list- 
> modules would work. I'm also wondering about plugin installation,  
> console support, etc. Have they been updated appropriately to  
> reflect your multiple config store scheme?

Hi Joe,

I created a JIRA to track this change of behavior and will soon  
commit the change.

Regarding the list-modules command, it lists all the configurations  
per target, i.e. configuration store.

Thanks,
Gianny

>
> --kevan

Re: Distribution and start/stop of clustered deployments

Posted by Kevan Miller <ke...@gmail.com>.

On Nov 13, 2007, at 9:36 PM, Gianny Damour wrote:

> Hi Joe,
>
> After some investigations, here is my understanding of problem 1:  
> there are two deployments because by default, i.e. when no target is  
> specified, the distribute command executes against all the  
> configuration stores defined by a Geronimo instance. Note that this  
> default behavior is also applied by other deployment components,  
> such as the hot directory scanner or the installation portlet. To  
> some extent, I believe this default behavior should be changed to  
> deploy to only one configuration store. Indeed, I am not convinced  
> that users distributing applications would expect their applications  
> to be deployed as many times as the number of configuration stores  
> defined by the targeted Geronimo server. Also, having the same  
> configuration multiple times in a Geronimo instance does not make a  
> lot of sense.
>
> A potentially better default behavior would be: only distribute to  
> the first target returned by DeploymentManager.getTargets().  
> Internally, our implementation of getTargets returns as the first  
> target the "default" configuration store.
>
> Problem 3) is caused by problem 1).
>
> What do you think?

Hi Gianny,
That seems like reasonable behavior.

I haven't looked closely at this, yet. I'm curious about how list- 
modules would work. I'm also wondering about plugin installation,  
console support, etc. Have they been updated appropriately to reflect  
your multiple config store scheme?

--kevan

Re: Distribution and start/stop of clustered deployments

Posted by Gianny Damour <gi...@optusnet.com.au>.

Hi Joe,

After some investigations, here is my understanding of problem 1:  
there are two deployments because by default, i.e. when no target is  
specified, the distribute command executes against all the  
configuration stores defined by a Geronimo instance. Note that this  
default behavior is also applied by other deployment components, such  
as the hot directory scanner or the installation portlet. To some  
extent, I believe this default behavior should be changed to deploy  
to only one configuration store. Indeed, I am not convinced that  
users distributing applications would expect their applications to be  
deployed as many times as the number of configuration stores defined  
by the targeted Geronimo server. Also, having the same configuration  
multiple times in a Geronimo instance does not make a lot of sense.

A potentially better default behavior would be: only distribute to  
the first target returned by DeploymentManager.getTargets().  
Internally, our implementation of getTargets returns as the first  
target the "default" configuration store.

Problem 3) is caused by problem 1).

What do you think?

Thanks,
Gianny


On 13/11/2007, at 7:14 AM, Joe Bohn wrote:

> Hi Gianny,
>
> Lots of newbie questions from me.  I'm not even going to pretend  
> that I understand your clustering changes just yet ... so please  
> bear with me.  I just want to point out a few things that I noticed  
> with a single server instance and get your take on them.
>
> 1)  Deploying a simple web app.  I deployed a simple snoop.war web  
> app without a plan to a Jetty server image using the command line.   
> It ended up deploying 2 configurations based upon the output  
> messages.  Based on your description I think this is correct but  
> from a user perspective it seems confusing and wrong.  I hadn't  
> configured anything for clustering and I was only deploying 1  
> thing.  I expected to see results of just 1 configID for the  
> deployed item.  Perhaps everything would have been fine if I had  
> used a plan but I don't think we can assume that users will always  
> use a plan.  Here are the messages that were output:
>     Completed with id default/snoop/1194895785124/war
>     Completed with id default/snoop/1194895785559/war
>     Deployed default/snoop/1194895785124/war to
> org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car? 
> ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/ 
> car,j2eeType=ConfigurationStore,name=MasterConfigurationStore
>     @ /snoop
>     Deployed default/snoop/1194895785559/war to
> org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car? 
> ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/ 
> car,j2eeType=ConfigurationStore,name=ClusterStore
>     @ /snoop
>
> 2) Undeploy?  What would I undeploy if I wanted to undo what I just  
> did?  Do I need to undeploy each configuration individually?  What  
> do you think about leaving the current deploy capability as is and  
> adding new commands/functions when deploying into a cluster so as  
> not to confuse users in the more simple case without clustering?
>
> 3)  Web Console. From the web console instead of 1 configuration I  
> initially expected, or the 2 configurations indicated in the  
> messages at deploy time ... I actually see 3 configurations (2 of  
> them started and 1 stopped ... now I'm even more confused ;- ) ):
>   - default/snoop/1194895785124/war  started
>   - default/snoop/1194895785559/war  started
>   - default/snoop/1194895785702/war  stopped
> Again, I'm not sure how the user is supposed to manage/interpret  
> this. It seems that if we implement these concepts there are a  
> number of comparable console and cli changes that will be necessary  
> to manage the multiple CARs in a clustered scenario.  Is there  
> anyway we can keep the single server use cases intact until we have  
> those capabilities?
>
> 4)  TCK for Jetty is toast.  I started to play with the individual  
> server because when I attempted to run Jetty TCK tests everything  
> was failing with lifeCycleExceptions.  I image that we need to  
> rework some of the tck for this change.  We might be able to avoid  
> that if we can keep the single server use cases unchanged.  If that  
> isn't possible will you be looking into the necessary TCK changes?
>
> Thanks,
> Joe
>
> Gianny Damour wrote:
>> Hi,
>> I have just checked in support for distribution of configurations  
>> to clusters and also management, i.e. start/stop, of such  
>> clustered deployments.
>> I will try to explain how everything hangs together so that people  
>> can jump in, provide feedback, request enhancements etc.
>> There is now a secondary configuration store:
>> org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car? 
>> ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/ 
>> car,j2eeType=ConfigurationStore,name=MasterConfigurationStore  
>> which is a configuration store, which is aware of the cluster  
>> members statically configured by users (more on this later). Its  
>> responsibilities are:
>> * (un)installation of configurations on cluster members; and
>> * creation of "master" configurations defining GBeans able to  
>> remote start and stop a given configuration on a specific cluster  
>> member.
>> Here is what happens when a configuration, e.g. groupId/artifactId/ 
>> 2.0/car, is distributed to this store:
>> 1. The usual configuration processing is executed. This results  
>> into a backed configuration, i.e. with its associated GBeans,  
>> ready to be installed by the clustered store.
>> 2. The clustered store uploads the backed configuration to the  
>> registered cluster members, which subsequently locally install  
>> them. If the "remote" installation fails for one of the members,  
>> then the clustered store removes the configuration from all the  
>> members having successfully installed it so far.
>> 3. The clustered store installs the configuration locally.
>> 4. The clustered store creates from scratch a master  
>> configuration, e.g. groupId/artifactId_G_MASTER/2.0/car. This  
>> master configuration is made of GBeans, one for each member, which  
>> can remote start or stop the configuration on a given member: when  
>> the master configuration starts, its GBeans start, which in turn  
>> remote start the configuration on a given member. In order to be  
>> able to start the master configuration without all the members up,  
>> these GBeans "fail" silently when a remote start fails. However,  
>> as these GBeans expose startConfiguration and stopConfiguration  
>> managed operations, it is pretty easy to remote start a  
>> configuration on a given member later via JMX. As expected, when  
>> the master configuration is stopped, its GBeans stop, which in  
>> turn remote stop the configurations.
>> The clustered store relies on the static configuration of cluster  
>> members. This static configuration MUST be done within  
>> org.apache.geronimo.configs/clustering//car as nodes must be  
>> registered before the start of any master configurations. Indeed,  
>> master configurations are injected with this static cluster  
>> configuration to retrieve the necessary JMX connection info to  
>> connect and cluster members and remote start/stop configurations.
>> At step 3. of the above deployment process, I wrote that the  
>> configuration is locally installed, i.e. into the clustered  
>> configuration store. At this stage, this is pretty much useless;  
>> however, I believe that keeping a carbon-copy of the configuration  
>> in the master repository may become quite handy. For instance,  
>> within the master configuration, we could add a GBean able to  
>> upload on demand this configuration to a given member. This way,  
>> when you add a new member to an existing clustered deployment, you  
>> simply need to add a new GBean to remote start/stop the  
>> configuration on this new member and upload the configuration to  
>> this new member via the utility GBean.
>> Hope the above is clear enough.
>> I will comment the org.apache.geronimo.configs/clustering//car  
>> deployment plan as there are new GBeans declarations not too  
>> obvious to understand without reading the code.
>> Following this, I will move to the remote start/stop of Geronimo  
>> instances from a single Geronimo server. This should provide a set  
>> of administration GBeans admin console people may want to leverage  
>> to improve the remote management of Geronimo instances. These  
>> GBeans will talk to GShell instances and send arbitrary groovy  
>> scripts for execution within GShells.
>> Meanwhile, if people are interested by working on the clustering  
>> of Tomcat or OpenEJB via WADI, then please reply as I am keen and  
>> happy to provide help. One of those two new features will be the  
>> next stuff I will work on after completion of the above management  
>> enhancement.
>> Thanks,
>> Gianny

Re: Distribution and start/stop of clustered deployments

Posted by Gianny Damour <gi...@optusnet.com.au>.

Hi Joe,

Thanks for your feedback.

On 13/11/2007, at 7:14 AM, Joe Bohn wrote:

> 1)  Deploying a simple web app.  I deployed a simple snoop.war web  
> app without a plan to a Jetty server image using the command line.   
> It ended up deploying 2 configurations based upon the output  
> messages.  Based on your description I think this is correct but  
> from a user perspective it seems confusing and wrong.  I hadn't  
> configured anything for clustering and I was only deploying 1  
> thing.  I expected to see results of just 1 configID for the  
> deployed item.  Perhaps everything would have been fine if I had  
> used a plan but I don't think we can assume that users will always  
> use a plan.  Here are the messages that were output:
>     Completed with id default/snoop/1194895785124/war
>     Completed with id default/snoop/1194895785559/war
>     Deployed default/snoop/1194895785124/war to
> org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car? 
> ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/ 
> car,j2eeType=ConfigurationStore,name=MasterConfigurationStore
>     @ /snoop
>     Deployed default/snoop/1194895785559/war to
> org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car? 
> ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/ 
> car,j2eeType=ConfigurationStore,name=ClusterStore
>     @ /snoop

This is indeed not working as expected: firstly, these configurations  
have distinct version numbers and they should not; Secondly, it seems  
that by default deployments are going to the master configuration store.

I do not yet know how to fix this problem of distinct configuration  
versions. Regarding the second one, I will improve the Deployer in  
order to support allow the explicit configuration of a "default"  
repository when a configuration is installed without the explicit  
configuration of targets.

>
> 2) Undeploy?  What would I undeploy if I wanted to undo what I just  
> did?  Do I need to undeploy each configuration individually?  What  
> do you think about leaving the current deploy capability as is and  
> adding new commands/functions when deploying into a cluster so as  
> not to confuse users in the more simple case without clustering?


Good question; I realized that my description was only going over the  
installation process and not the undeploy process.

When groupId/artifactId/2.0/car is installed to the master  
configuration store, a master configuration is created. This master  
configuration has the name groupId/artifactId_G_MASTER/2.0/car - note  
the "_G_MASTER" suffix following the artifact Id. In order to  
undeploy the configuration from all the cluster members, you simply  
need to undeploy the master configuration, i.e. groupId/ 
artifactId_G_MASTER/2.0/car. Under the cover, the master repository  
will in turn undeploy the configuration groupId/artifactId/2.0/car  
from the cluster members.

Regarding the addition of new deployment commands, I think this is  
not necessary as you can deploy via the current commands and the  
implemented functionalities:
* distribute to a cluster by targeting a configuration store, which  
is the master configuration;
* undeploy from a cluster by undeploying the master configuration; and
* start/stop configurations across a cluster by starting or stopping  
the master configuration.

I think that if users have issue with a specific cluster member, they  
can still operate against this member only by using the usual  
commands against the "actual" configuration.


>
> 3)  Web Console. From the web console instead of 1 configuration I  
> initially expected, or the 2 configurations indicated in the  
> messages at deploy time ... I actually see 3 configurations (2 of  
> them started and 1 stopped ... now I'm even more confused ;- ) ):
>   - default/snoop/1194895785124/war  started
>   - default/snoop/1194895785559/war  started
>   - default/snoop/1194895785702/war  stopped
> Again, I'm not sure how the user is supposed to manage/interpret  
> this. It seems that if we implement these concepts there are a  
> number of comparable console and cli changes that will be necessary  
> to manage the multiple CARs in a clustered scenario.  Is there  
> anyway we can keep the single server use cases intact until we have  
> those capabilities?

This is a bug. Everything should be transparent to end-users. For  
your information, if you had a look to the MasterConfigurationStore  
implementation, you will see that it has logic to filter out the non  
master configurations this way ensuring that a configuration is only  
listed once for all the repositories defined by a Geronimo instance.

>
> 4)  TCK for Jetty is toast.  I started to play with the individual  
> server because when I attempted to run Jetty TCK tests everything  
> was failing with lifeCycleExceptions.  I image that we need to  
> rework some of the tck for this change.  We might be able to avoid  
> that if we can keep the single server use cases unchanged.  If that  
> isn't possible will you be looking into the necessary TCK changes?

I believe that by improving the Deployer in order to have the "Local"  
configuration store used when no explicit targets is specified will  
fix the TCK. I can fix this problem tonight, in about 10 hours from  
now. However, if you can do the change before, then please go ahead.

Thanks,
Gianny

>
> Thanks,
> Joe
>
> Gianny Damour wrote:
>> Hi,
>> I have just checked in support for distribution of configurations  
>> to clusters and also management, i.e. start/stop, of such  
>> clustered deployments.
>> I will try to explain how everything hangs together so that people  
>> can jump in, provide feedback, request enhancements etc.
>> There is now a secondary configuration store:
>> org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car? 
>> ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/ 
>> car,j2eeType=ConfigurationStore,name=MasterConfigurationStore  
>> which is a configuration store, which is aware of the cluster  
>> members statically configured by users (more on this later). Its  
>> responsibilities are:
>> * (un)installation of configurations on cluster members; and
>> * creation of "master" configurations defining GBeans able to  
>> remote start and stop a given configuration on a specific cluster  
>> member.
>> Here is what happens when a configuration, e.g. groupId/artifactId/ 
>> 2.0/car, is distributed to this store:
>> 1. The usual configuration processing is executed. This results  
>> into a backed configuration, i.e. with its associated GBeans,  
>> ready to be installed by the clustered store.
>> 2. The clustered store uploads the backed configuration to the  
>> registered cluster members, which subsequently locally install  
>> them. If the "remote" installation fails for one of the members,  
>> then the clustered store removes the configuration from all the  
>> members having successfully installed it so far.
>> 3. The clustered store installs the configuration locally.
>> 4. The clustered store creates from scratch a master  
>> configuration, e.g. groupId/artifactId_G_MASTER/2.0/car. This  
>> master configuration is made of GBeans, one for each member, which  
>> can remote start or stop the configuration on a given member: when  
>> the master configuration starts, its GBeans start, which in turn  
>> remote start the configuration on a given member. In order to be  
>> able to start the master configuration without all the members up,  
>> these GBeans "fail" silently when a remote start fails. However,  
>> as these GBeans expose startConfiguration and stopConfiguration  
>> managed operations, it is pretty easy to remote start a  
>> configuration on a given member later via JMX. As expected, when  
>> the master configuration is stopped, its GBeans stop, which in  
>> turn remote stop the configurations.
>> The clustered store relies on the static configuration of cluster  
>> members. This static configuration MUST be done within  
>> org.apache.geronimo.configs/clustering//car as nodes must be  
>> registered before the start of any master configurations. Indeed,  
>> master configurations are injected with this static cluster  
>> configuration to retrieve the necessary JMX connection info to  
>> connect and cluster members and remote start/stop configurations.
>> At step 3. of the above deployment process, I wrote that the  
>> configuration is locally installed, i.e. into the clustered  
>> configuration store. At this stage, this is pretty much useless;  
>> however, I believe that keeping a carbon-copy of the configuration  
>> in the master repository may become quite handy. For instance,  
>> within the master configuration, we could add a GBean able to  
>> upload on demand this configuration to a given member. This way,  
>> when you add a new member to an existing clustered deployment, you  
>> simply need to add a new GBean to remote start/stop the  
>> configuration on this new member and upload the configuration to  
>> this new member via the utility GBean.
>> Hope the above is clear enough.
>> I will comment the org.apache.geronimo.configs/clustering//car  
>> deployment plan as there are new GBeans declarations not too  
>> obvious to understand without reading the code.
>> Following this, I will move to the remote start/stop of Geronimo  
>> instances from a single Geronimo server. This should provide a set  
>> of administration GBeans admin console people may want to leverage  
>> to improve the remote management of Geronimo instances. These  
>> GBeans will talk to GShell instances and send arbitrary groovy  
>> scripts for execution within GShells.
>> Meanwhile, if people are interested by working on the clustering  
>> of Tomcat or OpenEJB via WADI, then please reply as I am keen and  
>> happy to provide help. One of those two new features will be the  
>> next stuff I will work on after completion of the above management  
>> enhancement.
>> Thanks,
>> Gianny

Re: Distribution and start/stop of clustered deployments

Posted by Joe Bohn <jo...@earthlink.net>.

Hi Gianny,

Lots of newbie questions from me.  I'm not even going to pretend that I 
understand your clustering changes just yet ... so please bear with me. 
  I just want to point out a few things that I noticed with a single 
server instance and get your take on them.

1)  Deploying a simple web app.  I deployed a simple snoop.war web app 
without a plan to a Jetty server image using the command line.  It ended 
up deploying 2 configurations based upon the output messages.  Based on 
your description I think this is correct but from a user perspective it 
seems confusing and wrong.  I hadn't configured anything for clustering 
and I was only deploying 1 thing.  I expected to see results of just 1 
configID for the deployed item.  Perhaps everything would have been fine 
if I had used a plan but I don't think we can assume that users will 
always use a plan.  Here are the messages that were output:
     Completed with id default/snoop/1194895785124/war
     Completed with id default/snoop/1194895785559/war
     Deployed default/snoop/1194895785124/war to
 
org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car?ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car,j2eeType=ConfigurationStore,name=MasterConfigurationStore
     @ /snoop
     Deployed default/snoop/1194895785559/war to
 
org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car?ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car,j2eeType=ConfigurationStore,name=ClusterStore
     @ /snoop

2) Undeploy?  What would I undeploy if I wanted to undo what I just did? 
  Do I need to undeploy each configuration individually?  What do you 
think about leaving the current deploy capability as is and adding new 
commands/functions when deploying into a cluster so as not to confuse 
users in the more simple case without clustering?

3)  Web Console. From the web console instead of 1 configuration I 
initially expected, or the 2 configurations indicated in the messages at 
deploy time ... I actually see 3 configurations (2 of them started and 1 
stopped ... now I'm even more confused ;- ) ):
   - default/snoop/1194895785124/war  started
   - default/snoop/1194895785559/war  started
   - default/snoop/1194895785702/war  stopped
Again, I'm not sure how the user is supposed to manage/interpret this. 
It seems that if we implement these concepts there are a number of 
comparable console and cli changes that will be necessary to manage the 
multiple CARs in a clustered scenario.  Is there anyway we can keep the 
single server use cases intact until we have those capabilities?

4)  TCK for Jetty is toast.  I started to play with the individual 
server because when I attempted to run Jetty TCK tests everything was 
failing with lifeCycleExceptions.  I image that we need to rework some 
of the tck for this change.  We might be able to avoid that if we can 
keep the single server use cases unchanged.  If that isn't possible will 
you be looking into the necessary TCK changes?

Thanks,
Joe

Gianny Damour wrote:
> Hi,
> 
> I have just checked in support for distribution of configurations to 
> clusters and also management, i.e. start/stop, of such clustered 
> deployments.
> 
> I will try to explain how everything hangs together so that people can 
> jump in, provide feedback, request enhancements etc.
> 
> There is now a secondary configuration store:
> org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car?ServiceModule=org.apache.geronimo.configs/clustering/2.1-SNAPSHOT/car,j2eeType=ConfigurationStore,name=MasterConfigurationStore 
> 
> which is a configuration store, which is aware of the cluster members 
> statically configured by users (more on this later). Its 
> responsibilities are:
> * (un)installation of configurations on cluster members; and
> * creation of "master" configurations defining GBeans able to remote 
> start and stop a given configuration on a specific cluster member.
> 
> Here is what happens when a configuration, e.g. 
> groupId/artifactId/2.0/car, is distributed to this store:
> 1. The usual configuration processing is executed. This results into a 
> backed configuration, i.e. with its associated GBeans, ready to be 
> installed by the clustered store.
> 2. The clustered store uploads the backed configuration to the 
> registered cluster members, which subsequently locally install them. If 
> the "remote" installation fails for one of the members, then the 
> clustered store removes the configuration from all the members having 
> successfully installed it so far.
> 3. The clustered store installs the configuration locally.
> 4. The clustered store creates from scratch a master configuration, e.g. 
> groupId/artifactId_G_MASTER/2.0/car. This master configuration is made 
> of GBeans, one for each member, which can remote start or stop the 
> configuration on a given member: when the master configuration starts, 
> its GBeans start, which in turn remote start the configuration on a 
> given member. In order to be able to start the master configuration 
> without all the members up, these GBeans "fail" silently when a remote 
> start fails. However, as these GBeans expose startConfiguration and 
> stopConfiguration managed operations, it is pretty easy to remote start 
> a configuration on a given member later via JMX. As expected, when the 
> master configuration is stopped, its GBeans stop, which in turn remote 
> stop the configurations.
> 
> The clustered store relies on the static configuration of cluster 
> members. This static configuration MUST be done within 
> org.apache.geronimo.configs/clustering//car as nodes must be registered 
> before the start of any master configurations. Indeed, master 
> configurations are injected with this static cluster configuration to 
> retrieve the necessary JMX connection info to connect and cluster 
> members and remote start/stop configurations.
> 
> At step 3. of the above deployment process, I wrote that the 
> configuration is locally installed, i.e. into the clustered 
> configuration store. At this stage, this is pretty much useless; 
> however, I believe that keeping a carbon-copy of the configuration in 
> the master repository may become quite handy. For instance, within the 
> master configuration, we could add a GBean able to upload on demand this 
> configuration to a given member. This way, when you add a new member to 
> an existing clustered deployment, you simply need to add a new GBean to 
> remote start/stop the configuration on this new member and upload the 
> configuration to this new member via the utility GBean.
> 
> Hope the above is clear enough.
> 
> I will comment the org.apache.geronimo.configs/clustering//car 
> deployment plan as there are new GBeans declarations not too obvious to 
> understand without reading the code.
> 
> 
> Following this, I will move to the remote start/stop of Geronimo 
> instances from a single Geronimo server. This should provide a set of 
> administration GBeans admin console people may want to leverage to 
> improve the remote management of Geronimo instances. These GBeans will 
> talk to GShell instances and send arbitrary groovy scripts for execution 
> within GShells.
> 
> Meanwhile, if people are interested by working on the clustering of 
> Tomcat or OpenEJB via WADI, then please reply as I am keen and happy to 
> provide help. One of those two new features will be the next stuff I 
> will work on after completion of the above management enhancement.
> 
> Thanks,
> Gianny
> 
> 
> 
> 
>

Re: Distribution and start/stop of clustered deployments

Posted by Jeff Genender <jg...@savoirtech.com>.


Gianny Damour wrote:
> Nous parlons enfin la meme langue :).
> 

Nous avons toujours compris les uns les autres ;-)

I'll bet you didn't know I am in Paris right now...did ya? ;-)

Jeff

> Agreed: this would be a nice enhancement.
> 
>> Jeff

Re: Distribution and start/stop of clustered deployments

Posted by Gianny Damour <gi...@optusnet.com.au>.

On 13/11/2007, at 8:01 PM, Jeff Genender wrote:

> Gianny Damour wrote:
>> I hope we are not talking about the same thing. I am talking about a
>> "deployment time" constraint and not a "runtime" constraint mandating
>> that all the servers are reachable when an application is *deployed*.
>> FWIW, such a constraint was also defined by WebLogic 7.0; from  
>> WebLogic
>> 8.x+, this constraint was subsequently relaxed in order to allow the
>> deployment to a sub-set of the cluster members - however, users can
>> still enforce it if they want and partial deployments is not
>> recommended. As previously said, it is trivial to implement a  
>> GBean to
>> distribute a backed configuration to a server, which was not  
>> reachable
>> upon deployment.
>
> Yes I mean deployment time...
>
> I don't see a difference.  Yes...I think a switch for "allor  
> nothing" or
> "deploy to those who can" would be a good feature to have.
>

Nous parlons enfin la meme langue :).

Agreed: this would be a nice enhancement.

> Jeff

Re: Distribution and start/stop of clustered deployments

Posted by Jeff Genender <jg...@apache.org>.


Gianny Damour wrote:
> I hope we are not talking about the same thing. I am talking about a
> "deployment time" constraint and not a "runtime" constraint mandating
> that all the servers are reachable when an application is *deployed*.
> FWIW, such a constraint was also defined by WebLogic 7.0; from WebLogic
> 8.x+, this constraint was subsequently relaxed in order to allow the
> deployment to a sub-set of the cluster members - however, users can
> still enforce it if they want and partial deployments is not
> recommended. As previously said, it is trivial to implement a GBean to
> distribute a backed configuration to a server, which was not reachable
> upon deployment.

Yes I mean deployment time...

I don't see a difference.  Yes...I think a switch for "allor nothing" or
"deploy to those who can" would be a good feature to have.

Jeff

Re: Distribution and start/stop of clustered deployments

Posted by Gianny Damour <gi...@optusnet.com.au>.

On 13/11/2007, at 4:42 PM, Jeff Genender wrote:

> Gianny Damour wrote:
>> You can successfully "distribute" when all the configured cluster
>> members are running. If one of them is down, then the installation
>> fails. This seems to be a typical scenario - at least based on the
>> clustered deployments I have been working with.
>
> Hmmm...I have found this to be different...
>
> I have found that if 1 node in the cluster is bad, it is removed from
> the general cluster array (i.e. look at heartbeats and the
> joining/leaving of a general cluster).
>
> This is a valid use case.  The 1 server could have a problem with
> it...such as network, hardware, etc.  Not clustering due to a bad  
> server
> would defeat the purpose of providing for fail over.

I hope we are not talking about the same thing. I am talking about a  
"deployment time" constraint and not a "runtime" constraint mandating  
that all the servers are reachable when an application is *deployed*.  
FWIW, such a constraint was also defined by WebLogic 7.0; from  
WebLogic 8.x+, this constraint was subsequently relaxed in order to  
allow the deployment to a sub-set of the cluster members - however,  
users can still enforce it if they want and partial deployments is  
not recommended. As previously said, it is trivial to implement a  
GBean to distribute a backed configuration to a server, which was not  
reachable upon deployment.

Thanks,
Gianny

>
>> Hence this is the
>> simplest implementation possible for this initial stab of clustered
>> deployment. Furthermore, as I explained in my email, there is a  
>> carbon
>> copy of the already backed configuration within the master-repository
>> and it is trivial to: either improve the GBean in charge of the  
>> remote
>> control of configurations; or to add new GBeans in order to
>> automatically upload this carbon copy to cluster members which  
>> were not
>> running upon installation.
>>
>> Also, you still need to honor my request for heads-up :). If you are
>> working on clustering, then could you please provide some headlines?
>>
>
> I am working on the EJB clustering and hope that we can both  
> communicate
> on the list when we are committing in this area.
>
> Jeff
>

Re: Distribution and start/stop of clustered deployments

Posted by Jeff Genender <jg...@apache.org>.

Gianny Damour wrote:
> You can successfully "distribute" when all the configured cluster
> members are running. If one of them is down, then the installation
> fails. This seems to be a typical scenario - at least based on the
> clustered deployments I have been working with. 

Hmmm...I have found this to be different...

I have found that if 1 node in the cluster is bad, it is removed from
the general cluster array (i.e. look at heartbeats and the
joining/leaving of a general cluster).

This is a valid use case.  The 1 server could have a problem with
it...such as network, hardware, etc.  Not clustering due to a bad server
would defeat the purpose of providing for fail over.

> Hence this is the
> simplest implementation possible for this initial stab of clustered
> deployment. Furthermore, as I explained in my email, there is a carbon
> copy of the already backed configuration within the master-repository
> and it is trivial to: either improve the GBean in charge of the remote
> control of configurations; or to add new GBeans in order to
> automatically upload this carbon copy to cluster members which were not
> running upon installation.
> 
> Also, you still need to honor my request for heads-up :). If you are
> working on clustering, then could you please provide some headlines?
> 

I am working on the EJB clustering and hope that we can both communicate
on the list when we are committing in this area.

Jeff

Re: Distribution and start/stop of clustered deployments

Posted by Gianny Damour <gi...@optusnet.com.au>.

On 13/11/2007, at 4:35 AM, Jeff Genender wrote:

>
>
> Gianny Damour wrote:
>> 2. The clustered store uploads the backed configuration to the
>> registered cluster members, which subsequently locally install  
>> them. If
>> the "remote" installation fails for one of the members, then the
>> clustered store removes the configuration from all the members having
>> successfully installed it so far.
>
> So if one server fails, the clustering is fully disabled?  Can you
> please explain if I got this right?  If so, this seems a bit heavy
> handed.  I would more expect that particular server to be removed from
> the cluster as opposed to shut down everything.

You can successfully "distribute" when all the configured cluster  
members are running. If one of them is down, then the installation  
fails. This seems to be a typical scenario - at least based on the  
clustered deployments I have been working with. Hence this is the  
simplest implementation possible for this initial stab of clustered  
deployment. Furthermore, as I explained in my email, there is a  
carbon copy of the already backed configuration within the master- 
repository and it is trivial to: either improve the GBean in charge  
of the remote control of configurations; or to add new GBeans in  
order to automatically upload this carbon copy to cluster members  
which were not running upon installation.

Also, you still need to honor my request for heads-up :). If you are  
working on clustering, then could you please provide some headlines?

Thanks,
Gianny

>
> Jeff

Re: Distribution and start/stop of clustered deployments

Posted by Jeff Genender <jg...@apache.org>.

Gianny Damour wrote:
> 2. The clustered store uploads the backed configuration to the
> registered cluster members, which subsequently locally install them. If
> the "remote" installation fails for one of the members, then the
> clustered store removes the configuration from all the members having
> successfully installed it so far.

So if one server fails, the clustering is fully disabled?  Can you
please explain if I got this right?  If so, this seems a bit heavy
handed.  I would more expect that particular server to be removed from
the cluster as opposed to shut down everything.

Jeff