You are viewing a plain text version of this content. The canonical link for it is here.

Posted to scm@geronimo.apache.org by Apache Wiki <wi...@apache.org> on 2005/05/15 22:14:03 UTC

[Geronimo Wiki] Update of "Architecture/ConfigurationManagement" by JeremyBoynes

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Geronimo Wiki" for change notification.

The following page has been changed by JeremyBoynes:
http://wiki.apache.org/geronimo/Architecture/ConfigurationManagement

New page:
The first thing to realize is that Geronimo configuration is different to the way many other servers work. Not just in how it is implemented, but in the principles behind how it works. The best analogy I have is to the difference between tradition container systems and Inversion of Control architectures: Geronimo is inverted.

In a traditional server, the configuration of the server is well specified by an administrator but the applications themselves are relatively generic. Yes, there will be vendor and environment specific information in the application configuration (deployment descriptor) but it is relatively high level.

To get the environment running the process is typically something like:
* Install application server
* Configure application server
* Configure application resources
* Deploy application
* Repeat deployment with new versions of the application

This is simple (well, depending on the app server) and fairly intuitive for a single installation. However, it does not scale well to thousands of servers and hundreds of applications.

There are techniques to help with this. For example, most installers support a headless mode or some form of scripting that allows for automation of the installation process; the application server may support a scripting interface that allows for resource configuration; and, of course, full JSR-88 support for application deployment ;-)

This is OK when the environment is fairly static, but these days enterprises are looking for more dynamic solutions. Rather than dedicate entire clusters to a single application, sized to cope with the estimated peak workload, they are looking to reduce costs by dynamically reallocating resources as workload varies. This varies from relatively predictable "follow-the-sun" or "follow-the-market" type distribution where spare capacity overnight is used to supplement online processing to more complex QoS management where lower priority applications are reallocated to handle load peaks in higher priority ones.

You can, of course, configure every server to handle every application but that increases the configuration costs. There may also be security issues.

These kind of issues are relatively new to the computing industry but have been well worked out in others such as networking and telecom. For example, if you buy a commercial network switch (not a retail unit like a home router), you don't need to install stuff on it or configure it individually. Instead, you connect it to the network and turn it on; some central configuration management system detects it, validates it, configures it, and starts making it do what it is meant to do. Or, when you add a new service to your cellphone you don't need to do anything locally, the provider's systems enable it, configure it, and upgrade software etc. as needed. Same for your cable box, satellite decoder, and other units deployed on a massive scale.

One design goal for Geronimo was to support just that kind of model for application infrastructure. Enterprise users would be able to deploy Geronimo under a zero-cost license giving the potential for massive scale deployments; we wanted an architecture that would reduce the cost of configuring and managing that environment.

To do that, we turned the configuration model on its head.

'''Instead of configuring the server we configure the applications.'''

The Geronimo architecture is really very simple, comprising just two key artifacts: the kernel and a configuration bundling mechanism.

The kernel is dead simple, basically providing a mechanism for registering components and resolving references between them (including lifecycle). A raw kernel does nothing except provide a framework for hosting configuration bundles.

A configuration bundle is a set of component instances packaged together with a set of declared dependencies on other services. The configuation manager's job is to resolve all those dependencies so the bundle can run. How it does that is its problem and not the bundles - this is IoC at the packaging level rather than the component level.

Why does this matter? It matters because it changes the way in which applications run. In the traditional model above, the server configuration is static (set by the admin) and the application configuration is dynamic - the runtime model varies depending on the server configuration. This leads to problems in massive deployments as the behaviour of the application may be subtly (or not so subtly) different on different servers.

In Geronimo, the configuration bundle specifies the environment it expects to run in; in return it guarantees that if that environment is present it will run as expected. When an application is loaded into a server, it is the configuration manager's job to ensure that the environment it needs is available before running the application. If it isn't a couple of things can happen depending on the sophistication of the configuration manager:

* Nothing. If the environment is not already there then it simply declines to start the application.

* Local startup. It looks inside the server (specifically a local configuration store and repository) and sees if the missing resources are present. If so, it starts them (recursively) and then goes on to start the application.

* Global startup. If the resources are not present, then it obtains the missing dependencies from some trusted external source, creates the environment needed by the application and then starts it.

In other words, the application's configuration bundle defines the environment it needs to run; when you install that in a Geronimo server, the server's configuration manager will tweak the server's environment to make sure that the application can run.

At the moment the environment specification is fairly simplistic, comprising of a parent classloader specification and a set of external code dependencies. Expect this to specification model to grow to include other dependencies, such as system service and quality of service requirements.

Just a quick reminder too that when we say "application configuration bundle" we mean a Geronimo Configuration archive. The kernel and configuration manager to not care if this bundle was created from a raw XML definition (e.g. j2ee-server.xml) or built by some sophisticated configuration builder (e.g. the Jetty or OpenEJB builders).

What this means is that we now have an infrastructure that supports the massive deployment of applications. We can take a user-land application such as a J2EE webapp or Spring application, define the parameters we expect from it (e.g. it runs at this address with this average response time and uses this set of libraries), and bundle it up in a form that can be sent to any server in the network with the assurance that it will run '''as we defined it to.'''

With this assurance, we can then take application management to the next level. Instead of manually installing and configuring servers and applications, we simply specify how we want our applications to behave across the enterprise. We can then hand them to a global resource control system which can automatically determine where and when things run. In this way we meet the goal of deployment on a massive scale whilst controlling the administration cost.