You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@avalon.apache.org by Berin Loritsch <bl...@apache.org> on 2001/10/12 18:27:30 UTC

[RT] Component Tracking and Environment Exposure

I have been pondering for a while the process of exposing
information during system debugging and tuning, but with
minimal to no overhead during runtime.  This is something
that I have been thinking about for a few months now, and
thanks to an article in "Jounal of Object Oriented
Programming" (JOOP) I think I have a workable plan.  The
article in question is from the October/November 2001
issue (Vol. 14, No. 4), "Tracking Software Components",
pp. 13-22 written by Jerry Gao, Ph.D., Eugene Y. Zhu,
and Simon Shim, Ph.D..

With the bibliography out of the way, let me describe the
problem:  System administrators need the ability to tune
Phoenix and Excalibur components but information such as
pool sizes are well hidden, and Developers need
authoritative debugging and tracing information so that
they can be assured that everything is behaving as expected.

The article goes into more detail due to the fact that it
is dealing with EJBs, JavaBeans, and other automagic
components where their lifecycle is not well publicized or
understood.  This is exacerbated by the fact that different
vendors don't always follow the spec as closely as they
should and developers try to outsmart the system.  For our
case, we have a well defined lifecycle for components and
strong contracts surrounding them.

Avalon components have strong contracts and well understood
lifecycles.  This aids development, and the developer rarely
feels (or should rarely feel) as if he needs to outsmart the
system.  In all honesty, the framework is not that difficult,
and is fairly easy to grasp.  We do need the ability to
track that the environment is abiding by the rules (sort of
a run-time verification), and we need the ability to expose
normally hidden information during product development and
initial deployment.

Run-time verification simply receives events for a Component
that signify the Component is going through it's state/stage
transitions according to the contracts.  It should provide
some sort of notification if there is a conflict.  Unfortunately
this type of tracking has an associated overhead with it.
The overhead can be expensive as 40% or more depending on how
much.  For this type of tracking the only thing I can suggest
with any type of confidence is AspectJ.  This allows runtime
aspects to be placed on a system.  The type of checking is
standard accross the board and it does not require embedding
hand code in our components.  It has the added benefit that
the Aspects can be removed in the run time causing an increase
in response times for the system.

Information Exposing is much more specialized, and needs its
own set of contracts.  There are three things we must ensure
in order to expose information:

1) It is only done with the administrator's explicit approval.
2) It is switched out during the runtime causing only minimal
   performance degradation (by order of microseconds or less).
3) It is only exposed to an authorized client.

This gives us three distinct challenges to overcome.  The first
is relatively easy, as it can be a switch set by the administrator
to globally influence Avalon.  That means that it can be a switch
in a configuration file, or something similar.  The exact
mechanism will become evident after we decide on a final approach.

Challenge number 2 can be overcome by a static final boolean flag
that is initialized by the administrator's switch.  That way, the
information publishing is only performed if it is turned on.  This
does have the side affect of only being switchable when the class
is loaded in a new classloader.  However, this may be desirable
as system tuning and running are two different concerns, and not
too many people I personally know make a test system live without
it being restarted in a new environment any way.

Challenge number 3 is more difficult to overcome.  There has to
be some sort of handshaking that needs to happen to authenticate
and authorize a client.  For this solution, dynamically generated
magic keys might be an attractive solution.  It is basically how
X Windows servers authenticate clients.  They have some sort of
magic key value that is used as a shared secret.  Again, this may
be overthinking it a bit.  We have to consider the type of
information we would be publishing, and then determine if it is
worth it.  Another solution is blind trust in the installed jars,
and implement the Jar services solution (requiring JDK 1.3+).

Let me provide an example of possible information that would be
shared with the administrator:

* Pool size information--used to determine the most efficient
  pooling sizes for the particular pooled components or datasources.

* Role to Component mapping--used to verify that the expected
  component is used for the role.  By extension, this would apply
  to Blocks and Services as well.

The first example is relatively harmless, but the second example
provides a possible security hole.  The problem here is that when
a security hole is discovered in a Block or Component, you don't
want to advertise that you are using it.  While this is obscurity,
and not security, it does minimize potential attacks until you
are able to plug the hole.  Again, getting back to the ability
of exposing the information, when a security hole becomes known
it would allow the administrator the ability to verify if they
have the affected component or not.  Hense we have a catch 22--
Damned if we do, damned if we don't expose the info.

This means we need a secure method of communicating the information
to an external client.  We also have to consider different types
of environments.  For example, Cocoon lives in a Servlet environment,
but may need to expose information to the administrator.  Phoenix
is a known environment, and so this facility would be easy to embed.
We can go from something as simple as a shared secret (*MAGIC KEY*),
or as serious as full PKI (using JSSE for the TLS connections).

The demands for such a system are that there should only be one (1)
client connected at a time, and that the client be known to the server.
This does in effect rule out the Jar services approach.  One practical
requirement is that the client needs to be able to be detached from
the server environment.  JSSE does provide automatic handshaking so
that would minimize the developer effort (I have implemented the client
side of PKI sessions with Apache JMeter).  That requirement would
in turn require the messages to be sent in a structured and known
format.

Is this something that is worth the time pursuing?  I think so, but
it would have to be a community developed approach.

Keep in mind that Logging, while vital, does not necessarily provide
the functionality we are looking for.  It also does not allow us to
isolate a specific class of information for quick retrieval and Ad
Hoc querrying.

---------------------------------------------------------------------
To unsubscribe, e-mail: avalon-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: avalon-dev-help@jakarta.apache.org

Re: [RT] Component Tracking and Environment Exposure

Posted by Peter Donald <do...@apache.org>.

Kool discussion. I actually had to stop and think a few times ;)

On Mon, 15 Oct 2001 23:37, Berin Loritsch wrote:
> * State Trace: Tracks the object states or data states in a component.
>
> * Event Trace: Records the events and sequences occuring in a component.

In a lot of ways I see events and states as two sides of the one coin. Events 
could essentially be classified as either notifications or transitions to 
another state. Would you agree with that? 

> Part of the reason I meantioned AspectJ is that it can do the Operational
> tracing for Components we have no control over.  By adding a tracking
> package to Framework, and an interface for a Tracer object, we can have
> another project work on the Tracer implementation.  That way the mechanism
> to proliferate the tracing methods is a completely separate concern.
> We do have to manage how it is plugged in, but that is a later concern.

Perhaps. AspectJ is a really kool toolkit but I fear the added complexity it 
will bring. 

> > It is definetly something that we should be doing. However I believe it
> > should be built on JMX> While JMX may not be the most elegant API, it is
> > a standard and for 99% of time it is good-enough. (And I think we can get
> > around the licensing issues by reusing parts of enhydra).
>
> Keep in mind that we want to expose the ability to Component writers.  The
> Tracer implemented in Phoenix should most definitely be implemented with
> JMX. Especially since we started the work with it already.  However, there
> are different clients with different needs.  For example, Cocoon would like
> a mechanism so that when someone goes to the Cocoon Status page, they would
> like to have cache sizes and pool sizes included.  With a generic Tracer
> interface, Cocoon can have a Tracer implementation that feeds the
> StatusGenerator with the information it want's to expose.  It would even
> be able to add in performance data to find bottlenecks.

I can't see how you would not want to use JMX inside Cocoon. StatusGenerator 
roughly corresponds to the notion of an agent, albeit it would probably be 
specialized for cocoon specific data. It would be simple to have the various 
Cocoon components exposed as MBeans.

-- 
Cheers,

Pete

---------------------------------------------------
For every complex problem there is a solution that 
is simple, neat and wrong
---------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: avalon-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: avalon-dev-help@jakarta.apache.org

Re: [RT] Component Tracking and Environment Exposure

Posted by Berin Loritsch <bl...@apache.org>.

Peter Donald wrote:
> 
> On Sat, 13 Oct 2001 02:27, Berin Loritsch wrote:
> > I have been pondering for a while the process of exposing
> > information during system debugging and tuning, but with
> > minimal to no overhead during runtime.  This is something
> > that I have been thinking about for a few months now, and
> > thanks to an article in "Jounal of Object Oriented
> > Programming" (JOOP) I think I have a workable plan.  The
> > article in question is from the October/November 2001
> > issue (Vol. 14, No. 4), "Tracking Software Components",
> > pp. 13-22 written by Jerry Gao, Ph.D., Eugene Y. Zhu,
> > and Simon Shim, Ph.D..
> 
> I don't suppose theres a Web ref for that ?

http://www.joopmag.com/html/from_pages/article.asp?id=5121&mon=10&yr=2001

P.S.  JOOP only publishes on the web SOME of the articles.

> > With the bibliography out of the way, let me describe the
> > problem:  System administrators need the ability to tune
> > Phoenix and Excalibur components but information such as
> > pool sizes are well hidden, and Developers need
> > authoritative debugging and tracing information so that
> > they can be assured that everything is behaving as expected.
> 
> Operators also need tracing information usually for monitoring things etc.

This is true.  That is why the system has to be done in such
a manner that when the client is connected, it receives all
the messages.  When the client is not connected, the messages
must be destroyed--or even better never sent.

> > Run-time verification simply receives events for a Component
> > that signify the Component is going through it's state/stage
> > transitions according to the contracts.  It should provide
> > some sort of notification if there is a conflict.  Unfortunately
> > this type of tracking has an associated overhead with it.
> > The overhead can be expensive as 40% or more depending on how
> > much.  For this type of tracking the only thing I can suggest
> > with any type of confidence is AspectJ.
> 
> At one stage you also suggested a Design By Contract tool that allowed you to
> force pre, post and immutable conditions. That could be useful if the system
> was appropriately designed.

Yes, and the DBC tool I suggested would be decent--but it does incur a heavy
run-time price when it is engaged.  The DBC tool would only be used during
validation and verification of the component.  Again, DBC does require some
important constraints.  For instance, we would want it applied to the work
interface so that the Component implementation can be verified according to
the known contracts surrounding that interface.

The thing that DBC cannot offer us is testing degradation of the Component's
stability or performance over time.  This is a different type of tracking.

> > This allows runtime
> > aspects to be placed on a system.  The type of checking is
> > standard accross the board and it does not require embedding
> > hand code in our components.  It has the added benefit that
> > the Aspects can be removed in the run time causing an increase
> > in response times for the system.
> 
> I guess I have a different opinion on this. I think that most of the
> monitoring and verification code adds such a minimal overhead but has such a
> positive value that we shouldn't mind putting it in all the time.

The tracking systems outlined in the JOOP article had a 40% overhead accross
the board.  I consider that a very heavy cost.  There was a nice little
table to explain the problem domain.  The meat of the table follows:

Tracking        | Framework-Based | Automatic Code | Automatic Component
Aspects         | Code Insertion  | Insertion      | Wrapping
----------------+-----------------+----------------+--------------------
Source Code     | Needed          | Needed         | Not Needed
----------------+-----------------+----------------+--------------------
Code Separation | No              | No             | Yes
----------------+-----------------+----------------+--------------------
Overhead        | High            | Low            | Low
----------------+-----------------+----------------+--------------------
Complexity      | Low             | Very High      | High
----------------+-----------------+----------------+--------------------
Flexibility     | High            | Low            | Low
----------------+-----------------+----------------+--------------------
Applicability   | All types       | OP trace,      | OP trace,
                |                 | Perf. trace    | Perf. tracec
----------------+-----------------+----------------+--------------------
Applicable      | In-house        | In-house       | In-house and COTS
Components      |                 |                |

OP trace - Operational Trace

Perf. trace - Performance trace

COTS - Commercial Off-The-Shelf

There is a list of different trace types:

* Operational Trace: Records the intractions of component operations such
  as function invocations.  It can be broken down into Internal and External
  function traces (invocations of methods internal to the component and
  external to the component).

* Performance Trace: Records the performance data and benchmarks for each
  function of a component in a given platform and environment.

* State Trace: Tracks the object states or data states in a component.

* Event Trace: Records the events and sequences occuring in a component.

* Error Trace: Records the error messages generated by a component.

* Meta Information Trace: Tracks meta information about the component.  This
  is used to expose size information (i.e. memory consumed, diskspace used,
  pool sizes, etc.).  This is also not in the article.

Here is a quick table I threw together that shows where I think the trace
types are more important:

Trace Type    | Development | Verification | System | Operators | Trouble- | Production
              | and Testing | & Evaluation | Tuning |           | Shooting |
--------------+-------------+--------------+--------+-----------+----------+-----------
Operational   |     *       |      *       |        |           |    *     |
--------------+-------------+--------------+--------+-----------+----------+-----------
Performance   |     *       |      *       |   *    |     *     |    *     |
--------------+-------------+--------------+--------+-----------+----------+-----------
State Trace   |     *       |      *       |        |     *     |    *     |
--------------+-------------+--------------+--------+-----------+----------+-----------
Event Trace   |     *       |      *       |        |           |    *     |
--------------+-------------+--------------+--------+-----------+----------+-----------
Error Trace   |     *       |      *       |        |     *     |    *     |    *
--------------+-------------+--------------+--------+-----------+----------+-----------
Meta Info     |     *       |      *       |   *    |     *     |    *     |    -

(*) Useful    (-) Sometimes useful   ( ) Not necessary

As you can see, not all types of tracing need to be used all the time.
It would be good to remove the overhead of the unneeded tracing when you
don't need it.  A solution similar to the LogKit mechanism would be
excellent as it can help you optimize out your trace types when no client
is connected.  It can also automatically re-enable the trace types when
a client is accepting different kinds of traces.

Part of the reason I meantioned AspectJ is that it can do the Operational
tracing for Components we have no control over.  By adding a tracking
package to Framework, and an interface for a Tracer object, we can have
another project work on the Tracer implementation.  That way the mechanism
to proliferate the tracing methods is a completely separate concern.
We do have to manage how it is plugged in, but that is a later concern.

> > Let me provide an example of possible information that would be
> > shared with the administrator:
> >
> > * Pool size information--used to determine the most efficient
> >   pooling sizes for the particular pooled components or datasources.
> >
> > * Role to Component mapping--used to verify that the expected
> >   component is used for the role.  By extension, this would apply
> >   to Blocks and Services as well.
> >
> > The first example is relatively harmless, but the second example
> > provides a possible security hole.  The problem here is that when
> > a security hole is discovered in a Block or Component, you don't
> > want to advertise that you are using it.  While this is obscurity,
> > and not security, it does minimize potential attacks until you
> > are able to plug the hole.  Again, getting back to the ability
> > of exposing the information, when a security hole becomes known
> > it would allow the administrator the ability to verify if they
> > have the affected component or not.  Hense we have a catch 22--
> > Damned if we do, damned if we don't expose the info.
> 
> Again I think it is up to the management agent. Some may allow no one, except
> those who pass retinal scans, dna fingerprinting and voice recognition while
> others may simply require a HTTP password. It is completely dependent on
> agent IMO.

See my refinement above.  It does take into account your comments.

> > The demands for such a system are that there should only be one (1)
> > client connected at a time,
> 
> I don't see this as a must.

This can be agent dependant.

> > and that the client be known to the server.
> > This does in effect rule out the Jar services approach.  One practical
> > requirement is that the client needs to be able to be detached from
> > the server environment.  JSSE does provide automatic handshaking so
> > that would minimize the developer effort (I have implemented the client
> > side of PKI sessions with Apache JMeter).  That requirement would
> > in turn require the messages to be sent in a structured and known
> > format.
> >
> > Is this something that is worth the time pursuing?  I think so, but
> > it would have to be a community developed approach.
> 
> It is definetly something that we should be doing. However I believe it
> should be built on JMX> While JMX may not be the most elegant API, it is a
> standard and for 99% of time it is good-enough. (And I think we can get
> around the licensing issues by reusing parts of enhydra).

Keep in mind that we want to expose the ability to Component writers.  The
Tracer implemented in Phoenix should most definitely be implemented with JMX.
Especially since we started the work with it already.  However, there are
different clients with different needs.  For example, Cocoon would like a
mechanism so that when someone goes to the Cocoon Status page, they would
like to have cache sizes and pool sizes included.  With a generic Tracer
interface, Cocoon can have a Tracer implementation that feeds the
StatusGenerator with the information it want's to expose.  It would even
be able to add in performance data to find bottlenecks.

> > Keep in mind that Logging, while vital, does not necessarily provide
> > the functionality we are looking for.  It also does not allow us to
> > isolate a specific class of information for quick retrieval and Ad
> > Hoc querrying.
> 
> I classify logging as a subset of the notification information provided in
> JMX. I hadn't thought of using logging files as a searchable data store
> though ;)

Most people haven't.  It is because the message information is not structured
enough to do meaningful searches or categorizations of data.  The Tracer
implementation would be able to provide categorized events that are easily
grouped and searchable.  The client would even be able to provide meaningful
summary information based on the samples it got.

---------------------------------------------------------------------
To unsubscribe, e-mail: avalon-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: avalon-dev-help@jakarta.apache.org

Re: [RT] Component Tracking and Environment Exposure

Posted by Peter Donald <do...@apache.org>.

On Sat, 13 Oct 2001 02:27, Berin Loritsch wrote:
> I have been pondering for a while the process of exposing
> information during system debugging and tuning, but with
> minimal to no overhead during runtime.  This is something
> that I have been thinking about for a few months now, and
> thanks to an article in "Jounal of Object Oriented
> Programming" (JOOP) I think I have a workable plan.  The
> article in question is from the October/November 2001
> issue (Vol. 14, No. 4), "Tracking Software Components",
> pp. 13-22 written by Jerry Gao, Ph.D., Eugene Y. Zhu,
> and Simon Shim, Ph.D..

I don't suppose theres a Web ref for that ?

> With the bibliography out of the way, let me describe the
> problem:  System administrators need the ability to tune
> Phoenix and Excalibur components but information such as
> pool sizes are well hidden, and Developers need
> authoritative debugging and tracing information so that
> they can be assured that everything is behaving as expected.

Operators also need tracing information usually for monitoring things etc.

> Run-time verification simply receives events for a Component
> that signify the Component is going through it's state/stage
> transitions according to the contracts.  It should provide
> some sort of notification if there is a conflict.  Unfortunately
> this type of tracking has an associated overhead with it.
> The overhead can be expensive as 40% or more depending on how
> much.  For this type of tracking the only thing I can suggest
> with any type of confidence is AspectJ. 

At one stage you also suggested a Design By Contract tool that allowed you to 
force pre, post and immutable conditions. That could be useful if the system 
was appropriately designed.

> This allows runtime
> aspects to be placed on a system.  The type of checking is
> standard accross the board and it does not require embedding
> hand code in our components.  It has the added benefit that
> the Aspects can be removed in the run time causing an increase
> in response times for the system.

I guess I have a different opinion on this. I think that most of the 
monitoring and verification code adds such a minimal overhead but has such a 
positive value that we shouldn't mind putting it in all the time.

In some cases we may want to limit the degree to which other aspects are 
implemented. In these cases I would actually prefer not to use aspect J but 
to implement something like the following strategy. 

Almost all "join points" should occur via a java interface (I think). Thus in 
JDK1.3 we can use the automatic proxy generation classes to add in approriate 
protections and control.

For instance, say we wanted to debug a method to make sure it's pre/post 
conditions were valid. We could add in an InvocationHandler that looks 
something like

Object invoke( Object o, Method m, Object[] params, ....)
{
  checkPreconditions( o, params );
  try
  {
     return m.invoke( object, params );
  }
  finally
  {
    checkPostconditions( o, params );
  }
}

Alternatively we could add in invocation handlers for almost any other use. 
If there was some that we don't want to be applied in a production 
system then we can just use normal interface rather than generating a proxy.

> 3) It is only exposed to an authorized client.
...snip...
> Challenge number 3 is more difficult to overcome.  There has to
> be some sort of handshaking that needs to happen to authenticate
> and authorize a client.  For this solution, dynamically generated
> magic keys might be an attractive solution.  It is basically how
> X Windows servers authenticate clients.  They have some sort of
> magic key value that is used as a shared secret.  Again, this may
> be overthinking it a bit.  We have to consider the type of
> information we would be publishing, and then determine if it is
> worth it.  Another solution is blind trust in the installed jars,
> and implement the Jar services solution (requiring JDK 1.3+).

I think 3 is largely up to the object that wants to expose the management 
information.

> Let me provide an example of possible information that would be
> shared with the administrator:
>
> * Pool size information--used to determine the most efficient
>   pooling sizes for the particular pooled components or datasources.
>
> * Role to Component mapping--used to verify that the expected
>   component is used for the role.  By extension, this would apply
>   to Blocks and Services as well.
>
> The first example is relatively harmless, but the second example
> provides a possible security hole.  The problem here is that when
> a security hole is discovered in a Block or Component, you don't
> want to advertise that you are using it.  While this is obscurity,
> and not security, it does minimize potential attacks until you
> are able to plug the hole.  Again, getting back to the ability
> of exposing the information, when a security hole becomes known
> it would allow the administrator the ability to verify if they
> have the affected component or not.  Hense we have a catch 22--
> Damned if we do, damned if we don't expose the info.

Again I think it is up to the management agent. Some may allow no one, except 
those who pass retinal scans, dna fingerprinting and voice recognition while 
others may simply require a HTTP password. It is completely dependent on 
agent IMO.

> The demands for such a system are that there should only be one (1)
> client connected at a time, 

I don't see this as a must.

> and that the client be known to the server.
> This does in effect rule out the Jar services approach.  One practical
> requirement is that the client needs to be able to be detached from
> the server environment.  JSSE does provide automatic handshaking so
> that would minimize the developer effort (I have implemented the client
> side of PKI sessions with Apache JMeter).  That requirement would
> in turn require the messages to be sent in a structured and known
> format.
>
> Is this something that is worth the time pursuing?  I think so, but
> it would have to be a community developed approach.

It is definetly something that we should be doing. However I believe it 
should be built on JMX> While JMX may not be the most elegant API, it is a 
standard and for 99% of time it is good-enough. (And I think we can get 
around the licensing issues by reusing parts of enhydra).

I have started to write some base MBeans that make it easy to create 
manageable objects (based on Leos work). 

See jakarta-avalon-phoenix/src/java/org/apache/jmx/introspector/ for base 
MBean objects. They can either be used to auotmagically generate MBeans from 
an object, or set of interfaces or programatically. See 
org.apache.avalon.phoenix.components.kernel.DefaultKernelMBean for example of 
programatic management. In the future I also want to allow creation of MBeans 
based on a descriptor that sits besides component.

However the part we are missing is the agent. If you start up phoenix using 
--remote-manager you will be able to see the extremely simplified management 
interface via an unauthenticated webserver (See System.out for port number). 
In time I hope to replace the WebServer agent with something that is less 
encmbered by licensing restrictions and more secure/useful/freely licensed. 
However that sounds exactly what you want to do - so I could quite easily let 
you play with it ? ;)

So I think that once we have a good agent and we have added decent MBeans for 
objects all should be good. I tend to prefer to do it programatically rather 
than via AspectJ though ;)

> Keep in mind that Logging, while vital, does not necessarily provide
> the functionality we are looking for.  It also does not allow us to
> isolate a specific class of information for quick retrieval and Ad
> Hoc querrying.

I classify logging as a subset of the notification information provided in 
JMX. I hadn't thought of using logging files as a searchable data store 
though ;)

-- 
Cheers,

Pete

"The ability to quote is a serviceable substitute for wit." -- Maugham

---------------------------------------------------------------------
To unsubscribe, e-mail: avalon-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: avalon-dev-help@jakarta.apache.org