You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avalon.apache.org by Leo Sutic <le...@inspireinfrastructure.com> on 2003/10/09 10:09:38 UTC

Monitored Components (was: Thread Management)

All,

seems like the basic issues here are 

1. "How does a container detect that a component is broken?" 
    Whether that broken-ness results from a worker thread 
    crashing, or from an unhandled exception in a synchronous 
    call is irrelevant.

2. "How does a container recover from a broken component?"
    Dynamic reloading may not always work - consider components
    that maintain state between method calls (SAXTransformer).
    Sometimes, a component should just be marked "broken" and
    any calls to it blocked to keep it from hurting itself.

Perhaps a general 

    public interface Monitorable {
        public void setMonitor (ComponentMonitor monitor);
    }

    public interface ComponentMonitor {
        public void statusChange (Status status);
    }

    public class Status {
        public static final Status OK = new Status ("ok");    
        public static final Status BROKEN = new Status ("broken");
    }

can solve this?

The ComponentMonitor can work in two ways:

    /** Simple. */
    public void statusChange (Status status) {
        statusOfComponent = status;
    }

    /** Once broken, always broken. */
    public void statusChange (Status status) {
        if (statusOfComponent == Status.OK) { statusOfComponent = status
};
    }

/LS



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@avalon.apache.org
For additional commands, e-mail: dev-help@avalon.apache.org


RE: Monitored Components

Posted by Leo Sutic <le...@inspireinfrastructure.com>.

> From: news [mailto:news@sea.gmane.org] On Behalf Of Leo Simons
>
> Leo Sutic wrote:
> > IoC doesn't mean that there's no flow of information from the 
> > component to the container, just that *the container has the final
> > word* regarding what is actually *done* with the information.
> 
> The way I was looking at it, the container controls when information 
> flows, and the component is not allowed to initiate any 
> communication on its own ("You shall speak only when spoken to"). 
> Your definition is much more useful.
> 
> thinking...it seems like the "Will only speak when spoken to" is a 
> natural property for passive components (ie that don't implement 
> startable nor use seperate threads of execution), whereas you 
> won't be able to hold it for active components.

"You shall speak only when spoken to" makes sense, but you need to
allow for another order, or interpret "when spoken to" a bit wider: 
"You shall speak only when given permission to speak."

 + Speak when spoken to - because if I ask you something, you
   should interpret that as an implicit request that you shall
   speak to me (tell me the answer).

 + Notify me about things I have told you to notify me about - 
   because I have given you explicit permission to speak.

Thinking about a little child that is taught that "you shall speak 
only when spoken to". However, you should be able to instruct the
kid to "clean your room, and tell me when you are done". In this
case, you set up a callback ("tell me when you are done") and then
make an asynchronous call to the kid ("clean your room"). In the
same way "do your job and notify me about important events":

    component.setComponentMonitor (monitor); // Notify me
    component.start (); // do your job

(I've use the setXXX pattern just to make the analogy fit better.)

The important thing about IoC is the C - Control. The container
controls how the component speaks. It controls what is done
with the speech. I.e. the component cn yell and scream all it
wants and it will make no difference to the container:

 You: "Time to go sleep now."
 Kid: "But I don't want to!"
 You: "You do."  { kid.suspend(); }

A non-IoC is when the component starts to run the container:

 You: "Time to go sleep now."
 Kid: "But I don't want to!" 
 You: "Oh well I guess you don't have to, then..."
 Kid: "I want ice cream!"
 You: "OK, here you are..."
 ...

> > I have no problem with your use of lookup() to obtain a monitor. 
> > Perhaps even more in line would be to use the Context for this, 
> > though. (Or are we getting rid of that one in favor of the
> > ServiceManager?)
> 
> here we go again :D

Let's not...

> Me, I'm a type 3 convert, and in that world there is no distinction 
> between container or component-provided services. Which works.

Sounds OK to me, too.

> > Regarding the multitude of status messages - I don't think 
> > that will be any help.
> 
> It ties in with the no-logging idea I think. You want the container to

> notify an admin that a component died. It'd be useful for the 
> admin to know why that happened. Hence the specific message. And 
> since, in java, things die because of exceptions, that's a nice way 
> of providing the message.

OK - yep that makes sense. I just don't see the value of propagating
this to the other components other than as a FYI. One shouldn't expect
the other components to actually *help* here.

Makes sense - a broken component is an exceptional state, so we should
use similar objects to describe it (although we won't be throwing
the Exceptions), but as you say, since the root of the broken-ness 
probably will be an Exception, just re-using it will be the clearest,
most user-friendly thing.

Even though you'll get a stack trace and a OwwwBrokenThingException
in the logs (not normally considered user-friendly), I think that this
is the 
most user-friendlyness we can hope for and what we should aim for.
Exceptions are exceptional. If they break components one should not try
to 
hide it, but show as much data as possible, as it is something that
should never happen.

/LS


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@avalon.apache.org
For additional commands, e-mail: dev-help@avalon.apache.org


Re: Monitored Components

Posted by Leo Simons <le...@apache.org>.
Leo Sutic wrote:
> IoC doesn't mean that there's no flow of information from the
> component to the container, just that *the container has the final
> word* regarding what is actually *done* with the information.

The way I was looking at it, the container controls when information 
flows, and the component is not allowed to initiate any communication on 
its own ("You shall speak only when spoken to"). Your definition is much 
more useful.

thinking...it seems like the "Will only speak when spoken to" is a 
natural property for passive components (ie that don't implement 
startable nor use seperate threads of execution), whereas you won't be 
able to hold it for active components.

> I have no problem with your use of lookup() to obtain a monitor.
> Perhaps even more in line would be to use the Context for this,
> though. (Or are we getting rid of that one in favor of the
> ServiceManager?)

here we go again :D

I believe the last time we talked about all this we resolved (after 
three weeks of debate or so) that the context is used for 
container-component communication and the servicemanager for 
inter-component communication. The tricky part is when you have a 
service which can either be provided by the container or by another 
component. Muddy, since your average container makes most things into 
components.

So no clear answer here.

Me, I'm a type 3 convert, and in that world there is no distinction 
between container or component-provided services. Which works.

> Regarding the multitude of status messages - I don't think that will
> be any help.

It ties in with the no-logging idea I think. You want the container to 
notify an admin that a component died. It'd be useful for the admin to 
know why that happened. Hence the specific message. And since, in java, 
things die because of exceptions, that's a nice way of providing the 
message.

> I don't really want to have one component notify every
> other component that they're broken, because there's usually nothing
> the other components can do about it.

agreed.

> Just throwing more into the mess - multithreading:

actually...no problem at all!

The socket server IMV is not actually concerned with multithreading, nor 
does it need to be (unless you're running an infiniband-style 
200-processor server where a single processor cannot accept() 
connections as fast as the networking hardware). Just let the executor 
and handlers worry about that.

The single function of the socket server is to hand off connections to a 
handler. The handler is allowed to do multithreading if it wants to. It 
might again depend on an Executor, like this:

class AlternatingConnectionHandler implements ConnectionHandler
                                                      // code sketch
{
   /** in this example: a specialized pool that will call setSocket()
       on get() */
   WorkerPool m_workers = /* ... */

   handle( Socket socket )
   {
     worker = m_workers.get( socket );
     m_executor.execute( worker );
   }

   stop()
   {
     m_workers.stopAll();
   }

   class Worker implements Runnable
   {
     Worker() { /* ... */ }
     setSocket( Socket socket ) { /* ... */ }
     run()
     {
       try
       {
         /* ... */
         if(!running) return; Thread.yield();
         /* ... */
         if(!running) return; Thread.yield();
         /* ... */
       }
       finally
       {
         m_workers.release( this );
       }
     }
     stop() { running = false; }
   }
}

and you just happen to pass in a PooledExecutor here when you want 
multi-threading. Or you might just let connections queue up in your 
handler. Or...

 > m_executor.interruptAndStopAll(); // Does this method even exist?

nope, it doesn't. In fact, the executor interface specifies that it 
might even be single-threaded. But if you're using a PooledExecutor, for 
example, it does:

http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/PooledExecutor.html#shutdownNow()

However, in general, it is not the responsibility of the server or 
handler components to shut down the threads in a pool, but of the pool 
itself (which, after all, is a component, too).

cheers!

- LSD



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@avalon.apache.org
For additional commands, e-mail: dev-help@avalon.apache.org


RE: Monitored Components

Posted by Leo Sutic <le...@inspireinfrastructure.com>.
Thanks for the comments - just a quick reply to some of the many
good points (will get a bigger reply out later):

> From: news [mailto:news@sea.gmane.org] On Behalf Of Leo Simons
> 
> >     public interface ComponentMonitor {
> >         public void statusChange (Status status);
> >     }
> 
> the objection someone raised to this one is that it breaks 
> IoC because 
> it is a way for a component to tell the container to do 
> something. The 
> obvious way around that is to make the monitor active and have it use 
> polling, ie in my example it was:
> 
>    /** m_monitor runs a seperate thread which polls for status */
>    m_monitor.monitor( runnable, runner, m_component );

It is IoC, because the container tells the component - "Hey, when you
get a problem, I want to hear about it via this thing!"

(Consider how the military works: You don't expect the squad leader to
"poll" his soldiers - you expect the soldiers to report to him without
being prodded.)

Like Phoenix's requestShutdown() method - IoC doesn't mean that there's
no flow of information from the component to the container, just that
*the
container has the final word* regarding what is actually *done* with the
information.

I have no problem with your use of lookup() to obtain a monitor. Perhaps
even more in line would be to use the Context for this, though. (Or are
we getting rid of that one in favor of the ServiceManager?)

Regarding the multitude of status messages - I don't think that will be
any help. I don't really want to have one component notify every other
component that they're broken, because there's usually nothing the
other components can do about it.

It just removes the image I have in my mind of a component as "an
interface that I can get from a service manager that *just works*".
And I really like that. What complex things happen behind the interface
I don't want to care about! So the SocketManager throws a 
ServerPortInUseException - right, what does the RequestHandler do? Open
its own server socket? It degenerates *quickly*...

> There's an open question: why is there a neccessary and sufficient set

> of conditions for deviating from the usual approach of applying strict

> IoC? 

Strict IoC means that the container has the final word regarding what
actions are taken. We're *not* deviating in the slightest.

((((

Just throwing more into the mess - multithreading:

/** @avalon.component type="SocketServer" */
class ThreadedSocketServer implements SocketServer,
     Servicable, Configurable, Initializable, Startable, Disposable {
   public final static int DEFAULT_PORT = 80;
   public final static int DEFAULT_BACK_LOG = 100;
   public final static String DEFAULT_ADDRESS = "localhost";
   private int m_port;
   private int m_backlog;
   private String m_address;
   private int m_numThreads;

   private ServerSocket m_socket;
   private Monitor m_monitor;

   private Worker m_worker;
   private ConnectionHandler m_handler;

   public void configure( Configuration conf )
     throws ConfigurationException
   {
     m_port = conf.getChild("port").getValueAsInteger( DEFAULT_PORT );
     m_backlog = conf.getChild("backlog")
         .getValueAsInteger( DEFAULT_BACKLOG );
     m_address = conf.getChild("address").getValue( DEFAULT_ADDRESS );
   }

   /**
    * @avalon.dependency type="Executor"
    * @avalon.dependency type="ConnectionHandler"
    * @avalon.dependency type="Monitor"
    */
   public void service( ServiceManager sm ) throws ServiceException
   {
     m_executor = sm.Lookup( Executor.ROLE );
     m_handler = sm.lookup( ConnectionHandler.ROLE );
     m_monitor = sm.lookup( Monitor.ROLE );
   }

   public void initialize() throws Exception
   {
     m_socket = getNewServerSocket();
   }

   public void start() throws Exception
   {
     for( int i = 0; i < m_numThreads; i++ ) {
         m_executor.execute( new Worker() );
     }
   }

   public void stop()
   {
     m_executor.interruptAndStopAll(); // Does this method even exist?
   }

   public void dispose()
   {
     try { m_socket.close(); }
     catch( IOException ioe ) {}
   }

   protected void getNewServerSocket()
   {
     InetAddress address = InetAddress.getByName( m_address );
     ServerSocket socket =
         new ServerSocket( m_port, m_backlog, m_address );

     return socket;
   }

   private class Worker implements Runnable
   {
       private boolean running = false;

       public void stop()
       {
         running = false;
       }
       public void run()
       {
         running = true;

         while(running)
         {
           if(Thread.isInterrupted())
           {
             running = false;
             break; // die
           }

           try
           {
             Socket socket = m_socket.accept(); // block
             m_handler.handle( socket ); // delegate
           }
           catch( Throwable t )
           {
             m_monitor.statusChange( new Status( e ) );
                 // notify others
           }
         }
       }
   }
}

))))

/LS


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@avalon.apache.org
For additional commands, e-mail: dev-help@avalon.apache.org


Re: Monitored Components

Posted by Leo Simons <le...@apache.org>.
Leo Sutic wrote:
> 1. "How does a container detect that a component is broken?" 
>     Whether that broken-ness results from a worker thread 
>     crashing, or from an unhandled exception in a synchronous 
>     call is irrelevant.

+1. But I'm not so sure we're only looking at container doing the 
detection.....there'd be various parties interested in this information. 
So we'll also be looking at a way for querying the container to find out 
what components are broken. Not yet, though.

> 2. "How does a container recover from a broken component?"
>     Dynamic reloading may not always work - consider components
>     that maintain state between method calls (SAXTransformer).
>     Sometimes, a component should just be marked "broken" and
>     any calls to it blocked to keep it from hurting itself.

I am guessing several 'recovery' policies need to exist. Encoding just 
one of them into core contracts seems a bad idea......

>     public interface Monitorable {
>         public void setMonitor (ComponentMonitor monitor);
>     }

If I replace "able" with "Aware" in your example I end up with an 
XWork-style ("IoC type 3") setup. To problem with that (as many have 
found out by now) is the proliferation of XXXable interfaces, that it is 
difficult to remove the 'XXX' functionality in subclasses (like so 
adequately described wrt the Cloneable interface in various books), etc etc.

What about

       public service( ServiceManager sm )
       {
           monitor = sm.lookup( ComponentMonitor.ROLE );
       }

? Seems more consistent with avalon semantics.

>     public interface ComponentMonitor {
>         public void statusChange (Status status);
>     }

the objection someone raised to this one is that it breaks IoC because 
it is a way for a component to tell the container to do something. The 
obvious way around that is to make the monitor active and have it use 
polling, ie in my example it was:

   /** m_monitor runs a seperate thread which polls for status */
   m_monitor.monitor( runnable, runner, m_component );

Which is probably less efficient though...nor does it scale...any 
thoughts on that? In general, it seems to be there's a class of problems 
which are very hard to solve using IoC but simpler using 
events/callbacks/ listeners/etc. Is this a good place to break IoC? Why? 
Can we identify a pattern here?

>     public class Status {
>         public static final Status OK = new Status ("ok");    
>         public static final Status BROKEN = new Status ("broken");
>     }

it seems the 'status' should be a customizable enumeration to allow for 
application-specific policies. Perhaps something similar to HTTP (class 
of 1xx, 2xx, 3xx, 4xx error messages allowing a few base policies, and 
more specific policies based on the 'xx' in advanced implementations 
and/or [5-9]xx) could be figured out. Maybe using subclassing...

  public class Status {}
  public class StatusOK extends Status {}
  public class StatusBroken extends Status {}
  public class StatusDeadWorkerThread extends StatusBroken {}

...wait a minute...seems that's just recreating a hierarchy to parallel 
the exception hierarchy...maybe

   public class Status
   {
     private Throwable m_problem = null;

     public Status( Throwable problem ) { m_problem = problem; }
     public Throwable getProblem( m_problem ) { return m_problem; }
     public boolean isBroken() { return m_problem != null; }
   }

is flexible but still simple?

> [the above decomposition] can solve [the issues mentioned above]?

Yeah, seems like it could work, and it looks simple enough, and it might 
even be possible to implement using lifecycle extensions.

=======
Summary
=======
We're now looking at this seperation of concerns:

1 - work execution
2 - status monitoring

(1) can be addressed naturally using an Executor, whether or not thread 
management comes into the picture is not a concern of the client 
component. In fact, we've made all thread management policy configurable 
in a client-transparent way :D

The case you make is that (2) might be addressed best using a passive, 
one-instance-per-component, status-based component monitor, with the 
main argument that such a setup is much simpler than any kind of 
polling. I'll add that its also likely to be more efficient.

There's an open question: why is there a neccessary and sufficient set 
of conditions for deviating from the usual approach of applying strict 
IoC? (what are those conditions and why do they apply here?)

========================
Complete picture in code
========================

Work execution
--------------
Re-use from util.concurrent:

   interface Executor
   {
     void execute( Runnable runnable );
   }

Monitoring
----------
Passive and event-based:

   public interface ComponentMonitor
   {
     public void statusChange( Status status );
   }
   public class Status
   {
     private Throwable m_problem = null;

     public Status( Throwable problem ) { m_problem = problem; }
     public Throwable getProblem( m_problem ) { return m_problem; }
     public boolean isBroken() { return m_problem != null; }
   }

Socket server example (code sketch)
-----------------------------------
   /** Listens for socket requests and handles them. */
   interface SocketServer {}
   /** Deals with a single socket request at a time. */
   interface ConnectionHandler
   {
     public void handle( Socket socket );
   }

/** @avalon.component type="SocketServer" */
class ThreadedSocketServer implements SocketServer,
     Servicable, Configurable, Initializable, Startable, Disposable
{
   public final static int DEFAULT_PORT = 80;
   public final static int DEFAULT_BACK_LOG = 100;
   public final static String DEFAULT_ADDRESS = "localhost";
   private int m_port;
   private int m_backlog;
   private String m_address;

   private ServerSocket m_socket;
   private Monitor m_monitor;

   private Worker m_worker;
   private ConnectionHandler m_handler;

   public void configure( Configuration conf )
     throws ConfigurationException
   {
     m_port = conf.getChild("port").getValueAsInteger( DEFAULT_PORT );
     m_backlog = conf.getChild("backlog")
         .getValueAsInteger( DEFAULT_BACKLOG );
     m_address = conf.getChild("address").getValue( DEFAULT_ADDRESS );
   }

   /**
    * @avalon.dependency type="Executor"
    * @avalon.dependency type="ConnectionHandler"
    * @avalon.dependency type="Monitor"
    */
   public void service( ServiceManager sm ) throws ServiceException
   {
     m_executor = sm.Lookup( Executor.ROLE );
     m_handler = sm.lookup( ConnectionHandler.ROLE );
     m_monitor = sm.lookup( Monitor.ROLE );
   }

   public void initialize() throws Exception
   {
     m_socket = getNewServerSocket();
     m_worker = new Worker();
   }

   public void start() throws Exception
   {
     m_executor.execute( m_worker );
   }

   public void stop()
   {
     m_worker.stop();
   }

   public void dispose()
   {
     try { m_socket.close(); }
     catch( IOException ioe ) {}
   }

   protected void getNewServerSocket()
   {
     InetAddress address = InetAddress.getByName( m_address );
     ServerSocket socket =
         new ServerSocket( m_port, m_backlog, m_address );

     return socket;
   }

   private class Worker implements Runnable
   {
       private boolean running = false;

       public void stop()
       {
         running = false;
       }
       public void run()
       {
         running = true;

         while(running)
         {
           if(Thread.isInterrupted())
           {
             running = false;
             break; // die
           }

           try
           {
             Socket socket = m_socket.accept(); // block
             m_handler.handle( socket ); // delegate
           }
           catch( Throwable t )
           {
             m_monitor.statusChange( new Status( e ) );
                 // notify others
           }
         }
       }
   }
}

Observations
------------
- it is not transparent to the component that its being monitored. It'd 
be nice if it were. Not sure whether that's even remotely feasible for 
'generic' monitoring unless we introduce some magic (in the form of AOP).

- the configure()/service()/start()/stop()/initialize()/dispose() do not 
fire status changes to the monitor....assumed is that the monitor is 
container-provided and that these status changes will be sent to the 
monitor, if neccessary, by the container.

- I found several bugs and there's likely to be more; this code will 
likely not compile :D

- interesting problem, this is :D


cheers,


- LSD

PS: http://www.google.com/search?q=java+thread+monitoring reveals all 
this is an area of active research :D



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@avalon.apache.org
For additional commands, e-mail: dev-help@avalon.apache.org