You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Jörg Hoh (JIRA)" <ji...@apache.org> on 2012/09/13 12:20:07 UTC

[jira] [Created] (SLING-2597) Provide interface for monitoring services

Jörg Hoh created SLING-2597:
-------------------------------

             Summary: Provide interface for monitoring services
                 Key: SLING-2597
                 URL: https://issues.apache.org/jira/browse/SLING-2597
             Project: Sling
          Issue Type: New Feature
            Reporter: Jörg Hoh



There should be an interface which one can query to get information about the status of a service implementing this interface.


eg.

public interface HealthCheckable {

  public int getStatus();

}

For the return value for this method we could use:

static int OK = 0;
static int WARNING = 1;
static int CRITICAL = 2;
static int UNKNOWN = 3;

(these are the values which Nagios uses as return values for its plugins, see http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN76).

The decision what value is returned is delegated to the service, so maybe they need to have some configuration to define the points, where a "OK" becomes "WARNING".


Via OSGI whiteboard pattern we can collect then all services providing status information and calculate an overall status of the system. 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SLING-2597) Provide interface for monitoring services

Posted by "Carsten Ziegeler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SLING-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457045#comment-13457045 ] 

Carsten Ziegeler commented on SLING-2597:
-----------------------------------------

The examples above seem to be like a perfect fir for JMX; I think most of them except Sling eventing are covered by JMX already
                
> Provide interface for monitoring services
> -----------------------------------------
>
>                 Key: SLING-2597
>                 URL: https://issues.apache.org/jira/browse/SLING-2597
>             Project: Sling
>          Issue Type: New Feature
>            Reporter: Jörg Hoh
>
> There should be an interface which one can query to get information about the status of a service implementing this interface.
> eg.
> public interface HealthCheckable {
>   public int getStatus();
> }
> For the return value for this method we could use:
> static int OK = 0;
> static int WARNING = 1;
> static int CRITICAL = 2;
> static int UNKNOWN = 3;
> (these are the values which Nagios uses as return values for its plugins, see http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN76).
> The decision what value is returned is delegated to the service, so maybe they need to have some configuration to define the points, where a "OK" becomes "WARNING".
> Via OSGI whiteboard pattern we can collect then all services providing status information and calculate an overall status of the system. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SLING-2597) Provide interface for monitoring services

Posted by "Felix Meschberger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SLING-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456883#comment-13456883 ] 

Felix Meschberger commented on SLING-2597:
------------------------------------------

I don't think it would make sense to implement such a thing.

Apart from the seemingly simple API it is probably close to impossible to come up with a good definition of what CRITICAL means for Sling Eventing ... And I am not sure whether Sling Eventing is even able to make a sound decision on this kind of state.

As such, I think this API is just an oversimplification potentially causing confusion.
                
> Provide interface for monitoring services
> -----------------------------------------
>
>                 Key: SLING-2597
>                 URL: https://issues.apache.org/jira/browse/SLING-2597
>             Project: Sling
>          Issue Type: New Feature
>            Reporter: Jörg Hoh
>
> There should be an interface which one can query to get information about the status of a service implementing this interface.
> eg.
> public interface HealthCheckable {
>   public int getStatus();
> }
> For the return value for this method we could use:
> static int OK = 0;
> static int WARNING = 1;
> static int CRITICAL = 2;
> static int UNKNOWN = 3;
> (these are the values which Nagios uses as return values for its plugins, see http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN76).
> The decision what value is returned is delegated to the service, so maybe they need to have some configuration to define the points, where a "OK" becomes "WARNING".
> Via OSGI whiteboard pattern we can collect then all services providing status information and calculate an overall status of the system. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SLING-2597) Provide interface for monitoring services

Posted by "Jörg Hoh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SLING-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457294#comment-13457294 ] 

Jörg Hoh commented on SLING-2597:
---------------------------------

I agree to both of you.

Currently things like Sling eventing are hard to check, because they don't expose (enough) internal state. Exposing it through the API would probably bloat it. So JMX might be a viable solution here.

A problem with JMX is the dynamic nature of OSGI. Eg. when you add a new Sling event queue, the statistics of that queue might appear under a certain JMX objectname. But your external monitoring (custom service ontop of Sling) is not notified of that new event queue and its statistics. Unless you check for sling event queues via OSGI and get their statistic from JMX. This is a scenario I want to avoid.

To the definition of CRITICAL: I know that this is hard to do. But maybe some basic configurable rules could be sufficient (e.g. queue-length > 100k). Or we could define multiple aspects of sling eventing (delay, throughput, avg processing time, ...) and handle and configure each of them individually. 


                
> Provide interface for monitoring services
> -----------------------------------------
>
>                 Key: SLING-2597
>                 URL: https://issues.apache.org/jira/browse/SLING-2597
>             Project: Sling
>          Issue Type: New Feature
>            Reporter: Jörg Hoh
>
> There should be an interface which one can query to get information about the status of a service implementing this interface.
> eg.
> public interface HealthCheckable {
>   public int getStatus();
> }
> For the return value for this method we could use:
> static int OK = 0;
> static int WARNING = 1;
> static int CRITICAL = 2;
> static int UNKNOWN = 3;
> (these are the values which Nagios uses as return values for its plugins, see http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN76).
> The decision what value is returned is delegated to the service, so maybe they need to have some configuration to define the points, where a "OK" becomes "WARNING".
> Via OSGI whiteboard pattern we can collect then all services providing status information and calculate an overall status of the system. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SLING-2597) Provide interface for monitoring services

Posted by "Jörg Hoh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SLING-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456489#comment-13456489 ] 

Jörg Hoh commented on SLING-2597:
---------------------------------


In sling we have many interesting services and subsystems

* Sling eventing
* request processing: number of active requests, avg response times,
* Observation of threadpools
* Checking for state of bundles

I think, that we should at least have the interfaces in Sling. 
                
> Provide interface for monitoring services
> -----------------------------------------
>
>                 Key: SLING-2597
>                 URL: https://issues.apache.org/jira/browse/SLING-2597
>             Project: Sling
>          Issue Type: New Feature
>            Reporter: Jörg Hoh
>
> There should be an interface which one can query to get information about the status of a service implementing this interface.
> eg.
> public interface HealthCheckable {
>   public int getStatus();
> }
> For the return value for this method we could use:
> static int OK = 0;
> static int WARNING = 1;
> static int CRITICAL = 2;
> static int UNKNOWN = 3;
> (these are the values which Nagios uses as return values for its plugins, see http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN76).
> The decision what value is returned is delegated to the service, so maybe they need to have some configuration to define the points, where a "OK" becomes "WARNING".
> Via OSGI whiteboard pattern we can collect then all services providing status information and calculate an overall status of the system. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (SLING-2597) Provide interface for monitoring services

Posted by "Carsten Ziegeler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SLING-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455599#comment-13455599 ] 

Carsten Ziegeler edited comment on SLING-2597 at 9/14/12 5:21 PM:
------------------------------------------------------------------

I'm not sure if this is something we should do at the Sling level or in this way :)

Which services do you have in mind implementing this interface?
                
      was (Author: cziegeler):
    I'm not sure if this is something we should do the Sling level or in this way :)

Which services do you have in mind implementing this interface?
                  
> Provide interface for monitoring services
> -----------------------------------------
>
>                 Key: SLING-2597
>                 URL: https://issues.apache.org/jira/browse/SLING-2597
>             Project: Sling
>          Issue Type: New Feature
>            Reporter: Jörg Hoh
>
> There should be an interface which one can query to get information about the status of a service implementing this interface.
> eg.
> public interface HealthCheckable {
>   public int getStatus();
> }
> For the return value for this method we could use:
> static int OK = 0;
> static int WARNING = 1;
> static int CRITICAL = 2;
> static int UNKNOWN = 3;
> (these are the values which Nagios uses as return values for its plugins, see http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN76).
> The decision what value is returned is delegated to the service, so maybe they need to have some configuration to define the points, where a "OK" becomes "WARNING".
> Via OSGI whiteboard pattern we can collect then all services providing status information and calculate an overall status of the system. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SLING-2597) Provide interface for monitoring services

Posted by "Carsten Ziegeler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SLING-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455599#comment-13455599 ] 

Carsten Ziegeler commented on SLING-2597:
-----------------------------------------

I'm not sure if this is something we should do the Sling level or in this way :)

Which services do you have in mind implementing this interface?
                
> Provide interface for monitoring services
> -----------------------------------------
>
>                 Key: SLING-2597
>                 URL: https://issues.apache.org/jira/browse/SLING-2597
>             Project: Sling
>          Issue Type: New Feature
>            Reporter: Jörg Hoh
>
> There should be an interface which one can query to get information about the status of a service implementing this interface.
> eg.
> public interface HealthCheckable {
>   public int getStatus();
> }
> For the return value for this method we could use:
> static int OK = 0;
> static int WARNING = 1;
> static int CRITICAL = 2;
> static int UNKNOWN = 3;
> (these are the values which Nagios uses as return values for its plugins, see http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN76).
> The decision what value is returned is delegated to the service, so maybe they need to have some configuration to define the points, where a "OK" becomes "WARNING".
> Via OSGI whiteboard pattern we can collect then all services providing status information and calculate an overall status of the system. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira