You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Georg Henzler (JIRA)" <ji...@apache.org> on 2013/12/12 15:28:08 UTC

[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

    [ https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846328#comment-13846328 ] 

Georg Henzler commented on SLING-3278:
--------------------------------------

The property for async execution property can make sense when you want to make sure a check is called not as often as the health check itself (e.g. only twice a day).

I'm pretty much done, No 2 of Bertrand's list and unit tests are missing if you like you can have a look at the patches to give feedback before I submit a final one.

Impl Notes:
* The main entry method is org.apache.sling.hc.core.executor.HealthCheckExecutor.runAllForTags(String...)
* Results have now a HealthCheckDescriptor that contains meta info for the check (also used in the executor as cache key etc.) 
* Async is supported by attribute hc.async.cronExpression, a service listener is in place for registering/unregistering of jobs (org.apache.sling.hc.core.executor.AsyncHealthCheckExecutor)
* I did add a natural order to results (failed tests first, then by name alphabetically) - if not using this the order would be arbitrary (depending on execution time)
* The result has an additional finishDate and elapsedTime (I think finish date is more interesting for caching than the start date!)

Other thoughts (not in patch):
* I'm not sure if the CompositeHealthCheck makes sense - is this not a grouping competing with the tags? It is easy to configure it in a way that some checks are executed twice, especially if you run all checks without giving a tag (and the HealthCheckExecutor cannot prevent it as the CompositeHealthCheck looks like any other check to it)
* Exceptions: The result should be able to carry a exception - I would even go as far as adding "throws Exception" to the execute() signature (this would not break any existing implementation classes) and generically add a last critical log if the HC happens to throw an exception

> Provide a HealthCheckExecutor service
> -------------------------------------
>
>                 Key: SLING-3278
>                 URL: https://issues.apache.org/jira/browse/SLING-3278
>             Project: Sling
>          Issue Type: New Feature
>          Components: Health Check
>            Reporter: Georg Henzler
>            Assignee: Georg Henzler
>         Attachments: SLING-3278-hc.core-HealthCheckExecutorService-v0.5.patch, SLING-3278-hc.webconsole-v0.5.patch
>
>
> Goals:
> * Be able to get an overall (aggregated) result as quickly as possible (ideally <2sec)
> * Whenever possible, return most current results (e.g. for a memory check)
> * Provide a declarative way for async checks (async checks should be the exception though) 
> Approach
> * Run checks in parallel
> * Make sure long running (or even stuck) checks are timed out
> * If a health check must run asynchronously (because its execution time cannot be optimized), it should be enough to just specify a service property (e.g. "hc.async").
> See also
> http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
> http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)