You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@felix.apache.org by Christian Schneider <ch...@die-schneider.net> on 2018/04/17 08:02:08 UTC

Proposal to donate the system readiness check framework to Apache Felix

Dear Felix community,

during the last weeks Andrei Dulvac and I worked on a small framework to
check if an OSGi based system is fully up.

Our work originated in testing sling modules and whole sling instances. We
soon found though that the concept is more general than sling and can be
applied to any OSGi based system.

The system readiness framework has a SystemReadinessMonitor service that
reports the aggregated state of the system. It delegates to
SystemReadinessCheck services that each check for a certain aspect. We
implemented a first check based on a list of expected top level services.
The system can be customised by adding specific checks for your
application. For example we plan to add sling specific checks inside the
sling project.

In addition to simply detecting if the system is ready we also created a DS
based root cause analysis that can be very helpful to detect why a set of
components does not come up as expected.

We would like to donate this project to the Apache Felix project as it
might get more attention there by people that are not related to sling. The
project is Apache licensed from the start and we already got a basic
documentation as well as good test coverage.

We currently host it in this github repository:
https://github.com/dulvac/system-readiness

The packages are still mentioning sling but of course we would change this
to felix if this community is interested in the donation.

Best regards

Christian and Andrei


-- 
-- 
Christian Schneider
http://www.liquid-reality.de

Computer Scientist
http://www.adobe.com

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Andrei Dulvac <du...@apache.org>.
Hi Jan,

It is an interesting approach. Determining whether or not the framework
> is in a useable state is a rather difficult problem for dynamic
> systems. The problem obviously being that when a bundle appears to be
> active, it does not necessarily imply that its services are ready for
> duty.
>
That's absolutely right. On top of that, its services being up might not
mean your application is ready. You might have some Components that need to
do some costly system initialisation before your whole system is considered
"ready", including the top level of your application.

>
> In the past, I've worked with another, somewhat similar, utility (see
> [1]) that provides a callback whenever the framework reached a "stable"
> state (or becomes "unstable") so I could act upon that.


That's great! We were thinking of whether we want a callback approach, but
that can be implemented as a consumer for our monitor.


> What I liked
> about this was that it did not require me to implement "readiness
> checks" for any of my services (though I see that it might be useful to
> provide such custom checks occassionally).
>

What we provide ootb is:
* Check for the framework being up, with details (more meat to it could
come later). i.e. all bundles, including the system bundle are active.
* Two check for a configurable list of services and components that one
consideres necessary for the application to be ready.
* The root cause analysis for the checks above (why are they not up?)

Regarding implementing custom checks, our view is that (especially for an
enterprise product), there is a separation of roles between people owning a
packaged delivery (the application as it comes out of engineering, whatever
form that takes) and the people owning different deployments. One can have
multiple (types of) deployments, with different configuration. Having the
ability to develop and install custom readiness checks at different layers
with minimal effort, while still having a unified interface for getting the
readiness status and reuse deployment automation tooling, is quite valuable.

Sorry for the convoluted message :)
tl; dr: some applications might need other things that are not a service
being up to be ready + unified interface for aggregated status and being
able to contribute "my own checks".


> That said, I think it would be interesting to have an addon for Felix
> that has minimal dependencies and combines both approaches :)
>

Interesting. I'm not very familiar with the terminology. What do you mean
by addon exactly? Could you provide  more details about that?

BTW, thanks everyone for the involvement and support. I'm really excited
about this!

- Andrei


> 1. https://github.com/beinformed/osgitest/tree/master/com.beinformed.fr
> amework.osgi.frameworkstate
>
> Regards,
>
>   Jan Willem
>
>

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Jan Willem Janssen <j....@lxtreme.nl>.
Hi,

On Tue, 2018-04-17 at 10:02 +0200, Christian Schneider wrote:
> Dear Felix community,
> 
> during the last weeks Andrei Dulvac and I worked on a small framework
> to
> check if an OSGi based system is fully up.
> 
> Our work originated in testing sling modules and whole sling
> instances. We
> soon found though that the concept is more general than sling and can
> be
> applied to any OSGi based system.
> 
> The system readiness framework has a SystemReadinessMonitor service
> that
> reports the aggregated state of the system. It delegates to
> SystemReadinessCheck services that each check for a certain aspect.
> We
> implemented a first check based on a list of expected top level
> services.
> The system can be customised by adding specific checks for your
> application. For example we plan to add sling specific checks inside
> the
> sling project.
> 
> In addition to simply detecting if the system is ready we also
> created a DS
> based root cause analysis that can be very helpful to detect why a
> set of
> components does not come up as expected.
> 
> We would like to donate this project to the Apache Felix project as
> it
> might get more attention there by people that are not related to
> sling. The
> project is Apache licensed from the start and we already got a basic
> documentation as well as good test coverage.
> 
> We currently host it in this github repository:
> https://github.com/dulvac/system-readiness
> 
> The packages are still mentioning sling but of course we would change
> this
> to felix if this community is interested in the donation.

It is an interesting approach. Determining whether or not the framework
is in a useable state is a rather difficult problem for dynamic
systems. The problem obviously being that when a bundle appears to be
active, it does not necessarily imply that its services are ready for
duty. 

In the past, I've worked with another, somewhat similar, utility (see
[1]) that provides a callback whenever the framework reached a "stable"
state (or becomes "unstable") so I could act upon that. What I liked
about this was that it did not require me to implement "readiness
checks" for any of my services (though I see that it might be useful to
provide such custom checks occassionally).

That said, I think it would be interesting to have an addon for Felix
that has minimal dependencies and combines both approaches :)

1. https://github.com/beinformed/osgitest/tree/master/com.beinformed.fr
amework.osgi.frameworkstate

Regards,

  Jan Willem


Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Christian Schneider <ch...@die-schneider.net>.
Hi JB,

I did not get this. What do you mean by external dependency?
The code only depends on the OSGi specs and slf4j api. It runtime it needs
the scr and optionally a http service. In the reference docs you also find
how to set it up in Apache Karaf.

Best
Christian


2018-04-19 5:50 GMT+02:00 Jean-Baptiste Onofré <jb...@nanthrax.net>:

> Hi Christian
>
> That's interesting. Can you enlight a bit on the external dependency
> required by the codebase ?
>
> Regards
> JB
>
> Le 17 avr. 2018 à 12:09, à 12:09, Christian Schneider <
> chris@die-schneider.net> a écrit:
> >Dear Felix community,
> >
> >during the last weeks Andrei Dulvac and I worked on a small framework
> >to
> >check if an OSGi based system is fully up.
> >
> >Our work originated in testing sling modules and whole sling instances.
> >We
> >soon found though that the concept is more general than sling and can
> >be
> >applied to any OSGi based system.
> >
> >The system readiness framework has a SystemReadinessMonitor service
> >that
> >reports the aggregated state of the system. It delegates to
> >SystemReadinessCheck services that each check for a certain aspect. We
> >implemented a first check based on a list of expected top level
> >services.
> >The system can be customised by adding specific checks for your
> >application. For example we plan to add sling specific checks inside
> >the
> >sling project.
> >
> >In addition to simply detecting if the system is ready we also created
> >a DS
> >based root cause analysis that can be very helpful to detect why a set
> >of
> >components does not come up as expected.
> >
> >We would like to donate this project to the Apache Felix project as it
> >might get more attention there by people that are not related to sling.
> >The
> >project is Apache licensed from the start and we already got a basic
> >documentation as well as good test coverage.
> >
> >We currently host it in this github repository:
> >https://github.com/dulvac/system-readiness
> >
> >The packages are still mentioning sling but of course we would change
> >this
> >to felix if this community is interested in the donation.
> >
> >Best regards
> >
> >Christian and Andrei
> >
> >
> >--
> >--
> >Christian Schneider
> >http://www.liquid-reality.de
> >
> >Computer Scientist
> >http://www.adobe.com
>



-- 
-- 
Christian Schneider
http://www.liquid-reality.de

Computer Scientist
http://www.adobe.com

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Christian

That's interesting. Can you enlight a bit on the external dependency required by the codebase ?

Regards
JB

Le 17 avr. 2018 à 12:09, à 12:09, Christian Schneider <ch...@die-schneider.net> a écrit:
>Dear Felix community,
>
>during the last weeks Andrei Dulvac and I worked on a small framework
>to
>check if an OSGi based system is fully up.
>
>Our work originated in testing sling modules and whole sling instances.
>We
>soon found though that the concept is more general than sling and can
>be
>applied to any OSGi based system.
>
>The system readiness framework has a SystemReadinessMonitor service
>that
>reports the aggregated state of the system. It delegates to
>SystemReadinessCheck services that each check for a certain aspect. We
>implemented a first check based on a list of expected top level
>services.
>The system can be customised by adding specific checks for your
>application. For example we plan to add sling specific checks inside
>the
>sling project.
>
>In addition to simply detecting if the system is ready we also created
>a DS
>based root cause analysis that can be very helpful to detect why a set
>of
>components does not come up as expected.
>
>We would like to donate this project to the Apache Felix project as it
>might get more attention there by people that are not related to sling.
>The
>project is Apache licensed from the start and we already got a basic
>documentation as well as good test coverage.
>
>We currently host it in this github repository:
>https://github.com/dulvac/system-readiness
>
>The packages are still mentioning sling but of course we would change
>this
>to felix if this community is interested in the donation.
>
>Best regards
>
>Christian and Andrei
>
>
>-- 
>-- 
>Christian Schneider
>http://www.liquid-reality.de
>
>Computer Scientist
>http://www.adobe.com

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Andrei Dulvac <du...@apache.org>.
Hi Karl.

Thanks for the support!
I'd love to have a role in maintaining it further and get more involved
with the Felix community in general.

- Andrei

On Tue, Apr 17, 2018 at 1:36 PM Karl Pauls <ka...@gmail.com> wrote:

> Hi Christian and Andrei,
>
> as I said out of band already previously, I think this looks
> interesting and I agree that it seems generic enough to be at Felix. I
> assume you would be willing to maintain it going forward (assuming we
> choose to accept it)?
>
> Let's see what others think.
>
> regards,
>
> Karl
>
> On Tue, Apr 17, 2018 at 10:02 AM, Christian Schneider
> <ch...@die-schneider.net> wrote:
> > Dear Felix community,
> >
> > during the last weeks Andrei Dulvac and I worked on a small framework to
> > check if an OSGi based system is fully up.
> >
> > Our work originated in testing sling modules and whole sling instances.
> We
> > soon found though that the concept is more general than sling and can be
> > applied to any OSGi based system.
> >
> > The system readiness framework has a SystemReadinessMonitor service that
> > reports the aggregated state of the system. It delegates to
> > SystemReadinessCheck services that each check for a certain aspect. We
> > implemented a first check based on a list of expected top level services.
> > The system can be customised by adding specific checks for your
> > application. For example we plan to add sling specific checks inside the
> > sling project.
> >
> > In addition to simply detecting if the system is ready we also created a
> DS
> > based root cause analysis that can be very helpful to detect why a set of
> > components does not come up as expected.
> >
> > We would like to donate this project to the Apache Felix project as it
> > might get more attention there by people that are not related to sling.
> The
> > project is Apache licensed from the start and we already got a basic
> > documentation as well as good test coverage.
> >
> > We currently host it in this github repository:
> > https://github.com/dulvac/system-readiness
> >
> > The packages are still mentioning sling but of course we would change
> this
> > to felix if this community is interested in the donation.
> >
> > Best regards
> >
> > Christian and Andrei
> >
> >
> > --
> > --
> > Christian Schneider
> > http://www.liquid-reality.de
> >
> > Computer Scientist
> > http://www.adobe.com
>
>
>
> --
> Karl Pauls
> karlpauls@gmail.com
>

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Karl Pauls <ka...@gmail.com>.
Hi Christian and Andrei,

as I said out of band already previously, I think this looks
interesting and I agree that it seems generic enough to be at Felix. I
assume you would be willing to maintain it going forward (assuming we
choose to accept it)?

Let's see what others think.

regards,

Karl

On Tue, Apr 17, 2018 at 10:02 AM, Christian Schneider
<ch...@die-schneider.net> wrote:
> Dear Felix community,
>
> during the last weeks Andrei Dulvac and I worked on a small framework to
> check if an OSGi based system is fully up.
>
> Our work originated in testing sling modules and whole sling instances. We
> soon found though that the concept is more general than sling and can be
> applied to any OSGi based system.
>
> The system readiness framework has a SystemReadinessMonitor service that
> reports the aggregated state of the system. It delegates to
> SystemReadinessCheck services that each check for a certain aspect. We
> implemented a first check based on a list of expected top level services.
> The system can be customised by adding specific checks for your
> application. For example we plan to add sling specific checks inside the
> sling project.
>
> In addition to simply detecting if the system is ready we also created a DS
> based root cause analysis that can be very helpful to detect why a set of
> components does not come up as expected.
>
> We would like to donate this project to the Apache Felix project as it
> might get more attention there by people that are not related to sling. The
> project is Apache licensed from the start and we already got a basic
> documentation as well as good test coverage.
>
> We currently host it in this github repository:
> https://github.com/dulvac/system-readiness
>
> The packages are still mentioning sling but of course we would change this
> to felix if this community is interested in the donation.
>
> Best regards
>
> Christian and Andrei
>
>
> --
> --
> Christian Schneider
> http://www.liquid-reality.de
>
> Computer Scientist
> http://www.adobe.com



-- 
Karl Pauls
karlpauls@gmail.com

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Christian Schneider <ch...@die-schneider.net>.
Good point :-)

Christian

2018-04-17 16:51 GMT+02:00 Richard S. Hall <he...@ungoverned.org>:

> The question to answer is whether it would be worthwhile "as is" to be
> donated to Apache Felix as a starting point? If so, figuring out ways to
> improve it can happen after the fact. If not, then there isn't much else to
> discuss.
>
> -> richard
>
>
> On 4/17/18 08:53 , Neil Bartlett wrote:
>
>> I like the general idea but, like Guillaume, I feel maybe this should be
>> implemented at a lower level. The core `SystemReadinessMonitorImpl` and
>> the
>> rootcause command are implemented as DS components and configured via
>> Config Admin... but what if SCR and/or ConfigAdmin are unavailable or not
>> working?
>>
>> I'm also not sure about the way in which checks are defined and extended.
>> Only the application knows what "should" be started, but this can be
>> defined at the application level using a DS component that has
>> dependencies
>> on the necessary services, config etc. That DS component would provide a
>> SystemReady service when it has decided the system is ready.
>>
>> Thus I think your framework can be boiled down to something simpler:
>>
>> * An exported SystemReady service interface, which should appear within a
>> configurable timeout;
>> * The root cause analysis tool, which is something I have always wanted to
>> have and I hope your implementation works as well as described!
>>
>> Regards,
>> Neil
>>
>> On Tue, Apr 17, 2018 at 1:37 PM, Andrei Dulvac <du...@apache.org> wrote:
>>
>> Hi Guillaume.
>>>
>>> Thanks!
>>>
>>> There's the OOTB ServicesCheck check that can be configured with a list
>>> of
>>> services [0].
>>> We were thinking we could add the mandatory checks there through
>>> configuration.
>>>
>>> The fact that the system can initially green, because no checks are
>>> present
>>> is VERY valid.
>>> We try to mediate that with the ServicesCheck and by making sure the
>>> monitor waits for the framework to be up before reporting anything other
>>> than YELLOW.
>>>
>>> Hope I got the question and suggestion right :D
>>>
>>> - Andrei
>>>
>>>
>>> ---
>>> [0]
>>> https://github.com/dulvac/system-readiness/blob/master/
>>> src/main/java/org/apache/sling/systemreadiness/core/
>>> impl/ServicesCheck.java#L59
>>>
>>> On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gn...@apache.org>
>>> wrote:
>>>
>>> I like it a lot, the API is simple and extensible enough.  Really nice
>>>>
>>> work
>>>
>>>> !
>>>> I'm just a bit nervous about having such a low-level component depend on
>>>>
>>> an
>>>
>>>> external extender...
>>>>
>>>> I think it's missing one bit though: some kind of expectations. I.e. it
>>>> checks existing stuff, but it does not cover missing stuff.  I suppose
>>>> it
>>>> could be implemented specifically using custom checks, but I think there
>>>>
>>> is
>>>
>>>> still a hole, which is the fact that those custom checks are not
>>>> available.  So I wonder if there should be an additional built-in check
>>>> that would grab a configuration entry, turn that into a list of
>>>> mandatory
>>>> checks and be green if all those check are available, yellow/red
>>>> otherwise.  This would ensure your container does not switch between
>>>> green/yellow while the container is booting/provisioning.
>>>>
>>>> 2018-04-17 10:02 GMT+02:00 Christian Schneider <chris@die-schneider.net
>>>> :
>>>>
>>>> Dear Felix community,
>>>>>
>>>>> during the last weeks Andrei Dulvac and I worked on a small framework
>>>>>
>>>> to
>>>
>>>> check if an OSGi based system is fully up.
>>>>>
>>>>> Our work originated in testing sling modules and whole sling instances.
>>>>>
>>>> We
>>>>
>>>>> soon found though that the concept is more general than sling and can
>>>>>
>>>> be
>>>
>>>> applied to any OSGi based system.
>>>>>
>>>>> The system readiness framework has a SystemReadinessMonitor service
>>>>>
>>>> that
>>>
>>>> reports the aggregated state of the system. It delegates to
>>>>> SystemReadinessCheck services that each check for a certain aspect. We
>>>>> implemented a first check based on a list of expected top level
>>>>>
>>>> services.
>>>
>>>> The system can be customised by adding specific checks for your
>>>>> application. For example we plan to add sling specific checks inside
>>>>>
>>>> the
>>>
>>>> sling project.
>>>>>
>>>>> In addition to simply detecting if the system is ready we also created
>>>>>
>>>> a
>>>
>>>> DS
>>>>
>>>>> based root cause analysis that can be very helpful to detect why a set
>>>>>
>>>> of
>>>
>>>> components does not come up as expected.
>>>>>
>>>>> We would like to donate this project to the Apache Felix project as it
>>>>> might get more attention there by people that are not related to sling.
>>>>>
>>>> The
>>>>
>>>>> project is Apache licensed from the start and we already got a basic
>>>>> documentation as well as good test coverage.
>>>>>
>>>>> We currently host it in this github repository:
>>>>> https://github.com/dulvac/system-readiness
>>>>>
>>>>> The packages are still mentioning sling but of course we would change
>>>>>
>>>> this
>>>>
>>>>> to felix if this community is interested in the donation.
>>>>>
>>>>> Best regards
>>>>>
>>>>> Christian and Andrei
>>>>>
>>>>>
>>>>> --
>>>>> --
>>>>> Christian Schneider
>>>>> http://www.liquid-reality.de
>>>>>
>>>>> Computer Scientist
>>>>> http://www.adobe.com
>>>>>
>>>>>
>>>>
>>>> --
>>>> ------------------------
>>>> Guillaume Nodet
>>>>
>>>>
>


-- 
-- 
Christian Schneider
http://www.liquid-reality.de

Computer Scientist
http://www.adobe.com

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by "Richard S. Hall" <he...@ungoverned.org>.
The question to answer is whether it would be worthwhile "as is" to be 
donated to Apache Felix as a starting point? If so, figuring out ways to 
improve it can happen after the fact. If not, then there isn't much else 
to discuss.

-> richard

On 4/17/18 08:53 , Neil Bartlett wrote:
> I like the general idea but, like Guillaume, I feel maybe this should be
> implemented at a lower level. The core `SystemReadinessMonitorImpl` and the
> rootcause command are implemented as DS components and configured via
> Config Admin... but what if SCR and/or ConfigAdmin are unavailable or not
> working?
>
> I'm also not sure about the way in which checks are defined and extended.
> Only the application knows what "should" be started, but this can be
> defined at the application level using a DS component that has dependencies
> on the necessary services, config etc. That DS component would provide a
> SystemReady service when it has decided the system is ready.
>
> Thus I think your framework can be boiled down to something simpler:
>
> * An exported SystemReady service interface, which should appear within a
> configurable timeout;
> * The root cause analysis tool, which is something I have always wanted to
> have and I hope your implementation works as well as described!
>
> Regards,
> Neil
>
> On Tue, Apr 17, 2018 at 1:37 PM, Andrei Dulvac <du...@apache.org> wrote:
>
>> Hi Guillaume.
>>
>> Thanks!
>>
>> There's the OOTB ServicesCheck check that can be configured with a list of
>> services [0].
>> We were thinking we could add the mandatory checks there through
>> configuration.
>>
>> The fact that the system can initially green, because no checks are present
>> is VERY valid.
>> We try to mediate that with the ServicesCheck and by making sure the
>> monitor waits for the framework to be up before reporting anything other
>> than YELLOW.
>>
>> Hope I got the question and suggestion right :D
>>
>> - Andrei
>>
>>
>> ---
>> [0]
>> https://github.com/dulvac/system-readiness/blob/master/
>> src/main/java/org/apache/sling/systemreadiness/core/
>> impl/ServicesCheck.java#L59
>>
>> On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gn...@apache.org> wrote:
>>
>>> I like it a lot, the API is simple and extensible enough.  Really nice
>> work
>>> !
>>> I'm just a bit nervous about having such a low-level component depend on
>> an
>>> external extender...
>>>
>>> I think it's missing one bit though: some kind of expectations. I.e. it
>>> checks existing stuff, but it does not cover missing stuff.  I suppose it
>>> could be implemented specifically using custom checks, but I think there
>> is
>>> still a hole, which is the fact that those custom checks are not
>>> available.  So I wonder if there should be an additional built-in check
>>> that would grab a configuration entry, turn that into a list of mandatory
>>> checks and be green if all those check are available, yellow/red
>>> otherwise.  This would ensure your container does not switch between
>>> green/yellow while the container is booting/provisioning.
>>>
>>> 2018-04-17 10:02 GMT+02:00 Christian Schneider <chris@die-schneider.net
>>> :
>>>
>>>> Dear Felix community,
>>>>
>>>> during the last weeks Andrei Dulvac and I worked on a small framework
>> to
>>>> check if an OSGi based system is fully up.
>>>>
>>>> Our work originated in testing sling modules and whole sling instances.
>>> We
>>>> soon found though that the concept is more general than sling and can
>> be
>>>> applied to any OSGi based system.
>>>>
>>>> The system readiness framework has a SystemReadinessMonitor service
>> that
>>>> reports the aggregated state of the system. It delegates to
>>>> SystemReadinessCheck services that each check for a certain aspect. We
>>>> implemented a first check based on a list of expected top level
>> services.
>>>> The system can be customised by adding specific checks for your
>>>> application. For example we plan to add sling specific checks inside
>> the
>>>> sling project.
>>>>
>>>> In addition to simply detecting if the system is ready we also created
>> a
>>> DS
>>>> based root cause analysis that can be very helpful to detect why a set
>> of
>>>> components does not come up as expected.
>>>>
>>>> We would like to donate this project to the Apache Felix project as it
>>>> might get more attention there by people that are not related to sling.
>>> The
>>>> project is Apache licensed from the start and we already got a basic
>>>> documentation as well as good test coverage.
>>>>
>>>> We currently host it in this github repository:
>>>> https://github.com/dulvac/system-readiness
>>>>
>>>> The packages are still mentioning sling but of course we would change
>>> this
>>>> to felix if this community is interested in the donation.
>>>>
>>>> Best regards
>>>>
>>>> Christian and Andrei
>>>>
>>>>
>>>> --
>>>> --
>>>> Christian Schneider
>>>> http://www.liquid-reality.de
>>>>
>>>> Computer Scientist
>>>> http://www.adobe.com
>>>>
>>>
>>>
>>> --
>>> ------------------------
>>> Guillaume Nodet
>>>


Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Neil Bartlett <nj...@gmail.com>.
I agree with Richard's point about improving after contribution, so that
means +1 from me for the contribution of this project as it stands.

Regarding your points in relation to AEM, I think the overall concept can
be split into the following three concerns that can be decoupled:

1. A part that checks health of the system and reports true or false. Your
implementation would be a really flexible component driven by
configuration. In a simpler application the implementation would be just a
DS component with some mandatory service references.
2. A part that waits for the system to be reported as healthy, or shuts
down (with diagnostics) when not.
3. Root cause analysis, callable manually as a command, and by (2).

Neil


On Tue, Apr 17, 2018 at 3:29 PM, Christian Schneider <
chris@die-schneider.net> wrote:

> The problem we face in our environment (AEM) is that the system is highly
> configurable.
> So the checks can not be defined statically for all AEM based systems. This
> is why we came up with the
> check services that can each solve a part fo the problem and then be
> combined to show the aggregated state.
>
> For a single purpose application I agree with you that you can implement an
> application specific check that covers all aspects of the application
> readyness.
>
> The root cause analysis is something we could split off at some point and
> implement in its own bundle. I think it is not yet covering all aspects but
> I am pretty sure we can evolve it into a good tool that really helps
> developers. I already implemented it in its own packet with no deps to the
> other packages so it can be easily split off.
>
> Christian
>
>
> 2018-04-17 14:53 GMT+02:00 Neil Bartlett <nj...@gmail.com>:
>
> > I like the general idea but, like Guillaume, I feel maybe this should be
> > implemented at a lower level. The core `SystemReadinessMonitorImpl` and
> the
> > rootcause command are implemented as DS components and configured via
> > Config Admin... but what if SCR and/or ConfigAdmin are unavailable or not
> > working?
> >
> > I'm also not sure about the way in which checks are defined and extended.
> > Only the application knows what "should" be started, but this can be
> > defined at the application level using a DS component that has
> dependencies
> > on the necessary services, config etc. That DS component would provide a
> > SystemReady service when it has decided the system is ready.
> >
> > Thus I think your framework can be boiled down to something simpler:
> >
> > * An exported SystemReady service interface, which should appear within a
> > configurable timeout;
> > * The root cause analysis tool, which is something I have always wanted
> to
> > have and I hope your implementation works as well as described!
> >
> > Regards,
> > Neil
> >
> > On Tue, Apr 17, 2018 at 1:37 PM, Andrei Dulvac <du...@apache.org>
> wrote:
> >
> > > Hi Guillaume.
> > >
> > > Thanks!
> > >
> > > There's the OOTB ServicesCheck check that can be configured with a list
> > of
> > > services [0].
> > > We were thinking we could add the mandatory checks there through
> > > configuration.
> > >
> > > The fact that the system can initially green, because no checks are
> > present
> > > is VERY valid.
> > > We try to mediate that with the ServicesCheck and by making sure the
> > > monitor waits for the framework to be up before reporting anything
> other
> > > than YELLOW.
> > >
> > > Hope I got the question and suggestion right :D
> > >
> > > - Andrei
> > >
> > >
> > > ---
> > > [0]
> > > https://github.com/dulvac/system-readiness/blob/master/
> > > src/main/java/org/apache/sling/systemreadiness/core/
> > > impl/ServicesCheck.java#L59
> > >
> > > On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gn...@apache.org>
> > wrote:
> > >
> > > > I like it a lot, the API is simple and extensible enough.  Really
> nice
> > > work
> > > > !
> > > > I'm just a bit nervous about having such a low-level component depend
> > on
> > > an
> > > > external extender...
> > > >
> > > > I think it's missing one bit though: some kind of expectations. I.e.
> it
> > > > checks existing stuff, but it does not cover missing stuff.  I
> suppose
> > it
> > > > could be implemented specifically using custom checks, but I think
> > there
> > > is
> > > > still a hole, which is the fact that those custom checks are not
> > > > available.  So I wonder if there should be an additional built-in
> check
> > > > that would grab a configuration entry, turn that into a list of
> > mandatory
> > > > checks and be green if all those check are available, yellow/red
> > > > otherwise.  This would ensure your container does not switch between
> > > > green/yellow while the container is booting/provisioning.
> > > >
> > > > 2018-04-17 10:02 GMT+02:00 Christian Schneider <
> > chris@die-schneider.net
> > > >:
> > > >
> > > > > Dear Felix community,
> > > > >
> > > > > during the last weeks Andrei Dulvac and I worked on a small
> framework
> > > to
> > > > > check if an OSGi based system is fully up.
> > > > >
> > > > > Our work originated in testing sling modules and whole sling
> > instances.
> > > > We
> > > > > soon found though that the concept is more general than sling and
> can
> > > be
> > > > > applied to any OSGi based system.
> > > > >
> > > > > The system readiness framework has a SystemReadinessMonitor service
> > > that
> > > > > reports the aggregated state of the system. It delegates to
> > > > > SystemReadinessCheck services that each check for a certain aspect.
> > We
> > > > > implemented a first check based on a list of expected top level
> > > services.
> > > > > The system can be customised by adding specific checks for your
> > > > > application. For example we plan to add sling specific checks
> inside
> > > the
> > > > > sling project.
> > > > >
> > > > > In addition to simply detecting if the system is ready we also
> > created
> > > a
> > > > DS
> > > > > based root cause analysis that can be very helpful to detect why a
> > set
> > > of
> > > > > components does not come up as expected.
> > > > >
> > > > > We would like to donate this project to the Apache Felix project as
> > it
> > > > > might get more attention there by people that are not related to
> > sling.
> > > > The
> > > > > project is Apache licensed from the start and we already got a
> basic
> > > > > documentation as well as good test coverage.
> > > > >
> > > > > We currently host it in this github repository:
> > > > > https://github.com/dulvac/system-readiness
> > > > >
> > > > > The packages are still mentioning sling but of course we would
> change
> > > > this
> > > > > to felix if this community is interested in the donation.
> > > > >
> > > > > Best regards
> > > > >
> > > > > Christian and Andrei
> > > > >
> > > > >
> > > > > --
> > > > > --
> > > > > Christian Schneider
> > > > > http://www.liquid-reality.de
> > > > >
> > > > > Computer Scientist
> > > > > http://www.adobe.com
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > ------------------------
> > > > Guillaume Nodet
> > > >
> > >
> >
>
>
>
> --
> --
> Christian Schneider
> http://www.liquid-reality.de
>
> Computer Scientist
> http://www.adobe.com
>

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Christian Schneider <ch...@die-schneider.net>.
The problem we face in our environment (AEM) is that the system is highly
configurable.
So the checks can not be defined statically for all AEM based systems. This
is why we came up with the
check services that can each solve a part fo the problem and then be
combined to show the aggregated state.

For a single purpose application I agree with you that you can implement an
application specific check that covers all aspects of the application
readyness.

The root cause analysis is something we could split off at some point and
implement in its own bundle. I think it is not yet covering all aspects but
I am pretty sure we can evolve it into a good tool that really helps
developers. I already implemented it in its own packet with no deps to the
other packages so it can be easily split off.

Christian


2018-04-17 14:53 GMT+02:00 Neil Bartlett <nj...@gmail.com>:

> I like the general idea but, like Guillaume, I feel maybe this should be
> implemented at a lower level. The core `SystemReadinessMonitorImpl` and the
> rootcause command are implemented as DS components and configured via
> Config Admin... but what if SCR and/or ConfigAdmin are unavailable or not
> working?
>
> I'm also not sure about the way in which checks are defined and extended.
> Only the application knows what "should" be started, but this can be
> defined at the application level using a DS component that has dependencies
> on the necessary services, config etc. That DS component would provide a
> SystemReady service when it has decided the system is ready.
>
> Thus I think your framework can be boiled down to something simpler:
>
> * An exported SystemReady service interface, which should appear within a
> configurable timeout;
> * The root cause analysis tool, which is something I have always wanted to
> have and I hope your implementation works as well as described!
>
> Regards,
> Neil
>
> On Tue, Apr 17, 2018 at 1:37 PM, Andrei Dulvac <du...@apache.org> wrote:
>
> > Hi Guillaume.
> >
> > Thanks!
> >
> > There's the OOTB ServicesCheck check that can be configured with a list
> of
> > services [0].
> > We were thinking we could add the mandatory checks there through
> > configuration.
> >
> > The fact that the system can initially green, because no checks are
> present
> > is VERY valid.
> > We try to mediate that with the ServicesCheck and by making sure the
> > monitor waits for the framework to be up before reporting anything other
> > than YELLOW.
> >
> > Hope I got the question and suggestion right :D
> >
> > - Andrei
> >
> >
> > ---
> > [0]
> > https://github.com/dulvac/system-readiness/blob/master/
> > src/main/java/org/apache/sling/systemreadiness/core/
> > impl/ServicesCheck.java#L59
> >
> > On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gn...@apache.org>
> wrote:
> >
> > > I like it a lot, the API is simple and extensible enough.  Really nice
> > work
> > > !
> > > I'm just a bit nervous about having such a low-level component depend
> on
> > an
> > > external extender...
> > >
> > > I think it's missing one bit though: some kind of expectations. I.e. it
> > > checks existing stuff, but it does not cover missing stuff.  I suppose
> it
> > > could be implemented specifically using custom checks, but I think
> there
> > is
> > > still a hole, which is the fact that those custom checks are not
> > > available.  So I wonder if there should be an additional built-in check
> > > that would grab a configuration entry, turn that into a list of
> mandatory
> > > checks and be green if all those check are available, yellow/red
> > > otherwise.  This would ensure your container does not switch between
> > > green/yellow while the container is booting/provisioning.
> > >
> > > 2018-04-17 10:02 GMT+02:00 Christian Schneider <
> chris@die-schneider.net
> > >:
> > >
> > > > Dear Felix community,
> > > >
> > > > during the last weeks Andrei Dulvac and I worked on a small framework
> > to
> > > > check if an OSGi based system is fully up.
> > > >
> > > > Our work originated in testing sling modules and whole sling
> instances.
> > > We
> > > > soon found though that the concept is more general than sling and can
> > be
> > > > applied to any OSGi based system.
> > > >
> > > > The system readiness framework has a SystemReadinessMonitor service
> > that
> > > > reports the aggregated state of the system. It delegates to
> > > > SystemReadinessCheck services that each check for a certain aspect.
> We
> > > > implemented a first check based on a list of expected top level
> > services.
> > > > The system can be customised by adding specific checks for your
> > > > application. For example we plan to add sling specific checks inside
> > the
> > > > sling project.
> > > >
> > > > In addition to simply detecting if the system is ready we also
> created
> > a
> > > DS
> > > > based root cause analysis that can be very helpful to detect why a
> set
> > of
> > > > components does not come up as expected.
> > > >
> > > > We would like to donate this project to the Apache Felix project as
> it
> > > > might get more attention there by people that are not related to
> sling.
> > > The
> > > > project is Apache licensed from the start and we already got a basic
> > > > documentation as well as good test coverage.
> > > >
> > > > We currently host it in this github repository:
> > > > https://github.com/dulvac/system-readiness
> > > >
> > > > The packages are still mentioning sling but of course we would change
> > > this
> > > > to felix if this community is interested in the donation.
> > > >
> > > > Best regards
> > > >
> > > > Christian and Andrei
> > > >
> > > >
> > > > --
> > > > --
> > > > Christian Schneider
> > > > http://www.liquid-reality.de
> > > >
> > > > Computer Scientist
> > > > http://www.adobe.com
> > > >
> > >
> > >
> > >
> > > --
> > > ------------------------
> > > Guillaume Nodet
> > >
> >
>



-- 
-- 
Christian Schneider
http://www.liquid-reality.de

Computer Scientist
http://www.adobe.com

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Neil Bartlett <nj...@gmail.com>.
I like the general idea but, like Guillaume, I feel maybe this should be
implemented at a lower level. The core `SystemReadinessMonitorImpl` and the
rootcause command are implemented as DS components and configured via
Config Admin... but what if SCR and/or ConfigAdmin are unavailable or not
working?

I'm also not sure about the way in which checks are defined and extended.
Only the application knows what "should" be started, but this can be
defined at the application level using a DS component that has dependencies
on the necessary services, config etc. That DS component would provide a
SystemReady service when it has decided the system is ready.

Thus I think your framework can be boiled down to something simpler:

* An exported SystemReady service interface, which should appear within a
configurable timeout;
* The root cause analysis tool, which is something I have always wanted to
have and I hope your implementation works as well as described!

Regards,
Neil

On Tue, Apr 17, 2018 at 1:37 PM, Andrei Dulvac <du...@apache.org> wrote:

> Hi Guillaume.
>
> Thanks!
>
> There's the OOTB ServicesCheck check that can be configured with a list of
> services [0].
> We were thinking we could add the mandatory checks there through
> configuration.
>
> The fact that the system can initially green, because no checks are present
> is VERY valid.
> We try to mediate that with the ServicesCheck and by making sure the
> monitor waits for the framework to be up before reporting anything other
> than YELLOW.
>
> Hope I got the question and suggestion right :D
>
> - Andrei
>
>
> ---
> [0]
> https://github.com/dulvac/system-readiness/blob/master/
> src/main/java/org/apache/sling/systemreadiness/core/
> impl/ServicesCheck.java#L59
>
> On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gn...@apache.org> wrote:
>
> > I like it a lot, the API is simple and extensible enough.  Really nice
> work
> > !
> > I'm just a bit nervous about having such a low-level component depend on
> an
> > external extender...
> >
> > I think it's missing one bit though: some kind of expectations. I.e. it
> > checks existing stuff, but it does not cover missing stuff.  I suppose it
> > could be implemented specifically using custom checks, but I think there
> is
> > still a hole, which is the fact that those custom checks are not
> > available.  So I wonder if there should be an additional built-in check
> > that would grab a configuration entry, turn that into a list of mandatory
> > checks and be green if all those check are available, yellow/red
> > otherwise.  This would ensure your container does not switch between
> > green/yellow while the container is booting/provisioning.
> >
> > 2018-04-17 10:02 GMT+02:00 Christian Schneider <chris@die-schneider.net
> >:
> >
> > > Dear Felix community,
> > >
> > > during the last weeks Andrei Dulvac and I worked on a small framework
> to
> > > check if an OSGi based system is fully up.
> > >
> > > Our work originated in testing sling modules and whole sling instances.
> > We
> > > soon found though that the concept is more general than sling and can
> be
> > > applied to any OSGi based system.
> > >
> > > The system readiness framework has a SystemReadinessMonitor service
> that
> > > reports the aggregated state of the system. It delegates to
> > > SystemReadinessCheck services that each check for a certain aspect. We
> > > implemented a first check based on a list of expected top level
> services.
> > > The system can be customised by adding specific checks for your
> > > application. For example we plan to add sling specific checks inside
> the
> > > sling project.
> > >
> > > In addition to simply detecting if the system is ready we also created
> a
> > DS
> > > based root cause analysis that can be very helpful to detect why a set
> of
> > > components does not come up as expected.
> > >
> > > We would like to donate this project to the Apache Felix project as it
> > > might get more attention there by people that are not related to sling.
> > The
> > > project is Apache licensed from the start and we already got a basic
> > > documentation as well as good test coverage.
> > >
> > > We currently host it in this github repository:
> > > https://github.com/dulvac/system-readiness
> > >
> > > The packages are still mentioning sling but of course we would change
> > this
> > > to felix if this community is interested in the donation.
> > >
> > > Best regards
> > >
> > > Christian and Andrei
> > >
> > >
> > > --
> > > --
> > > Christian Schneider
> > > http://www.liquid-reality.de
> > >
> > > Computer Scientist
> > > http://www.adobe.com
> > >
> >
> >
> >
> > --
> > ------------------------
> > Guillaume Nodet
> >
>

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Andrei Dulvac <du...@apache.org>.
Hi Guillaume.

Thanks!

There's the OOTB ServicesCheck check that can be configured with a list of
services [0].
We were thinking we could add the mandatory checks there through
configuration.

The fact that the system can initially green, because no checks are present
is VERY valid.
We try to mediate that with the ServicesCheck and by making sure the
monitor waits for the framework to be up before reporting anything other
than YELLOW.

Hope I got the question and suggestion right :D

- Andrei


---
[0]
https://github.com/dulvac/system-readiness/blob/master/src/main/java/org/apache/sling/systemreadiness/core/impl/ServicesCheck.java#L59

On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gn...@apache.org> wrote:

> I like it a lot, the API is simple and extensible enough.  Really nice work
> !
> I'm just a bit nervous about having such a low-level component depend on an
> external extender...
>
> I think it's missing one bit though: some kind of expectations. I.e. it
> checks existing stuff, but it does not cover missing stuff.  I suppose it
> could be implemented specifically using custom checks, but I think there is
> still a hole, which is the fact that those custom checks are not
> available.  So I wonder if there should be an additional built-in check
> that would grab a configuration entry, turn that into a list of mandatory
> checks and be green if all those check are available, yellow/red
> otherwise.  This would ensure your container does not switch between
> green/yellow while the container is booting/provisioning.
>
> 2018-04-17 10:02 GMT+02:00 Christian Schneider <ch...@die-schneider.net>:
>
> > Dear Felix community,
> >
> > during the last weeks Andrei Dulvac and I worked on a small framework to
> > check if an OSGi based system is fully up.
> >
> > Our work originated in testing sling modules and whole sling instances.
> We
> > soon found though that the concept is more general than sling and can be
> > applied to any OSGi based system.
> >
> > The system readiness framework has a SystemReadinessMonitor service that
> > reports the aggregated state of the system. It delegates to
> > SystemReadinessCheck services that each check for a certain aspect. We
> > implemented a first check based on a list of expected top level services.
> > The system can be customised by adding specific checks for your
> > application. For example we plan to add sling specific checks inside the
> > sling project.
> >
> > In addition to simply detecting if the system is ready we also created a
> DS
> > based root cause analysis that can be very helpful to detect why a set of
> > components does not come up as expected.
> >
> > We would like to donate this project to the Apache Felix project as it
> > might get more attention there by people that are not related to sling.
> The
> > project is Apache licensed from the start and we already got a basic
> > documentation as well as good test coverage.
> >
> > We currently host it in this github repository:
> > https://github.com/dulvac/system-readiness
> >
> > The packages are still mentioning sling but of course we would change
> this
> > to felix if this community is interested in the donation.
> >
> > Best regards
> >
> > Christian and Andrei
> >
> >
> > --
> > --
> > Christian Schneider
> > http://www.liquid-reality.de
> >
> > Computer Scientist
> > http://www.adobe.com
> >
>
>
>
> --
> ------------------------
> Guillaume Nodet
>

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Christian Schneider <ch...@die-schneider.net>.
Hi Guillaume,

it can indeed be a problem when checks arrive late as the ready state of
the system can then be green too early.
So one approach is to configure a list of expected checks. This is easy to
implement but requires manual config.

Another approach I thought about is to use the ServiceComponentRuntime to
check which check components are present and wait until they com up. This
only works for checks implemented in DS but might be quite convenient.

Christian



2018-04-17 14:16 GMT+02:00 Guillaume Nodet <gn...@apache.org>:

> I like it a lot, the API is simple and extensible enough.  Really nice work
> !
> I'm just a bit nervous about having such a low-level component depend on an
> external extender...
>
> I think it's missing one bit though: some kind of expectations. I.e. it
> checks existing stuff, but it does not cover missing stuff.  I suppose it
> could be implemented specifically using custom checks, but I think there is
> still a hole, which is the fact that those custom checks are not
> available.  So I wonder if there should be an additional built-in check
> that would grab a configuration entry, turn that into a list of mandatory
> checks and be green if all those check are available, yellow/red
> otherwise.  This would ensure your container does not switch between
> green/yellow while the container is booting/provisioning.
>
> 2018-04-17 10:02 GMT+02:00 Christian Schneider <ch...@die-schneider.net>:
>
> > Dear Felix community,
> >
> > during the last weeks Andrei Dulvac and I worked on a small framework to
> > check if an OSGi based system is fully up.
> >
> > Our work originated in testing sling modules and whole sling instances.
> We
> > soon found though that the concept is more general than sling and can be
> > applied to any OSGi based system.
> >
> > The system readiness framework has a SystemReadinessMonitor service that
> > reports the aggregated state of the system. It delegates to
> > SystemReadinessCheck services that each check for a certain aspect. We
> > implemented a first check based on a list of expected top level services.
> > The system can be customised by adding specific checks for your
> > application. For example we plan to add sling specific checks inside the
> > sling project.
> >
> > In addition to simply detecting if the system is ready we also created a
> DS
> > based root cause analysis that can be very helpful to detect why a set of
> > components does not come up as expected.
> >
> > We would like to donate this project to the Apache Felix project as it
> > might get more attention there by people that are not related to sling.
> The
> > project is Apache licensed from the start and we already got a basic
> > documentation as well as good test coverage.
> >
> > We currently host it in this github repository:
> > https://github.com/dulvac/system-readiness
> >
> > The packages are still mentioning sling but of course we would change
> this
> > to felix if this community is interested in the donation.
> >
> > Best regards
> >
> > Christian and Andrei
> >
> >
> > --
> > --
> > Christian Schneider
> > http://www.liquid-reality.de
> >
> > Computer Scientist
> > http://www.adobe.com
> >
>
>
>
> --
> ------------------------
> Guillaume Nodet
>



-- 
-- 
Christian Schneider
http://www.liquid-reality.de

Computer Scientist
http://www.adobe.com

Re: Proposal to donate the system readiness check framework to Apache Felix

Posted by Guillaume Nodet <gn...@apache.org>.
I like it a lot, the API is simple and extensible enough.  Really nice work
!
I'm just a bit nervous about having such a low-level component depend on an
external extender...

I think it's missing one bit though: some kind of expectations. I.e. it
checks existing stuff, but it does not cover missing stuff.  I suppose it
could be implemented specifically using custom checks, but I think there is
still a hole, which is the fact that those custom checks are not
available.  So I wonder if there should be an additional built-in check
that would grab a configuration entry, turn that into a list of mandatory
checks and be green if all those check are available, yellow/red
otherwise.  This would ensure your container does not switch between
green/yellow while the container is booting/provisioning.

2018-04-17 10:02 GMT+02:00 Christian Schneider <ch...@die-schneider.net>:

> Dear Felix community,
>
> during the last weeks Andrei Dulvac and I worked on a small framework to
> check if an OSGi based system is fully up.
>
> Our work originated in testing sling modules and whole sling instances. We
> soon found though that the concept is more general than sling and can be
> applied to any OSGi based system.
>
> The system readiness framework has a SystemReadinessMonitor service that
> reports the aggregated state of the system. It delegates to
> SystemReadinessCheck services that each check for a certain aspect. We
> implemented a first check based on a list of expected top level services.
> The system can be customised by adding specific checks for your
> application. For example we plan to add sling specific checks inside the
> sling project.
>
> In addition to simply detecting if the system is ready we also created a DS
> based root cause analysis that can be very helpful to detect why a set of
> components does not come up as expected.
>
> We would like to donate this project to the Apache Felix project as it
> might get more attention there by people that are not related to sling. The
> project is Apache licensed from the start and we already got a basic
> documentation as well as good test coverage.
>
> We currently host it in this github repository:
> https://github.com/dulvac/system-readiness
>
> The packages are still mentioning sling but of course we would change this
> to felix if this community is interested in the donation.
>
> Best regards
>
> Christian and Andrei
>
>
> --
> --
> Christian Schneider
> http://www.liquid-reality.de
>
> Computer Scientist
> http://www.adobe.com
>



-- 
------------------------
Guillaume Nodet