You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by Alex Heneveld <al...@cloudsoft.io> on 2020/11/23 11:12:30 UTC

Bundle resolvers loading too late

Hi Brooklyn devs,

Regarding the recent addition to allow custom Bundle Resolver OSGi 
services [1], we've discovered a bothersome issue with load order at 
runtime.  It is non-deterministic whether a custom resolver bundle loads 
before or after the initial catalog.bom and persisted state.  If the 
custom resolver loads _after_, then it won't be available to handle the 
catalog.bom and persisted state, which means the bundle might be loaded 
by the wrong resolver or it might fail to load altogether.

We want to introduce a mechanism to prevent those errors.  There are 
several options:

(a) Specify a dependency on the resolver bundle/service inside the 
bundle that needs it

(b) Specify any resolver OSGi bundle or service names that are required 
in brooklyn.cfg, and then wait until they are available before 
initializing Brooklyn catalog (eg using BundleListener / ServiceTracker)

(c) Require the bundle to be explicitly included in the Brooklyn/OSGi 
startup sequence (boot bundlers or startup.properties) before the 
catalog/rebind initializes

(d) Wait for "all startup and deploy bundles" to be in their final state 
(usually active) or a start level before the catalog/rebind initializes

(e) Re-install bundles if we've added a new bundle-parser service (so 
while it might fail initially, it eventually succeeds)


Option (d) would be the nicest I think, simplest for user, leaning on 
OSGi "start levels":  but Karaf does not seem to respect startlevels.  
I'll send an email to the Karaf list to ask.

Option (c) is quite tricky AFAIK, obscure edits needed to the etc/* 
directory and some tricky listeners (OSGi doesn't encourage the notion 
of "wait for everything else to be ready", for obvious reasons if two 
bundles use that philosophy they will deadlock!). So I don't like it.

Option (a) makes writing a bundle that uses a custom resolver more 
difficult (e.g. requiring an OSGi MANIFEST.MF) so I don't like it either.

Option (e) is quite hard to code up, and inefficient, and will cause a 
lot of warnings in the log as part of the normal case, and potentially 
disrupt operations if we re-install bundles whenever a resolver is 
added.  That said, it is a common pattern in OSGi ... but I don't much 
like it.


That leaves option (b) which is what I'm leaning towards (unless we get 
an answer re (d)).  Specifically we'd say something like this in 
`brooklyn.cfg`:

     brooklyn.resolvers.require.services = custom.Resolver1,custom.Resolver2

and then in catalog/rebind we block (with logging) if those are not yet 
available.  Possibly we would also have

     brooklyn.resolvers.require.timeout = 5m

And if not available within that timeframe it logs a warning and proceeds.


Note there are potentially similar issues with PlanTransformers but I 
think that is simple to solve once we've solved ^.

Best
Alex

[1] https://github.com/apache/brooklyn-server/pull/1115



Re: Bundle resolvers loading too late

Posted by Alex Heneveld <al...@cloudsoft.io>.
Cheers Geoff -- helpful input.

Absolutely I think (a) should be _supported_ -- but I don't think it 
should be preferred.

Blueprint authors who just write a simple yaml/BOM file shouldn't be 
required to make/maintain an OSGI-INF/MANIFEST.MF.  And nor do we want 
to stipulate that bundles that get installed _have_ to be OSGi, or have 
to be OSGi _if_ they depend on certain resolvers -- that would rule out, 
for instance, installing a TOSCA CSAR ZIP.  So while (a) should be an 
option, I think it's important we offer at least one other.  Also a 
failure to delare that will have non-deterministic failures, it would 
probably work on first install but then might fail on rebind ... and 
while some of the other options have the same issue, they can be fixed 
at a platform level whereas this would require (and potentially be an 
error) on every bundle.

I spent some time on (d) but it is quite invasive and start levels don't 
work the way I expected. I'm talking to JB later today and he will 
hopefully clear it up.

Option (b) is winning out and actually works out quite nicely.  Again 
it's optional, but the current spike seems to be working to add 
additional optional properties in brooklyn.cfg:

brooklyn.osgi.dependencies.services.filters=(&(osgi.service.blueprint.compname=toscaCsarBundleResolver))

The RHS is an OSGi filter for a service, or a list of such filters.  The 
effect is that it won't init the catalog/rebind until the 
brooklyn.osgi.dependencies.services.filters are all satisfied, but it 
feels quite OSGi-friendly, we specify in config the services that 
catalog depends, and let OSGi tell us when they're available and we just 
wait on that.

I'm gonna tidy this up and open a PR for review.

Best
Alex



On 24/11/2020 11:16, Geoff Macartney wrote:
> Hi Alex,
>
> I'd actually say (a) is the way to do it, using the OSGI service
> dependency mechanism in some way (not *quite* sure how that is done
> these days!). That would be the more "OSGI native" style of doing it,
> would it not? Start levels wouldn't be what we want [1] and making all
> catalog init block until all bundle loaders are started (b) sounds
> workable but coarse grained. If each bundle specified a dependency on
> a service that loaded it, then the normal OSGI service mechanism would
> control the startup order for us very naturally. I share your dislike
> of options (c) and (e).
>
> What do you think?
>
> Geoff
>
>
> [1] https://www.aqute.biz/2017/04/24/aggregate-state.html
>
> On Mon, 23 Nov 2020 at 11:12, Alex Heneveld <al...@cloudsoft.io> wrote:
>>
>> Hi Brooklyn devs,
>>
>> Regarding the recent addition to allow custom Bundle Resolver OSGi
>> services [1], we've discovered a bothersome issue with load order at
>> runtime.  It is non-deterministic whether a custom resolver bundle loads
>> before or after the initial catalog.bom and persisted state.  If the
>> custom resolver loads _after_, then it won't be available to handle the
>> catalog.bom and persisted state, which means the bundle might be loaded
>> by the wrong resolver or it might fail to load altogether.
>>
>> We want to introduce a mechanism to prevent those errors.  There are
>> several options:
>>
>> (a) Specify a dependency on the resolver bundle/service inside the
>> bundle that needs it
>>
>> (b) Specify any resolver OSGi bundle or service names that are required
>> in brooklyn.cfg, and then wait until they are available before
>> initializing Brooklyn catalog (eg using BundleListener / ServiceTracker)
>>
>> (c) Require the bundle to be explicitly included in the Brooklyn/OSGi
>> startup sequence (boot bundlers or startup.properties) before the
>> catalog/rebind initializes
>>
>> (d) Wait for "all startup and deploy bundles" to be in their final state
>> (usually active) or a start level before the catalog/rebind initializes
>>
>> (e) Re-install bundles if we've added a new bundle-parser service (so
>> while it might fail initially, it eventually succeeds)
>>
>>
>> Option (d) would be the nicest I think, simplest for user, leaning on
>> OSGi "start levels":  but Karaf does not seem to respect startlevels.
>> I'll send an email to the Karaf list to ask.
>>
>> Option (c) is quite tricky AFAIK, obscure edits needed to the etc/*
>> directory and some tricky listeners (OSGi doesn't encourage the notion
>> of "wait for everything else to be ready", for obvious reasons if two
>> bundles use that philosophy they will deadlock!). So I don't like it.
>>
>> Option (a) makes writing a bundle that uses a custom resolver more
>> difficult (e.g. requiring an OSGi MANIFEST.MF) so I don't like it either.
>>
>> Option (e) is quite hard to code up, and inefficient, and will cause a
>> lot of warnings in the log as part of the normal case, and potentially
>> disrupt operations if we re-install bundles whenever a resolver is
>> added.  That said, it is a common pattern in OSGi ... but I don't much
>> like it.
>>
>>
>> That leaves option (b) which is what I'm leaning towards (unless we get
>> an answer re (d)).  Specifically we'd say something like this in
>> `brooklyn.cfg`:
>>
>>       brooklyn.resolvers.require.services = custom.Resolver1,custom.Resolver2
>>
>> and then in catalog/rebind we block (with logging) if those are not yet
>> available.  Possibly we would also have
>>
>>       brooklyn.resolvers.require.timeout = 5m
>>
>> And if not available within that timeframe it logs a warning and proceeds.
>>
>>
>> Note there are potentially similar issues with PlanTransformers but I
>> think that is simple to solve once we've solved ^.
>>
>> Best
>> Alex
>>
>> [1] https://github.com/apache/brooklyn-server/pull/1115
>>
>>


Re: Bundle resolvers loading too late

Posted by Geoff Macartney <ge...@gmail.com>.
Hi Alex,

I'd actually say (a) is the way to do it, using the OSGI service
dependency mechanism in some way (not *quite* sure how that is done
these days!). That would be the more "OSGI native" style of doing it,
would it not? Start levels wouldn't be what we want [1] and making all
catalog init block until all bundle loaders are started (b) sounds
workable but coarse grained. If each bundle specified a dependency on
a service that loaded it, then the normal OSGI service mechanism would
control the startup order for us very naturally. I share your dislike
of options (c) and (e).

What do you think?

Geoff


[1] https://www.aqute.biz/2017/04/24/aggregate-state.html

On Mon, 23 Nov 2020 at 11:12, Alex Heneveld <al...@cloudsoft.io> wrote:
>
>
> Hi Brooklyn devs,
>
> Regarding the recent addition to allow custom Bundle Resolver OSGi
> services [1], we've discovered a bothersome issue with load order at
> runtime.  It is non-deterministic whether a custom resolver bundle loads
> before or after the initial catalog.bom and persisted state.  If the
> custom resolver loads _after_, then it won't be available to handle the
> catalog.bom and persisted state, which means the bundle might be loaded
> by the wrong resolver or it might fail to load altogether.
>
> We want to introduce a mechanism to prevent those errors.  There are
> several options:
>
> (a) Specify a dependency on the resolver bundle/service inside the
> bundle that needs it
>
> (b) Specify any resolver OSGi bundle or service names that are required
> in brooklyn.cfg, and then wait until they are available before
> initializing Brooklyn catalog (eg using BundleListener / ServiceTracker)
>
> (c) Require the bundle to be explicitly included in the Brooklyn/OSGi
> startup sequence (boot bundlers or startup.properties) before the
> catalog/rebind initializes
>
> (d) Wait for "all startup and deploy bundles" to be in their final state
> (usually active) or a start level before the catalog/rebind initializes
>
> (e) Re-install bundles if we've added a new bundle-parser service (so
> while it might fail initially, it eventually succeeds)
>
>
> Option (d) would be the nicest I think, simplest for user, leaning on
> OSGi "start levels":  but Karaf does not seem to respect startlevels.
> I'll send an email to the Karaf list to ask.
>
> Option (c) is quite tricky AFAIK, obscure edits needed to the etc/*
> directory and some tricky listeners (OSGi doesn't encourage the notion
> of "wait for everything else to be ready", for obvious reasons if two
> bundles use that philosophy they will deadlock!). So I don't like it.
>
> Option (a) makes writing a bundle that uses a custom resolver more
> difficult (e.g. requiring an OSGi MANIFEST.MF) so I don't like it either.
>
> Option (e) is quite hard to code up, and inefficient, and will cause a
> lot of warnings in the log as part of the normal case, and potentially
> disrupt operations if we re-install bundles whenever a resolver is
> added.  That said, it is a common pattern in OSGi ... but I don't much
> like it.
>
>
> That leaves option (b) which is what I'm leaning towards (unless we get
> an answer re (d)).  Specifically we'd say something like this in
> `brooklyn.cfg`:
>
>      brooklyn.resolvers.require.services = custom.Resolver1,custom.Resolver2
>
> and then in catalog/rebind we block (with logging) if those are not yet
> available.  Possibly we would also have
>
>      brooklyn.resolvers.require.timeout = 5m
>
> And if not available within that timeframe it logs a warning and proceeds.
>
>
> Note there are potentially similar issues with PlanTransformers but I
> think that is simple to solve once we've solved ^.
>
> Best
> Alex
>
> [1] https://github.com/apache/brooklyn-server/pull/1115
>
>