You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@sling.apache.org by Carsten Ziegeler <cz...@apache.org> on 2015/11/04 18:45:00 UTC

[RT] Handling resource (provider) changes

The current active model for observation event for resources through
OSGi events is modelled after JCR observation which means that for
changes to a tree you might only get an event for the root of that tree.
Especially when a whole tree is deleted, you get a single event.
I assume, the same could in theory be true for adding.

The question is, whether we want to keep this model for the observation
API we're about to implement. Especially as there are some additional
things to consider:

1. With Oak we could register observation listener which provide the
information about every node removed/added/changed. So we can send out
detailed events. Other resource provider implementations could simply
follow that concept.

2. We have three events for resources: added/removed/changed but also
two events for resource providers. Obviously, if a resource provider is
mounted at some path and is unregistered, the whole tree is removed.
Again, we just send out a single event.

For 2. we definitely don't want to send out an event for every resource
of that provider as that would be way too expensive.

For 1. we might start sending out too many events as well and I assume
code is already prepared for that case.

I think we should keep the optimization (and make this clear in the new
observation API) and we probably should collapse the two special
resource provider events into resource events: instead of sending out a
resource provider added/removed, we send out a resource added/removed.
Listeners right now usually handle all five events, and we could reduce
this to three events, making everything compacter, nicer and easier to
understand.

WDYT?

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziegeler@apache.org

Re: [RT] Handling resource (provider) changes

Posted by Dominik Süß <do...@gmail.com>.

Hi everyone,

as I get the point that having a broadcasted event for each any any change
in the ResourceTree I fear that this comes at a high price. I think it
would be good to have some mechanisms to
a) toggle of eventing for specific subtrees completely
b) just create an event for a specific subtree whenever sth changed in a
given interval
c) being able to register interest for specific paths / properties and
therefore just send event if there is any listener that claims to have
interest
For the existing behavior a listener could just register for "everything"
but it would be possible to tweak all eventlisteners in a way that reduces
the amount of events that are fired significantly (which would probably
allow to further narrow down what kind of observation we need in oak)

The oak team has tweaked in the area quite a lot to provide better scaling
of eventing - it would be a shame if our sling eventing wouldn't make use
of this or even eliminates the benefits of this.

Cheers
Dominik


On Thu, Nov 5, 2015 at 2:23 PM, Carsten Ziegeler <cz...@apache.org>
wrote:

> Am 05.11.15 um 14:01 schrieb Oliver Lietz:
> > On Wednesday 04 November 2015 18:45:00 Carsten Ziegeler wrote:
> >> The current active model for observation event for resources through
> >> OSGi events is modelled after JCR observation which means that for
> >> changes to a tree you might only get an event for the root of that tree.
> >> Especially when a whole tree is deleted, you get a single event.
> >> I assume, the same could in theory be true for adding.
> >>
> >> The question is, whether we want to keep this model for the observation
> >> API we're about to implement. Especially as there are some additional
> >> things to consider:
> >>
> >> 1. With Oak we could register observation listener which provide the
> >> information about every node removed/added/changed. So we can send out
> >> detailed events. Other resource provider implementations could simply
> >> follow that concept.
> >
> > One event per resource makes implementing depending systems more
> > straightforward, e.g. adding/removing data for a resource from a search
> index.
>
> While this sounds nice, I guess in practice you can't solely rely on
> observation for this. You might miss events as your listener is
> registered "too late", is updated, uninstalled etc.
>
> > So +1.
> >
> >> 2. We have three events for resources: added/removed/changed but also
> >> two events for resource providers. Obviously, if a resource provider is
> >> mounted at some path and is unregistered, the whole tree is removed.
> >> Again, we just send out a single event.
> >>
> >> For 2. we definitely don't want to send out an event for every resource
> >> of that provider as that would be way too expensive.
> >>
> >> For 1. we might start sending out too many events as well and I assume
> >> code is already prepared for that case.
> >
> > What does that mean? _too_ many for whom? Can we process _too_ many
> events
> > reliable?
> That's a good question - I guess with good filtering by the listener
> registrations we should be able to do so.
>
> >
> >> I think we should keep the optimization (and make this clear in the new
> >> observation API) and we probably should collapse the two special
> >> resource provider events into resource events: instead of sending out a
> >> resource provider added/removed, we send out a resource added/removed.
> >> Listeners right now usually handle all five events, and we could reduce
> >> this to three events, making everything compacter, nicer and easier to
> >> understand.
> >>
> >> WDYT?
> >
> > hm, added/removed resources and added/removed resource providers are
> from some
> > aspects totally different use cases which I think should be seen as such.
>
> It depends on your point of view I think - for someone interested in
> whether a resource has been added or removed, there is no difference
> whether a resource has been removed or the resource provider who
> provided this. It's the same. And observation listeners are interested
> to find out about resource changes, therefore whether something happened
> because of a change in the database or a provider change is not relevant.
>
> > I would like to differ between added/removed resources and added/removed
> > resource providers. But having events for all resources added/removed
> when a
> > resource provider is added/removed is also helpful. -1 for collapsing.
> Can you give some use cases where you want to know about a provider
> being added or removed - in contrast to the resource it provides?
>
> >
> > How rich are these events? Can a listener determine the provider for a
> > resource? Can a listener determine if a resource was added/removed
> because its
> > provider was added/removed? - forgive my ignorance, hadn't time to look
> into
> > the new APIs so far.
>
> The current API is quiet simple - if a provider is added or removed, we
> send out a provider add/remove event with the mount path of the provider.
> For resource change events its the path.
>
> I guess we're actually discussing two totally different things here and
> I brought them up in a single discussion :)
> 1. Whether we send out events for all resources of a delete (add/update)
> operation or can optimize as JCR does?
> 2. How to deal with resource provider changes?
>
> Regards
> Carsten
> --
> Carsten Ziegeler
> Adobe Research Switzerland
> cziegeler@apache.org
>

Re: [RT] Handling resource (provider) changes

Posted by Carsten Ziegeler <cz...@apache.org>.

Am 05.11.15 um 14:01 schrieb Oliver Lietz:
> On Wednesday 04 November 2015 18:45:00 Carsten Ziegeler wrote:
>> The current active model for observation event for resources through
>> OSGi events is modelled after JCR observation which means that for
>> changes to a tree you might only get an event for the root of that tree.
>> Especially when a whole tree is deleted, you get a single event.
>> I assume, the same could in theory be true for adding.
>>
>> The question is, whether we want to keep this model for the observation
>> API we're about to implement. Especially as there are some additional
>> things to consider:
>>
>> 1. With Oak we could register observation listener which provide the
>> information about every node removed/added/changed. So we can send out
>> detailed events. Other resource provider implementations could simply
>> follow that concept.
> 
> One event per resource makes implementing depending systems more 
> straightforward, e.g. adding/removing data for a resource from a search index.

While this sounds nice, I guess in practice you can't solely rely on
observation for this. You might miss events as your listener is
registered "too late", is updated, uninstalled etc.

> So +1.
> 
>> 2. We have three events for resources: added/removed/changed but also
>> two events for resource providers. Obviously, if a resource provider is
>> mounted at some path and is unregistered, the whole tree is removed.
>> Again, we just send out a single event.
>>
>> For 2. we definitely don't want to send out an event for every resource
>> of that provider as that would be way too expensive.
>>
>> For 1. we might start sending out too many events as well and I assume
>> code is already prepared for that case.
> 
> What does that mean? _too_ many for whom? Can we process _too_ many events 
> reliable?
That's a good question - I guess with good filtering by the listener
registrations we should be able to do so.

> 
>> I think we should keep the optimization (and make this clear in the new
>> observation API) and we probably should collapse the two special
>> resource provider events into resource events: instead of sending out a
>> resource provider added/removed, we send out a resource added/removed.
>> Listeners right now usually handle all five events, and we could reduce
>> this to three events, making everything compacter, nicer and easier to
>> understand.
>>
>> WDYT?
> 
> hm, added/removed resources and added/removed resource providers are from some 
> aspects totally different use cases which I think should be seen as such.

It depends on your point of view I think - for someone interested in
whether a resource has been added or removed, there is no difference
whether a resource has been removed or the resource provider who
provided this. It's the same. And observation listeners are interested
to find out about resource changes, therefore whether something happened
because of a change in the database or a provider change is not relevant.

> I would like to differ between added/removed resources and added/removed 
> resource providers. But having events for all resources added/removed when a 
> resource provider is added/removed is also helpful. -1 for collapsing.
Can you give some use cases where you want to know about a provider
being added or removed - in contrast to the resource it provides?

> 
> How rich are these events? Can a listener determine the provider for a 
> resource? Can a listener determine if a resource was added/removed because its 
> provider was added/removed? - forgive my ignorance, hadn't time to look into 
> the new APIs so far.

The current API is quiet simple - if a provider is added or removed, we
send out a provider add/remove event with the mount path of the provider.
For resource change events its the path.

I guess we're actually discussing two totally different things here and
I brought them up in a single discussion :)
1. Whether we send out events for all resources of a delete (add/update)
operation or can optimize as JCR does?
2. How to deal with resource provider changes?

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziegeler@apache.org

Re: [RT] Handling resource (provider) changes

Posted by Oliver Lietz <ap...@oliverlietz.de>.

On Wednesday 04 November 2015 18:45:00 Carsten Ziegeler wrote:
> The current active model for observation event for resources through
> OSGi events is modelled after JCR observation which means that for
> changes to a tree you might only get an event for the root of that tree.
> Especially when a whole tree is deleted, you get a single event.
> I assume, the same could in theory be true for adding.
> 
> The question is, whether we want to keep this model for the observation
> API we're about to implement. Especially as there are some additional
> things to consider:
> 
> 1. With Oak we could register observation listener which provide the
> information about every node removed/added/changed. So we can send out
> detailed events. Other resource provider implementations could simply
> follow that concept.

One event per resource makes implementing depending systems more 
straightforward, e.g. adding/removing data for a resource from a search index.
So +1.

> 2. We have three events for resources: added/removed/changed but also
> two events for resource providers. Obviously, if a resource provider is
> mounted at some path and is unregistered, the whole tree is removed.
> Again, we just send out a single event.
> 
> For 2. we definitely don't want to send out an event for every resource
> of that provider as that would be way too expensive.
> 
> For 1. we might start sending out too many events as well and I assume
> code is already prepared for that case.

What does that mean? _too_ many for whom? Can we process _too_ many events 
reliable?

> I think we should keep the optimization (and make this clear in the new
> observation API) and we probably should collapse the two special
> resource provider events into resource events: instead of sending out a
> resource provider added/removed, we send out a resource added/removed.
> Listeners right now usually handle all five events, and we could reduce
> this to three events, making everything compacter, nicer and easier to
> understand.
> 
> WDYT?

hm, added/removed resources and added/removed resource providers are from some 
aspects totally different use cases which I think should be seen as such.
I would like to differ between added/removed resources and added/removed 
resource providers. But having events for all resources added/removed when a 
resource provider is added/removed is also helpful. -1 for collapsing.

How rich are these events? Can a listener determine the provider for a 
resource? Can a listener determine if a resource was added/removed because its 
provider was added/removed? - forgive my ignorance, hadn't time to look into 
the new APIs so far.

Regards,
O.

> Regards
> Carsten

Re: [RT] Handling resource (provider) changes

Posted by Carsten Ziegeler <cz...@apache.org>.

Am 04.11.15 um 19:27 schrieb Robert Munteanu:
> On Wed, 2015-11-04 at 18:45 +0100, Carsten Ziegeler wrote:
>> The current active model for observation event for resources through
>> OSGi events is modelled after JCR observation which means that for
>> changes to a tree you might only get an event for the root of that
>> tree.
>> Especially when a whole tree is deleted, you get a single event.
>> I assume, the same could in theory be true for adding.
> 
> Just to make sure I understand correctly:
> 
> - when a tree at /foo is deleted we get a single event which says that
> the node at /foo is deleted, but we do not know which children were
> removed - we only get a single path

Correct.

> - when a tree at /bar is added, we get a single event but we get all
> the date about the resources under /bar, e.g. /bar/one/two/three

I think this depends :) Afaik, right now for adds we get an event for
every resource, but this is not necessarily guaranteed by JCR
observation (at least that's what people told me).
> 
> Is that correct?
> 
>> The question is, whether we want to keep this model for the
>> observation
>> API we're about to implement. Especially as there are some additional
>> things to consider:
>>
>> 1. With Oak we could register observation listener which provide the
>> information about every node removed/added/changed. So we can send
>> out
>> detailed events. Other resource provider implementations could simply
>> follow that concept.
>>  
>> 2. We have three events for resources: added/removed/changed but also
>> two events for resource providers. Obviously, if a resource provider
>> is
>> mounted at some path and is unregistered, the whole tree is removed.
>> Again, we just send out a single event.
>>
>> For 2. we definitely don't want to send out an event for every
>> resource
>> of that provider as that would be way too expensive.
>>
>> For 1. we might start sending out too many events as well and I
>> assume
>> code is already prepared for that case.
>>
>> I think we should keep the optimization (and make this clear in the
>> new
>> observation API) 
> 
> +1
> 
>> and we probably should collapse the two special
>> resource provider events into resource events: instead of sending out
>> a
>> resource provider added/removed, we send out a resource
>> added/removed.
>> Listeners right now usually handle all five events, and we could
>> reduce
>> this to three events, making everything compacter, nicer and easier
>> to
>> understand.
> 
> I think simplifying the listeners is good, but I wonder whether anyone
> actually listens to the resource provider added/removed events. Perhaps
> we can have a separate listener for those?

Every implementation doing caching (e.g. jsp scripting) listens to those
events. That's a must, otherwise you end up caching something from a
provider which disappeared after you cached it. My initial idea was to
have a separate interface, but as the usual pattern is to listen to both
anyway, I think we can just fold the events in and make everything easier

Regards
Carsten

> 
> Robert
> 


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziegeler@apache.org

Re: [RT] Handling resource (provider) changes

Posted by Robert Munteanu <ro...@apache.org>.

On Wed, 2015-11-04 at 18:45 +0100, Carsten Ziegeler wrote:
> The current active model for observation event for resources through
> OSGi events is modelled after JCR observation which means that for
> changes to a tree you might only get an event for the root of that
> tree.
> Especially when a whole tree is deleted, you get a single event.
> I assume, the same could in theory be true for adding.

Just to make sure I understand correctly:

- when a tree at /foo is deleted we get a single event which says that
the node at /foo is deleted, but we do not know which children were
removed - we only get a single path
- when a tree at /bar is added, we get a single event but we get all
the date about the resources under /bar, e.g. /bar/one/two/three

Is that correct?

> The question is, whether we want to keep this model for the
> observation
> API we're about to implement. Especially as there are some additional
> things to consider:
> 
> 1. With Oak we could register observation listener which provide the
> information about every node removed/added/changed. So we can send
> out
> detailed events. Other resource provider implementations could simply
> follow that concept.
> 
> 2. We have three events for resources: added/removed/changed but also
> two events for resource providers. Obviously, if a resource provider
> is
> mounted at some path and is unregistered, the whole tree is removed.
> Again, we just send out a single event.
> 
> For 2. we definitely don't want to send out an event for every
> resource
> of that provider as that would be way too expensive.
> 
> For 1. we might start sending out too many events as well and I
> assume
> code is already prepared for that case.
> 
> I think we should keep the optimization (and make this clear in the
> new
> observation API) 

+1

> and we probably should collapse the two special
> resource provider events into resource events: instead of sending out
> a
> resource provider added/removed, we send out a resource
> added/removed.
> Listeners right now usually handle all five events, and we could
> reduce
> this to three events, making everything compacter, nicer and easier
> to
> understand.

I think simplifying the listeners is good, but I wonder whether anyone
actually listens to the resource provider added/removed events. Perhaps
we can have a separate listener for those?

Robert