You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Jukka Zitting <ju...@gmail.com> on 2013/03/01 08:19:57 UTC

Re: Kernel stats question.

Hi,

On Thu, Feb 28, 2013 at 11:55 PM, Ian Boston <ie...@tfd.co.uk> wrote:
> Are there plans to expose kernel stats from Oak ?

Yes, see OAK-364 [1]. Though so far we haven't yet implemented
anything along these lines.

One related idea I was already thinking about implementing is exposing
cache statistics [2] from the Guava caches we're using at the
MicroKernel level.

> The Jackrabbit ObservationDispatcher has a queue that contains a
> AtomicInteger indicating the size of the queue. If Observers are slow
> in responding to asynchronous notifications the queue can grow
> rapidly.

This actually is an area that won't be an issue with Oak, as instead
of using a centralized queue from where events are pushed to observers
we're allowing each observer to generate events by comparing
successive repository revisions at their own pace. Thus one slow
observer will only block itself.

Of course there are other areas where such metrics will still be highly useful.

[1] https://issues.apache.org/jira/browse/OAK-364
[2] http://code.google.com/p/guava-libraries/wiki/CachesExplained#Statistics

BR,

Jukka Zitting

Re: Kernel stats question.

Posted by Ian Boston <ie...@tfd.co.uk>.
On 4 March 2013 18:43, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On Fri, Mar 1, 2013 at 9:42 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>> I am more interested in counters than average or last operation time
>> measurements  as that gives and idea of where hotspots, multi threaded
>> throughput at full load and anomalies rather than general slowness.
>> (not saying time measurements are not useful, just hard to interpret
>> in a highly multi threaded server under load)
>
> Good point. It would be good to use OAK-364 to gather ideas for
> numbers that would be useful or at least interesting to track.

I'll add ideas for methodology and things that would be
useful/interesting to track.

>
>> Does that mean observers get a stream revision tokens, and its up to
>> them to queue the revision for later processing, even if that means
>> just keeping the last one worked on and the most recent (ie no queue
>> at all, just a range of revisions) ?
>
> That depends on the level at which you're observing the repository.
>
> At the lowest level you can just poll the repository for  new
> revisions of the content tree and do a content diff to find out what
> changed. The repository itself does nothing special for you, just
> gives you access to the latest revision and the ability to compare two
> revisions of the repository.
>
> Since explicitly managing such a polling mechanism can be a bit
> cumbersome, we also provide a way to register listeners (see the
> o.a.j.oak.spi.commit.Observer interface) that get notified when there
> are new revisions in the repository, like you describe above. The
> thread that makes these contentChanged() callbacks is controlled by
> the repository, so the observers should avoid too expensive
> calculations.

That sounds and looks nice, keeping the observation simple and making
it easy for observers to be very low cost. I am expecting a NodeState
can be converted into a simple ID (eg hash) for network transmission.


>
> Finally at the JCR level we have the JCR observation listeners that
> receive a stream of event objects instead of repository revisions.
> Again the repository is in charge of making the callbacks and thus the
> observer should be mindful of the amount of time it takes.

Ok, that makes sense. It would be nice if whatever is passed to higher
levels is lazy in nature avoiding building big trees of pointers
(sorry references) and triggering GC.

>
>> Will the Kernel blacklist slow observers, as the OSGi event manager does?
>
> Currently it doesn't, but for the latter two cases we'll probably need
> something like that. The first case is essentially decoupled from the
> underlying repository, so there's no need for the repository to worry
> about such observers. (Of course they could still consume a lot of CPU
> and IO, bu that would then be a higher level deployment concern and
> would only indirectly affect the repository.)

Having listeners blacklisted in OSGi is a pain for the novice, but
once in production the novice realises the container is just trying to
avoid all sorts of other problems (or at least thats what I
discovered). A configurable threshold keeps most people happy and
feeling they are in control.

All questions answered, thank you for the detailed answers.

Best Regards
Ian


>
> BR,
>
> Jukka Zitting

Re: Kernel stats question.

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Mar 1, 2013 at 9:42 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> I am more interested in counters than average or last operation time
> measurements  as that gives and idea of where hotspots, multi threaded
> throughput at full load and anomalies rather than general slowness.
> (not saying time measurements are not useful, just hard to interpret
> in a highly multi threaded server under load)

Good point. It would be good to use OAK-364 to gather ideas for
numbers that would be useful or at least interesting to track.

> Does that mean observers get a stream revision tokens, and its up to
> them to queue the revision for later processing, even if that means
> just keeping the last one worked on and the most recent (ie no queue
> at all, just a range of revisions) ?

That depends on the level at which you're observing the repository.

At the lowest level you can just poll the repository for  new
revisions of the content tree and do a content diff to find out what
changed. The repository itself does nothing special for you, just
gives you access to the latest revision and the ability to compare two
revisions of the repository.

Since explicitly managing such a polling mechanism can be a bit
cumbersome, we also provide a way to register listeners (see the
o.a.j.oak.spi.commit.Observer interface) that get notified when there
are new revisions in the repository, like you describe above. The
thread that makes these contentChanged() callbacks is controlled by
the repository, so the observers should avoid too expensive
calculations.

Finally at the JCR level we have the JCR observation listeners that
receive a stream of event objects instead of repository revisions.
Again the repository is in charge of making the callbacks and thus the
observer should be mindful of the amount of time it takes.

> Will the Kernel blacklist slow observers, as the OSGi event manager does?

Currently it doesn't, but for the latter two cases we'll probably need
something like that. The first case is essentially decoupled from the
underlying repository, so there's no need for the repository to worry
about such observers. (Of course they could still consume a lot of CPU
and IO, bu that would then be a higher level deployment concern and
would only indirectly affect the repository.)

BR,

Jukka Zitting

Re: Kernel stats question.

Posted by Ian Boston <ie...@tfd.co.uk>.
On 1 March 2013 18:19, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On Thu, Feb 28, 2013 at 11:55 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>> Are there plans to expose kernel stats from Oak ?
>
> Yes, see OAK-364 [1]. Though so far we haven't yet implemented
> anything along these lines.

Nice thanks.
I am more interested in counters than average or last operation time
measurements  as that gives and idea of where hotspots, multi threaded
throughput at full load and anomalies rather than general slowness.
(not saying time measurements are not useful, just hard to interpret
in a highly multi threaded server under load)

>
> One related idea I was already thinking about implementing is exposing
> cache statistics [2] from the Guava caches we're using at the
> MicroKernel level.
>
>> The Jackrabbit ObservationDispatcher has a queue that contains a
>> AtomicInteger indicating the size of the queue. If Observers are slow
>> in responding to asynchronous notifications the queue can grow
>> rapidly.
>
> This actually is an area that won't be an issue with Oak, as instead
> of using a centralized queue from where events are pushed to observers
> we're allowing each observer to generate events by comparing
> successive repository revisions at their own pace. Thus one slow
> observer will only block itself.

Does that mean observers get a stream revision tokens, and its up to
them to queue the revision for later processing, even if that means
just keeping the last one worked on and the most recent (ie no queue
at all, just a range of revisions) ?

Will the Kernel blacklist slow observers, as the OSGi event manager does?

Best Regards
Ian

BTW: part of the reason for asking is I am looking into exposing more
stats over in Sling.

>
> Of course there are other areas where such metrics will still be highly useful.
>
> [1] https://issues.apache.org/jira/browse/OAK-364
> [2] http://code.google.com/p/guava-libraries/wiki/CachesExplained#Statistics
>
> BR,
>
> Jukka Zitting