You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@sling.apache.org by Ian Boston <ie...@tfd.co.uk> on 2013/02/22 23:59:09 UTC

Monitoring and Statistics

Hi,
Whilst writing the MBeans in the event bundle I started thinking about
monitoring inside Sling. IMHO there are not enough to really know what
a instance under load is doing. Much though I like JMX it comes with
implementation and runtime overhead which I don't much like.

Runtime:
* Running with JMX enabled doesn't add any overhead, but once a client
is connected there is some (some reports upto 3% of resources).
* You have to remember to enable it, and most of the time JVMs are not
enabled. By the time you really need it, its often too late.
* JMX is not restful.

Implementation
* MBeans are not that hard to implement with the OSGi Whiteboard, but
they have to be implemented.

Alternatives.
In Jackarabbit there is/was a statistics class [1], which IIUC uses
counters and time series stored in a map. The service can then be
queries to extract the values by wrapping in an MBean or Servlet.

I think the approach could be generalised and extended so that
anything in the container could use the service to record metrics. The
api might look something like

public interface Statistics {

      /**
       * Increment a counter by 1
       */
      void increment(String counterName);

      /**
       * Record a double value in a timeseries.
       */
      void record(String timeSeriesName, double value);

      /**
       * Record a long value in a timeseries.
       */
      void record(String timeSeriesName, long value);

}

and (so that any reference can be optional on a service
implementation, the final is a hint to hotspot to inline)

public final class StatisticsUtils {

  private StatisticsUtils() {
  }

  public static void increment(Statistics statistics, String counterName) {
     if ( statistics != null ) {
         statistics.increment(counterName);
     }
  }

... etc for the other methods ..
}




The service would need to deal with all the implementation details
(including concurrency and speed). The service implementation would
also come with a servlet endpoint (under /system/*) and/or single JMX
MBean.

Anything that wanted to record stats would then bind to the service
and use it. I think this would avoid the issues mentioned above with
wide scale MBean usage.

WDYT?

(apologies for the noise if this already exists, and if so, please
treat it as a question: where and how do we record stats?)

Ian




1 http://wiki.apache.org/jackrabbit/Statistics

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 26 February 2013 09:54, Ian Boston <ie...@tfd.co.uk> wrote:
> On 25 February 2013 21:56, Bertrand Delacretaz <bd...@apache.org> wrote:
>> On Mon, Feb 25, 2013 at 11:14 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>>> ...I think it's quite common when running more than a few instances to want to
>>> be able to quickly gather read only stats from all the instances. Ie a json
>>> feed per server. The easiest way of doing that is to have a single end
>>> point that delivers a bundle of counters with a time stamp...
>>
>> I tend to agree, but IMO that's a different use case than JMX which is
>> a more general management framework. You're looking here at a subset
>> which is just monitoring counters and time series.
>>
>> Keeping track of counters can also be seen as a logging activity, we
>> might also use slf4j markers for this?
>
> yes, true.
>
>>
>> Using a "counter" Marker and logging at the TRACE level, for example,
>> to tell the logger to behave as a counter as well, and interpret
>> messages like +1 or -1 (which is cheap with String comparison).
>>
>> Just a rough idea, but this would avoid inventing new services, and
>> the resulting values can be made available both via JMX and a
>> "counters" HTTP endpoint, which I agree makes sense to feed tools like
>> Splunk. And using loggers we inherit the existing enable/disable
>> functionality.
>>
>> I experimented a bit with slf4j markers recently, just committed that
>> experiment in my whiteboard as revision 1449656 but the use case is
>> different - use markers to tell some loggers to ignore all messages
>> that don't have a specific marker.
>
>
> Thanks, I'll talke a look.
>

Hi,

IIUC
Using Markers for logging statistics could be done with

Marker counter = MarkerFactory.getMarker("counter");
Logger logger = LoggerFactory.getLogger(MyClass.class);


...


logger.info(counter,"+1 read_nodes");

And then in SlingLogger at some point (probably the log method)

if ( "counter".equals(marker.getName()) ) {
    // process the message and adjust the appropriate counter
"read_nodes" in the above example.
   ...
} else {
  // log normally
  ...
}


Was that the idea ?


It does remove any additional APIs and gives additional control since
the level can be used to switch off certain counters completely.
However it requires that the code is manually instrumented. That could
be a good thing, and could be a bad thing. I am in 2 minds. ?


Ian

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 25 February 2013 21:56, Bertrand Delacretaz <bd...@apache.org> wrote:
> On Mon, Feb 25, 2013 at 11:14 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>> ...I think it's quite common when running more than a few instances to want to
>> be able to quickly gather read only stats from all the instances. Ie a json
>> feed per server. The easiest way of doing that is to have a single end
>> point that delivers a bundle of counters with a time stamp...
>
> I tend to agree, but IMO that's a different use case than JMX which is
> a more general management framework. You're looking here at a subset
> which is just monitoring counters and time series.
>
> Keeping track of counters can also be seen as a logging activity, we
> might also use slf4j markers for this?

yes, true.

>
> Using a "counter" Marker and logging at the TRACE level, for example,
> to tell the logger to behave as a counter as well, and interpret
> messages like +1 or -1 (which is cheap with String comparison).
>
> Just a rough idea, but this would avoid inventing new services, and
> the resulting values can be made available both via JMX and a
> "counters" HTTP endpoint, which I agree makes sense to feed tools like
> Splunk. And using loggers we inherit the existing enable/disable
> functionality.
>
> I experimented a bit with slf4j markers recently, just committed that
> experiment in my whiteboard as revision 1449656 but the use case is
> different - use markers to tell some loggers to ignore all messages
> that don't have a specific marker.


Thanks, I'll talke a look.

Presumably we could do something in the LogManager to make it possible
to configure the availability of counters using the exiting log
configuration interfaces with a special format.

The approach has the advantage of using existing APIs (ie no new
imports) but has the same disadvantage that the code would have to be
instrumented with code modifications.


>
> -Bertrand

Re: Monitoring and Statistics

Posted by Felix Meschberger <fm...@adobe.com>.

I am pretty sure, I don't like mixing monitoring & statistics with logging. This clearly violates the separation of concern patterns ....

Of two evils, I prefer creating new API over misusing functionality meant for other things ...

Regards
Felix

Am 25.02.2013 um 11:56 schrieb Bertrand Delacretaz:

> On Mon, Feb 25, 2013 at 11:14 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>> ...I think it's quite common when running more than a few instances to want to
>> be able to quickly gather read only stats from all the instances. Ie a json
>> feed per server. The easiest way of doing that is to have a single end
>> point that delivers a bundle of counters with a time stamp...
> 
> I tend to agree, but IMO that's a different use case than JMX which is
> a more general management framework. You're looking here at a subset
> which is just monitoring counters and time series.
> 
> Keeping track of counters can also be seen as a logging activity, we
> might also use slf4j markers for this?
> 
> Using a "counter" Marker and logging at the TRACE level, for example,
> to tell the logger to behave as a counter as well, and interpret
> messages like +1 or -1 (which is cheap with String comparison).
> 
> Just a rough idea, but this would avoid inventing new services, and
> the resulting values can be made available both via JMX and a
> "counters" HTTP endpoint, which I agree makes sense to feed tools like
> Splunk. And using loggers we inherit the existing enable/disable
> functionality.
> 
> I experimented a bit with slf4j markers recently, just committed that
> experiment in my whiteboard as revision 1449656 but the use case is
> different - use markers to tell some loggers to ignore all messages
> that don't have a specific marker.
> 
> -Bertrand


--
Felix Meschberger | Principal Scientist | Adobe

Re: Monitoring and Statistics

Posted by Bertrand Delacretaz <bd...@apache.org>.

On Mon, Feb 25, 2013 at 11:14 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> ...I think it's quite common when running more than a few instances to want to
> be able to quickly gather read only stats from all the instances. Ie a json
> feed per server. The easiest way of doing that is to have a single end
> point that delivers a bundle of counters with a time stamp...

I tend to agree, but IMO that's a different use case than JMX which is
a more general management framework. You're looking here at a subset
which is just monitoring counters and time series.

Keeping track of counters can also be seen as a logging activity, we
might also use slf4j markers for this?

Using a "counter" Marker and logging at the TRACE level, for example,
to tell the logger to behave as a counter as well, and interpret
messages like +1 or -1 (which is cheap with String comparison).

Just a rough idea, but this would avoid inventing new services, and
the resulting values can be made available both via JMX and a
"counters" HTTP endpoint, which I agree makes sense to feed tools like
Splunk. And using loggers we inherit the existing enable/disable
functionality.

I experimented a bit with slf4j markers recently, just committed that
experiment in my whiteboard as revision 1449656 but the use case is
different - use markers to tell some loggers to ignore all messages
that don't have a specific marker.

-Bertrand

Re: Monitoring and Statistics

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Back to square one inventing our own soup ? ;-)

Is there really nothing available, that we can leverage ? I am feeling uncomfortable with this (not to speak of the singleton...)

Regards
Felix

Am 01.03.2013 um 08:30 schrieb Ian Boston:

> On 28 February 2013 07:03, Ian Boston <ie...@tfd.co.uk> wrote:
>> 
>> How do you all feel about a service or a singleton covering counters, set
>> values and possibly running mean?.  I am less keen on anything that's
>> expensive to record as impact on the runtime must be close to zero.
> 
> ie: something like [1]. It was a lot more sophisticated initially, but
> it was slower, created more GC traffic and in the end I thought, whats
> the point, so I cut it back to really simple again.
> 
> To use.
> import org.apache.sling.commons.monitor.Statistics;
> 
> Statistics statistics = StatisticsFactory.instance();
> statistics.get("counter").incrementAndGet();
> statistics.get("setvalue").set(System.currentTimeMillis());
> 
> or to avoid the get completely.
> 
> private static final AtomicLong counter =
> StatisticsFactory.instance().get("counter");
> 
> 
> counter.incrementAndGet();
> 
> The output of the servlet is of the form:
> {
>    "_timenanos": 1362121529334796000,
>    "counter": 1,
>    "ObservationDispatcher.backlog": 101,
>    "ObservationDispatcher.added": 12191812191,
>    "ObservationDispatcher.removed": 12191812090
> }
> 
> I thought about Statistics as a OSGI service, but decided not to to
> avoid additional bindings. As a singleton only an import is needed.
> 
> WDYT, usable, or is JMX beans in each bundle going to be the way ?
> 
> Ian
> 
> 
> 
> 1 http://svn.apache.org/repos/asf/sling/whiteboard/ieb/monitor/
> 
> 
>> 
>> Ian


--
Felix Meschberger | Principal Scientist | Adobe

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 28 February 2013 07:03, Ian Boston <ie...@tfd.co.uk> wrote:
>
> How do you all feel about a service or a singleton covering counters, set
> values and possibly running mean?.  I am less keen on anything that's
> expensive to record as impact on the runtime must be close to zero.

ie: something like [1]. It was a lot more sophisticated initially, but
it was slower, created more GC traffic and in the end I thought, whats
the point, so I cut it back to really simple again.

To use.
import org.apache.sling.commons.monitor.Statistics;

 Statistics statistics = StatisticsFactory.instance();
 statistics.get("counter").incrementAndGet();
 statistics.get("setvalue").set(System.currentTimeMillis());

or to avoid the get completely.

private static final AtomicLong counter =
StatisticsFactory.instance().get("counter");

counter.incrementAndGet();

The output of the servlet is of the form:
{
    "_timenanos": 1362121529334796000,
    "counter": 1,
    "ObservationDispatcher.backlog": 101,
    "ObservationDispatcher.added": 12191812191,
    "ObservationDispatcher.removed": 12191812090
}

I thought about Statistics as a OSGI service, but decided not to to
avoid additional bindings. As a singleton only an import is needed.

WDYT, usable, or is JMX beans in each bundle going to be the way ?

Ian

1 http://svn.apache.org/repos/asf/sling/whiteboard/ieb/monitor/

>
> Ian

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

 Yes I agree, mixing logging with monitoring is a bit evil. It's also ugly
to do anything beyond a simple incrementing counter.

How do you all feel about a service or a singleton covering counters, set
values and possibly running mean?.  I am less keen on anything that's
expensive to record as impact on the runtime must be close to zero.

@carsten. The intention is that the consuming service collects the counters
and performs first and second order differentials on the time series to
generate rate and rate of change information. Doing this outside simplifies
the the internal monitoring code and its impact on the runtime. Most
monitoring frameworks (Munin, Reconoiter) are capable of working like this
and some perfumer it. The service needs to also support set value, so that
things like queue backlog can be monitored.
I started with pure counters since it was easiest.

Ian

On Thursday, February 28, 2013, Bertrand Delacretaz wrote:

> On Wed, Feb 27, 2013 at 12:56 AM, Ian Boston <ieb@tfd.co.uk <javascript:;>>
> wrote:
> ...
> > public static final Marker COUNT = MarkerFactory.getMarker("count");
> > log.info(Counter.COUNT,"read-property");...
>
> I agree with others that this looks like abusing the logging API...my
> initial idea was more like
>
> log.info(Markers.INCREMENT, "{} incremented for {}", "read-property",
> property;
>
> which would combine logging and keeping counters in a single
> call...not sure if it's practical.
>
> -Bertrand (at ApacheCon, sorry about unpredictable response times)
>

Re: Monitoring and Statistics

Posted by Bertrand Delacretaz <bd...@apache.org>.

On Wed, Feb 27, 2013 at 12:56 AM, Ian Boston <ie...@tfd.co.uk> wrote:
...
> public static final Marker COUNT = MarkerFactory.getMarker("count");
> log.info(Counter.COUNT,"read-property");...

I agree with others that this looks like abusing the logging API...my
initial idea was more like

log.info(Markers.INCREMENT, "{} incremented for {}", "read-property", property;

which would combine logging and keeping counters in a single
call...not sure if it's practical.

-Bertrand (at ApacheCon, sorry about unpredictable response times)

Re: Monitoring and Statistics

Posted by Carsten Ziegeler <cz...@apache.org>.

I don't want you to stop experimenting but logging a counter seems to
have a lot of practical problems to me, especially when it comes to
evaluating them.

Carsten

2013/2/27 Ian Boston <ie...@tfd.co.uk>:
> Hi,
> I have done a witeboard impl [1] using Bertrand's Marker idea, based
> on the current commons-log bundle (@Bertrand I wasn't sure if I should
> take your whiteboard code or not as a starting point).
>
> Its a simple concurrent map of AtomicLongs, which I suspect wont be to
> everyones taste.
> There is a servlet which dumps the map out to json using a
> StringBuilder to avoid adding any dependencies to logging.
>
> I am still testing, but so far I dont think there is any non
> concurrent behaviour with upto 32 threads, 100 counters invoked a few
> M times. Overhead to increment a counter is about 200-240ns on my box
> (MBP 2.53 Ghz Intel Core 2 Duo, JDK1.6). I need to get some stats if
> there are any blocking waits by threads but from the tests so far
> there dont appear to be any.
>
>
> To use it you would do
>
> public static final Marker COUNT = MarkerFactory.getMarker("count");
> ...
>
> log.info(Counter.COUNT,"read-property");
>
> which adds 1 to the counter "read-property".
>
>
> Ian
>
> 1 http://svn.apache.org/viewvc?view=revision&revision=r1450676
>
>
> On 27 February 2013 15:23, Ian Boston <ie...@tfd.co.uk> wrote:
>> I was thinking of covering this first with a multi threaded unit test
>> to verify no concurrency issues and to give an idea of maximum
>> throughput, since a centralised counter working off logging API
>> markers could be used anywhere.
>>
>> I will do before after testing as well.
>> Ian
>>
>> On 26 February 2013 17:55, Carsten Ziegeler <cz...@apache.org> wrote:
>>> Just a general comment especially on statistics, whatever we do, we
>>> carefully need to check the performance impact. Over the past years,
>>> I've seen too many approaches where a simple statistics like "number
>>> of incoming requests" was a real bottleneck under load (I think one of
>>> our first implementations in Sling engine had a similar problem). So
>>> while such an information is interesting and important it shouldn't
>>> bring down the server or reduce the performance significantly. Doing
>>> some benchmarking before and after should do the trick.
>>>
>>> Carsten
>>> --
>>> Carsten Ziegeler
>>> cziegeler@apache.org



-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

Hi,
I have done a witeboard impl [1] using Bertrand's Marker idea, based
on the current commons-log bundle (@Bertrand I wasn't sure if I should
take your whiteboard code or not as a starting point).

Its a simple concurrent map of AtomicLongs, which I suspect wont be to
everyones taste.
There is a servlet which dumps the map out to json using a
StringBuilder to avoid adding any dependencies to logging.

I am still testing, but so far I dont think there is any non
concurrent behaviour with upto 32 threads, 100 counters invoked a few
M times. Overhead to increment a counter is about 200-240ns on my box
(MBP 2.53 Ghz Intel Core 2 Duo, JDK1.6). I need to get some stats if
there are any blocking waits by threads but from the tests so far
there dont appear to be any.

To use it you would do

public static final Marker COUNT = MarkerFactory.getMarker("count");
...

log.info(Counter.COUNT,"read-property");

which adds 1 to the counter "read-property".

Ian

1 http://svn.apache.org/viewvc?view=revision&revision=r1450676

On 27 February 2013 15:23, Ian Boston <ie...@tfd.co.uk> wrote:
> I was thinking of covering this first with a multi threaded unit test
> to verify no concurrency issues and to give an idea of maximum
> throughput, since a centralised counter working off logging API
> markers could be used anywhere.
>
> I will do before after testing as well.
> Ian
>
> On 26 February 2013 17:55, Carsten Ziegeler <cz...@apache.org> wrote:
>> Just a general comment especially on statistics, whatever we do, we
>> carefully need to check the performance impact. Over the past years,
>> I've seen too many approaches where a simple statistics like "number
>> of incoming requests" was a real bottleneck under load (I think one of
>> our first implementations in Sling engine had a similar problem). So
>> while such an information is interesting and important it shouldn't
>> bring down the server or reduce the performance significantly. Doing
>> some benchmarking before and after should do the trick.
>>
>> Carsten
>> --
>> Carsten Ziegeler
>> cziegeler@apache.org

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

I was thinking of covering this first with a multi threaded unit test
to verify no concurrency issues and to give an idea of maximum
throughput, since a centralised counter working off logging API
markers could be used anywhere.

I will do before after testing as well.
Ian

On 26 February 2013 17:55, Carsten Ziegeler <cz...@apache.org> wrote:
> Just a general comment especially on statistics, whatever we do, we
> carefully need to check the performance impact. Over the past years,
> I've seen too many approaches where a simple statistics like "number
> of incoming requests" was a real bottleneck under load (I think one of
> our first implementations in Sling engine had a similar problem). So
> while such an information is interesting and important it shouldn't
> bring down the server or reduce the performance significantly. Doing
> some benchmarking before and after should do the trick.
>
> Carsten
> --
> Carsten Ziegeler
> cziegeler@apache.org

Re: Monitoring and Statistics

Posted by Carsten Ziegeler <cz...@apache.org>.

Just a general comment especially on statistics, whatever we do, we
carefully need to check the performance impact. Over the past years,
I've seen too many approaches where a simple statistics like "number
of incoming requests" was a real bottleneck under load (I think one of
our first implementations in Sling engine had a similar problem). So
while such an information is interesting and important it shouldn't
bring down the server or reduce the performance significantly. Doing
some benchmarking before and after should do the trick.

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 26 February 2013 09:48, Ian Boston <ie...@tfd.co.uk> wrote:
> On 26 February 2013 00:27, Felix Meschberger <fm...@adobe.com> wrote:
>> Hi,
>>
>> Ok, I understand. Yet your solution is quite intrusive (requires code changes and Code Dependencies).
>>
>> OTOH: I just was made aware of JAMon [1]. It looks like this may do a lot of things but is not intrusive since it seems to transparently work with wrappers and proxies.
>>
>> WDYT ?
>>
>
> It looks interesting however the license [1] will be a problem. Its
> wording looks like BSD but only allows binary code redistribution.
> Thats not really open source and makes me worry about sustainability ?
>
> But I like the approach, almost like Mockito with
>
> MonitorFactory.proxy(Repository.class).login();
>
> What happens when you use aspects or CGLib with OSGi ?
> Ideally we would want to be able to monitor classes not exported.
> Has anyone tried ?
>

Outcome of some research, reading the JAMon code base and other reading:

JAmon creates proxies which you have to use instead of the object. I
was wrong about its Mockito like behaviour:

so

MyClass proxyMyclass = MonitorFactory.proxy(myClass);
proxyMyclass.doSomething();

Will record 1 call for proxyMyclass and not for all calls to
doSomething() on all instances of MyClass (ie its not like a aspect
pointcut).

As such, it might be of use if wrapping service registration.

Second issue:
It uses JDK proxies which can only proxy interfaces and not classes.

I also looked at AspectJ in an OSGi context which is probably a non
starter since load-time weaving has to be configured in the base
container and will modify the behaviour of the classloaders blocking
reloading of bundles. Equinox has weaving with [1]. I didn't find
anything for Felix other than [2] which uses a modified service
registration mechanism. All of this (AOP/AspectJ/JDKProxies) will only
allow monitory of visible interfaces and not recording of stats within
an implementation, below the service interface level.

Proxying with CGLib will allow proxying of clases as well as
interfaces but to use this to monitor classes with no code changes in
the bundles will require modification to something within Felix
(classloaders ?), all of which is probably too extreem to address a
monitoring requirement.

For completeness I've asked the authors of JAMon about their license wrt BSD.

Ian

1 http://eclipse.org/equinox/weaving/
2 http://www.mail-archive.com/users@felix.apache.org/msg07065.html

>
>
> 1 http://jamonapi.sourceforge.net/JAMonLicense.html
>
>
> <snip>

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 26 February 2013 00:27, Felix Meschberger <fm...@adobe.com> wrote:
> Hi,
>
> Ok, I understand. Yet your solution is quite intrusive (requires code changes and Code Dependencies).
>
> OTOH: I just was made aware of JAMon [1]. It looks like this may do a lot of things but is not intrusive since it seems to transparently work with wrappers and proxies.
>
> WDYT ?
>

It looks interesting however the license [1] will be a problem. Its
wording looks like BSD but only allows binary code redistribution.
Thats not really open source and makes me worry about sustainability ?

But I like the approach, almost like Mockito with

MonitorFactory.proxy(Repository.class).login();

What happens when you use aspects or CGLib with OSGi ?
Ideally we would want to be able to monitor classes not exported.
Has anyone tried ?

1 http://jamonapi.sourceforge.net/JAMonLicense.html

<snip>

Re: Monitoring and Statistics

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Ok, I understand. Yet your solution is quite intrusive (requires code changes and Code Dependencies).

OTOH: I just was made aware of JAMon [1]. It looks like this may do a lot of things but is not intrusive since it seems to transparently work with wrappers and proxies.

WDYT ?

Regards
Felix

[1] http://jamonapi.sourceforge.net/

Am 25.02.2013 um 11:14 schrieb Ian Boston:

> On Monday, February 25, 2013, Felix Meschberger wrote:
> 
>> Hi,
>> 
>> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>> 
>>> Hi,
>>> Whilst writing the MBeans in the event bundle I started thinking about
>>> monitoring inside Sling. IMHO there are not enough to really know what
>>> a instance under load is doing. Much though I like JMX it comes with
>>> implementation and runtime overhead which I don't much like.
>>> 
>>> Runtime:
>>> * Running with JMX enabled doesn't add any overhead, but once a client
>>> is connected there is some (some reports upto 3% of resources).
>> 
>> I agree. But a client connecting (actually it is not the connection per
>> se, it is the requests executed by the client, which bring the load) may
>> load the system, which is ok, because the client is investigating.
>> 
>> 
>>> * You have to remember to enable it, and most of the time JVMs are not
>>> enabled. By the time you really need it, its often too late.
>>> * JMX is not restful.
>> 
>> I don't think that this is enough of a reason to reinvent the wheel given
>> the spread of JMX support in tools.
>> 
>>> 
>>> Implementation
>>> * MBeans are not that hard to implement with the OSGi Whiteboard, but
>>> they have to be implemented.
>> 
>> Right. Everyhting that has to be queried has to be implemented. The actual
>> problem I see, is that a separate API is required, which in turn is good
>> programming style anyway when doing services. Yet this API does not need to
>> be exported because the JMX Whiteboard gets to it through the MBean service
>> class (IIRC).
>> 
>>> 
>>> Alternatives.
>>> In Jackarabbit there is/was a statistics class [1], which IIUC uses
>>> counters and time series stored in a map. The service can then be
>>> queries to extract the values by wrapping in an MBean or Servlet.
>> 
>> The only advantage over direct JMX I see, is the Servlet wrapping
>> (probably the Web Console in our case ?). How common is that use case ?
> 
> 
> 
> I think it's quite common when running more than a few instances to want to
> be able to quickly gather read only stats from all the instances. Ie a json
> feed per server. The easiest way of doing that is to have a single end
> point that delivers a bundle of counters with a time stamp. If that's done
> then all of the time series and trending can be performed by the monitoring
> tool that can be much more efficient at the task. ( ie Munin, Reconoiter,
> Splunk etc)
> 
> The main problem with jmx in a multiple instance monitoring environment is
> two fold.
> It opens up all sorts of admin level operations.
> The jmx protocol is rpc like in nature requiring many network rounds trips.
> 
> I am not saying it cant be done with jmx, its just not very efficient and
> with more than a handfull of jvms has to be done on the instance in
> question, side stepping the scaling issue.
> 
> As for the suggested service. The reason for the centralised map of
> counters is so that the get operation can just serialise that map, rather
> than having to invoke get attributes on many Mbeans. The other reason for
> the centralised map of counters is so that areas that need to record usage,
> can do so with single lines of code. It's quite close to the pattern of
> RepositoryStatislticsImpl in Jackrabbit except there an enum is used to
> control the map and the objects are all time series.
> 
> 
> I agree using jmx for monitoring of one or two jvms does make sense and it
> would not be a good idea to reinvent that wheel. It would certainly be a
> mistake to reinvent operations or notifications.
> 
> Ian
> 
> 
> 
>> 
>> And even then I would prefer a good DTO design over just a "counter map".
>> 
>> The StatisticsUtil class is IMHO overkill.
>> 
>> Regards
>> Felix
>> 
>>> 
>>> I think the approach could be generalised and extended so that
>>> anything in the container could use the service to record metrics. The
>>> api might look something like
>>> 
>>> public interface Statistics {
>>> 
>>>     /**
>>>      * Increment a counter by 1
>>>      */
>>>     void increment(String counterName);
>>> 
>>>     /**
>>>      * Record a double value in a timeseries.
>>>      */
>>>     void record(String timeSeriesName, double value);
>>> 
>>>     /**
>>>      * Record a long value in a timeseries.
>>>      */
>>>     void record(String timeSeriesName, long value);
>>> 
>>> }
>>> 
>>> and (so that any reference can be optional on a service
>>> implementation, the final is a hint to hotspot to inline)
>>> 
>>> public final class StatisticsUtils {
>>> 
>>> private StatisticsUtils() {
>>> }
>>> 
>>> public static void increment(Statistics statistics, String counterName)
>> {
>>>    if ( statistics != null ) {
>>>        statistics.increment(counterName);
>>>    }
>>> }
>>> 
>>> ... etc for the other methods ..
>>> }
>>> 
>>> 
>>> 
>>> 
>>> The service would need to deal with all the implementation details
>>> (including concurrency and speed). The service implementation would
>>> also come with a servlet endpoint (under /system/*) and/or single JMX
>>> MBean.
>>> 
>>> Anything that wanted to record stats would then bind to the service
>>> and use it. I think this would avoid the issues mentioned above with
>>> wide scale MBean usage.
>>> 
>>> WDYT?
>>> 
>>> (apologies for the noise if this already exists, and if so, please
>>> treat it as a question: where and how do we record stats?)
>>> 
>>> Ian
>>> 
>>> 
>>> 
>>> 
>>> 1 http://wiki.apache.org/jackrabbit/Statistics
>> 
>> 
>> --
>> Felix Meschberger | Principal Scientist | Adobe
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 


--
Felix Meschberger | Principal Scientist | Adobe

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On Monday, February 25, 2013, Felix Meschberger wrote:

> Hi,
>
> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>
> > Hi,
> > Whilst writing the MBeans in the event bundle I started thinking about
> > monitoring inside Sling. IMHO there are not enough to really know what
> > a instance under load is doing. Much though I like JMX it comes with
> > implementation and runtime overhead which I don't much like.
> >
> > Runtime:
> > * Running with JMX enabled doesn't add any overhead, but once a client
> > is connected there is some (some reports upto 3% of resources).
>
> I agree. But a client connecting (actually it is not the connection per
> se, it is the requests executed by the client, which bring the load) may
> load the system, which is ok, because the client is investigating.
>
>
> > * You have to remember to enable it, and most of the time JVMs are not
> > enabled. By the time you really need it, its often too late.
> > * JMX is not restful.
>
> I don't think that this is enough of a reason to reinvent the wheel given
> the spread of JMX support in tools.
>
> >
> > Implementation
> > * MBeans are not that hard to implement with the OSGi Whiteboard, but
> > they have to be implemented.
>
> Right. Everyhting that has to be queried has to be implemented. The actual
> problem I see, is that a separate API is required, which in turn is good
> programming style anyway when doing services. Yet this API does not need to
> be exported because the JMX Whiteboard gets to it through the MBean service
> class (IIRC).
>
> >
> > Alternatives.
> > In Jackarabbit there is/was a statistics class [1], which IIUC uses
> > counters and time series stored in a map. The service can then be
> > queries to extract the values by wrapping in an MBean or Servlet.
>
> The only advantage over direct JMX I see, is the Servlet wrapping
> (probably the Web Console in our case ?). How common is that use case ?



I think it's quite common when running more than a few instances to want to
be able to quickly gather read only stats from all the instances. Ie a json
feed per server. The easiest way of doing that is to have a single end
point that delivers a bundle of counters with a time stamp. If that's done
then all of the time series and trending can be performed by the monitoring
tool that can be much more efficient at the task. ( ie Munin, Reconoiter,
Splunk etc)

The main problem with jmx in a multiple instance monitoring environment is
two fold.
 It opens up all sorts of admin level operations.
The jmx protocol is rpc like in nature requiring many network rounds trips.

I am not saying it cant be done with jmx, its just not very efficient and
with more than a handfull of jvms has to be done on the instance in
question, side stepping the scaling issue.

As for the suggested service. The reason for the centralised map of
counters is so that the get operation can just serialise that map, rather
than having to invoke get attributes on many Mbeans. The other reason for
the centralised map of counters is so that areas that need to record usage,
can do so with single lines of code. It's quite close to the pattern of
RepositoryStatislticsImpl in Jackrabbit except there an enum is used to
control the map and the objects are all time series.


I agree using jmx for monitoring of one or two jvms does make sense and it
would not be a good idea to reinvent that wheel. It would certainly be a
mistake to reinvent operations or notifications.

Ian



>
> And even then I would prefer a good DTO design over just a "counter map".
>
> The StatisticsUtil class is IMHO overkill.
>
> Regards
> Felix
>
> >
> > I think the approach could be generalised and extended so that
> > anything in the container could use the service to record metrics. The
> > api might look something like
> >
> > public interface Statistics {
> >
> >      /**
> >       * Increment a counter by 1
> >       */
> >      void increment(String counterName);
> >
> >      /**
> >       * Record a double value in a timeseries.
> >       */
> >      void record(String timeSeriesName, double value);
> >
> >      /**
> >       * Record a long value in a timeseries.
> >       */
> >      void record(String timeSeriesName, long value);
> >
> > }
> >
> > and (so that any reference can be optional on a service
> > implementation, the final is a hint to hotspot to inline)
> >
> > public final class StatisticsUtils {
> >
> >  private StatisticsUtils() {
> >  }
> >
> >  public static void increment(Statistics statistics, String counterName)
> {
> >     if ( statistics != null ) {
> >         statistics.increment(counterName);
> >     }
> >  }
> >
> > ... etc for the other methods ..
> > }
> >
> >
> >
> >
> > The service would need to deal with all the implementation details
> > (including concurrency and speed). The service implementation would
> > also come with a servlet endpoint (under /system/*) and/or single JMX
> > MBean.
> >
> > Anything that wanted to record stats would then bind to the service
> > and use it. I think this would avoid the issues mentioned above with
> > wide scale MBean usage.
> >
> > WDYT?
> >
> > (apologies for the noise if this already exists, and if so, please
> > treat it as a question: where and how do we record stats?)
> >
> > Ian
> >
> >
> >
> >
> > 1 http://wiki.apache.org/jackrabbit/Statistics
>
>
> --
> Felix Meschberger | Principal Scientist | Adobe
>
>
>
>
>
>
>
>

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 26 February 2013 07:02, Justin Edelson <ju...@justinedelson.com> wrote:
> Hi,
>
>
> On Mon, Feb 25, 2013 at 4:24 AM, Felix Meschberger <fm...@adobe.com>wrote:
>
>> Hi,
>>
>> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>>
>> >
>> > Implementation
>> > * MBeans are not that hard to implement with the OSGi Whiteboard, but
>> > they have to be implemented.
>>
>> Right. Everyhting that has to be queried has to be implemented. The actual
>> problem I see, is that a separate API is required, which in turn is good
>> programming style anyway when doing services. Yet this API does not need to
>> be exported because the JMX Whiteboard gets to it through the MBean service
>> class (IIRC).
>>
>
> The MBean API does not *need* to be exported but it *should* be exported so
> that client applications can use it if they want to.

yup, the white board impl (Aries) looks at the service implemented
property to see if it matches *MBean. If it does the service gets
registered using the property jmx.objectname

If extending a StandardMBean, the MBean interface being implemented is
passed in the super constructor call and all the methods on the MBean
interface are exposed to the MBean server. For JMX the interface
doesn't need to be exported as its only the class internals that
introspect it.

Ian

>
> Justin
>
>

<snip>

Re: Monitoring and Statistics

Posted by Justin Edelson <ju...@justinedelson.com>.

Hi,


On Mon, Feb 25, 2013 at 4:24 AM, Felix Meschberger <fm...@adobe.com>wrote:

> Hi,
>
> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>
> >
> > Implementation
> > * MBeans are not that hard to implement with the OSGi Whiteboard, but
> > they have to be implemented.
>
> Right. Everyhting that has to be queried has to be implemented. The actual
> problem I see, is that a separate API is required, which in turn is good
> programming style anyway when doing services. Yet this API does not need to
> be exported because the JMX Whiteboard gets to it through the MBean service
> class (IIRC).
>

The MBean API does not *need* to be exported but it *should* be exported so
that client applications can use it if they want to.

Justin


>
> >
> > Alternatives.
> > In Jackarabbit there is/was a statistics class [1], which IIUC uses
> > counters and time series stored in a map. The service can then be
> > queries to extract the values by wrapping in an MBean or Servlet.
>
> The only advantage over direct JMX I see, is the Servlet wrapping
> (probably the Web Console in our case ?). How common is that use case ?
>
> And even then I would prefer a good DTO design over just a "counter map".
>
> The StatisticsUtil class is IMHO overkill.
>
> Regards
> Felix
>
> >
> > I think the approach could be generalised and extended so that
> > anything in the container could use the service to record metrics. The
> > api might look something like
> >
> > public interface Statistics {
> >
> >      /**
> >       * Increment a counter by 1
> >       */
> >      void increment(String counterName);
> >
> >      /**
> >       * Record a double value in a timeseries.
> >       */
> >      void record(String timeSeriesName, double value);
> >
> >      /**
> >       * Record a long value in a timeseries.
> >       */
> >      void record(String timeSeriesName, long value);
> >
> > }
> >
> > and (so that any reference can be optional on a service
> > implementation, the final is a hint to hotspot to inline)
> >
> > public final class StatisticsUtils {
> >
> >  private StatisticsUtils() {
> >  }
> >
> >  public static void increment(Statistics statistics, String counterName)
> {
> >     if ( statistics != null ) {
> >         statistics.increment(counterName);
> >     }
> >  }
> >
> > ... etc for the other methods ..
> > }
> >
> >
> >
> >
> > The service would need to deal with all the implementation details
> > (including concurrency and speed). The service implementation would
> > also come with a servlet endpoint (under /system/*) and/or single JMX
> > MBean.
> >
> > Anything that wanted to record stats would then bind to the service
> > and use it. I think this would avoid the issues mentioned above with
> > wide scale MBean usage.
> >
> > WDYT?
> >
> > (apologies for the noise if this already exists, and if so, please
> > treat it as a question: where and how do we record stats?)
> >
> > Ian
> >
> >
> >
> >
> > 1 http://wiki.apache.org/jackrabbit/Statistics
>
>
> --
> Felix Meschberger | Principal Scientist | Adobe
>
>
>
>
>
>
>
>

Re: Monitoring and Statistics

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Am 22.02.2013 um 23:59 schrieb Ian Boston:

> Hi,
> Whilst writing the MBeans in the event bundle I started thinking about
> monitoring inside Sling. IMHO there are not enough to really know what
> a instance under load is doing. Much though I like JMX it comes with
> implementation and runtime overhead which I don't much like.
> 
> Runtime:
> * Running with JMX enabled doesn't add any overhead, but once a client
> is connected there is some (some reports upto 3% of resources).

I agree. But a client connecting (actually it is not the connection per se, it is the requests executed by the client, which bring the load) may load the system, which is ok, because the client is investigating.


> * You have to remember to enable it, and most of the time JVMs are not
> enabled. By the time you really need it, its often too late.
> * JMX is not restful.

I don't think that this is enough of a reason to reinvent the wheel given the spread of JMX support in tools.

> 
> Implementation
> * MBeans are not that hard to implement with the OSGi Whiteboard, but
> they have to be implemented.

Right. Everyhting that has to be queried has to be implemented. The actual problem I see, is that a separate API is required, which in turn is good programming style anyway when doing services. Yet this API does not need to be exported because the JMX Whiteboard gets to it through the MBean service class (IIRC).

> 
> Alternatives.
> In Jackarabbit there is/was a statistics class [1], which IIUC uses
> counters and time series stored in a map. The service can then be
> queries to extract the values by wrapping in an MBean or Servlet.

The only advantage over direct JMX I see, is the Servlet wrapping (probably the Web Console in our case ?). How common is that use case ?

And even then I would prefer a good DTO design over just a "counter map".

The StatisticsUtil class is IMHO overkill.

Regards
Felix

> 
> I think the approach could be generalised and extended so that
> anything in the container could use the service to record metrics. The
> api might look something like
> 
> public interface Statistics {
> 
>      /**
>       * Increment a counter by 1
>       */
>      void increment(String counterName);
> 
>      /**
>       * Record a double value in a timeseries.
>       */
>      void record(String timeSeriesName, double value);
> 
>      /**
>       * Record a long value in a timeseries.
>       */
>      void record(String timeSeriesName, long value);
> 
> }
> 
> and (so that any reference can be optional on a service
> implementation, the final is a hint to hotspot to inline)
> 
> public final class StatisticsUtils {
> 
>  private StatisticsUtils() {
>  }
> 
>  public static void increment(Statistics statistics, String counterName) {
>     if ( statistics != null ) {
>         statistics.increment(counterName);
>     }
>  }
> 
> ... etc for the other methods ..
> }
> 
> 
> 
> 
> The service would need to deal with all the implementation details
> (including concurrency and speed). The service implementation would
> also come with a servlet endpoint (under /system/*) and/or single JMX
> MBean.
> 
> Anything that wanted to record stats would then bind to the service
> and use it. I think this would avoid the issues mentioned above with
> wide scale MBean usage.
> 
> WDYT?
> 
> (apologies for the noise if this already exists, and if so, please
> treat it as a question: where and how do we record stats?)
> 
> Ian
> 
> 
> 
> 
> 1 http://wiki.apache.org/jackrabbit/Statistics


--
Felix Meschberger | Principal Scientist | Adobe

Re: Monitoring and Statistics

Posted by Bertrand Delacretaz <bd...@apache.org>.

On Fri, Mar 1, 2013 at 1:56 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> On 1 March 2013 20:17, Carsten Ziegeler <cz...@apache.org> wrote:
>> ....As outlined in the post that
>> started this thread, using JMX from an implementation pov is pretty
>> simple.
>
> To be honest extending StandardMBean is not the issue. Doing it in a
> way thats safe and low impact is, as you pointed out when talking
> about earlier attempts to generate stats. If each bundle has its own
> implementation we will re-invent the wheel...

So maybe what's needed is a base/utility class that can be used in a
very simple way to create MBeans that manage counters?

Would this + a simple HTTP interface to read the counters cover your use cases?

-Bertrand

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 4 March 2013 20:44, Bertrand Delacretaz <bd...@apache.org> wrote:
> Hi,
>
> On Fri, Mar 1, 2013 at 9:12 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>> ...I am not really certain of the value of this work any more, so I am
>> not going to do anything further, unless there is a strong demand....
>
> Note that FELIX-3152 provides jmx extensions for the Felix web console
> - for some reason that hasn't been committed yet, but I just tested it
> and it works with the Sling trunk and does provide some JMX-to-json
> conversion, dunno if that would work for your needs.

Looks like it will, and it supports a large number of OpenTypes. I
assume it will make it into a Felix release at some point.

Ian

>
> -Bertrand

Re: Monitoring and Statistics

Posted by Bertrand Delacretaz <bd...@apache.org>.

Hi,

On Fri, Mar 1, 2013 at 9:12 PM, Ian Boston <ie...@tfd.co.uk> wrote:
> ...I am not really certain of the value of this work any more, so I am
> not going to do anything further, unless there is a strong demand....

Note that FELIX-3152 provides jmx extensions for the Felix web console
- for some reason that hasn't been committed yet, but I just tested it
and it works with the Sling trunk and does provide some JMX-to-json
conversion, dunno if that would work for your needs.

-Bertrand

Re: Monitoring and Statistics

Posted by Justin Edelson <ju...@justinedelson.com>.

Hi Ian,

On Sat, Mar 2, 2013 at 12:12 AM, Ian Boston <ie...@tfd.co.uk> wrote:

> On 2 March 2013 10:42, Ian Boston <ie...@tfd.co.uk> wrote:
>
> > 2.
> > GETs can only query 1 bean at a time.
> > If you need a snapshot of the state of the server and have 50 MBeans,
> > you have to make 50 requests. You can make POST requests to perform
> > batch up GET requests. Fortunately you can specify a * for attributes.
> >
>
> Correction:
> A SlingServlet extending the AgentServlet form Jolokia registered at
> /system/stats triggers authentication and can be restricted to GET
> only with the SlingServlet annotation.
>

I went ahead and committed what I had - it isn't a SlingServlet, but rather
uses the standard HttpService http context stuff. In the configuration of
the servlet component, you specify a list of userIds who have access.

>
> Provided the query is performed in a parameter [1], its possible to
> retrieve all MBeans, however, Jolokia ignores the Attributes that it
> doesn't understand how to convert into json. For instance, the
> TimeSeries in Jackrabbit is ignored. I think any container that isn't
> a primitive or open type is ignored even if it contains data that
> could be output.
>

This is an interesting issue. I've never had this particular use case. I
find it is more common to want to query the same mbean (or really the same
attribute) across 50 instances than 50 mbeans on the same instance.

>
> I am not really certain of the value of this work any more, so I am
> not going to do anything further, unless there is a strong demand.
>

I guess my point in brining this up is that I feel like we don't need to
provide a JMX/JSON bridge in the Sling project. If there are defects in
Jolokia, we can work with the Jolokia community to fix them.

Justin

> Best Regards
> Ian
>
> 1 http://localhost:4502/system/stats?p=read/*:type=*
>

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 2 March 2013 10:42, Ian Boston <ie...@tfd.co.uk> wrote:

> 2.
> GETs can only query 1 bean at a time.
> If you need a snapshot of the state of the server and have 50 MBeans,
> you have to make 50 requests. You can make POST requests to perform
> batch up GET requests. Fortunately you can specify a * for attributes.
>

Correction:
A SlingServlet extending the AgentServlet form Jolokia registered at
/system/stats triggers authentication and can be restricted to GET
only with the SlingServlet annotation.

Provided the query is performed in a parameter [1], its possible to
retrieve all MBeans, however, Jolokia ignores the Attributes that it
doesn't understand how to convert into json. For instance, the
TimeSeries in Jackrabbit is ignored. I think any container that isn't
a primitive or open type is ignored even if it contains data that
could be output.

I am not really certain of the value of this work any more, so I am
not going to do anything further, unless there is a strong demand.

Best Regards
Ian

1 http://localhost:4502/system/stats?p=read/*:type=*

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 2 March 2013 01:45, Justin Edelson <ju...@justinedelson.com> wrote:

>>> No REST interface is not really an issue and I guess it would be easy
>>> to write a JMX over REST bridge :)
>>
>> It would and it could live in a separate optional bundle for anyone
>> that wanted it and I could write a JMX Bean ontop of the map as I did
>> for the Jackrabbit RepositoryStatistics
>
> I really like Jolokia (http://www.jolokia.org/) as a JMX/HTTP bridge. It comes out of jmx4perl and is ASLv2 licensed. They already provide an OSGi bundle which can just be dropped in and works. I started to write a Sling-specific bundlization which would use repository credentials rather than user/password hard coded in a configuration file (similar to the WebConsole).
>
> Unless there's some specific defect, let's just use this.
>

Hi,

I am +1 with the following observations:

The project documentation gives a good introduction and documents the
reasons why JMX remote is problematic. The chapter on architecture [1]
gives a detailed description and the section on the HTTP-JMX proxy
explains all the problems associated with JSR-160 and RMI in the wild.

There are one or two problems with the protocol, that I think are
significant and we might want to think about.

1.
GETs modifiy data.
The protocol allows a GET operation to invoke a JMX operation, modifying state.

2.
GETs can only query 1 bean at a time.
If you need a snapshot of the state of the server and have 50 MBeans,
you have to make 50 requests. You can make POST requests to perform
batch up GET requests. Fortunately you can specify a * for attributes.

3.
POSTs allow invoking operations.
The bridge gives full access to all JMX operations.

It might be possible to disable 1 and 3 using access control policies
that are available within Jolokia, however I think this might have to
be done on a bean by bean basis which would but section 4 of the
manual [2] appears to indicate that wildcards on bean names can be
used.

I dont see a way around 2 other than to use POST operations as GETs,
although it might be possible to re-implement the GET operation to
expose a OSGi configured set of statistics in a single GET operation.

Best Regards
Ian

1 http://www.jolokia.org/reference/html/architecture.html
2 http://www.jolokia.org/reference/html/security.html

> Justin
>
>>
>> Best Regards
>> Ian
>>

Re: Monitoring and Statistics

Posted by Justin Edelson <ju...@justinedelson.com>.

Hi,

On Mar 1, 2013, at 4:56 AM, Ian Boston <ie...@tfd.co.uk> wrote:

> On 1 March 2013 20:17, Carsten Ziegeler <cz...@apache.org> wrote:
>> Hi,
>> 
>> I agree with Felix here - we should have a coherent story here and JMX
>> looks like a good solution to me as well. As outlined in the post that
>> started this thread, using JMX from an implementation pov is pretty
>> simple.
> 
> To be honest extending StandardMBean is not the issue. Doing it in a
> way thats safe and low impact is, as you pointed out when talking
> about earlier attempts to generate stats. If each bundle has its own
> implementation we will re-invent the wheel, and getting even simple
> counters right with no synchronisation is not as simple as volatile
> int x; i++; Its not hard by has to be done in the right way.
> 
>> So are there any concerns or problems for using JMX from a client pov?
> 
> Depends on who you talk to.
> Most sysops people dont like JMX due to the cost of querying it and
> most java devs do like it because it has GUI tools. Most sysops tools
> have bridges that spin up a JVM, query JMX and convert the stats into
> some other format, (Resmon XML, RRDTool). That uses a minimum of 64MB
> of RAM and sysops people tend to get a bit agitated by that when they
> compare it to what a python/bash/c command uses.
> 
>> No REST interface is not really an issue and I guess it would be easy
>> to write a JMX over REST bridge :)
> 
> It would and it could live in a separate optional bundle for anyone
> that wanted it and I could write a JMX Bean ontop of the map as I did
> for the Jackrabbit RepositoryStatistics

I really like Jolokia (http://www.jolokia.org/) as a JMX/HTTP bridge. It comes out of jmx4perl and is ASLv2 licensed. They already provide an OSGi bundle which can just be dropped in and works. I started to write a Sling-specific bundlization which would use repository credentials rather than user/password hard coded in a configuration file (similar to the WebConsole).

Unless there's some specific defect, let's just use this.

Justin

> 
> Best Regards
> Ian
> 
>> 
>> Regards
>> Carsten
>> 
>> 2013/3/1 Felix Meschberger <fm...@adobe.com>:
>>> Hi
>>> 
>>> I really appreciate this discussion. But I would like to get to a point where we create a proper future-proof (as much as possible) architecture which properly integrates with the current situation:
>>> 
>>>  * JMX is the system of choice for systems management
>>>  * The Web Console is the respective system of choice
>>>    for web based interactive tooling
>>>  * Don't reinvent wheels
>>> 
>>> I would really like to highlight the last point: I would prefer to reuse existing functionality and libraries as much as possible instead of reinventing our own stuff using yet another channel.
>>> 
>>> Regards
>>> Felix
>>> 
>>> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>>> 
>>>> Hi,
>>>> Whilst writing the MBeans in the event bundle I started thinking about
>>>> monitoring inside Sling. IMHO there are not enough to really know what
>>>> a instance under load is doing. Much though I like JMX it comes with
>>>> implementation and runtime overhead which I don't much like.
>>>> 
>>>> Runtime:
>>>> * Running with JMX enabled doesn't add any overhead, but once a client
>>>> is connected there is some (some reports upto 3% of resources).
>>>> * You have to remember to enable it, and most of the time JVMs are not
>>>> enabled. By the time you really need it, its often too late.
>>>> * JMX is not restful.
>>>> 
>>>> Implementation
>>>> * MBeans are not that hard to implement with the OSGi Whiteboard, but
>>>> they have to be implemented.
>>>> 
>>>> Alternatives.
>>>> In Jackarabbit there is/was a statistics class [1], which IIUC uses
>>>> counters and time series stored in a map. The service can then be
>>>> queries to extract the values by wrapping in an MBean or Servlet.
>>>> 
>>>> I think the approach could be generalised and extended so that
>>>> anything in the container could use the service to record metrics. The
>>>> api might look something like
>>>> 
>>>> public interface Statistics {
>>>> 
>>>>     /**
>>>>      * Increment a counter by 1
>>>>      */
>>>>     void increment(String counterName);
>>>> 
>>>>     /**
>>>>      * Record a double value in a timeseries.
>>>>      */
>>>>     void record(String timeSeriesName, double value);
>>>> 
>>>>     /**
>>>>      * Record a long value in a timeseries.
>>>>      */
>>>>     void record(String timeSeriesName, long value);
>>>> 
>>>> }
>>>> 
>>>> and (so that any reference can be optional on a service
>>>> implementation, the final is a hint to hotspot to inline)
>>>> 
>>>> public final class StatisticsUtils {
>>>> 
>>>> private StatisticsUtils() {
>>>> }
>>>> 
>>>> public static void increment(Statistics statistics, String counterName) {
>>>>    if ( statistics != null ) {
>>>>        statistics.increment(counterName);
>>>>    }
>>>> }
>>>> 
>>>> ... etc for the other methods ..
>>>> }
>>>> 
>>>> 
>>>> 
>>>> 
>>>> The service would need to deal with all the implementation details
>>>> (including concurrency and speed). The service implementation would
>>>> also come with a servlet endpoint (under /system/*) and/or single JMX
>>>> MBean.
>>>> 
>>>> Anything that wanted to record stats would then bind to the service
>>>> and use it. I think this would avoid the issues mentioned above with
>>>> wide scale MBean usage.
>>>> 
>>>> WDYT?
>>>> 
>>>> (apologies for the noise if this already exists, and if so, please
>>>> treat it as a question: where and how do we record stats?)
>>>> 
>>>> Ian
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 1 http://wiki.apache.org/jackrabbit/Statistics
>>> 
>>> 
>>> --
>>> Felix Meschberger | Principal Scientist | Adobe
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> --
>> Carsten Ziegeler
>> cziegeler@apache.org

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

On 1 March 2013 20:17, Carsten Ziegeler <cz...@apache.org> wrote:
> Hi,
>
> I agree with Felix here - we should have a coherent story here and JMX
> looks like a good solution to me as well. As outlined in the post that
> started this thread, using JMX from an implementation pov is pretty
> simple.

To be honest extending StandardMBean is not the issue. Doing it in a
way thats safe and low impact is, as you pointed out when talking
about earlier attempts to generate stats. If each bundle has its own
implementation we will re-invent the wheel, and getting even simple
counters right with no synchronisation is not as simple as volatile
int x; i++; Its not hard by has to be done in the right way.

> So are there any concerns or problems for using JMX from a client pov?

Depends on who you talk to.
Most sysops people dont like JMX due to the cost of querying it and
most java devs do like it because it has GUI tools. Most sysops tools
have bridges that spin up a JVM, query JMX and convert the stats into
some other format, (Resmon XML, RRDTool). That uses a minimum of 64MB
of RAM and sysops people tend to get a bit agitated by that when they
compare it to what a python/bash/c command uses.

> No REST interface is not really an issue and I guess it would be easy
> to write a JMX over REST bridge :)

It would and it could live in a separate optional bundle for anyone
that wanted it and I could write a JMX Bean ontop of the map as I did
for the Jackrabbit RepositoryStatistics

Best Regards
Ian

>
> Regards
> Carsten
>
> 2013/3/1 Felix Meschberger <fm...@adobe.com>:
>> Hi
>>
>> I really appreciate this discussion. But I would like to get to a point where we create a proper future-proof (as much as possible) architecture which properly integrates with the current situation:
>>
>>   * JMX is the system of choice for systems management
>>   * The Web Console is the respective system of choice
>>     for web based interactive tooling
>>   * Don't reinvent wheels
>>
>> I would really like to highlight the last point: I would prefer to reuse existing functionality and libraries as much as possible instead of reinventing our own stuff using yet another channel.
>>
>> Regards
>> Felix
>>
>> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>>
>>> Hi,
>>> Whilst writing the MBeans in the event bundle I started thinking about
>>> monitoring inside Sling. IMHO there are not enough to really know what
>>> a instance under load is doing. Much though I like JMX it comes with
>>> implementation and runtime overhead which I don't much like.
>>>
>>> Runtime:
>>> * Running with JMX enabled doesn't add any overhead, but once a client
>>> is connected there is some (some reports upto 3% of resources).
>>> * You have to remember to enable it, and most of the time JVMs are not
>>> enabled. By the time you really need it, its often too late.
>>> * JMX is not restful.
>>>
>>> Implementation
>>> * MBeans are not that hard to implement with the OSGi Whiteboard, but
>>> they have to be implemented.
>>>
>>> Alternatives.
>>> In Jackarabbit there is/was a statistics class [1], which IIUC uses
>>> counters and time series stored in a map. The service can then be
>>> queries to extract the values by wrapping in an MBean or Servlet.
>>>
>>> I think the approach could be generalised and extended so that
>>> anything in the container could use the service to record metrics. The
>>> api might look something like
>>>
>>> public interface Statistics {
>>>
>>>      /**
>>>       * Increment a counter by 1
>>>       */
>>>      void increment(String counterName);
>>>
>>>      /**
>>>       * Record a double value in a timeseries.
>>>       */
>>>      void record(String timeSeriesName, double value);
>>>
>>>      /**
>>>       * Record a long value in a timeseries.
>>>       */
>>>      void record(String timeSeriesName, long value);
>>>
>>> }
>>>
>>> and (so that any reference can be optional on a service
>>> implementation, the final is a hint to hotspot to inline)
>>>
>>> public final class StatisticsUtils {
>>>
>>>  private StatisticsUtils() {
>>>  }
>>>
>>>  public static void increment(Statistics statistics, String counterName) {
>>>     if ( statistics != null ) {
>>>         statistics.increment(counterName);
>>>     }
>>>  }
>>>
>>> ... etc for the other methods ..
>>> }
>>>
>>>
>>>
>>>
>>> The service would need to deal with all the implementation details
>>> (including concurrency and speed). The service implementation would
>>> also come with a servlet endpoint (under /system/*) and/or single JMX
>>> MBean.
>>>
>>> Anything that wanted to record stats would then bind to the service
>>> and use it. I think this would avoid the issues mentioned above with
>>> wide scale MBean usage.
>>>
>>> WDYT?
>>>
>>> (apologies for the noise if this already exists, and if so, please
>>> treat it as a question: where and how do we record stats?)
>>>
>>> Ian
>>>
>>>
>>>
>>>
>>> 1 http://wiki.apache.org/jackrabbit/Statistics
>>
>>
>> --
>> Felix Meschberger | Principal Scientist | Adobe
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Carsten Ziegeler
> cziegeler@apache.org

Re: Monitoring and Statistics

Posted by Carsten Ziegeler <cz...@apache.org>.

Hi,

I agree with Felix here - we should have a coherent story here and JMX
looks like a good solution to me as well. As outlined in the post that
started this thread, using JMX from an implementation pov is pretty
simple.
So are there any concerns or problems for using JMX from a client pov?
No REST interface is not really an issue and I guess it would be easy
to write a JMX over REST bridge :)

Regards
Carsten

2013/3/1 Felix Meschberger <fm...@adobe.com>:
> Hi
>
> I really appreciate this discussion. But I would like to get to a point where we create a proper future-proof (as much as possible) architecture which properly integrates with the current situation:
>
>   * JMX is the system of choice for systems management
>   * The Web Console is the respective system of choice
>     for web based interactive tooling
>   * Don't reinvent wheels
>
> I would really like to highlight the last point: I would prefer to reuse existing functionality and libraries as much as possible instead of reinventing our own stuff using yet another channel.
>
> Regards
> Felix
>
> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>
>> Hi,
>> Whilst writing the MBeans in the event bundle I started thinking about
>> monitoring inside Sling. IMHO there are not enough to really know what
>> a instance under load is doing. Much though I like JMX it comes with
>> implementation and runtime overhead which I don't much like.
>>
>> Runtime:
>> * Running with JMX enabled doesn't add any overhead, but once a client
>> is connected there is some (some reports upto 3% of resources).
>> * You have to remember to enable it, and most of the time JVMs are not
>> enabled. By the time you really need it, its often too late.
>> * JMX is not restful.
>>
>> Implementation
>> * MBeans are not that hard to implement with the OSGi Whiteboard, but
>> they have to be implemented.
>>
>> Alternatives.
>> In Jackarabbit there is/was a statistics class [1], which IIUC uses
>> counters and time series stored in a map. The service can then be
>> queries to extract the values by wrapping in an MBean or Servlet.
>>
>> I think the approach could be generalised and extended so that
>> anything in the container could use the service to record metrics. The
>> api might look something like
>>
>> public interface Statistics {
>>
>>      /**
>>       * Increment a counter by 1
>>       */
>>      void increment(String counterName);
>>
>>      /**
>>       * Record a double value in a timeseries.
>>       */
>>      void record(String timeSeriesName, double value);
>>
>>      /**
>>       * Record a long value in a timeseries.
>>       */
>>      void record(String timeSeriesName, long value);
>>
>> }
>>
>> and (so that any reference can be optional on a service
>> implementation, the final is a hint to hotspot to inline)
>>
>> public final class StatisticsUtils {
>>
>>  private StatisticsUtils() {
>>  }
>>
>>  public static void increment(Statistics statistics, String counterName) {
>>     if ( statistics != null ) {
>>         statistics.increment(counterName);
>>     }
>>  }
>>
>> ... etc for the other methods ..
>> }
>>
>>
>>
>>
>> The service would need to deal with all the implementation details
>> (including concurrency and speed). The service implementation would
>> also come with a servlet endpoint (under /system/*) and/or single JMX
>> MBean.
>>
>> Anything that wanted to record stats would then bind to the service
>> and use it. I think this would avoid the issues mentioned above with
>> wide scale MBean usage.
>>
>> WDYT?
>>
>> (apologies for the noise if this already exists, and if so, please
>> treat it as a question: where and how do we record stats?)
>>
>> Ian
>>
>>
>>
>>
>> 1 http://wiki.apache.org/jackrabbit/Statistics
>
>
> --
> Felix Meschberger | Principal Scientist | Adobe
>
>
>
>
>
>
>



-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Monitoring and Statistics

Posted by Ian Boston <ie...@tfd.co.uk>.

Hi,

Ok, I can see you dont like the concept.
I'll abandon it and use JMX beans in each bundle where appropriate,
which for anyone deploying in a cluster to be monitored will require
local monitoring scripts like OmniTI's Jezebel, Munins JMX plugin or
Splunks JMX plugin.


BTW the idea for the approach (central monitoring simple counters and
values delivered over http rather than JMX) comes from the Resmon
protocol used bu OmniTI as the native protocol for Reconnoiter. If you
are interested you can read goals of that system[1] or the a rant[2],
which is a bit at the extreem end of the spectrum, and its probably
valid to say not relevant for Sling. BTW2: I dont want to reinvent
Reconnoiter or exclusively bind to it, bit I would have like to feed
it and its competitors efficiently with minimal impact or OS level
configuration on each deployed instance.

Best Regards
Ian


1 http://labs.omniti.com/labs/reconnoiter/wiki/Goals
2 http://lethargy.org/~jesus/writes/reconnoiter-and-another-platform

On 1 March 2013 18:51, Felix Meschberger <fm...@adobe.com> wrote:
> Hi
>
> I really appreciate this discussion. But I would like to get to a point where we create a proper future-proof (as much as possible) architecture which properly integrates with the current situation:
>
>   * JMX is the system of choice for systems management
>   * The Web Console is the respective system of choice
>     for web based interactive tooling
>   * Don't reinvent wheels
>
> I would really like to highlight the last point: I would prefer to reuse existing functionality and libraries as much as possible instead of reinventing our own stuff using yet another channel.
>
> Regards
> Felix
>
> Am 22.02.2013 um 23:59 schrieb Ian Boston:
>
>> Hi,
>> Whilst writing the MBeans in the event bundle I started thinking about
>> monitoring inside Sling. IMHO there are not enough to really know what
>> a instance under load is doing. Much though I like JMX it comes with
>> implementation and runtime overhead which I don't much like.
>>
>> Runtime:
>> * Running with JMX enabled doesn't add any overhead, but once a client
>> is connected there is some (some reports upto 3% of resources).
>> * You have to remember to enable it, and most of the time JVMs are not
>> enabled. By the time you really need it, its often too late.
>> * JMX is not restful.
>>
>> Implementation
>> * MBeans are not that hard to implement with the OSGi Whiteboard, but
>> they have to be implemented.
>>
>> Alternatives.
>> In Jackarabbit there is/was a statistics class [1], which IIUC uses
>> counters and time series stored in a map. The service can then be
>> queries to extract the values by wrapping in an MBean or Servlet.
>>
>> I think the approach could be generalised and extended so that
>> anything in the container could use the service to record metrics. The
>> api might look something like
>>
>> public interface Statistics {
>>
>>      /**
>>       * Increment a counter by 1
>>       */
>>      void increment(String counterName);
>>
>>      /**
>>       * Record a double value in a timeseries.
>>       */
>>      void record(String timeSeriesName, double value);
>>
>>      /**
>>       * Record a long value in a timeseries.
>>       */
>>      void record(String timeSeriesName, long value);
>>
>> }
>>
>> and (so that any reference can be optional on a service
>> implementation, the final is a hint to hotspot to inline)
>>
>> public final class StatisticsUtils {
>>
>>  private StatisticsUtils() {
>>  }
>>
>>  public static void increment(Statistics statistics, String counterName) {
>>     if ( statistics != null ) {
>>         statistics.increment(counterName);
>>     }
>>  }
>>
>> ... etc for the other methods ..
>> }
>>
>>
>>
>>
>> The service would need to deal with all the implementation details
>> (including concurrency and speed). The service implementation would
>> also come with a servlet endpoint (under /system/*) and/or single JMX
>> MBean.
>>
>> Anything that wanted to record stats would then bind to the service
>> and use it. I think this would avoid the issues mentioned above with
>> wide scale MBean usage.
>>
>> WDYT?
>>
>> (apologies for the noise if this already exists, and if so, please
>> treat it as a question: where and how do we record stats?)
>>
>> Ian
>>
>>
>>
>>
>> 1 http://wiki.apache.org/jackrabbit/Statistics
>
>
> --
> Felix Meschberger | Principal Scientist | Adobe
>
>
>
>
>
>
>

Re: Monitoring and Statistics

Posted by Felix Meschberger <fm...@adobe.com>.

Hi

I really appreciate this discussion. But I would like to get to a point where we create a proper future-proof (as much as possible) architecture which properly integrates with the current situation:

  * JMX is the system of choice for systems management
  * The Web Console is the respective system of choice
    for web based interactive tooling
  * Don't reinvent wheels

I would really like to highlight the last point: I would prefer to reuse existing functionality and libraries as much as possible instead of reinventing our own stuff using yet another channel.

Regards
Felix

Am 22.02.2013 um 23:59 schrieb Ian Boston:

> Hi,
> Whilst writing the MBeans in the event bundle I started thinking about
> monitoring inside Sling. IMHO there are not enough to really know what
> a instance under load is doing. Much though I like JMX it comes with
> implementation and runtime overhead which I don't much like.
> 
> Runtime:
> * Running with JMX enabled doesn't add any overhead, but once a client
> is connected there is some (some reports upto 3% of resources).
> * You have to remember to enable it, and most of the time JVMs are not
> enabled. By the time you really need it, its often too late.
> * JMX is not restful.
> 
> Implementation
> * MBeans are not that hard to implement with the OSGi Whiteboard, but
> they have to be implemented.
> 
> Alternatives.
> In Jackarabbit there is/was a statistics class [1], which IIUC uses
> counters and time series stored in a map. The service can then be
> queries to extract the values by wrapping in an MBean or Servlet.
> 
> I think the approach could be generalised and extended so that
> anything in the container could use the service to record metrics. The
> api might look something like
> 
> public interface Statistics {
> 
>      /**
>       * Increment a counter by 1
>       */
>      void increment(String counterName);
> 
>      /**
>       * Record a double value in a timeseries.
>       */
>      void record(String timeSeriesName, double value);
> 
>      /**
>       * Record a long value in a timeseries.
>       */
>      void record(String timeSeriesName, long value);
> 
> }
> 
> and (so that any reference can be optional on a service
> implementation, the final is a hint to hotspot to inline)
> 
> public final class StatisticsUtils {
> 
>  private StatisticsUtils() {
>  }
> 
>  public static void increment(Statistics statistics, String counterName) {
>     if ( statistics != null ) {
>         statistics.increment(counterName);
>     }
>  }
> 
> ... etc for the other methods ..
> }
> 
> 
> 
> 
> The service would need to deal with all the implementation details
> (including concurrency and speed). The service implementation would
> also come with a servlet endpoint (under /system/*) and/or single JMX
> MBean.
> 
> Anything that wanted to record stats would then bind to the service
> and use it. I think this would avoid the issues mentioned above with
> wide scale MBean usage.
> 
> WDYT?
> 
> (apologies for the noise if this already exists, and if so, please
> treat it as a question: where and how do we record stats?)
> 
> Ian
> 
> 
> 
> 
> 1 http://wiki.apache.org/jackrabbit/Statistics


--
Felix Meschberger | Principal Scientist | Adobe