You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Kirk Lund <kl...@apache.org> on 2018/11/27 00:08:38 UTC

SampleHandler interface and handling of statistics samples

SampleHandler is in the org.apache.geode.internal.statistics package.

The SampleHandler interface was originally introduced to allow multiple
handlers to receive notification of a statistics sample. Before this
interface, the StatSampler used a StatArchiveWriter directly to write to
the .gfs (Geode Statistics archive file).

The StatSampler takes a sample of all statistics (every second by default)
and then invokes every registered SampleHandler. The interface defines 4
methods:

*1) sampled* -- invoked whenever there's a sample taken -- provides a List
of ResourceInstances with latest values
*2) allocatedResourceType* -- invoked when a new ResourceType has been
defined -- this may be caused by enabling a feature or adding a new
instance of a feature or the User may have used the Statistics API to
define a new Statistics ResourceType
*3) allocatedResourceInstance* -- invoked when a new ResourceInstance of a
ResourceType is created
*4) destroyedResourceInstance* -- invoked when a ResourceInstance is
destroyed

A ResourceType is basically a definition of a bucket of key/value stats.
Within Geode, each ResourceType is typically a class such as
DistributionStats. A ResourceInstance is an instance of that with actual
values. One ResourceType may have many ResourceInstances.

The format of a .gfs file closely follows those 4 methods. ResourceType
definition is written to the .gfs file before any sample can contain a
ResourceInstance of that type. A ResourceInstance allocated record is also
written to the .gfs file before any sample may contain any values for that
instance. A ResourceInstance destroyed record is written indicating the end
of lifecycle for a particular instance.

There are two main implementations of SampleHandler:

*A) StatArchiveHandler* -- mechanism that writes binary data to the .gfs
(Geode Statistics File)
*B) StatMonitorHandler* -- mechanism that notifies objects that are clients
to the statistics monitoring API (which is what updates the metrics
attributes on Geode mbeans

The purpose of this design is to allow a developer to build a new
implementation of SampleHandler that writes out to another format or
defines some sort of custom consumer of Statistics samples. For example,
you could define a MicrometerHandler which adapts Geode stats into
Micrometer, or you could define an entirely new custom file format of stats.

By default, the StatSampler samples the statistics every second and there
is only one StatSampler thread, so stat sampling is sensitive to the
performance of any implementation of SampleHandler. For example, you might
not want to define a handler that writes to a RDB over the network.

Performance problems in a SampleHandler could cause gaps in stats values,
delayed updating of Geode mbeans for monitoring tools, and it could cause
someone who is analyzing Geode artifacts to think that a stop-the-world GC
pause has occurred -- the time between stat samples is generally assumed to
be fairly consistent unless a serious GC pause occurs.