You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Jay Kreps <ja...@gmail.com> on 2014/02/06 21:51:11 UTC

Metrics in new producer

Hey guys,

I wanted to kick off a quick discussion of metrics with respect to the new
producer and consumer (and potentially the server).

At a high level I think there are three approaches we could take:
1. Plain vanilla JMX
2. Use Coda Hale (AKA Yammer) Metrics
3. Do our own metrics (with JMX as one output)

1. Has the advantage that JMX is the most commonly used java thing and
plugs in reasonably to most metrics systems. JMX is included in the JDK so
it doesn't impose any additional dependencies on clients. It has the
disadvantage that plain vanilla JMX is a pain to use. We would need a bunch
of helper code for maintaining counters to make this reasonable.

2. Coda Hale metrics is pretty good and broadly used. It supports JMX
output as well as direct output to many other types of systems. The primary
downside we have had with Coda Hale has to do with the clients and library
incompatibilities. We are currently on an older more popular version. The
newer version is a rewrite of the APIs and is incompatible. Originally
these were totally incompatible and people had to choose one or the other.
I think that has been improved so now the new version is a totally
different package. But even in this case you end up with both versions if
you use Kafka and we are on a different version than you which is going to
be pretty inconvenient.

3. Doing our own has the downside of potentially reinventing the wheel, and
potentially needing to work out any bugs in our code. The upsides would
depend on the how good the reinvention was. As it happens I did a quick
(~900 loc) version of a metrics library that is under kafka.common.metrics.
I think it has some advantages over the Yammer metrics package for our
usage beyond just not causing incompatibilities. I will describe this code
so we can discuss the pros and cons. Although I favor this approach I have
no emotional attachment and wouldn't be too sad if I ended up deleting it.
Here are javadocs for this code, though I haven't written much
documentation yet since I might end up deleting it:

Here is a quick overview of this library.

There are three main public interfaces:
  Metrics - This is a repository of metrics being tracked.
  Metric - A single, named numerical value being measured (i.e. a counter).
  Sensor - This is a thing that records values and updates zero or more
metrics

So let's say we want to track three values about message sizes;
specifically say we want to record the average, the maximum, the total rate
of bytes being sent, and a count of messages. Then we would do something
like this:

   // setup code
   Metrics metrics = new Metrics(); // this is a global "singleton"
   Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
   sensor.add("kafka.producer.message-size.avg", new Avg());
   sensor.add("kafka.producer.message-size.max", new Max());
   sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
   sensor.add("kafka.producer.message-count", new Count());

   // now when we get a message we do this
   sensor.record(messageSize);

The above code creates the global metrics repository, creates a single
Sensor, and defines 5 named metrics that are updated by that Sensor.

Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
off the metric names not the Sensor names, which I think is an
improvement--I just use the convention that the last portion of the name is
the attribute name, the second to last is the mbean name, and the rest is
the package. So in the above example there is a producer mbean that has a
avg and max attribute and a producer mbean that has a bytes-sent-per-sec
and message-count attribute. This is nice because you can logically group
the values reported irrespective of where in the program they are
computed--that is an mbean can logically group attributes computed off
different sensors. This means you can report values by logical subsystem.

I also allow the concept of hierarchical Sensors which I think is a good
convenience. I have noticed a common pattern in systems where you need to
roll up the same values along different dimensions. An simple example is
metrics about qps, data rate, etc on the broker. These we want to capture
in aggregate, but also broken down by topic-id. You can do this purely by
defining the sensor hierarchy:
Sensor allSizes = metrics.sensor("kafka.producer.sizes");
Sensor topicSizes = metrics.sensor("kafka.producer." + topic  + ".sizes",
allSizes);
Now each actual update will go to the appropriate topicSizes sensor (based
on the topic name), but allSizes metrics will get updated too. I also
support multiple parents for each sensor as well as multiple layers of
hiearchy, so you can define a more elaborate DAG of sensors. An example of
how this would be useful is if you wanted to record your metrics broken
down by topic AND client id as well as the global aggregate.

Each metric can take a configurable Quota value which allows us to limit
the maximum value of that sensor. This is intended for use on the server as
part of our Quota implementation. The way this works is that you record
metrics as usual:
   mySensor.record(42.0)
However if this event occurance causes one of the metrics to exceed its
maximum allowable value (the quota) this call will throw a
QuotaViolationException. The cool thing about this is that it means we can
define quotas on anything we capture metrics for, which I think is pretty
cool.

Another question is how to handle windowing of the values? Metrics want to
record the "current" value, but the definition of current is inherently
nebulous. A few of the obvious gotchas are that if you define "current" to
be a number of events you can end up measuring an arbitrarily long window
of time if the event rate is low (e.g. you think you are getting 50
messages/sec because that was the rate yesterday when all events topped).

Here is how I approach this. All the metrics use the same windowing
approach. We define a single window by a length of time or number of values
(you can use either or both--if both the window ends when *either* the time
bound or event bound is hit). The typical problem with hard window
boundaries is that at the beginning of the window you have no data and the
first few samples are too small to be a valid sample. (Consider if you were
keeping an avg and the first value in the window happens to be very very
high, if you check the avg at this exact time you will conclude the avg is
very high but on a sample size of one). One simple fix would be to always
report the last complete window, however this is not appropriate here
because (1) we want to drive quotas off it so it needs to be current, and
(2) since this is for monitoring you kind of care more about the current
state. The ideal solution here would be to define a backwards looking
sliding window from the present, but many statistics are actually very hard
to compute in this model without retaining all the values which would be
hopelessly inefficient. My solution to this is to keep a configurable
number of windows (default is two) and combine them for the estimate. So in
a two sample case depending on when you ask you have between one and two
complete samples worth of data to base the answer off of. Provided the
sample window is large enough to get a valid result this satisfies both of
my criteria of incorporating the most recent data and having reasonable
variance at all times.

Another approach is to use an exponential weighting scheme to combine all
history but emphasize the recent past. I have not done this as it has a lot
of issues for practical operational metrics. I'd be happy to elaborate on
this if anyone cares...

The window size for metrics has a global default which can be overridden at
either the sensor or individual metric level.

In addition to these time series values the user can directly expose some
method of their choosing JMX-style by implementing the Measurable interface
and registering that value. E.g.
  metrics.addMetric("my.metric", new Measurable() {
    public double measure(MetricConfg config, long now) {
       return this.calculateValueToExpose();
    }
  });
This is useful for exposing things like the accumulator free memory.

The set of metrics is extensible, new metrics can be added by just
implementing the appropriate interfaces and registering with a sensor. I
implement the following metrics:
  total - the sum of all values from the given sensor
  count - a windowed count of values from the sensor
  avg - the sample average within the windows
  max - the max over the windows
  min - the min over the windows
  rate - the rate in the windows (e.g. the total or count divided by the
ellapsed time)
  percentiles - a collection of percentiles computed over the window

My approach to percentiles is a little different from the yammer metrics
package. My complaint about the yammer metrics approach is that it uses
rather expensive sampling and uses kind of a lot of memory to get a
reasonable sample. This is problematic for per-topic measurements.

Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0) which
directly allows you to specify the desired memory use. Any value below the
minimum is recorded as -Infinity and any value above the maximum as
+Infinity. I think this is okay as all metrics have an expected range
except for latency which can be arbitrarily large, but for very high
latency there is no need to model it exactly (e.g. 30 seconds + really is
effectively infinite). Within the range values are recorded in buckets
which can be either fixed width or increasing width. The increasing width
is analogous to the idea of significant figures, that is if your value is
in the range 0-10 you might want to be accurate to within 1ms, but if it is
20000 there is no need to be so accurate. I implemented a linear bucket
size where the Nth bucket has width proportional to N. An exponential
bucket size would also be sensible and could likely be derived directly
from the floating point representation of a the value.

I'd like to get some feedback on this metrics code and make a decision on
whether we want to use it before I actually go ahead and add all the
instrumentation in the code (otherwise I'll have to redo it if we switch
approaches). So the next topic of discussion will be which actual metrics
to add.

-Jay

Re: Metrics in new producer

Posted by Clark Breyman <cl...@breyman.com>.

Jay,

Not a kafka dev (yet), but I really like having Coda metrics in Kafka since
it simplifies the integration with dropwizard services, which I use a ton
of. Using coda metrics in kafka means that they should automatically be
surfaced on the dropwizard metrics REST endpoint as well as JMX and any
other reporter I have configured.

A separate thread might be good to discuss the shortcomings of coda metrics
- if that means a push to either upgrade Kafka to a newer release of
metrics (pref aligned with either DW 6.2 or 7.0) or an enhancement to the
metrics library itself.

Thx,
Clark


On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:

> Hey guys,
>
> I wanted to kick off a quick discussion of metrics with respect to the new
> producer and consumer (and potentially the server).
>
> At a high level I think there are three approaches we could take:
> 1. Plain vanilla JMX
> 2. Use Coda Hale (AKA Yammer) Metrics
> 3. Do our own metrics (with JMX as one output)
>
> 1. Has the advantage that JMX is the most commonly used java thing and
> plugs in reasonably to most metrics systems. JMX is included in the JDK so
> it doesn't impose any additional dependencies on clients. It has the
> disadvantage that plain vanilla JMX is a pain to use. We would need a bunch
> of helper code for maintaining counters to make this reasonable.
>
> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> output as well as direct output to many other types of systems. The primary
> downside we have had with Coda Hale has to do with the clients and library
> incompatibilities. We are currently on an older more popular version. The
> newer version is a rewrite of the APIs and is incompatible. Originally
> these were totally incompatible and people had to choose one or the other.
> I think that has been improved so now the new version is a totally
> different package. But even in this case you end up with both versions if
> you use Kafka and we are on a different version than you which is going to
> be pretty inconvenient.
>
> 3. Doing our own has the downside of potentially reinventing the wheel, and
> potentially needing to work out any bugs in our code. The upsides would
> depend on the how good the reinvention was. As it happens I did a quick
> (~900 loc) version of a metrics library that is under kafka.common.metrics.
> I think it has some advantages over the Yammer metrics package for our
> usage beyond just not causing incompatibilities. I will describe this code
> so we can discuss the pros and cons. Although I favor this approach I have
> no emotional attachment and wouldn't be too sad if I ended up deleting it.
> Here are javadocs for this code, though I haven't written much
> documentation yet since I might end up deleting it:
>
> Here is a quick overview of this library.
>
> There are three main public interfaces:
>   Metrics - This is a repository of metrics being tracked.
>   Metric - A single, named numerical value being measured (i.e. a counter).
>   Sensor - This is a thing that records values and updates zero or more
> metrics
>
> So let's say we want to track three values about message sizes;
> specifically say we want to record the average, the maximum, the total rate
> of bytes being sent, and a count of messages. Then we would do something
> like this:
>
>    // setup code
>    Metrics metrics = new Metrics(); // this is a global "singleton"
>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
>    sensor.add("kafka.producer.message-size.avg", new Avg());
>    sensor.add("kafka.producer.message-size.max", new Max());
>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
>    sensor.add("kafka.producer.message-count", new Count());
>
>    // now when we get a message we do this
>    sensor.record(messageSize);
>
> The above code creates the global metrics repository, creates a single
> Sensor, and defines 5 named metrics that are updated by that Sensor.
>
> Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> off the metric names not the Sensor names, which I think is an
> improvement--I just use the convention that the last portion of the name is
> the attribute name, the second to last is the mbean name, and the rest is
> the package. So in the above example there is a producer mbean that has a
> avg and max attribute and a producer mbean that has a bytes-sent-per-sec
> and message-count attribute. This is nice because you can logically group
> the values reported irrespective of where in the program they are
> computed--that is an mbean can logically group attributes computed off
> different sensors. This means you can report values by logical subsystem.
>
> I also allow the concept of hierarchical Sensors which I think is a good
> convenience. I have noticed a common pattern in systems where you need to
> roll up the same values along different dimensions. An simple example is
> metrics about qps, data rate, etc on the broker. These we want to capture
> in aggregate, but also broken down by topic-id. You can do this purely by
> defining the sensor hierarchy:
> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  + ".sizes",
> allSizes);
> Now each actual update will go to the appropriate topicSizes sensor (based
> on the topic name), but allSizes metrics will get updated too. I also
> support multiple parents for each sensor as well as multiple layers of
> hiearchy, so you can define a more elaborate DAG of sensors. An example of
> how this would be useful is if you wanted to record your metrics broken
> down by topic AND client id as well as the global aggregate.
>
> Each metric can take a configurable Quota value which allows us to limit
> the maximum value of that sensor. This is intended for use on the server as
> part of our Quota implementation. The way this works is that you record
> metrics as usual:
>    mySensor.record(42.0)
> However if this event occurance causes one of the metrics to exceed its
> maximum allowable value (the quota) this call will throw a
> QuotaViolationException. The cool thing about this is that it means we can
> define quotas on anything we capture metrics for, which I think is pretty
> cool.
>
> Another question is how to handle windowing of the values? Metrics want to
> record the "current" value, but the definition of current is inherently
> nebulous. A few of the obvious gotchas are that if you define "current" to
> be a number of events you can end up measuring an arbitrarily long window
> of time if the event rate is low (e.g. you think you are getting 50
> messages/sec because that was the rate yesterday when all events topped).
>
> Here is how I approach this. All the metrics use the same windowing
> approach. We define a single window by a length of time or number of values
> (you can use either or both--if both the window ends when *either* the time
> bound or event bound is hit). The typical problem with hard window
> boundaries is that at the beginning of the window you have no data and the
> first few samples are too small to be a valid sample. (Consider if you were
> keeping an avg and the first value in the window happens to be very very
> high, if you check the avg at this exact time you will conclude the avg is
> very high but on a sample size of one). One simple fix would be to always
> report the last complete window, however this is not appropriate here
> because (1) we want to drive quotas off it so it needs to be current, and
> (2) since this is for monitoring you kind of care more about the current
> state. The ideal solution here would be to define a backwards looking
> sliding window from the present, but many statistics are actually very hard
> to compute in this model without retaining all the values which would be
> hopelessly inefficient. My solution to this is to keep a configurable
> number of windows (default is two) and combine them for the estimate. So in
> a two sample case depending on when you ask you have between one and two
> complete samples worth of data to base the answer off of. Provided the
> sample window is large enough to get a valid result this satisfies both of
> my criteria of incorporating the most recent data and having reasonable
> variance at all times.
>
> Another approach is to use an exponential weighting scheme to combine all
> history but emphasize the recent past. I have not done this as it has a lot
> of issues for practical operational metrics. I'd be happy to elaborate on
> this if anyone cares...
>
> The window size for metrics has a global default which can be overridden at
> either the sensor or individual metric level.
>
> In addition to these time series values the user can directly expose some
> method of their choosing JMX-style by implementing the Measurable interface
> and registering that value. E.g.
>   metrics.addMetric("my.metric", new Measurable() {
>     public double measure(MetricConfg config, long now) {
>        return this.calculateValueToExpose();
>     }
>   });
> This is useful for exposing things like the accumulator free memory.
>
> The set of metrics is extensible, new metrics can be added by just
> implementing the appropriate interfaces and registering with a sensor. I
> implement the following metrics:
>   total - the sum of all values from the given sensor
>   count - a windowed count of values from the sensor
>   avg - the sample average within the windows
>   max - the max over the windows
>   min - the min over the windows
>   rate - the rate in the windows (e.g. the total or count divided by the
> ellapsed time)
>   percentiles - a collection of percentiles computed over the window
>
> My approach to percentiles is a little different from the yammer metrics
> package. My complaint about the yammer metrics approach is that it uses
> rather expensive sampling and uses kind of a lot of memory to get a
> reasonable sample. This is problematic for per-topic measurements.
>
> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0) which
> directly allows you to specify the desired memory use. Any value below the
> minimum is recorded as -Infinity and any value above the maximum as
> +Infinity. I think this is okay as all metrics have an expected range
> except for latency which can be arbitrarily large, but for very high
> latency there is no need to model it exactly (e.g. 30 seconds + really is
> effectively infinite). Within the range values are recorded in buckets
> which can be either fixed width or increasing width. The increasing width
> is analogous to the idea of significant figures, that is if your value is
> in the range 0-10 you might want to be accurate to within 1ms, but if it is
> 20000 there is no need to be so accurate. I implemented a linear bucket
> size where the Nth bucket has width proportional to N. An exponential
> bucket size would also be sensible and could likely be derived directly
> from the floating point representation of a the value.
>
> I'd like to get some feedback on this metrics code and make a decision on
> whether we want to use it before I actually go ahead and add all the
> instrumentation in the code (otherwise I'll have to redo it if we switch
> approaches). So the next topic of discussion will be which actual metrics
> to add.
>
> -Jay
>

Re: Metrics in new producer

Posted by Neha Narkhede <ne...@gmail.com>.

Al though I'm not entirely sure about the approach taken for windowing and
histograms since it is difficult to say much before observing how well in
works in practice. However, I really like the hierarchical histograms,
inbuilt ability for quotas and flexible or extensible memory management.
Also like that we expose the ability to plug in reporters.

I'm in the favor of not depending on coda hale metrics due to the version
compatibility issues which are primarily a concern on the clients.

Overall, there is some overhead with reinventing the wheel but I see
significant benefits with the new features it exposes. So I'm +1 on
proceeding with approach #3.

Thanks,
Neha


On Thu, Feb 6, 2014 at 1:50 PM, Jay Kreps <ja...@gmail.com> wrote:

> Also, here is the javadoc for this package:
>
> http://empathybox.com/kafka-metrics-javadoc/index.html?kafka/common/metrics/package-summary.html
>
>
> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:
>
> > Hey guys,
> >
> > I wanted to kick off a quick discussion of metrics with respect to the
> new
> > producer and consumer (and potentially the server).
> >
> > At a high level I think there are three approaches we could take:
> > 1. Plain vanilla JMX
> > 2. Use Coda Hale (AKA Yammer) Metrics
> > 3. Do our own metrics (with JMX as one output)
> >
> > 1. Has the advantage that JMX is the most commonly used java thing and
> > plugs in reasonably to most metrics systems. JMX is included in the JDK
> so
> > it doesn't impose any additional dependencies on clients. It has the
> > disadvantage that plain vanilla JMX is a pain to use. We would need a
> bunch
> > of helper code for maintaining counters to make this reasonable.
> >
> > 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> > output as well as direct output to many other types of systems. The
> primary
> > downside we have had with Coda Hale has to do with the clients and
> library
> > incompatibilities. We are currently on an older more popular version. The
> > newer version is a rewrite of the APIs and is incompatible. Originally
> > these were totally incompatible and people had to choose one or the
> other.
> > I think that has been improved so now the new version is a totally
> > different package. But even in this case you end up with both versions if
> > you use Kafka and we are on a different version than you which is going
> to
> > be pretty inconvenient.
> >
> > 3. Doing our own has the downside of potentially reinventing the wheel,
> > and potentially needing to work out any bugs in our code. The upsides
> would
> > depend on the how good the reinvention was. As it happens I did a quick
> > (~900 loc) version of a metrics library that is under
> kafka.common.metrics.
> > I think it has some advantages over the Yammer metrics package for our
> > usage beyond just not causing incompatibilities. I will describe this
> code
> > so we can discuss the pros and cons. Although I favor this approach I
> have
> > no emotional attachment and wouldn't be too sad if I ended up deleting
> it.
> > Here are javadocs for this code, though I haven't written much
> > documentation yet since I might end up deleting it:
> >
> > Here is a quick overview of this library.
> >
> > There are three main public interfaces:
> >   Metrics - This is a repository of metrics being tracked.
> >   Metric - A single, named numerical value being measured (i.e. a
> counter).
> >   Sensor - This is a thing that records values and updates zero or more
> > metrics
> >
> > So let's say we want to track three values about message sizes;
> > specifically say we want to record the average, the maximum, the total
> rate
> > of bytes being sent, and a count of messages. Then we would do something
> > like this:
> >
> >    // setup code
> >    Metrics metrics = new Metrics(); // this is a global "singleton"
> >    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
> >    sensor.add("kafka.producer.message-size.avg", new Avg());
> >    sensor.add("kafka.producer.message-size.max", new Max());
> >    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> >    sensor.add("kafka.producer.message-count", new Count());
> >
> >    // now when we get a message we do this
> >    sensor.record(messageSize);
> >
> > The above code creates the global metrics repository, creates a single
> > Sensor, and defines 5 named metrics that are updated by that Sensor.
> >
> > Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> > JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> > off the metric names not the Sensor names, which I think is an
> > improvement--I just use the convention that the last portion of the name
> is
> > the attribute name, the second to last is the mbean name, and the rest is
> > the package. So in the above example there is a producer mbean that has a
> > avg and max attribute and a producer mbean that has a bytes-sent-per-sec
> > and message-count attribute. This is nice because you can logically group
> > the values reported irrespective of where in the program they are
> > computed--that is an mbean can logically group attributes computed off
> > different sensors. This means you can report values by logical subsystem.
> >
> > I also allow the concept of hierarchical Sensors which I think is a good
> > convenience. I have noticed a common pattern in systems where you need to
> > roll up the same values along different dimensions. An simple example is
> > metrics about qps, data rate, etc on the broker. These we want to capture
> > in aggregate, but also broken down by topic-id. You can do this purely by
> > defining the sensor hierarchy:
> > Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > Sensor topicSizes = metrics.sensor("kafka.producer." + topic  + ".sizes",
> > allSizes);
> > Now each actual update will go to the appropriate topicSizes sensor
> (based
> > on the topic name), but allSizes metrics will get updated too. I also
> > support multiple parents for each sensor as well as multiple layers of
> > hiearchy, so you can define a more elaborate DAG of sensors. An example
> of
> > how this would be useful is if you wanted to record your metrics broken
> > down by topic AND client id as well as the global aggregate.
> >
> > Each metric can take a configurable Quota value which allows us to limit
> > the maximum value of that sensor. This is intended for use on the server
> as
> > part of our Quota implementation. The way this works is that you record
> > metrics as usual:
> >    mySensor.record(42.0)
> > However if this event occurance causes one of the metrics to exceed its
> > maximum allowable value (the quota) this call will throw a
> > QuotaViolationException. The cool thing about this is that it means we
> can
> > define quotas on anything we capture metrics for, which I think is pretty
> > cool.
> >
> > Another question is how to handle windowing of the values? Metrics want
> to
> > record the "current" value, but the definition of current is inherently
> > nebulous. A few of the obvious gotchas are that if you define "current"
> to
> > be a number of events you can end up measuring an arbitrarily long window
> > of time if the event rate is low (e.g. you think you are getting 50
> > messages/sec because that was the rate yesterday when all events topped).
> >
> > Here is how I approach this. All the metrics use the same windowing
> > approach. We define a single window by a length of time or number of
> values
> > (you can use either or both--if both the window ends when *either* the
> > time bound or event bound is hit). The typical problem with hard window
> > boundaries is that at the beginning of the window you have no data and
> the
> > first few samples are too small to be a valid sample. (Consider if you
> were
> > keeping an avg and the first value in the window happens to be very very
> > high, if you check the avg at this exact time you will conclude the avg
> is
> > very high but on a sample size of one). One simple fix would be to always
> > report the last complete window, however this is not appropriate here
> > because (1) we want to drive quotas off it so it needs to be current, and
> > (2) since this is for monitoring you kind of care more about the current
> > state. The ideal solution here would be to define a backwards looking
> > sliding window from the present, but many statistics are actually very
> hard
> > to compute in this model without retaining all the values which would be
> > hopelessly inefficient. My solution to this is to keep a configurable
> > number of windows (default is two) and combine them for the estimate. So
> in
> > a two sample case depending on when you ask you have between one and two
> > complete samples worth of data to base the answer off of. Provided the
> > sample window is large enough to get a valid result this satisfies both
> of
> > my criteria of incorporating the most recent data and having reasonable
> > variance at all times.
> >
> > Another approach is to use an exponential weighting scheme to combine all
> > history but emphasize the recent past. I have not done this as it has a
> lot
> > of issues for practical operational metrics. I'd be happy to elaborate on
> > this if anyone cares...
> >
> > The window size for metrics has a global default which can be overridden
> > at either the sensor or individual metric level.
> >
> > In addition to these time series values the user can directly expose some
> > method of their choosing JMX-style by implementing the Measurable
> interface
> > and registering that value. E.g.
> >   metrics.addMetric("my.metric", new Measurable() {
> >     public double measure(MetricConfg config, long now) {
> >        return this.calculateValueToExpose();
> >     }
> >   });
> > This is useful for exposing things like the accumulator free memory.
> >
> > The set of metrics is extensible, new metrics can be added by just
> > implementing the appropriate interfaces and registering with a sensor. I
> > implement the following metrics:
> >   total - the sum of all values from the given sensor
> >   count - a windowed count of values from the sensor
> >   avg - the sample average within the windows
> >   max - the max over the windows
> >   min - the min over the windows
> >   rate - the rate in the windows (e.g. the total or count divided by the
> > ellapsed time)
> >   percentiles - a collection of percentiles computed over the window
> >
> > My approach to percentiles is a little different from the yammer metrics
> > package. My complaint about the yammer metrics approach is that it uses
> > rather expensive sampling and uses kind of a lot of memory to get a
> > reasonable sample. This is problematic for per-topic measurements.
> >
> > Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0) which
> > directly allows you to specify the desired memory use. Any value below
> the
> > minimum is recorded as -Infinity and any value above the maximum as
> > +Infinity. I think this is okay as all metrics have an expected range
> > except for latency which can be arbitrarily large, but for very high
> > latency there is no need to model it exactly (e.g. 30 seconds + really is
> > effectively infinite). Within the range values are recorded in buckets
> > which can be either fixed width or increasing width. The increasing width
> > is analogous to the idea of significant figures, that is if your value is
> > in the range 0-10 you might want to be accurate to within 1ms, but if it
> is
> > 20000 there is no need to be so accurate. I implemented a linear bucket
> > size where the Nth bucket has width proportional to N. An exponential
> > bucket size would also be sensible and could likely be derived directly
> > from the floating point representation of a the value.
> >
> > I'd like to get some feedback on this metrics code and make a decision on
> > whether we want to use it before I actually go ahead and add all the
> > instrumentation in the code (otherwise I'll have to redo it if we switch
> > approaches). So the next topic of discussion will be which actual metrics
> > to add.
> >
> > -Jay
> >
> >
> >
> >
>

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

Also, here is the javadoc for this package:
http://empathybox.com/kafka-metrics-javadoc/index.html?kafka/common/metrics/package-summary.html


On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:

> Hey guys,
>
> I wanted to kick off a quick discussion of metrics with respect to the new
> producer and consumer (and potentially the server).
>
> At a high level I think there are three approaches we could take:
> 1. Plain vanilla JMX
> 2. Use Coda Hale (AKA Yammer) Metrics
> 3. Do our own metrics (with JMX as one output)
>
> 1. Has the advantage that JMX is the most commonly used java thing and
> plugs in reasonably to most metrics systems. JMX is included in the JDK so
> it doesn't impose any additional dependencies on clients. It has the
> disadvantage that plain vanilla JMX is a pain to use. We would need a bunch
> of helper code for maintaining counters to make this reasonable.
>
> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> output as well as direct output to many other types of systems. The primary
> downside we have had with Coda Hale has to do with the clients and library
> incompatibilities. We are currently on an older more popular version. The
> newer version is a rewrite of the APIs and is incompatible. Originally
> these were totally incompatible and people had to choose one or the other.
> I think that has been improved so now the new version is a totally
> different package. But even in this case you end up with both versions if
> you use Kafka and we are on a different version than you which is going to
> be pretty inconvenient.
>
> 3. Doing our own has the downside of potentially reinventing the wheel,
> and potentially needing to work out any bugs in our code. The upsides would
> depend on the how good the reinvention was. As it happens I did a quick
> (~900 loc) version of a metrics library that is under kafka.common.metrics.
> I think it has some advantages over the Yammer metrics package for our
> usage beyond just not causing incompatibilities. I will describe this code
> so we can discuss the pros and cons. Although I favor this approach I have
> no emotional attachment and wouldn't be too sad if I ended up deleting it.
> Here are javadocs for this code, though I haven't written much
> documentation yet since I might end up deleting it:
>
> Here is a quick overview of this library.
>
> There are three main public interfaces:
>   Metrics - This is a repository of metrics being tracked.
>   Metric - A single, named numerical value being measured (i.e. a counter).
>   Sensor - This is a thing that records values and updates zero or more
> metrics
>
> So let's say we want to track three values about message sizes;
> specifically say we want to record the average, the maximum, the total rate
> of bytes being sent, and a count of messages. Then we would do something
> like this:
>
>    // setup code
>    Metrics metrics = new Metrics(); // this is a global "singleton"
>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
>    sensor.add("kafka.producer.message-size.avg", new Avg());
>    sensor.add("kafka.producer.message-size.max", new Max());
>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
>    sensor.add("kafka.producer.message-count", new Count());
>
>    // now when we get a message we do this
>    sensor.record(messageSize);
>
> The above code creates the global metrics repository, creates a single
> Sensor, and defines 5 named metrics that are updated by that Sensor.
>
> Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> off the metric names not the Sensor names, which I think is an
> improvement--I just use the convention that the last portion of the name is
> the attribute name, the second to last is the mbean name, and the rest is
> the package. So in the above example there is a producer mbean that has a
> avg and max attribute and a producer mbean that has a bytes-sent-per-sec
> and message-count attribute. This is nice because you can logically group
> the values reported irrespective of where in the program they are
> computed--that is an mbean can logically group attributes computed off
> different sensors. This means you can report values by logical subsystem.
>
> I also allow the concept of hierarchical Sensors which I think is a good
> convenience. I have noticed a common pattern in systems where you need to
> roll up the same values along different dimensions. An simple example is
> metrics about qps, data rate, etc on the broker. These we want to capture
> in aggregate, but also broken down by topic-id. You can do this purely by
> defining the sensor hierarchy:
> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  + ".sizes",
> allSizes);
> Now each actual update will go to the appropriate topicSizes sensor (based
> on the topic name), but allSizes metrics will get updated too. I also
> support multiple parents for each sensor as well as multiple layers of
> hiearchy, so you can define a more elaborate DAG of sensors. An example of
> how this would be useful is if you wanted to record your metrics broken
> down by topic AND client id as well as the global aggregate.
>
> Each metric can take a configurable Quota value which allows us to limit
> the maximum value of that sensor. This is intended for use on the server as
> part of our Quota implementation. The way this works is that you record
> metrics as usual:
>    mySensor.record(42.0)
> However if this event occurance causes one of the metrics to exceed its
> maximum allowable value (the quota) this call will throw a
> QuotaViolationException. The cool thing about this is that it means we can
> define quotas on anything we capture metrics for, which I think is pretty
> cool.
>
> Another question is how to handle windowing of the values? Metrics want to
> record the "current" value, but the definition of current is inherently
> nebulous. A few of the obvious gotchas are that if you define "current" to
> be a number of events you can end up measuring an arbitrarily long window
> of time if the event rate is low (e.g. you think you are getting 50
> messages/sec because that was the rate yesterday when all events topped).
>
> Here is how I approach this. All the metrics use the same windowing
> approach. We define a single window by a length of time or number of values
> (you can use either or both--if both the window ends when *either* the
> time bound or event bound is hit). The typical problem with hard window
> boundaries is that at the beginning of the window you have no data and the
> first few samples are too small to be a valid sample. (Consider if you were
> keeping an avg and the first value in the window happens to be very very
> high, if you check the avg at this exact time you will conclude the avg is
> very high but on a sample size of one). One simple fix would be to always
> report the last complete window, however this is not appropriate here
> because (1) we want to drive quotas off it so it needs to be current, and
> (2) since this is for monitoring you kind of care more about the current
> state. The ideal solution here would be to define a backwards looking
> sliding window from the present, but many statistics are actually very hard
> to compute in this model without retaining all the values which would be
> hopelessly inefficient. My solution to this is to keep a configurable
> number of windows (default is two) and combine them for the estimate. So in
> a two sample case depending on when you ask you have between one and two
> complete samples worth of data to base the answer off of. Provided the
> sample window is large enough to get a valid result this satisfies both of
> my criteria of incorporating the most recent data and having reasonable
> variance at all times.
>
> Another approach is to use an exponential weighting scheme to combine all
> history but emphasize the recent past. I have not done this as it has a lot
> of issues for practical operational metrics. I'd be happy to elaborate on
> this if anyone cares...
>
> The window size for metrics has a global default which can be overridden
> at either the sensor or individual metric level.
>
> In addition to these time series values the user can directly expose some
> method of their choosing JMX-style by implementing the Measurable interface
> and registering that value. E.g.
>   metrics.addMetric("my.metric", new Measurable() {
>     public double measure(MetricConfg config, long now) {
>        return this.calculateValueToExpose();
>     }
>   });
> This is useful for exposing things like the accumulator free memory.
>
> The set of metrics is extensible, new metrics can be added by just
> implementing the appropriate interfaces and registering with a sensor. I
> implement the following metrics:
>   total - the sum of all values from the given sensor
>   count - a windowed count of values from the sensor
>   avg - the sample average within the windows
>   max - the max over the windows
>   min - the min over the windows
>   rate - the rate in the windows (e.g. the total or count divided by the
> ellapsed time)
>   percentiles - a collection of percentiles computed over the window
>
> My approach to percentiles is a little different from the yammer metrics
> package. My complaint about the yammer metrics approach is that it uses
> rather expensive sampling and uses kind of a lot of memory to get a
> reasonable sample. This is problematic for per-topic measurements.
>
> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0) which
> directly allows you to specify the desired memory use. Any value below the
> minimum is recorded as -Infinity and any value above the maximum as
> +Infinity. I think this is okay as all metrics have an expected range
> except for latency which can be arbitrarily large, but for very high
> latency there is no need to model it exactly (e.g. 30 seconds + really is
> effectively infinite). Within the range values are recorded in buckets
> which can be either fixed width or increasing width. The increasing width
> is analogous to the idea of significant figures, that is if your value is
> in the range 0-10 you might want to be accurate to within 1ms, but if it is
> 20000 there is no need to be so accurate. I implemented a linear bucket
> size where the Nth bucket has width proportional to N. An exponential
> bucket size would also be sensible and could likely be derived directly
> from the floating point representation of a the value.
>
> I'd like to get some feedback on this metrics code and make a decision on
> whether we want to use it before I actually go ahead and add all the
> instrumentation in the code (otherwise I'll have to redo it if we switch
> approaches). So the next topic of discussion will be which actual metrics
> to add.
>
> -Jay
>
>
>
>

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

Hey guys,

I think this is the first item where we haven't had fairly obvious
consensus. I will call a vote if there isn't more discussion worth doing.

I think the alternatives I have understood:
1. JMX
2. Coda Hale
3. Custom package in there now with improvements (histograms, additional
reporters, etc)
4. New shim layer that abstracts both Coda Hale and new code

Is there anything else worth discussing or other alternatives we haven't
discussed?

-Jay


On Mon, Feb 24, 2014 at 9:56 AM, Martin Kleppmann
<mk...@linkedin.com>wrote:

> Hi Jay,
>
> 1. Agree with your assessment. Let me know when you start writing the
> metrics library to rule them all, I'm interested :-)
>
> 2. If I understood you correctly, you need to give an indication of what
> range of values you expect for a metric (e.g. a latency might be in
> log-scale of buckets between 0.1ms and 30,000ms) -- that's what I meant
> with distribution. Or did I get it wrong?
>
> There are a bunch of interesting algorithms for estimating percentiles
> with a small memory footprint. This is probably getting too far into
> yak-shaving territory, but just in case it's useful, this short literature
> survey may be useful:
>
> Chiranjeeb Buragohain and Subhash Suri: "Quantiles on Streams" in
> Encyclopedia of Database Systems, Springer, pp 2235-2240, 2009. ISBN:
> 978-0-387-35544-3 http://www.cs.ucsb.edu/~suri/psdir/ency.pdf
>
> 3. I don't know all the purposes for which quotas will be used, but
> instinctively your approach sounds good to me. I would be inclined to
> increase N a bit (perhaps to 4 or 5), to reduce the uncertainty introduced
> by the incomplete window, if memory usage allows.
>
> 5. Looking into open-sourcing it. I will also take a look at your code.
>
> Best,
> Martin
>
> On 22 Feb 2014, at 18:53, Jay Kreps <ja...@gmail.com> wrote:
> > Hey Martin,
> >
> > Thanks for the great feedback.
> >
> > 1. I agree with the problems of mixing moving window statistics with
> fixed
> > window statistics. That was one of my rationales. The other is that
> > weighted statistics are very unintuitive for people compared to simple
> > things like averages and percentiles so they fail a bit as an intuitive
> > monitoring mechanism. I actually think the moving windows are technically
> > superior since they don't have hard boundaries, but a naive
> implementation
> > based solely on events is actually totally wrong for the reasons you
> > describe, the weighting needs to take into account the point in time of
> the
> > estimate in its contribution to the average. This is an interesting
> problem
> > and I started to think about it but then decided that if I kept thinking
> > about it I would never get anything finished. When I retire I plan to
> write
> > a metrics library based solely on continuously weighted averages. :-)
> >
> > 2. Fair. To be clear you needn't encode a distribution, just your
> > preference about accuracy in the measurement. You are saying "I care
> > equally about accuracy in the whole range" or "I don't care about fine
> > grained accuracy when the numbers themselves are large".
> >
> > 3. The reason the exception is good is because the actual quota may be
> low
> > down in some part of the system, but a quota violation always needs to
> > unwind all the way back up to the API layer to return the error to the
> > client. So an exception is actually just what you need because the catch
> > will actually potentially be in a different place than the record() call.
> > This let's you introduce quotaing without each subsystem really needing
> to
> > know about it.
> >
> > Your point about whether or not you should count the current event when a
> > quota violation occurs is a good one. I actually think the right answer
> > depends on the details of how you handle windowing. For example one
> > approach to windowing I have seen is to use the most recent COMPLETE
> window
> > as the estimate while you fill up the current window. In this model then
> > with a 30 second window the estimate you give out is always 0-30 seconds
> > old. In this case you have a real problem with quotas because once the
> > previous window is filled and you are in violation of your quota you will
> > keep throwing exceptions regardless of the client behavior for the
> duration
> > of the next window. But worse if you aren't counting the requests that
> got
> > rejected then even though the client behavior is still bad your next
> window
> > will record no values (because you rejected them all as quota
> violations).
> > This is clearly a mess.
> >
> > But that isn't quite how I'm doing windowing. The way I do it is I always
> > keep N windows, with the last window being partial) and the estimate is
> > overall all windows. So with N=2 (the default) when you complete the
> > current window the previous window is cleared and used for to record the
> > new values. The downside of this is that with a 30 second window and N=2
> > your estimate is based on anything from 30 seconds to 60 seconds. The
> > upside is that the most recent data is always included. I feel this is
> > inherently important for monitoring. But it is particularly important for
> > Quotas. In this case I feel that it is always the right thing to NOT
> count
> > rejected measurements. Not that in this model let's say that the user
> goes
> > over their quota and stays that way for a sustained period of time. The
> > impact will not be the seesaw behavior I described where we reject all
> then
> > none of their requests, instead we will reject enough requests to keep
> them
> > under their quota.
> >
> > 5. I would definitely be interested to see the code if it is open source,
> > since I am interested in metrics. Overall since you went down this path I
> > would be interested to get your opinion on my code. If you think what you
> > did is better I would be open to discussing it as a third alternative
> too.
> > If we decide we do want to use this code for metrics then we may want to
> > implement a sampling histogram either in addition to or as a replacement
> > for the existing histograms and if you were up to contribute your
> > implementation that would be great.
> >
> > -Jay
> >
> >
> > On Sat, Feb 22, 2014 at 9:25 AM, Martin Kleppmann
> > <mk...@linkedin.com>wrote:
> >
> >> Not sure if you want yet another opinion added to the pile -- but since
> I
> >> had a similar problem on another project recently, I thought I'd weigh
> in.
> >> (On that project we were originally using Coda's library, but then
> switched
> >> to rolling our own metrics implementation because we needed to do a few
> >> things differently.)
> >>
> >> 1. Problems we encountered with Coda's library: it uses an
> >> exponentially-weighted moving average (EMWA) for rates (eg.
> messages/sec),
> >> and exponentially biased reservoir sampling for histograms (percentiles,
> >> averages). Those methods of calculation work well for events with a
> >> consistently high volume, but they give strange and misleading results
> for
> >> events that are bursty or rare (eg error rates). We found that a
> fixed-size
> >> window gives more predictable, easier-to-interpret results.
> >>
> >> 2. In defence of Coda's library, I think its histogram implementation
> is a
> >> good trade-off of memory for accuracy; I'm not totally convinced that
> your
> >> proposal (counts of events in a fixed set of buckets) would be much
> better.
> >> Would have to do some math to work out the expected accuracy in each
> case.
> >> The reservoir sampling can be configured to use a smaller sample if the
> >> default of 1028 samples is too expensive. Reservoir sampling also has
> the
> >> advantage that you don't need to hard-code a bucket distribution.
> >>
> >> 3. Quotas are an interesting use case. However, I'm not wild about
> using a
> >> QuotaViolationException for control flow -- I think an explicit
> conditional
> >> would be nicer than having to catch an exception. One question in that
> >> context: if a quota is exceeded, do you still want to count the event
> >> towards the metric, or do you want to stop counting it until the quota
> is
> >> replenished? The answer may depend on the particular metric.
> >>
> >> 4. If you decide to go with Coda's library, I would advocate isolating
> the
> >> dependency into a separate module and using it via a facade -- somewhat
> >> like using SLF4J instead of Log4j directly. It's ok for Coda's library
> to
> >> be the default metrics implementation, but it should be easy to swap it
> out
> >> for something different in case someone has a version conflict or
> differing
> >> requirements. The facade should be at a low level (individual events),
> not
> >> at the reporter level (which deals with pre-aggregated values, and is
> >> already pluggable).
> >>
> >> 5. If it's useful, I can probably contribute my simple (but imho
> >> effective) metrics library, for embedding into Kafka. It uses reservoir
> >> sampling for percentiles, like Coda's library, but uses a fixed-size
> window
> >> instead of an exponential bias, which avoids weird behaviour on bursty
> >> metrics.
> >>
> >> In summary, I would advocate one of the following approaches:
> >> - Coda Hale library via facade (allowing it to be swapped for something
> >> else), or
> >> - Own metrics implementation, provided that we have confidence in its
> >> implementation of percentiles.
> >>
> >> Martin
> >>
> >>
> >> On 22 Feb 2014, at 01:06, Jay Kreps <ja...@gmail.com> wrote:
> >>> Hey guys,
> >>>
> >>> Just picking up this thread again. I do want to drive a conclusion as I
> >>> will run out of work to do on the producer soon and will need to add
> >>> metrics of some sort. We can vote on it, but I'm not sure if we
> actually
> >>> got everything discussed.
> >>>
> >>> Joel, I wasn't fully sure how to interpret your comment. I think you
> are
> >>> saying you are cool with the new metrics package as long as it really
> is
> >>> better. Do you have any comment on whether you think the benefits I
> >>> outlined are worth it? I agree with you that we could hold off on a
> >> second
> >>> repo until someone else would actually want to use our code.
> >>>
> >>> Jun, I'm not averse to doing a sampling-based histogram and doing some
> >>> comparison between the two approaches if you think this approach is
> >>> otherwise better.
> >>>
> >>> Sriram, originally I thought you preferred just sticking to Coda Hale,
> >> but
> >>> after your follow-up email I wasn't really sure...
> >>>
> >>> Joe/Clark, yes this code allows pluggable reporting so you could have a
> >>> metrics reporter that just wraps each metric in a Coda Hale Gauge if
> that
> >>> is useful. Though obviously if enough people were doing that I would
> >> think
> >>> it would be worth just using the Coda Hale package directly...
> >>>
> >>> -Jay
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Feb 13, 2014 at 3:34 PM, Clark Breyman <cl...@breyman.com>
> >> wrote:
> >>>
> >>>> Not requiring the client to link Coda/Yammer metrics sounds like a
> >>>> compelling reason to pivot to new interfaces. If that's the agreed
> >>>> direction, I'm hoping that we'd get the choice of backend to provide
> >> (e.g.
> >>>> facade on Yammer metrics for those with an investment in that) rather
> >> than
> >>>> force the new backend.  Having a metrics factory seems better for this
> >> than
> >>>> directly instantiating the singleton registry.
> >>>>
> >>>>
> >>>> On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly>
> >> wrote:
> >>>>
> >>>>> Can we leave metrics and have multiple supported KafkaMetricsGroup
> >>>>> implementing a yammer based implementation?
> >>>>>
> >>>>> ProducerRequestStats with your configured analytics group?
> >>>>>
> >>>>> On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> I think we discussed the scala/java stuff more fully previously.
> >>>>>> Essentially the client is embedded everywhere. Scala is very
> >>>> incompatible
> >>>>>> with itself so this makes it very hard to use for people using
> >> anything
> >>>>>> else in scala. Also Scala stack traces are very confusing. Basically
> >> we
> >>>>>> thought plain java code would be a lot easier for people to use.
> Even
> >>>> if
> >>>>>> Scala is more fun to write, that isn't really what we are optimizing
> >>>> for.
> >>>>>>
> >>>>>> -Jay
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com>
> >> wrote:
> >>>>>>
> >>>>>>> Jay, pretty impressive how you just write a 'quick version' like
> that
> >>>>> :)
> >>>>>>> Not to get off-topic but why didn't you write this in scala?
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> I have not had a chance to review the new metrics code and its
> >>>>>>>> features carefully (apart from your write-up), but here are my
> >>>>> general
> >>>>>>>> thoughts:
> >>>>>>>>
> >>>>>>>> Implementing a metrics package correctly is difficult; more so for
> >>>>>>>> people like me, because I'm not a statistician.  However, if this
> >>>> new
> >>>>>>>> package: {(i) functions correctly (and we need to define and prove
> >>>>>>>> correctness), (ii) is easy to use, (iii) serves all our current
> and
> >>>>>>>> anticipated monitoring needs, (iv) is not overly complex that it
> >>>>>>>> becomes a burden to maintain and we are better of with an
> available
> >>>>>>>> library;} then I think it makes sense to embed it and use it
> within
> >>>>>>>> the Kafka code. The main wins are: (i) predictability (no changing
> >>>>>>>> APIs and intimate knowledge of the code) and (ii) control with
> >>>>> respect
> >>>>>>>> to both functionality (e.g., there are hard-coded decay constants
> >>>> in
> >>>>>>>> metrics-core 2.x) and correctness (i.e., if we find a bug in the
> >>>>>>>> metrics package we have to submit a pull request and wait for it
> to
> >>>>>>>> become mainstream).  I'm not sure it would help very much to pull
> >>>> it
> >>>>>>>> into a separate repo because that could potentially annul these
> >>>>>>>> benefits.
> >>>>>>>>
> >>>>>>>> Joel
> >>>>>>>>
> >>>>>>>> On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> >>>>>>>>> Sriram,
> >>>>>>>>>
> >>>>>>>>> Makes sense. I am cool moving this stuff into its own repo if
> >>>>> people
> >>>>>>>> think
> >>>>>>>>> that is better. I'm not sure it would get much contribution but
> >>>>> when
> >>>>>> I
> >>>>>>>>> started messing with this I did have a lot of grand ideas of
> >>>> making
> >>>>>>>> adding
> >>>>>>>>> metrics to a sensor dynamic so you could add more stuff in
> >>>>>>> real-time(via
> >>>>>>>>> jmx, say) and/or externalize all your metrics and config to a
> >>>>>> separate
> >>>>>>>> file
> >>>>>>>>> like log4j with only the points of instrumentation hard-coded.
> >>>>>>>>>
> >>>>>>>>> -Jay
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> >>>>>>>>> srsubramanian@linkedin.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> I am actually neutral to this change. I found the replies were
> >>>>> more
> >>>>>>>>>> towards the implementation and features so far. I would like
> >>>> the
> >>>>>>>> community
> >>>>>>>>>> to think about the questions below before making a decision. My
> >>>>>>>> opinion on
> >>>>>>>>>> this is that it has potential to be its own project and it
> >>>> would
> >>>>>>>> attract
> >>>>>>>>>> developers who are specifically interested in contributing to
> >>>>>>> metrics.
> >>>>>>>> I
> >>>>>>>>>> am skeptical that the Kafka contributors would focus on
> >>>> improving
> >>>>>>> this
> >>>>>>>>>> library (apart from bug fixes) instead of
> >>>> developing/contributing
> >>>>>> to
> >>>>>>>> other
> >>>>>>>>>> core pieces. It would be useful to continue and keep it
> >>>> decoupled
> >>>>>>> from
> >>>>>>>>>> rest of Kafka (if it resides in the Kafka code base.) so that
> >>>> we
> >>>>>> can
> >>>>>>>> move
> >>>>>>>>>> it out anytime to its own project.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hey Sriram,
> >>>>>>>>>>>
> >>>>>>>>>>> Not sure if these are actually meant as questions or more
> >>>> veiled
> >>>>>>>> comments.
> >>>>>>>>>>> In an case I tried to give my 2 cents inline.
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> >>>>>>>>>>> srsubramanian@linkedin.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I think answering the questions below would help to make a
> >>>>>> better
> >>>>>>>>>>>> decision. I am all for writing better code and having
> >>>> superior
> >>>>>>>>>>>> functionalities but it is worth thinking about stuff outside
> >>>>>> just
> >>>>>>>> code
> >>>>>>>>>>>> in
> >>>>>>>>>>>> this case -
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. Does metric form a core piece of kafka? Does it help
> >>>> kafka
> >>>>>>>> greatly in
> >>>>>>>>>>>> providing better core functionalities? I would always like a
> >>>>>>>> project to
> >>>>>>>>>>>> do
> >>>>>>>>>>>> one thing really well. Metrics is a non trivial amount of
> >>>>> code.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Metrics are obviously important, and obviously improving our
> >>>>>> metrics
> >>>>>>>>>>> system
> >>>>>>>>>>> would be good. That said this may or may not be better, and
> >>>> even
> >>>>>> if
> >>>>>>>> it is
> >>>>>>>>>>> better that betterness might not outweigh other
> >>>> considerations.
> >>>>>> That
> >>>>>>>> is
> >>>>>>>>>>> what we are discussing.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> 2. Does it make sense to be part of Kafka or its own
> >>>> project?
> >>>>> If
> >>>>>>>> this
> >>>>>>>>>>>> metrics library has the potential to be better than
> >>>>>> metrics-core,
> >>>>>>> I
> >>>>>>>>>>>> would
> >>>>>>>>>>>> be interested in other projects take advantage of it.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> It could be either.
> >>>>>>>>>>>
> >>>>>>>>>>> 3. Can Kafka maintain this library as new members join and old
> >>>>>>> members
> >>>>>>>>>>>> leave? Would this be a piece of code that no one (in Kafka)
> >>>> in
> >>>>>> the
> >>>>>>>>>>>> future
> >>>>>>>>>>>> spends time improving if the original author left?
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I am not going anywhere in the near term, but if I did, yes,
> >>>>> this
> >>>>>>>> would be
> >>>>>>>>>>> like any other code we have. As with yammer metrics or any
> >>>> other
> >>>>>>> code
> >>>>>>>> at
> >>>>>>>>>>> that point we would either use it as is or someone would
> >>>> improve
> >>>>>> it.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> 4. Does it affect the schedule of producer rewrite? This
> >>>> needs
> >>>>>> its
> >>>>>>>> own
> >>>>>>>>>>>> stabilization and modification to existing metric dashboards
> >>>>> if
> >>>>>>> the
> >>>>>>>>>>>> format
> >>>>>>>>>>>> is changed. Many times such cost are not factored in and a
> >>>>>> project
> >>>>>>>> loses
> >>>>>>>>>>>> time before realizing the extra time required to make a
> >>>>> library
> >>>>>> as
> >>>>>>>> this
> >>>>>>>>>>>> operational.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Probably not. The metrics are going to change regardless of
> >>>>>> whether
> >>>>>>>> we use
> >>>>>>>>>>> the same library or not. If we think this is better I don't
> >>>> mind
> >>>>>>>> putting
> >>>>>>>>>>> in
> >>>>>>>>>>> a little extra effort to get there.
> >>>>>>>>>>>
> >>>>>>>>>>> Irrespective I think this is probably not the right thing to
> >>>>>>> optimize
> >>>>>>>> for.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> I am sure we can do better when we write code to a specific
> >>>>> use
> >>>>>>>> case (in
> >>>>>>>>>>>> this case, kafka) rather than building a generic library
> >>>> that
> >>>>>>> suits
> >>>>>>>> all
> >>>>>>>>>>>> (metrics-core) but I would like us to have answers to the
> >>>>>>> questions
> >>>>>>>>>>>> above
> >>>>>>>>>>>> and be prepared before we proceed to support this with the
> >>>>>>> producer
> >>>>>>>>>>>> rewrite.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Naturally we are all considering exactly these things, that is
> >>>>>>>> exactly the
> >>>>>>>>>>> reason I started the thread.
> >>>>>>>>>>>
> >>>>>>>>>>> -Jay
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the detailed write-up. It's well thought
> >>>> through.
> >>>>> A
> >>>>>>> few
> >>>>>>>>>>>>> comments:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1. I have a couple of concerns on the percentiles. The
> >>>> first
> >>>>>>> issue
> >>>>>>>> is
> >>>>>>>>>>>> that
> >>>>>>>>>>>>> It requires the user to know the value range. Since the
> >>>> range
> >>>>>> for
> >>>>>>>>>>>> things
> >>>>>>>>>>>>> like message size (in millions) is quite different from
> >>>> those
> >>>>>>> like
> >>>>>>>>>>>> request
> >>>>>>>>>>>>> time (less than 100), it's going to be hard to pick a good
> >>>>>> global
> >>>>>>>>>>>> default
> >>>>>>>>>>>>> range. Different apps could be dealing with different
> >>>> message
> >>>>>>>> size. So
> >>>>>>>>>>>>> they
> >>>>>>>>>>>>> probably will have to customize the range. Another issue is
> >>>>>> that
> >>>>>>>> it can
> >>>>>>>>>>>>> only report values at the bucket boundaries. So, if you
> >>>> have
> >>>>>> 1000
> >>>>>>>>>>>> buckets
> >>>>>>>>>>>>> and a value range of 1 million, you will only see 1000
> >>>>> possible
> >>>>>>>> values
> >>>>>>>>>>>> as
> >>>>>>>>>>>>> the quantile, which is probably too sparse. The
> >>>>> implementation
> >>>>>> of
> >>>>>>>>>>>>> histogram
> >>>>>>>>>>>>> in metrics-core keeps a fix size of samples, which avoids
> >>>>> both
> >>>>>>>> issues.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2. We need to document the 3-part metrics names better
> >>>> since
> >>>>>> it's
> >>>>>>>> not
> >>>>>>>>>>>>> obvious what the convention is. Also, currently the name of
> >>>>> the
> >>>>>>>> sensor
> >>>>>>>>>>>> and
> >>>>>>>>>>>>> the metrics defined in it are independent. Would it make
> >>>>> sense
> >>>>>> to
> >>>>>>>> have
> >>>>>>>>>>>> the
> >>>>>>>>>>>>> sensor name be a prefix of the metric name?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Overall, this approach seems to be cleaner than
> >>>> metrics-core
> >>>>> by
> >>>>>>>>>>>> decoupling
> >>>>>>>>>>>>> measuring and reporting. The main benefit of metrics-core
> >>>>> seems
> >>>>>>> to
> >>>>>>>> be
> >>>>>>>>>>>> the
> >>>>>>>>>>>>> existing reporters. Since not that many people voted for
> >>>>>>>> metrics-core,
> >>>>>>>>>>>> I
> >>>>>>>>>>>>> am
> >>>>>>>>>>>>> ok with going with the new implementation. My only
> >>>>>> recommendation
> >>>>>>>> is to
> >>>>>>>>>>>>> address the concern on percentiles.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Jun
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
> >>>>>> jay.kreps@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hey guys,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I wanted to kick off a quick discussion of metrics with
> >>>>>> respect
> >>>>>>>> to
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> new
> >>>>>>>>>>>>>> producer and consumer (and potentially the server).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> At a high level I think there are three approaches we
> >>>> could
> >>>>>>> take:
> >>>>>>>>>>>>>> 1. Plain vanilla JMX
> >>>>>>>>>>>>>> 2. Use Coda Hale (AKA Yammer) Metrics
> >>>>>>>>>>>>>> 3. Do our own metrics (with JMX as one output)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 1. Has the advantage that JMX is the most commonly used
> >>>>> java
> >>>>>>>> thing
> >>>>>>>>>>>> and
> >>>>>>>>>>>>>> plugs in reasonably to most metrics systems. JMX is
> >>>>> included
> >>>>>> in
> >>>>>>>> the
> >>>>>>>>>>>> JDK
> >>>>>>>>>>>>>> so
> >>>>>>>>>>>>>> it doesn't impose any additional dependencies on clients.
> >>>>> It
> >>>>>>> has
> >>>>>>>> the
> >>>>>>>>>>>>>> disadvantage that plain vanilla JMX is a pain to use. We
> >>>>>> would
> >>>>>>>> need a
> >>>>>>>>>>>>>> bunch
> >>>>>>>>>>>>>> of helper code for maintaining counters to make this
> >>>>>>> reasonable.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2. Coda Hale metrics is pretty good and broadly used. It
> >>>>>>>> supports JMX
> >>>>>>>>>>>>>> output as well as direct output to many other types of
> >>>>>> systems.
> >>>>>>>> The
> >>>>>>>>>>>>>> primary
> >>>>>>>>>>>>>> downside we have had with Coda Hale has to do with the
> >>>>>> clients
> >>>>>>>> and
> >>>>>>>>>>>>>> library
> >>>>>>>>>>>>>> incompatibilities. We are currently on an older more
> >>>>> popular
> >>>>>>>> version.
> >>>>>>>>>>>>>> The
> >>>>>>>>>>>>>> newer version is a rewrite of the APIs and is
> >>>> incompatible.
> >>>>>>>>>>>> Originally
> >>>>>>>>>>>>>> these were totally incompatible and people had to choose
> >>>>> one
> >>>>>> or
> >>>>>>>> the
> >>>>>>>>>>>>>> other.
> >>>>>>>>>>>>>> I think that has been improved so now the new version is
> >>>> a
> >>>>>>>> totally
> >>>>>>>>>>>>>> different package. But even in this case you end up with
> >>>>> both
> >>>>>>>>>>>> versions
> >>>>>>>>>>>>>> if
> >>>>>>>>>>>>>> you use Kafka and we are on a different version than you
> >>>>>> which
> >>>>>>> is
> >>>>>>>>>>>> going
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>> be pretty inconvenient.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 3. Doing our own has the downside of potentially
> >>>>> reinventing
> >>>>>>> the
> >>>>>>>>>>>> wheel,
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>> potentially needing to work out any bugs in our code. The
> >>>>>>> upsides
> >>>>>>>>>>>> would
> >>>>>>>>>>>>>> depend on the how good the reinvention was. As it
> >>>> happens I
> >>>>>>> did a
> >>>>>>>>>>>> quick
> >>>>>>>>>>>>>> (~900 loc) version of a metrics library that is under
> >>>>>>>>>>>>>> kafka.common.metrics.
> >>>>>>>>>>>>>> I think it has some advantages over the Yammer metrics
> >>>>>> package
> >>>>>>>> for
> >>>>>>>>>>>> our
> >>>>>>>>>>>>>> usage beyond just not causing incompatibilities. I will
> >>>>>>> describe
> >>>>>>>> this
> >>>>>>>>>>>>>> code
> >>>>>>>>>>>>>> so we can discuss the pros and cons. Although I favor
> >>>> this
> >>>>>>>> approach I
> >>>>>>>>>>>>>> have
> >>>>>>>>>>>>>> no emotional attachment and wouldn't be too sad if I
> >>>> ended
> >>>>> up
> >>>>>>>>>>>> deleting
> >>>>>>>>>>>>>> it.
> >>>>>>>>>>>>>> Here are javadocs for this code, though I haven't written
> >>>>>> much
> >>>>>>>>>>>>>> documentation yet since I might end up deleting it:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Here is a quick overview of this library.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> There are three main public interfaces:
> >>>>>>>>>>>>>> Metrics - This is a repository of metrics being
> >>>> tracked.
> >>>>>>>>>>>>>> Metric - A single, named numerical value being measured
> >>>>>>> (i.e. a
> >>>>>>>>>>>>>> counter).
> >>>>>>>>>>>>>> Sensor - This is a thing that records values and
> >>>> updates
> >>>>>> zero
> >>>>>>>> or
> >>>>>>>>>>>> more
> >>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> So let's say we want to track three values about message
> >>>>>> sizes;
> >>>>>>>>>>>>>> specifically say we want to record the average, the
> >>>>> maximum,
> >>>>>>> the
> >>>>>>>>>>>> total
> >>>>>>>>>>>>>> rate
> >>>>>>>>>>>>>> of bytes being sent, and a count of messages. Then we
> >>>> would
> >>>>>> do
> >>>>>>>>>>>> something
> >>>>>>>>>>>>>> like this:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  // setup code
> >>>>>>>>>>>>>>  Metrics metrics = new Metrics(); // this is a global
> >>>>>>>> "singleton"
> >>>>>>>>>>>>>>  Sensor sensor =
> >>>>>>>> metrics.sensor("kafka.producer.message.sizes");
> >>>>>>>>>>>>>>  sensor.add("kafka.producer.message-size.avg", new
> >>>>> Avg());
> >>>>>>>>>>>>>>  sensor.add("kafka.producer.message-size.max", new
> >>>>> Max());
> >>>>>>>>>>>>>>  sensor.add("kafka.producer.bytes-sent-per-sec", new
> >>>>>> Rate());
> >>>>>>>>>>>>>>  sensor.add("kafka.producer.message-count", new
> >>>> Count());
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  // now when we get a message we do this
> >>>>>>>>>>>>>>  sensor.record(messageSize);
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The above code creates the global metrics repository,
> >>>>>> creates a
> >>>>>>>>>>>> single
> >>>>>>>>>>>>>> Sensor, and defines 5 named metrics that are updated by
> >>>>> that
> >>>>>>>> Sensor.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Like Yammer Metrics (YM) I allow you to plug in
> >>>>> "reporters",
> >>>>>>>>>>>> including a
> >>>>>>>>>>>>>> JMX reporter. Unlike the Coda Hale JMX reporter the
> >>>>> reporter
> >>>>>> I
> >>>>>>>> have
> >>>>>>>>>>>> keys
> >>>>>>>>>>>>>> off the metric names not the Sensor names, which I think
> >>>> is
> >>>>>> an
> >>>>>>>>>>>>>> improvement--I just use the convention that the last
> >>>>> portion
> >>>>>> of
> >>>>>>>> the
> >>>>>>>>>>>>>> name is
> >>>>>>>>>>>>>> the attribute name, the second to last is the mbean name,
> >>>>> and
> >>>>>>> the
> >>>>>>>>>>>> rest
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>> the package. So in the above example there is a producer
> >>>>>> mbean
> >>>>>>>> that
> >>>>>>>>>>>> has
> >>>>>>>>>>>>>> a
> >>>>>>>>>>>>>> avg and max attribute and a producer mbean that has a
> >>>>>>>>>>>> bytes-sent-per-sec
> >>>>>>>>>>>>>> and message-count attribute. This is nice because you can
> >>>>>>>> logically
> >>>>>>>>>>>>>> group
> >>>>>>>>>>>>>> the values reported irrespective of where in the program
> >>>>> they
> >>>>>>> are
> >>>>>>>>>>>>>> computed--that is an mbean can logically group attributes
> >>>>>>>> computed
> >>>>>>>>>>>> off
> >>>>>>>>>>>>>> different sensors. This means you can report values by
> >>>>>> logical
> >>>>>>>>>>>>>> subsystem.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I also allow the concept of hierarchical Sensors which I
> >>>>>> think
> >>>>>>>> is a
> >>>>>>>>>>>> good
> >>>>>>>>>>>>>> convenience. I have noticed a common pattern in systems
> >>>>> where
> >>>>>>> you
> >>>>>>>>>>>> need
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>> roll up the same values along different dimensions. An
> >>>>> simple
> >>>>>>>>>>>> example is
> >>>>>>>>>>>>>> metrics about qps, data rate, etc on the broker. These we
> >>>>>> want
> >>>>>>> to
> >>>>>>>>>>>>>> capture
> >>>>>>>>>>>>>> in aggregate, but also broken down by topic-id. You can
> >>>> do
> >>>>>> this
> >>>>>>>>>>>> purely
> >>>>>>>>>>>>>> by
> >>>>>>>>>>>>>> defining the sensor hierarchy:
> >>>>>>>>>>>>>> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> >>>>>>>>>>>>>> Sensor topicSizes = metrics.sensor("kafka.producer." +
> >>>>> topic
> >>>>>> +
> >>>>>>>>>>>>>> ".sizes",
> >>>>>>>>>>>>>> allSizes);
> >>>>>>>>>>>>>> Now each actual update will go to the appropriate
> >>>>> topicSizes
> >>>>>>>> sensor
> >>>>>>>>>>>>>> (based
> >>>>>>>>>>>>>> on the topic name), but allSizes metrics will get updated
> >>>>>> too.
> >>>>>>> I
> >>>>>>>> also
> >>>>>>>>>>>>>> support multiple parents for each sensor as well as
> >>>>> multiple
> >>>>>>>> layers
> >>>>>>>>>>>> of
> >>>>>>>>>>>>>> hiearchy, so you can define a more elaborate DAG of
> >>>>> sensors.
> >>>>>> An
> >>>>>>>>>>>> example
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>> how this would be useful is if you wanted to record your
> >>>>>>> metrics
> >>>>>>>>>>>> broken
> >>>>>>>>>>>>>> down by topic AND client id as well as the global
> >>>>> aggregate.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Each metric can take a configurable Quota value which
> >>>>> allows
> >>>>>> us
> >>>>>>>> to
> >>>>>>>>>>>> limit
> >>>>>>>>>>>>>> the maximum value of that sensor. This is intended for
> >>>> use
> >>>>> on
> >>>>>>> the
> >>>>>>>>>>>>>> server as
> >>>>>>>>>>>>>> part of our Quota implementation. The way this works is
> >>>>> that
> >>>>>>> you
> >>>>>>>>>>>> record
> >>>>>>>>>>>>>> metrics as usual:
> >>>>>>>>>>>>>>  mySensor.record(42.0)
> >>>>>>>>>>>>>> However if this event occurance causes one of the metrics
> >>>>> to
> >>>>>>>> exceed
> >>>>>>>>>>>> its
> >>>>>>>>>>>>>> maximum allowable value (the quota) this call will throw
> >>>> a
> >>>>>>>>>>>>>> QuotaViolationException. The cool thing about this is
> >>>> that
> >>>>> it
> >>>>>>>> means
> >>>>>>>>>>>> we
> >>>>>>>>>>>>>> can
> >>>>>>>>>>>>>> define quotas on anything we capture metrics for, which I
> >>>>>> think
> >>>>>>>> is
> >>>>>>>>>>>>>> pretty
> >>>>>>>>>>>>>> cool.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Another question is how to handle windowing of the
> >>>> values?
> >>>>>>>> Metrics
> >>>>>>>>>>>> want
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>> record the "current" value, but the definition of current
> >>>>> is
> >>>>>>>>>>>> inherently
> >>>>>>>>>>>>>> nebulous. A few of the obvious gotchas are that if you
> >>>>> define
> >>>>>>>>>>>> "current"
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>> be a number of events you can end up measuring an
> >>>>> arbitrarily
> >>>>>>>> long
> >>>>>>>>>>>>>> window
> >>>>>>>>>>>>>> of time if the event rate is low (e.g. you think you are
> >>>>>>> getting
> >>>>>>>> 50
> >>>>>>>>>>>>>> messages/sec because that was the rate yesterday when all
> >>>>>>> events
> >>>>>>>>>>>>>> topped).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Here is how I approach this. All the metrics use the same
> >>>>>>>> windowing
> >>>>>>>>>>>>>> approach. We define a single window by a length of time
> >>>> or
> >>>>>>>> number of
> >>>>>>>>>>>>>> values
> >>>>>>>>>>>>>> (you can use either or both--if both the window ends when
> >>>>>>>> *either*
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> time
> >>>>>>>>>>>>>> bound or event bound is hit). The typical problem with
> >>>> hard
> >>>>>>>> window
> >>>>>>>>>>>>>> boundaries is that at the beginning of the window you
> >>>> have
> >>>>> no
> >>>>>>>> data
> >>>>>>>>>>>> and
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> first few samples are too small to be a valid sample.
> >>>>>> (Consider
> >>>>>>>> if
> >>>>>>>>>>>> you
> >>>>>>>>>>>>>> were
> >>>>>>>>>>>>>> keeping an avg and the first value in the window happens
> >>>> to
> >>>>>> be
> >>>>>>>> very
> >>>>>>>>>>>> very
> >>>>>>>>>>>>>> high, if you check the avg at this exact time you will
> >>>>>> conclude
> >>>>>>>> the
> >>>>>>>>>>>> avg
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>> very high but on a sample size of one). One simple fix
> >>>>> would
> >>>>>> be
> >>>>>>>> to
> >>>>>>>>>>>>>> always
> >>>>>>>>>>>>>> report the last complete window, however this is not
> >>>>>>> appropriate
> >>>>>>>> here
> >>>>>>>>>>>>>> because (1) we want to drive quotas off it so it needs to
> >>>>> be
> >>>>>>>> current,
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>> (2) since this is for monitoring you kind of care more
> >>>>> about
> >>>>>>> the
> >>>>>>>>>>>> current
> >>>>>>>>>>>>>> state. The ideal solution here would be to define a
> >>>>> backwards
> >>>>>>>> looking
> >>>>>>>>>>>>>> sliding window from the present, but many statistics are
> >>>>>>> actually
> >>>>>>>>>>>> very
> >>>>>>>>>>>>>> hard
> >>>>>>>>>>>>>> to compute in this model without retaining all the values
> >>>>>> which
> >>>>>>>>>>>> would be
> >>>>>>>>>>>>>> hopelessly inefficient. My solution to this is to keep a
> >>>>>>>> configurable
> >>>>>>>>>>>>>> number of windows (default is two) and combine them for
> >>>> the
> >>>>>>>> estimate.
> >>>>>>>>>>>>>> So in
> >>>>>>>>>>>>>> a two sample case depending on when you ask you have
> >>>>> between
> >>>>>>> one
> >>>>>>>> and
> >>>>>>>>>>>> two
> >>>>>>>>>>>>>> complete samples worth of data to base the answer off of.
> >>>>>>>> Provided
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> sample window is large enough to get a valid result this
> >>>>>>>> satisfies
> >>>>>>>>>>>> both
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>> my criteria of incorporating the most recent data and
> >>>>> having
> >>>>>>>>>>>> reasonable
> >>>>>>>>>>>>>> variance at all times.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Another approach is to use an exponential weighting
> >>>> scheme
> >>>>> to
> >>>>>>>> combine
> >>>>>>>>>>>>>> all
> >>>>>>>>>>>>>> history but emphasize the recent past. I have not done
> >>>> this
> >>>>>> as
> >>>>>>> it
> >>>>>>>>>>>> has a
> >>>>>>>>>>>>>> lot
> >>>>>>>>>>>>>> of issues for practical operational metrics. I'd be happy
> >>>>> to
> >>>>>>>>>>>> elaborate
> >>>>>>>>>>>>>> on
> >>>>>>>>>>>>>> this if anyone cares...
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The window size for metrics has a global default which
> >>>> can
> >>>>> be
> >>>>>>>>>>>>>> overridden at
> >>>>>>>>>>>>>> either the sensor or individual metric level.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In addition to these time series values the user can
> >>>>> directly
> >>>>>>>> expose
> >>>>>>>>>>>>>> some
> >>>>>>>>>>>>>> method of their choosing JMX-style by implementing the
> >>>>>>> Measurable
> >>>>>>>>>>>>>> interface
> >>>>>>>>>>>>>> and registering that value. E.g.
> >>>>>>>>>>>>>> metrics.addMetric("my.metric", new Measurable() {
> >>>>>>>>>>>>>>   public double measure(MetricConfg config, long now) {
> >>>>>>>>>>>>>>      return this.calculateValueToExpose();
> >>>>>>>>>>>>>>   }
> >>>>>>>>>>>>>> });
> >>>>>>>>>>>>>> This is useful for exposing things like the accumulator
> >>>>> free
> >>>>>>>> memory.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The set of metrics is extensible, new metrics can be
> >>>> added
> >>>>> by
> >>>>>>>> just
> >>>>>>>>>>>>>> implementing the appropriate interfaces and registering
> >>>>> with
> >>>>>> a
> >>>>>>>>>>>> sensor. I
> >>>>>>>>>>>>>> implement the following metrics:
> >>>>>>>>>>>>>> total - the sum of all values from the given sensor
> >>>>>>>>>>>>>> count - a windowed count of values from the sensor
> >>>>>>>>>>>>>> avg - the sample average within the windows
> >>>>>>>>>>>>>> max - the max over the windows
> >>>>>>>>>>>>>> min - the min over the windows
> >>>>>>>>>>>>>> rate - the rate in the windows (e.g. the total or count
> >>>>>>>> divided by
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> ellapsed time)
> >>>>>>>>>>>>>> percentiles - a collection of percentiles computed over
> >>>>> the
> >>>>>>>> window
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> My approach to percentiles is a little different from the
> >>>>>>> yammer
> >>>>>>>>>>>> metrics
> >>>>>>>>>>>>>> package. My complaint about the yammer metrics approach
> >>>> is
> >>>>>> that
> >>>>>>>> it
> >>>>>>>>>>>> uses
> >>>>>>>>>>>>>> rather expensive sampling and uses kind of a lot of
> >>>> memory
> >>>>> to
> >>>>>>>> get a
> >>>>>>>>>>>>>> reasonable sample. This is problematic for per-topic
> >>>>>>>> measurements.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Instead I use a fixed range for the histogram (e.g. 0.0
> >>>> to
> >>>>>>>> 30000.0)
> >>>>>>>>>>>>>> which
> >>>>>>>>>>>>>> directly allows you to specify the desired memory use.
> >>>> Any
> >>>>>>> value
> >>>>>>>>>>>> below
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> minimum is recorded as -Infinity and any value above the
> >>>>>>> maximum
> >>>>>>>> as
> >>>>>>>>>>>>>> +Infinity. I think this is okay as all metrics have an
> >>>>>> expected
> >>>>>>>> range
> >>>>>>>>>>>>>> except for latency which can be arbitrarily large, but
> >>>> for
> >>>>>> very
> >>>>>>>> high
> >>>>>>>>>>>>>> latency there is no need to model it exactly (e.g. 30
> >>>>>> seconds +
> >>>>>>>>>>>> really
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>> effectively infinite). Within the range values are
> >>>> recorded
> >>>>>> in
> >>>>>>>>>>>> buckets
> >>>>>>>>>>>>>> which can be either fixed width or increasing width. The
> >>>>>>>> increasing
> >>>>>>>>>>>>>> width
> >>>>>>>>>>>>>> is analogous to the idea of significant figures, that is
> >>>> if
> >>>>>>> your
> >>>>>>>>>>>> value
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>> in the range 0-10 you might want to be accurate to within
> >>>>>> 1ms,
> >>>>>>>> but if
> >>>>>>>>>>>>>> it is
> >>>>>>>>>>>>>> 20000 there is no need to be so accurate. I implemented a
> >>>>>>> linear
> >>>>>>>>>>>> bucket
> >>>>>>>>>>>>>> size where the Nth bucket has width proportional to N. An
> >>>>>>>> exponential
> >>>>>>>>>>>>>> bucket size would also be sensible and could likely be
> >>>>>> derived
> >>>>>>>>>>>> directly
> >>>>>>>>>>>>>> from the floating point representation of a the value.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'd like to get some feedback on this metrics code and
> >>>>> make a
> >>>>>>>>>>>> decision
> >>>>>>>>>>>>>> on
> >>>>>>>>>>>>>> whether we want to use it before I actually go ahead and
> >>>>> add
> >>>>>>> all
> >>>>>>>> the
> >>>>>>>>>>>>>> instrumentation in the code (otherwise I'll have to redo
> >>>> it
> >>>>>> if
> >>>>>>> we
> >>>>>>>>>>>> switch
> >>>>>>>>>>>>>> approaches). So the next topic of discussion will be
> >>>> which
> >>>>>>> actual
> >>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>> to add.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -Jay
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Re: Metrics in new producer

Posted by Martin Kleppmann <mk...@linkedin.com>.

Hi Jay,

1. Agree with your assessment. Let me know when you start writing the metrics library to rule them all, I'm interested :-)

2. If I understood you correctly, you need to give an indication of what range of values you expect for a metric (e.g. a latency might be in log-scale of buckets between 0.1ms and 30,000ms) -- that's what I meant with distribution. Or did I get it wrong?

There are a bunch of interesting algorithms for estimating percentiles with a small memory footprint. This is probably getting too far into yak-shaving territory, but just in case it's useful, this short literature survey may be useful:

Chiranjeeb Buragohain and Subhash Suri: "Quantiles on Streams" in Encyclopedia of Database Systems, Springer, pp 2235–2240, 2009. ISBN: 978-0-387-35544-3 http://www.cs.ucsb.edu/~suri/psdir/ency.pdf

3. I don't know all the purposes for which quotas will be used, but instinctively your approach sounds good to me. I would be inclined to increase N a bit (perhaps to 4 or 5), to reduce the uncertainty introduced by the incomplete window, if memory usage allows.

5. Looking into open-sourcing it. I will also take a look at your code.

Best,
Martin

On 22 Feb 2014, at 18:53, Jay Kreps <ja...@gmail.com> wrote:
> Hey Martin,
> 
> Thanks for the great feedback.
> 
> 1. I agree with the problems of mixing moving window statistics with fixed
> window statistics. That was one of my rationales. The other is that
> weighted statistics are very unintuitive for people compared to simple
> things like averages and percentiles so they fail a bit as an intuitive
> monitoring mechanism. I actually think the moving windows are technically
> superior since they don't have hard boundaries, but a naive implementation
> based solely on events is actually totally wrong for the reasons you
> describe, the weighting needs to take into account the point in time of the
> estimate in its contribution to the average. This is an interesting problem
> and I started to think about it but then decided that if I kept thinking
> about it I would never get anything finished. When I retire I plan to write
> a metrics library based solely on continuously weighted averages. :-)
> 
> 2. Fair. To be clear you needn't encode a distribution, just your
> preference about accuracy in the measurement. You are saying "I care
> equally about accuracy in the whole range" or "I don't care about fine
> grained accuracy when the numbers themselves are large".
> 
> 3. The reason the exception is good is because the actual quota may be low
> down in some part of the system, but a quota violation always needs to
> unwind all the way back up to the API layer to return the error to the
> client. So an exception is actually just what you need because the catch
> will actually potentially be in a different place than the record() call.
> This let's you introduce quotaing without each subsystem really needing to
> know about it.
> 
> Your point about whether or not you should count the current event when a
> quota violation occurs is a good one. I actually think the right answer
> depends on the details of how you handle windowing. For example one
> approach to windowing I have seen is to use the most recent COMPLETE window
> as the estimate while you fill up the current window. In this model then
> with a 30 second window the estimate you give out is always 0-30 seconds
> old. In this case you have a real problem with quotas because once the
> previous window is filled and you are in violation of your quota you will
> keep throwing exceptions regardless of the client behavior for the duration
> of the next window. But worse if you aren't counting the requests that got
> rejected then even though the client behavior is still bad your next window
> will record no values (because you rejected them all as quota violations).
> This is clearly a mess.
> 
> But that isn't quite how I'm doing windowing. The way I do it is I always
> keep N windows, with the last window being partial) and the estimate is
> overall all windows. So with N=2 (the default) when you complete the
> current window the previous window is cleared and used for to record the
> new values. The downside of this is that with a 30 second window and N=2
> your estimate is based on anything from 30 seconds to 60 seconds. The
> upside is that the most recent data is always included. I feel this is
> inherently important for monitoring. But it is particularly important for
> Quotas. In this case I feel that it is always the right thing to NOT count
> rejected measurements. Not that in this model let's say that the user goes
> over their quota and stays that way for a sustained period of time. The
> impact will not be the seesaw behavior I described where we reject all then
> none of their requests, instead we will reject enough requests to keep them
> under their quota.
> 
> 5. I would definitely be interested to see the code if it is open source,
> since I am interested in metrics. Overall since you went down this path I
> would be interested to get your opinion on my code. If you think what you
> did is better I would be open to discussing it as a third alternative too.
> If we decide we do want to use this code for metrics then we may want to
> implement a sampling histogram either in addition to or as a replacement
> for the existing histograms and if you were up to contribute your
> implementation that would be great.
> 
> -Jay
> 
> 
> On Sat, Feb 22, 2014 at 9:25 AM, Martin Kleppmann
> <mk...@linkedin.com>wrote:
> 
>> Not sure if you want yet another opinion added to the pile -- but since I
>> had a similar problem on another project recently, I thought I'd weigh in.
>> (On that project we were originally using Coda's library, but then switched
>> to rolling our own metrics implementation because we needed to do a few
>> things differently.)
>> 
>> 1. Problems we encountered with Coda's library: it uses an
>> exponentially-weighted moving average (EMWA) for rates (eg. messages/sec),
>> and exponentially biased reservoir sampling for histograms (percentiles,
>> averages). Those methods of calculation work well for events with a
>> consistently high volume, but they give strange and misleading results for
>> events that are bursty or rare (eg error rates). We found that a fixed-size
>> window gives more predictable, easier-to-interpret results.
>> 
>> 2. In defence of Coda's library, I think its histogram implementation is a
>> good trade-off of memory for accuracy; I'm not totally convinced that your
>> proposal (counts of events in a fixed set of buckets) would be much better.
>> Would have to do some math to work out the expected accuracy in each case.
>> The reservoir sampling can be configured to use a smaller sample if the
>> default of 1028 samples is too expensive. Reservoir sampling also has the
>> advantage that you don't need to hard-code a bucket distribution.
>> 
>> 3. Quotas are an interesting use case. However, I'm not wild about using a
>> QuotaViolationException for control flow -- I think an explicit conditional
>> would be nicer than having to catch an exception. One question in that
>> context: if a quota is exceeded, do you still want to count the event
>> towards the metric, or do you want to stop counting it until the quota is
>> replenished? The answer may depend on the particular metric.
>> 
>> 4. If you decide to go with Coda's library, I would advocate isolating the
>> dependency into a separate module and using it via a facade -- somewhat
>> like using SLF4J instead of Log4j directly. It's ok for Coda's library to
>> be the default metrics implementation, but it should be easy to swap it out
>> for something different in case someone has a version conflict or differing
>> requirements. The facade should be at a low level (individual events), not
>> at the reporter level (which deals with pre-aggregated values, and is
>> already pluggable).
>> 
>> 5. If it's useful, I can probably contribute my simple (but imho
>> effective) metrics library, for embedding into Kafka. It uses reservoir
>> sampling for percentiles, like Coda's library, but uses a fixed-size window
>> instead of an exponential bias, which avoids weird behaviour on bursty
>> metrics.
>> 
>> In summary, I would advocate one of the following approaches:
>> - Coda Hale library via facade (allowing it to be swapped for something
>> else), or
>> - Own metrics implementation, provided that we have confidence in its
>> implementation of percentiles.
>> 
>> Martin
>> 
>> 
>> On 22 Feb 2014, at 01:06, Jay Kreps <ja...@gmail.com> wrote:
>>> Hey guys,
>>> 
>>> Just picking up this thread again. I do want to drive a conclusion as I
>>> will run out of work to do on the producer soon and will need to add
>>> metrics of some sort. We can vote on it, but I'm not sure if we actually
>>> got everything discussed.
>>> 
>>> Joel, I wasn't fully sure how to interpret your comment. I think you are
>>> saying you are cool with the new metrics package as long as it really is
>>> better. Do you have any comment on whether you think the benefits I
>>> outlined are worth it? I agree with you that we could hold off on a
>> second
>>> repo until someone else would actually want to use our code.
>>> 
>>> Jun, I'm not averse to doing a sampling-based histogram and doing some
>>> comparison between the two approaches if you think this approach is
>>> otherwise better.
>>> 
>>> Sriram, originally I thought you preferred just sticking to Coda Hale,
>> but
>>> after your follow-up email I wasn't really sure...
>>> 
>>> Joe/Clark, yes this code allows pluggable reporting so you could have a
>>> metrics reporter that just wraps each metric in a Coda Hale Gauge if that
>>> is useful. Though obviously if enough people were doing that I would
>> think
>>> it would be worth just using the Coda Hale package directly...
>>> 
>>> -Jay
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Feb 13, 2014 at 3:34 PM, Clark Breyman <cl...@breyman.com>
>> wrote:
>>> 
>>>> Not requiring the client to link Coda/Yammer metrics sounds like a
>>>> compelling reason to pivot to new interfaces. If that's the agreed
>>>> direction, I'm hoping that we'd get the choice of backend to provide
>> (e.g.
>>>> facade on Yammer metrics for those with an investment in that) rather
>> than
>>>> force the new backend.  Having a metrics factory seems better for this
>> than
>>>> directly instantiating the singleton registry.
>>>> 
>>>> 
>>>> On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly>
>> wrote:
>>>> 
>>>>> Can we leave metrics and have multiple supported KafkaMetricsGroup
>>>>> implementing a yammer based implementation?
>>>>> 
>>>>> ProducerRequestStats with your configured analytics group?
>>>>> 
>>>>> On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com>
>> wrote:
>>>>> 
>>>>>> I think we discussed the scala/java stuff more fully previously.
>>>>>> Essentially the client is embedded everywhere. Scala is very
>>>> incompatible
>>>>>> with itself so this makes it very hard to use for people using
>> anything
>>>>>> else in scala. Also Scala stack traces are very confusing. Basically
>> we
>>>>>> thought plain java code would be a lot easier for people to use. Even
>>>> if
>>>>>> Scala is more fun to write, that isn't really what we are optimizing
>>>> for.
>>>>>> 
>>>>>> -Jay
>>>>>> 
>>>>>> 
>>>>>> On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com>
>> wrote:
>>>>>> 
>>>>>>> Jay, pretty impressive how you just write a 'quick version' like that
>>>>> :)
>>>>>>> Not to get off-topic but why didn't you write this in scala?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
>>>>> wrote:
>>>>>>> 
>>>>>>>> I have not had a chance to review the new metrics code and its
>>>>>>>> features carefully (apart from your write-up), but here are my
>>>>> general
>>>>>>>> thoughts:
>>>>>>>> 
>>>>>>>> Implementing a metrics package correctly is difficult; more so for
>>>>>>>> people like me, because I'm not a statistician.  However, if this
>>>> new
>>>>>>>> package: {(i) functions correctly (and we need to define and prove
>>>>>>>> correctness), (ii) is easy to use, (iii) serves all our current and
>>>>>>>> anticipated monitoring needs, (iv) is not overly complex that it
>>>>>>>> becomes a burden to maintain and we are better of with an available
>>>>>>>> library;} then I think it makes sense to embed it and use it within
>>>>>>>> the Kafka code. The main wins are: (i) predictability (no changing
>>>>>>>> APIs and intimate knowledge of the code) and (ii) control with
>>>>> respect
>>>>>>>> to both functionality (e.g., there are hard-coded decay constants
>>>> in
>>>>>>>> metrics-core 2.x) and correctness (i.e., if we find a bug in the
>>>>>>>> metrics package we have to submit a pull request and wait for it to
>>>>>>>> become mainstream).  I'm not sure it would help very much to pull
>>>> it
>>>>>>>> into a separate repo because that could potentially annul these
>>>>>>>> benefits.
>>>>>>>> 
>>>>>>>> Joel
>>>>>>>> 
>>>>>>>> On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
>>>>>>>>> Sriram,
>>>>>>>>> 
>>>>>>>>> Makes sense. I am cool moving this stuff into its own repo if
>>>>> people
>>>>>>>> think
>>>>>>>>> that is better. I'm not sure it would get much contribution but
>>>>> when
>>>>>> I
>>>>>>>>> started messing with this I did have a lot of grand ideas of
>>>> making
>>>>>>>> adding
>>>>>>>>> metrics to a sensor dynamic so you could add more stuff in
>>>>>>> real-time(via
>>>>>>>>> jmx, say) and/or externalize all your metrics and config to a
>>>>>> separate
>>>>>>>> file
>>>>>>>>> like log4j with only the points of instrumentation hard-coded.
>>>>>>>>> 
>>>>>>>>> -Jay
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
>>>>>>>>> srsubramanian@linkedin.com> wrote:
>>>>>>>>> 
>>>>>>>>>> I am actually neutral to this change. I found the replies were
>>>>> more
>>>>>>>>>> towards the implementation and features so far. I would like
>>>> the
>>>>>>>> community
>>>>>>>>>> to think about the questions below before making a decision. My
>>>>>>>> opinion on
>>>>>>>>>> this is that it has potential to be its own project and it
>>>> would
>>>>>>>> attract
>>>>>>>>>> developers who are specifically interested in contributing to
>>>>>>> metrics.
>>>>>>>> I
>>>>>>>>>> am skeptical that the Kafka contributors would focus on
>>>> improving
>>>>>>> this
>>>>>>>>>> library (apart from bug fixes) instead of
>>>> developing/contributing
>>>>>> to
>>>>>>>> other
>>>>>>>>>> core pieces. It would be useful to continue and keep it
>>>> decoupled
>>>>>>> from
>>>>>>>>>> rest of Kafka (if it resides in the Kafka code base.) so that
>>>> we
>>>>>> can
>>>>>>>> move
>>>>>>>>>> it out anytime to its own project.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hey Sriram,
>>>>>>>>>>> 
>>>>>>>>>>> Not sure if these are actually meant as questions or more
>>>> veiled
>>>>>>>> comments.
>>>>>>>>>>> In an case I tried to give my 2 cents inline.
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
>>>>>>>>>>> srsubramanian@linkedin.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> I think answering the questions below would help to make a
>>>>>> better
>>>>>>>>>>>> decision. I am all for writing better code and having
>>>> superior
>>>>>>>>>>>> functionalities but it is worth thinking about stuff outside
>>>>>> just
>>>>>>>> code
>>>>>>>>>>>> in
>>>>>>>>>>>> this case -
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. Does metric form a core piece of kafka? Does it help
>>>> kafka
>>>>>>>> greatly in
>>>>>>>>>>>> providing better core functionalities? I would always like a
>>>>>>>> project to
>>>>>>>>>>>> do
>>>>>>>>>>>> one thing really well. Metrics is a non trivial amount of
>>>>> code.
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Metrics are obviously important, and obviously improving our
>>>>>> metrics
>>>>>>>>>>> system
>>>>>>>>>>> would be good. That said this may or may not be better, and
>>>> even
>>>>>> if
>>>>>>>> it is
>>>>>>>>>>> better that betterness might not outweigh other
>>>> considerations.
>>>>>> That
>>>>>>>> is
>>>>>>>>>>> what we are discussing.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> 2. Does it make sense to be part of Kafka or its own
>>>> project?
>>>>> If
>>>>>>>> this
>>>>>>>>>>>> metrics library has the potential to be better than
>>>>>> metrics-core,
>>>>>>> I
>>>>>>>>>>>> would
>>>>>>>>>>>> be interested in other projects take advantage of it.
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> It could be either.
>>>>>>>>>>> 
>>>>>>>>>>> 3. Can Kafka maintain this library as new members join and old
>>>>>>> members
>>>>>>>>>>>> leave? Would this be a piece of code that no one (in Kafka)
>>>> in
>>>>>> the
>>>>>>>>>>>> future
>>>>>>>>>>>> spends time improving if the original author left?
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> I am not going anywhere in the near term, but if I did, yes,
>>>>> this
>>>>>>>> would be
>>>>>>>>>>> like any other code we have. As with yammer metrics or any
>>>> other
>>>>>>> code
>>>>>>>> at
>>>>>>>>>>> that point we would either use it as is or someone would
>>>> improve
>>>>>> it.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> 4. Does it affect the schedule of producer rewrite? This
>>>> needs
>>>>>> its
>>>>>>>> own
>>>>>>>>>>>> stabilization and modification to existing metric dashboards
>>>>> if
>>>>>>> the
>>>>>>>>>>>> format
>>>>>>>>>>>> is changed. Many times such cost are not factored in and a
>>>>>> project
>>>>>>>> loses
>>>>>>>>>>>> time before realizing the extra time required to make a
>>>>> library
>>>>>> as
>>>>>>>> this
>>>>>>>>>>>> operational.
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Probably not. The metrics are going to change regardless of
>>>>>> whether
>>>>>>>> we use
>>>>>>>>>>> the same library or not. If we think this is better I don't
>>>> mind
>>>>>>>> putting
>>>>>>>>>>> in
>>>>>>>>>>> a little extra effort to get there.
>>>>>>>>>>> 
>>>>>>>>>>> Irrespective I think this is probably not the right thing to
>>>>>>> optimize
>>>>>>>> for.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> I am sure we can do better when we write code to a specific
>>>>> use
>>>>>>>> case (in
>>>>>>>>>>>> this case, kafka) rather than building a generic library
>>>> that
>>>>>>> suits
>>>>>>>> all
>>>>>>>>>>>> (metrics-core) but I would like us to have answers to the
>>>>>>> questions
>>>>>>>>>>>> above
>>>>>>>>>>>> and be prepared before we proceed to support this with the
>>>>>>> producer
>>>>>>>>>>>> rewrite.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Naturally we are all considering exactly these things, that is
>>>>>>>> exactly the
>>>>>>>>>>> reason I started the thread.
>>>>>>>>>>> 
>>>>>>>>>>> -Jay
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for the detailed write-up. It's well thought
>>>> through.
>>>>> A
>>>>>>> few
>>>>>>>>>>>>> comments:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1. I have a couple of concerns on the percentiles. The
>>>> first
>>>>>>> issue
>>>>>>>> is
>>>>>>>>>>>> that
>>>>>>>>>>>>> It requires the user to know the value range. Since the
>>>> range
>>>>>> for
>>>>>>>>>>>> things
>>>>>>>>>>>>> like message size (in millions) is quite different from
>>>> those
>>>>>>> like
>>>>>>>>>>>> request
>>>>>>>>>>>>> time (less than 100), it's going to be hard to pick a good
>>>>>> global
>>>>>>>>>>>> default
>>>>>>>>>>>>> range. Different apps could be dealing with different
>>>> message
>>>>>>>> size. So
>>>>>>>>>>>>> they
>>>>>>>>>>>>> probably will have to customize the range. Another issue is
>>>>>> that
>>>>>>>> it can
>>>>>>>>>>>>> only report values at the bucket boundaries. So, if you
>>>> have
>>>>>> 1000
>>>>>>>>>>>> buckets
>>>>>>>>>>>>> and a value range of 1 million, you will only see 1000
>>>>> possible
>>>>>>>> values
>>>>>>>>>>>> as
>>>>>>>>>>>>> the quantile, which is probably too sparse. The
>>>>> implementation
>>>>>> of
>>>>>>>>>>>>> histogram
>>>>>>>>>>>>> in metrics-core keeps a fix size of samples, which avoids
>>>>> both
>>>>>>>> issues.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 2. We need to document the 3-part metrics names better
>>>> since
>>>>>> it's
>>>>>>>> not
>>>>>>>>>>>>> obvious what the convention is. Also, currently the name of
>>>>> the
>>>>>>>> sensor
>>>>>>>>>>>> and
>>>>>>>>>>>>> the metrics defined in it are independent. Would it make
>>>>> sense
>>>>>> to
>>>>>>>> have
>>>>>>>>>>>> the
>>>>>>>>>>>>> sensor name be a prefix of the metric name?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Overall, this approach seems to be cleaner than
>>>> metrics-core
>>>>> by
>>>>>>>>>>>> decoupling
>>>>>>>>>>>>> measuring and reporting. The main benefit of metrics-core
>>>>> seems
>>>>>>> to
>>>>>>>> be
>>>>>>>>>>>> the
>>>>>>>>>>>>> existing reporters. Since not that many people voted for
>>>>>>>> metrics-core,
>>>>>>>>>>>> I
>>>>>>>>>>>>> am
>>>>>>>>>>>>> ok with going with the new implementation. My only
>>>>>> recommendation
>>>>>>>> is to
>>>>>>>>>>>>> address the concern on percentiles.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Jun
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
>>>>>> jay.kreps@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey guys,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I wanted to kick off a quick discussion of metrics with
>>>>>> respect
>>>>>>>> to
>>>>>>>>>>>> the
>>>>>>>>>>>>>> new
>>>>>>>>>>>>>> producer and consumer (and potentially the server).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> At a high level I think there are three approaches we
>>>> could
>>>>>>> take:
>>>>>>>>>>>>>> 1. Plain vanilla JMX
>>>>>>>>>>>>>> 2. Use Coda Hale (AKA Yammer) Metrics
>>>>>>>>>>>>>> 3. Do our own metrics (with JMX as one output)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. Has the advantage that JMX is the most commonly used
>>>>> java
>>>>>>>> thing
>>>>>>>>>>>> and
>>>>>>>>>>>>>> plugs in reasonably to most metrics systems. JMX is
>>>>> included
>>>>>> in
>>>>>>>> the
>>>>>>>>>>>> JDK
>>>>>>>>>>>>>> so
>>>>>>>>>>>>>> it doesn't impose any additional dependencies on clients.
>>>>> It
>>>>>>> has
>>>>>>>> the
>>>>>>>>>>>>>> disadvantage that plain vanilla JMX is a pain to use. We
>>>>>> would
>>>>>>>> need a
>>>>>>>>>>>>>> bunch
>>>>>>>>>>>>>> of helper code for maintaining counters to make this
>>>>>>> reasonable.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2. Coda Hale metrics is pretty good and broadly used. It
>>>>>>>> supports JMX
>>>>>>>>>>>>>> output as well as direct output to many other types of
>>>>>> systems.
>>>>>>>> The
>>>>>>>>>>>>>> primary
>>>>>>>>>>>>>> downside we have had with Coda Hale has to do with the
>>>>>> clients
>>>>>>>> and
>>>>>>>>>>>>>> library
>>>>>>>>>>>>>> incompatibilities. We are currently on an older more
>>>>> popular
>>>>>>>> version.
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> newer version is a rewrite of the APIs and is
>>>> incompatible.
>>>>>>>>>>>> Originally
>>>>>>>>>>>>>> these were totally incompatible and people had to choose
>>>>> one
>>>>>> or
>>>>>>>> the
>>>>>>>>>>>>>> other.
>>>>>>>>>>>>>> I think that has been improved so now the new version is
>>>> a
>>>>>>>> totally
>>>>>>>>>>>>>> different package. But even in this case you end up with
>>>>> both
>>>>>>>>>>>> versions
>>>>>>>>>>>>>> if
>>>>>>>>>>>>>> you use Kafka and we are on a different version than you
>>>>>> which
>>>>>>> is
>>>>>>>>>>>> going
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> be pretty inconvenient.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 3. Doing our own has the downside of potentially
>>>>> reinventing
>>>>>>> the
>>>>>>>>>>>> wheel,
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> potentially needing to work out any bugs in our code. The
>>>>>>> upsides
>>>>>>>>>>>> would
>>>>>>>>>>>>>> depend on the how good the reinvention was. As it
>>>> happens I
>>>>>>> did a
>>>>>>>>>>>> quick
>>>>>>>>>>>>>> (~900 loc) version of a metrics library that is under
>>>>>>>>>>>>>> kafka.common.metrics.
>>>>>>>>>>>>>> I think it has some advantages over the Yammer metrics
>>>>>> package
>>>>>>>> for
>>>>>>>>>>>> our
>>>>>>>>>>>>>> usage beyond just not causing incompatibilities. I will
>>>>>>> describe
>>>>>>>> this
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>> so we can discuss the pros and cons. Although I favor
>>>> this
>>>>>>>> approach I
>>>>>>>>>>>>>> have
>>>>>>>>>>>>>> no emotional attachment and wouldn't be too sad if I
>>>> ended
>>>>> up
>>>>>>>>>>>> deleting
>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>> Here are javadocs for this code, though I haven't written
>>>>>> much
>>>>>>>>>>>>>> documentation yet since I might end up deleting it:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Here is a quick overview of this library.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> There are three main public interfaces:
>>>>>>>>>>>>>> Metrics - This is a repository of metrics being
>>>> tracked.
>>>>>>>>>>>>>> Metric - A single, named numerical value being measured
>>>>>>> (i.e. a
>>>>>>>>>>>>>> counter).
>>>>>>>>>>>>>> Sensor - This is a thing that records values and
>>>> updates
>>>>>> zero
>>>>>>>> or
>>>>>>>>>>>> more
>>>>>>>>>>>>>> metrics
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So let's say we want to track three values about message
>>>>>> sizes;
>>>>>>>>>>>>>> specifically say we want to record the average, the
>>>>> maximum,
>>>>>>> the
>>>>>>>>>>>> total
>>>>>>>>>>>>>> rate
>>>>>>>>>>>>>> of bytes being sent, and a count of messages. Then we
>>>> would
>>>>>> do
>>>>>>>>>>>> something
>>>>>>>>>>>>>> like this:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>  // setup code
>>>>>>>>>>>>>>  Metrics metrics = new Metrics(); // this is a global
>>>>>>>> "singleton"
>>>>>>>>>>>>>>  Sensor sensor =
>>>>>>>> metrics.sensor("kafka.producer.message.sizes");
>>>>>>>>>>>>>>  sensor.add("kafka.producer.message-size.avg", new
>>>>> Avg());
>>>>>>>>>>>>>>  sensor.add("kafka.producer.message-size.max", new
>>>>> Max());
>>>>>>>>>>>>>>  sensor.add("kafka.producer.bytes-sent-per-sec", new
>>>>>> Rate());
>>>>>>>>>>>>>>  sensor.add("kafka.producer.message-count", new
>>>> Count());
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>  // now when we get a message we do this
>>>>>>>>>>>>>>  sensor.record(messageSize);
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The above code creates the global metrics repository,
>>>>>> creates a
>>>>>>>>>>>> single
>>>>>>>>>>>>>> Sensor, and defines 5 named metrics that are updated by
>>>>> that
>>>>>>>> Sensor.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Like Yammer Metrics (YM) I allow you to plug in
>>>>> "reporters",
>>>>>>>>>>>> including a
>>>>>>>>>>>>>> JMX reporter. Unlike the Coda Hale JMX reporter the
>>>>> reporter
>>>>>> I
>>>>>>>> have
>>>>>>>>>>>> keys
>>>>>>>>>>>>>> off the metric names not the Sensor names, which I think
>>>> is
>>>>>> an
>>>>>>>>>>>>>> improvement--I just use the convention that the last
>>>>> portion
>>>>>> of
>>>>>>>> the
>>>>>>>>>>>>>> name is
>>>>>>>>>>>>>> the attribute name, the second to last is the mbean name,
>>>>> and
>>>>>>> the
>>>>>>>>>>>> rest
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> the package. So in the above example there is a producer
>>>>>> mbean
>>>>>>>> that
>>>>>>>>>>>> has
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> avg and max attribute and a producer mbean that has a
>>>>>>>>>>>> bytes-sent-per-sec
>>>>>>>>>>>>>> and message-count attribute. This is nice because you can
>>>>>>>> logically
>>>>>>>>>>>>>> group
>>>>>>>>>>>>>> the values reported irrespective of where in the program
>>>>> they
>>>>>>> are
>>>>>>>>>>>>>> computed--that is an mbean can logically group attributes
>>>>>>>> computed
>>>>>>>>>>>> off
>>>>>>>>>>>>>> different sensors. This means you can report values by
>>>>>> logical
>>>>>>>>>>>>>> subsystem.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I also allow the concept of hierarchical Sensors which I
>>>>>> think
>>>>>>>> is a
>>>>>>>>>>>> good
>>>>>>>>>>>>>> convenience. I have noticed a common pattern in systems
>>>>> where
>>>>>>> you
>>>>>>>>>>>> need
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> roll up the same values along different dimensions. An
>>>>> simple
>>>>>>>>>>>> example is
>>>>>>>>>>>>>> metrics about qps, data rate, etc on the broker. These we
>>>>>> want
>>>>>>> to
>>>>>>>>>>>>>> capture
>>>>>>>>>>>>>> in aggregate, but also broken down by topic-id. You can
>>>> do
>>>>>> this
>>>>>>>>>>>> purely
>>>>>>>>>>>>>> by
>>>>>>>>>>>>>> defining the sensor hierarchy:
>>>>>>>>>>>>>> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
>>>>>>>>>>>>>> Sensor topicSizes = metrics.sensor("kafka.producer." +
>>>>> topic
>>>>>> +
>>>>>>>>>>>>>> ".sizes",
>>>>>>>>>>>>>> allSizes);
>>>>>>>>>>>>>> Now each actual update will go to the appropriate
>>>>> topicSizes
>>>>>>>> sensor
>>>>>>>>>>>>>> (based
>>>>>>>>>>>>>> on the topic name), but allSizes metrics will get updated
>>>>>> too.
>>>>>>> I
>>>>>>>> also
>>>>>>>>>>>>>> support multiple parents for each sensor as well as
>>>>> multiple
>>>>>>>> layers
>>>>>>>>>>>> of
>>>>>>>>>>>>>> hiearchy, so you can define a more elaborate DAG of
>>>>> sensors.
>>>>>> An
>>>>>>>>>>>> example
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> how this would be useful is if you wanted to record your
>>>>>>> metrics
>>>>>>>>>>>> broken
>>>>>>>>>>>>>> down by topic AND client id as well as the global
>>>>> aggregate.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Each metric can take a configurable Quota value which
>>>>> allows
>>>>>> us
>>>>>>>> to
>>>>>>>>>>>> limit
>>>>>>>>>>>>>> the maximum value of that sensor. This is intended for
>>>> use
>>>>> on
>>>>>>> the
>>>>>>>>>>>>>> server as
>>>>>>>>>>>>>> part of our Quota implementation. The way this works is
>>>>> that
>>>>>>> you
>>>>>>>>>>>> record
>>>>>>>>>>>>>> metrics as usual:
>>>>>>>>>>>>>>  mySensor.record(42.0)
>>>>>>>>>>>>>> However if this event occurance causes one of the metrics
>>>>> to
>>>>>>>> exceed
>>>>>>>>>>>> its
>>>>>>>>>>>>>> maximum allowable value (the quota) this call will throw
>>>> a
>>>>>>>>>>>>>> QuotaViolationException. The cool thing about this is
>>>> that
>>>>> it
>>>>>>>> means
>>>>>>>>>>>> we
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>> define quotas on anything we capture metrics for, which I
>>>>>> think
>>>>>>>> is
>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>> cool.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Another question is how to handle windowing of the
>>>> values?
>>>>>>>> Metrics
>>>>>>>>>>>> want
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> record the "current" value, but the definition of current
>>>>> is
>>>>>>>>>>>> inherently
>>>>>>>>>>>>>> nebulous. A few of the obvious gotchas are that if you
>>>>> define
>>>>>>>>>>>> "current"
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> be a number of events you can end up measuring an
>>>>> arbitrarily
>>>>>>>> long
>>>>>>>>>>>>>> window
>>>>>>>>>>>>>> of time if the event rate is low (e.g. you think you are
>>>>>>> getting
>>>>>>>> 50
>>>>>>>>>>>>>> messages/sec because that was the rate yesterday when all
>>>>>>> events
>>>>>>>>>>>>>> topped).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Here is how I approach this. All the metrics use the same
>>>>>>>> windowing
>>>>>>>>>>>>>> approach. We define a single window by a length of time
>>>> or
>>>>>>>> number of
>>>>>>>>>>>>>> values
>>>>>>>>>>>>>> (you can use either or both--if both the window ends when
>>>>>>>> *either*
>>>>>>>>>>>> the
>>>>>>>>>>>>>> time
>>>>>>>>>>>>>> bound or event bound is hit). The typical problem with
>>>> hard
>>>>>>>> window
>>>>>>>>>>>>>> boundaries is that at the beginning of the window you
>>>> have
>>>>> no
>>>>>>>> data
>>>>>>>>>>>> and
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> first few samples are too small to be a valid sample.
>>>>>> (Consider
>>>>>>>> if
>>>>>>>>>>>> you
>>>>>>>>>>>>>> were
>>>>>>>>>>>>>> keeping an avg and the first value in the window happens
>>>> to
>>>>>> be
>>>>>>>> very
>>>>>>>>>>>> very
>>>>>>>>>>>>>> high, if you check the avg at this exact time you will
>>>>>> conclude
>>>>>>>> the
>>>>>>>>>>>> avg
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> very high but on a sample size of one). One simple fix
>>>>> would
>>>>>> be
>>>>>>>> to
>>>>>>>>>>>>>> always
>>>>>>>>>>>>>> report the last complete window, however this is not
>>>>>>> appropriate
>>>>>>>> here
>>>>>>>>>>>>>> because (1) we want to drive quotas off it so it needs to
>>>>> be
>>>>>>>> current,
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> (2) since this is for monitoring you kind of care more
>>>>> about
>>>>>>> the
>>>>>>>>>>>> current
>>>>>>>>>>>>>> state. The ideal solution here would be to define a
>>>>> backwards
>>>>>>>> looking
>>>>>>>>>>>>>> sliding window from the present, but many statistics are
>>>>>>> actually
>>>>>>>>>>>> very
>>>>>>>>>>>>>> hard
>>>>>>>>>>>>>> to compute in this model without retaining all the values
>>>>>> which
>>>>>>>>>>>> would be
>>>>>>>>>>>>>> hopelessly inefficient. My solution to this is to keep a
>>>>>>>> configurable
>>>>>>>>>>>>>> number of windows (default is two) and combine them for
>>>> the
>>>>>>>> estimate.
>>>>>>>>>>>>>> So in
>>>>>>>>>>>>>> a two sample case depending on when you ask you have
>>>>> between
>>>>>>> one
>>>>>>>> and
>>>>>>>>>>>> two
>>>>>>>>>>>>>> complete samples worth of data to base the answer off of.
>>>>>>>> Provided
>>>>>>>>>>>> the
>>>>>>>>>>>>>> sample window is large enough to get a valid result this
>>>>>>>> satisfies
>>>>>>>>>>>> both
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> my criteria of incorporating the most recent data and
>>>>> having
>>>>>>>>>>>> reasonable
>>>>>>>>>>>>>> variance at all times.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Another approach is to use an exponential weighting
>>>> scheme
>>>>> to
>>>>>>>> combine
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> history but emphasize the recent past. I have not done
>>>> this
>>>>>> as
>>>>>>> it
>>>>>>>>>>>> has a
>>>>>>>>>>>>>> lot
>>>>>>>>>>>>>> of issues for practical operational metrics. I'd be happy
>>>>> to
>>>>>>>>>>>> elaborate
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> this if anyone cares...
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The window size for metrics has a global default which
>>>> can
>>>>> be
>>>>>>>>>>>>>> overridden at
>>>>>>>>>>>>>> either the sensor or individual metric level.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In addition to these time series values the user can
>>>>> directly
>>>>>>>> expose
>>>>>>>>>>>>>> some
>>>>>>>>>>>>>> method of their choosing JMX-style by implementing the
>>>>>>> Measurable
>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>> and registering that value. E.g.
>>>>>>>>>>>>>> metrics.addMetric("my.metric", new Measurable() {
>>>>>>>>>>>>>>   public double measure(MetricConfg config, long now) {
>>>>>>>>>>>>>>      return this.calculateValueToExpose();
>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>> });
>>>>>>>>>>>>>> This is useful for exposing things like the accumulator
>>>>> free
>>>>>>>> memory.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The set of metrics is extensible, new metrics can be
>>>> added
>>>>> by
>>>>>>>> just
>>>>>>>>>>>>>> implementing the appropriate interfaces and registering
>>>>> with
>>>>>> a
>>>>>>>>>>>> sensor. I
>>>>>>>>>>>>>> implement the following metrics:
>>>>>>>>>>>>>> total - the sum of all values from the given sensor
>>>>>>>>>>>>>> count - a windowed count of values from the sensor
>>>>>>>>>>>>>> avg - the sample average within the windows
>>>>>>>>>>>>>> max - the max over the windows
>>>>>>>>>>>>>> min - the min over the windows
>>>>>>>>>>>>>> rate - the rate in the windows (e.g. the total or count
>>>>>>>> divided by
>>>>>>>>>>>> the
>>>>>>>>>>>>>> ellapsed time)
>>>>>>>>>>>>>> percentiles - a collection of percentiles computed over
>>>>> the
>>>>>>>> window
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> My approach to percentiles is a little different from the
>>>>>>> yammer
>>>>>>>>>>>> metrics
>>>>>>>>>>>>>> package. My complaint about the yammer metrics approach
>>>> is
>>>>>> that
>>>>>>>> it
>>>>>>>>>>>> uses
>>>>>>>>>>>>>> rather expensive sampling and uses kind of a lot of
>>>> memory
>>>>> to
>>>>>>>> get a
>>>>>>>>>>>>>> reasonable sample. This is problematic for per-topic
>>>>>>>> measurements.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Instead I use a fixed range for the histogram (e.g. 0.0
>>>> to
>>>>>>>> 30000.0)
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>> directly allows you to specify the desired memory use.
>>>> Any
>>>>>>> value
>>>>>>>>>>>> below
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> minimum is recorded as -Infinity and any value above the
>>>>>>> maximum
>>>>>>>> as
>>>>>>>>>>>>>> +Infinity. I think this is okay as all metrics have an
>>>>>> expected
>>>>>>>> range
>>>>>>>>>>>>>> except for latency which can be arbitrarily large, but
>>>> for
>>>>>> very
>>>>>>>> high
>>>>>>>>>>>>>> latency there is no need to model it exactly (e.g. 30
>>>>>> seconds +
>>>>>>>>>>>> really
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> effectively infinite). Within the range values are
>>>> recorded
>>>>>> in
>>>>>>>>>>>> buckets
>>>>>>>>>>>>>> which can be either fixed width or increasing width. The
>>>>>>>> increasing
>>>>>>>>>>>>>> width
>>>>>>>>>>>>>> is analogous to the idea of significant figures, that is
>>>> if
>>>>>>> your
>>>>>>>>>>>> value
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> in the range 0-10 you might want to be accurate to within
>>>>>> 1ms,
>>>>>>>> but if
>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>> 20000 there is no need to be so accurate. I implemented a
>>>>>>> linear
>>>>>>>>>>>> bucket
>>>>>>>>>>>>>> size where the Nth bucket has width proportional to N. An
>>>>>>>> exponential
>>>>>>>>>>>>>> bucket size would also be sensible and could likely be
>>>>>> derived
>>>>>>>>>>>> directly
>>>>>>>>>>>>>> from the floating point representation of a the value.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'd like to get some feedback on this metrics code and
>>>>> make a
>>>>>>>>>>>> decision
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> whether we want to use it before I actually go ahead and
>>>>> add
>>>>>>> all
>>>>>>>> the
>>>>>>>>>>>>>> instrumentation in the code (otherwise I'll have to redo
>>>> it
>>>>>> if
>>>>>>> we
>>>>>>>>>>>> switch
>>>>>>>>>>>>>> approaches). So the next topic of discussion will be
>>>> which
>>>>>>> actual
>>>>>>>>>>>>>> metrics
>>>>>>>>>>>>>> to add.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Jay
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
>>

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

Hey Martin,

Thanks for the great feedback.

1. I agree with the problems of mixing moving window statistics with fixed
window statistics. That was one of my rationales. The other is that
weighted statistics are very unintuitive for people compared to simple
things like averages and percentiles so they fail a bit as an intuitive
monitoring mechanism. I actually think the moving windows are technically
superior since they don't have hard boundaries, but a naive implementation
based solely on events is actually totally wrong for the reasons you
describe, the weighting needs to take into account the point in time of the
estimate in its contribution to the average. This is an interesting problem
and I started to think about it but then decided that if I kept thinking
about it I would never get anything finished. When I retire I plan to write
a metrics library based solely on continuously weighted averages. :-)

2. Fair. To be clear you needn't encode a distribution, just your
preference about accuracy in the measurement. You are saying "I care
equally about accuracy in the whole range" or "I don't care about fine
grained accuracy when the numbers themselves are large".

3. The reason the exception is good is because the actual quota may be low
down in some part of the system, but a quota violation always needs to
unwind all the way back up to the API layer to return the error to the
client. So an exception is actually just what you need because the catch
will actually potentially be in a different place than the record() call.
This let's you introduce quotaing without each subsystem really needing to
know about it.

Your point about whether or not you should count the current event when a
quota violation occurs is a good one. I actually think the right answer
depends on the details of how you handle windowing. For example one
approach to windowing I have seen is to use the most recent COMPLETE window
as the estimate while you fill up the current window. In this model then
with a 30 second window the estimate you give out is always 0-30 seconds
old. In this case you have a real problem with quotas because once the
previous window is filled and you are in violation of your quota you will
keep throwing exceptions regardless of the client behavior for the duration
of the next window. But worse if you aren't counting the requests that got
rejected then even though the client behavior is still bad your next window
will record no values (because you rejected them all as quota violations).
This is clearly a mess.

But that isn't quite how I'm doing windowing. The way I do it is I always
keep N windows, with the last window being partial) and the estimate is
overall all windows. So with N=2 (the default) when you complete the
current window the previous window is cleared and used for to record the
new values. The downside of this is that with a 30 second window and N=2
your estimate is based on anything from 30 seconds to 60 seconds. The
upside is that the most recent data is always included. I feel this is
inherently important for monitoring. But it is particularly important for
Quotas. In this case I feel that it is always the right thing to NOT count
rejected measurements. Not that in this model let's say that the user goes
over their quota and stays that way for a sustained period of time. The
impact will not be the seesaw behavior I described where we reject all then
none of their requests, instead we will reject enough requests to keep them
under their quota.

5. I would definitely be interested to see the code if it is open source,
since I am interested in metrics. Overall since you went down this path I
would be interested to get your opinion on my code. If you think what you
did is better I would be open to discussing it as a third alternative too.
If we decide we do want to use this code for metrics then we may want to
implement a sampling histogram either in addition to or as a replacement
for the existing histograms and if you were up to contribute your
implementation that would be great.

-Jay


On Sat, Feb 22, 2014 at 9:25 AM, Martin Kleppmann
<mk...@linkedin.com>wrote:

> Not sure if you want yet another opinion added to the pile -- but since I
> had a similar problem on another project recently, I thought I'd weigh in.
> (On that project we were originally using Coda's library, but then switched
> to rolling our own metrics implementation because we needed to do a few
> things differently.)
>
> 1. Problems we encountered with Coda's library: it uses an
> exponentially-weighted moving average (EMWA) for rates (eg. messages/sec),
> and exponentially biased reservoir sampling for histograms (percentiles,
> averages). Those methods of calculation work well for events with a
> consistently high volume, but they give strange and misleading results for
> events that are bursty or rare (eg error rates). We found that a fixed-size
> window gives more predictable, easier-to-interpret results.
>
> 2. In defence of Coda's library, I think its histogram implementation is a
> good trade-off of memory for accuracy; I'm not totally convinced that your
> proposal (counts of events in a fixed set of buckets) would be much better.
> Would have to do some math to work out the expected accuracy in each case.
> The reservoir sampling can be configured to use a smaller sample if the
> default of 1028 samples is too expensive. Reservoir sampling also has the
> advantage that you don't need to hard-code a bucket distribution.
>
> 3. Quotas are an interesting use case. However, I'm not wild about using a
> QuotaViolationException for control flow -- I think an explicit conditional
> would be nicer than having to catch an exception. One question in that
> context: if a quota is exceeded, do you still want to count the event
> towards the metric, or do you want to stop counting it until the quota is
> replenished? The answer may depend on the particular metric.
>
> 4. If you decide to go with Coda's library, I would advocate isolating the
> dependency into a separate module and using it via a facade -- somewhat
> like using SLF4J instead of Log4j directly. It's ok for Coda's library to
> be the default metrics implementation, but it should be easy to swap it out
> for something different in case someone has a version conflict or differing
> requirements. The facade should be at a low level (individual events), not
> at the reporter level (which deals with pre-aggregated values, and is
> already pluggable).
>
> 5. If it's useful, I can probably contribute my simple (but imho
> effective) metrics library, for embedding into Kafka. It uses reservoir
> sampling for percentiles, like Coda's library, but uses a fixed-size window
> instead of an exponential bias, which avoids weird behaviour on bursty
> metrics.
>
> In summary, I would advocate one of the following approaches:
> - Coda Hale library via facade (allowing it to be swapped for something
> else), or
> - Own metrics implementation, provided that we have confidence in its
> implementation of percentiles.
>
> Martin
>
>
> On 22 Feb 2014, at 01:06, Jay Kreps <ja...@gmail.com> wrote:
> > Hey guys,
> >
> > Just picking up this thread again. I do want to drive a conclusion as I
> > will run out of work to do on the producer soon and will need to add
> > metrics of some sort. We can vote on it, but I'm not sure if we actually
> > got everything discussed.
> >
> > Joel, I wasn't fully sure how to interpret your comment. I think you are
> > saying you are cool with the new metrics package as long as it really is
> > better. Do you have any comment on whether you think the benefits I
> > outlined are worth it? I agree with you that we could hold off on a
> second
> > repo until someone else would actually want to use our code.
> >
> > Jun, I'm not averse to doing a sampling-based histogram and doing some
> > comparison between the two approaches if you think this approach is
> > otherwise better.
> >
> > Sriram, originally I thought you preferred just sticking to Coda Hale,
> but
> > after your follow-up email I wasn't really sure...
> >
> > Joe/Clark, yes this code allows pluggable reporting so you could have a
> > metrics reporter that just wraps each metric in a Coda Hale Gauge if that
> > is useful. Though obviously if enough people were doing that I would
> think
> > it would be worth just using the Coda Hale package directly...
> >
> > -Jay
> >
> >
> >
> >
> > On Thu, Feb 13, 2014 at 3:34 PM, Clark Breyman <cl...@breyman.com>
> wrote:
> >
> >> Not requiring the client to link Coda/Yammer metrics sounds like a
> >> compelling reason to pivot to new interfaces. If that's the agreed
> >> direction, I'm hoping that we'd get the choice of backend to provide
> (e.g.
> >> facade on Yammer metrics for those with an investment in that) rather
> than
> >> force the new backend.  Having a metrics factory seems better for this
> than
> >> directly instantiating the singleton registry.
> >>
> >>
> >> On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly>
> wrote:
> >>
> >>> Can we leave metrics and have multiple supported KafkaMetricsGroup
> >>> implementing a yammer based implementation?
> >>>
> >>> ProducerRequestStats with your configured analytics group?
> >>>
> >>> On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com>
> wrote:
> >>>
> >>>> I think we discussed the scala/java stuff more fully previously.
> >>>> Essentially the client is embedded everywhere. Scala is very
> >> incompatible
> >>>> with itself so this makes it very hard to use for people using
> anything
> >>>> else in scala. Also Scala stack traces are very confusing. Basically
> we
> >>>> thought plain java code would be a lot easier for people to use. Even
> >> if
> >>>> Scala is more fun to write, that isn't really what we are optimizing
> >> for.
> >>>>
> >>>> -Jay
> >>>>
> >>>>
> >>>> On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com>
> wrote:
> >>>>
> >>>>> Jay, pretty impressive how you just write a 'quick version' like that
> >>> :)
> >>>>> Not to get off-topic but why didn't you write this in scala?
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>>> I have not had a chance to review the new metrics code and its
> >>>>>> features carefully (apart from your write-up), but here are my
> >>> general
> >>>>>> thoughts:
> >>>>>>
> >>>>>> Implementing a metrics package correctly is difficult; more so for
> >>>>>> people like me, because I'm not a statistician.  However, if this
> >> new
> >>>>>> package: {(i) functions correctly (and we need to define and prove
> >>>>>> correctness), (ii) is easy to use, (iii) serves all our current and
> >>>>>> anticipated monitoring needs, (iv) is not overly complex that it
> >>>>>> becomes a burden to maintain and we are better of with an available
> >>>>>> library;} then I think it makes sense to embed it and use it within
> >>>>>> the Kafka code. The main wins are: (i) predictability (no changing
> >>>>>> APIs and intimate knowledge of the code) and (ii) control with
> >>> respect
> >>>>>> to both functionality (e.g., there are hard-coded decay constants
> >> in
> >>>>>> metrics-core 2.x) and correctness (i.e., if we find a bug in the
> >>>>>> metrics package we have to submit a pull request and wait for it to
> >>>>>> become mainstream).  I'm not sure it would help very much to pull
> >> it
> >>>>>> into a separate repo because that could potentially annul these
> >>>>>> benefits.
> >>>>>>
> >>>>>> Joel
> >>>>>>
> >>>>>> On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> >>>>>>> Sriram,
> >>>>>>>
> >>>>>>> Makes sense. I am cool moving this stuff into its own repo if
> >>> people
> >>>>>> think
> >>>>>>> that is better. I'm not sure it would get much contribution but
> >>> when
> >>>> I
> >>>>>>> started messing with this I did have a lot of grand ideas of
> >> making
> >>>>>> adding
> >>>>>>> metrics to a sensor dynamic so you could add more stuff in
> >>>>> real-time(via
> >>>>>>> jmx, say) and/or externalize all your metrics and config to a
> >>>> separate
> >>>>>> file
> >>>>>>> like log4j with only the points of instrumentation hard-coded.
> >>>>>>>
> >>>>>>> -Jay
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> >>>>>>> srsubramanian@linkedin.com> wrote:
> >>>>>>>
> >>>>>>>> I am actually neutral to this change. I found the replies were
> >>> more
> >>>>>>>> towards the implementation and features so far. I would like
> >> the
> >>>>>> community
> >>>>>>>> to think about the questions below before making a decision. My
> >>>>>> opinion on
> >>>>>>>> this is that it has potential to be its own project and it
> >> would
> >>>>>> attract
> >>>>>>>> developers who are specifically interested in contributing to
> >>>>> metrics.
> >>>>>> I
> >>>>>>>> am skeptical that the Kafka contributors would focus on
> >> improving
> >>>>> this
> >>>>>>>> library (apart from bug fixes) instead of
> >> developing/contributing
> >>>> to
> >>>>>> other
> >>>>>>>> core pieces. It would be useful to continue and keep it
> >> decoupled
> >>>>> from
> >>>>>>>> rest of Kafka (if it resides in the Kafka code base.) so that
> >> we
> >>>> can
> >>>>>> move
> >>>>>>>> it out anytime to its own project.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> Hey Sriram,
> >>>>>>>>>
> >>>>>>>>> Not sure if these are actually meant as questions or more
> >> veiled
> >>>>>> comments.
> >>>>>>>>> In an case I tried to give my 2 cents inline.
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> >>>>>>>>> srsubramanian@linkedin.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> I think answering the questions below would help to make a
> >>>> better
> >>>>>>>>>> decision. I am all for writing better code and having
> >> superior
> >>>>>>>>>> functionalities but it is worth thinking about stuff outside
> >>>> just
> >>>>>> code
> >>>>>>>>>> in
> >>>>>>>>>> this case -
> >>>>>>>>>>
> >>>>>>>>>> 1. Does metric form a core piece of kafka? Does it help
> >> kafka
> >>>>>> greatly in
> >>>>>>>>>> providing better core functionalities? I would always like a
> >>>>>> project to
> >>>>>>>>>> do
> >>>>>>>>>> one thing really well. Metrics is a non trivial amount of
> >>> code.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Metrics are obviously important, and obviously improving our
> >>>> metrics
> >>>>>>>>> system
> >>>>>>>>> would be good. That said this may or may not be better, and
> >> even
> >>>> if
> >>>>>> it is
> >>>>>>>>> better that betterness might not outweigh other
> >> considerations.
> >>>> That
> >>>>>> is
> >>>>>>>>> what we are discussing.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> 2. Does it make sense to be part of Kafka or its own
> >> project?
> >>> If
> >>>>>> this
> >>>>>>>>>> metrics library has the potential to be better than
> >>>> metrics-core,
> >>>>> I
> >>>>>>>>>> would
> >>>>>>>>>> be interested in other projects take advantage of it.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> It could be either.
> >>>>>>>>>
> >>>>>>>>> 3. Can Kafka maintain this library as new members join and old
> >>>>> members
> >>>>>>>>>> leave? Would this be a piece of code that no one (in Kafka)
> >> in
> >>>> the
> >>>>>>>>>> future
> >>>>>>>>>> spends time improving if the original author left?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I am not going anywhere in the near term, but if I did, yes,
> >>> this
> >>>>>> would be
> >>>>>>>>> like any other code we have. As with yammer metrics or any
> >> other
> >>>>> code
> >>>>>> at
> >>>>>>>>> that point we would either use it as is or someone would
> >> improve
> >>>> it.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> 4. Does it affect the schedule of producer rewrite? This
> >> needs
> >>>> its
> >>>>>> own
> >>>>>>>>>> stabilization and modification to existing metric dashboards
> >>> if
> >>>>> the
> >>>>>>>>>> format
> >>>>>>>>>> is changed. Many times such cost are not factored in and a
> >>>> project
> >>>>>> loses
> >>>>>>>>>> time before realizing the extra time required to make a
> >>> library
> >>>> as
> >>>>>> this
> >>>>>>>>>> operational.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Probably not. The metrics are going to change regardless of
> >>>> whether
> >>>>>> we use
> >>>>>>>>> the same library or not. If we think this is better I don't
> >> mind
> >>>>>> putting
> >>>>>>>>> in
> >>>>>>>>> a little extra effort to get there.
> >>>>>>>>>
> >>>>>>>>> Irrespective I think this is probably not the right thing to
> >>>>> optimize
> >>>>>> for.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> I am sure we can do better when we write code to a specific
> >>> use
> >>>>>> case (in
> >>>>>>>>>> this case, kafka) rather than building a generic library
> >> that
> >>>>> suits
> >>>>>> all
> >>>>>>>>>> (metrics-core) but I would like us to have answers to the
> >>>>> questions
> >>>>>>>>>> above
> >>>>>>>>>> and be prepared before we proceed to support this with the
> >>>>> producer
> >>>>>>>>>> rewrite.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Naturally we are all considering exactly these things, that is
> >>>>>> exactly the
> >>>>>>>>> reason I started the thread.
> >>>>>>>>>
> >>>>>>>>> -Jay
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks for the detailed write-up. It's well thought
> >> through.
> >>> A
> >>>>> few
> >>>>>>>>>>> comments:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. I have a couple of concerns on the percentiles. The
> >> first
> >>>>> issue
> >>>>>> is
> >>>>>>>>>> that
> >>>>>>>>>>> It requires the user to know the value range. Since the
> >> range
> >>>> for
> >>>>>>>>>> things
> >>>>>>>>>>> like message size (in millions) is quite different from
> >> those
> >>>>> like
> >>>>>>>>>> request
> >>>>>>>>>>> time (less than 100), it's going to be hard to pick a good
> >>>> global
> >>>>>>>>>> default
> >>>>>>>>>>> range. Different apps could be dealing with different
> >> message
> >>>>>> size. So
> >>>>>>>>>>> they
> >>>>>>>>>>> probably will have to customize the range. Another issue is
> >>>> that
> >>>>>> it can
> >>>>>>>>>>> only report values at the bucket boundaries. So, if you
> >> have
> >>>> 1000
> >>>>>>>>>> buckets
> >>>>>>>>>>> and a value range of 1 million, you will only see 1000
> >>> possible
> >>>>>> values
> >>>>>>>>>> as
> >>>>>>>>>>> the quantile, which is probably too sparse. The
> >>> implementation
> >>>> of
> >>>>>>>>>>> histogram
> >>>>>>>>>>> in metrics-core keeps a fix size of samples, which avoids
> >>> both
> >>>>>> issues.
> >>>>>>>>>>>
> >>>>>>>>>>> 2. We need to document the 3-part metrics names better
> >> since
> >>>> it's
> >>>>>> not
> >>>>>>>>>>> obvious what the convention is. Also, currently the name of
> >>> the
> >>>>>> sensor
> >>>>>>>>>> and
> >>>>>>>>>>> the metrics defined in it are independent. Would it make
> >>> sense
> >>>> to
> >>>>>> have
> >>>>>>>>>> the
> >>>>>>>>>>> sensor name be a prefix of the metric name?
> >>>>>>>>>>>
> >>>>>>>>>>> Overall, this approach seems to be cleaner than
> >> metrics-core
> >>> by
> >>>>>>>>>> decoupling
> >>>>>>>>>>> measuring and reporting. The main benefit of metrics-core
> >>> seems
> >>>>> to
> >>>>>> be
> >>>>>>>>>> the
> >>>>>>>>>>> existing reporters. Since not that many people voted for
> >>>>>> metrics-core,
> >>>>>>>>>> I
> >>>>>>>>>>> am
> >>>>>>>>>>> ok with going with the new implementation. My only
> >>>> recommendation
> >>>>>> is to
> >>>>>>>>>>> address the concern on percentiles.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>>
> >>>>>>>>>>> Jun
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
> >>>> jay.kreps@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hey guys,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I wanted to kick off a quick discussion of metrics with
> >>>> respect
> >>>>>> to
> >>>>>>>>>> the
> >>>>>>>>>>>> new
> >>>>>>>>>>>> producer and consumer (and potentially the server).
> >>>>>>>>>>>>
> >>>>>>>>>>>> At a high level I think there are three approaches we
> >> could
> >>>>> take:
> >>>>>>>>>>>> 1. Plain vanilla JMX
> >>>>>>>>>>>> 2. Use Coda Hale (AKA Yammer) Metrics
> >>>>>>>>>>>> 3. Do our own metrics (with JMX as one output)
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. Has the advantage that JMX is the most commonly used
> >>> java
> >>>>>> thing
> >>>>>>>>>> and
> >>>>>>>>>>>> plugs in reasonably to most metrics systems. JMX is
> >>> included
> >>>> in
> >>>>>> the
> >>>>>>>>>> JDK
> >>>>>>>>>>>> so
> >>>>>>>>>>>> it doesn't impose any additional dependencies on clients.
> >>> It
> >>>>> has
> >>>>>> the
> >>>>>>>>>>>> disadvantage that plain vanilla JMX is a pain to use. We
> >>>> would
> >>>>>> need a
> >>>>>>>>>>>> bunch
> >>>>>>>>>>>> of helper code for maintaining counters to make this
> >>>>> reasonable.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2. Coda Hale metrics is pretty good and broadly used. It
> >>>>>> supports JMX
> >>>>>>>>>>>> output as well as direct output to many other types of
> >>>> systems.
> >>>>>> The
> >>>>>>>>>>>> primary
> >>>>>>>>>>>> downside we have had with Coda Hale has to do with the
> >>>> clients
> >>>>>> and
> >>>>>>>>>>>> library
> >>>>>>>>>>>> incompatibilities. We are currently on an older more
> >>> popular
> >>>>>> version.
> >>>>>>>>>>>> The
> >>>>>>>>>>>> newer version is a rewrite of the APIs and is
> >> incompatible.
> >>>>>>>>>> Originally
> >>>>>>>>>>>> these were totally incompatible and people had to choose
> >>> one
> >>>> or
> >>>>>> the
> >>>>>>>>>>>> other.
> >>>>>>>>>>>> I think that has been improved so now the new version is
> >> a
> >>>>>> totally
> >>>>>>>>>>>> different package. But even in this case you end up with
> >>> both
> >>>>>>>>>> versions
> >>>>>>>>>>>> if
> >>>>>>>>>>>> you use Kafka and we are on a different version than you
> >>>> which
> >>>>> is
> >>>>>>>>>> going
> >>>>>>>>>>>> to
> >>>>>>>>>>>> be pretty inconvenient.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3. Doing our own has the downside of potentially
> >>> reinventing
> >>>>> the
> >>>>>>>>>> wheel,
> >>>>>>>>>>>> and
> >>>>>>>>>>>> potentially needing to work out any bugs in our code. The
> >>>>> upsides
> >>>>>>>>>> would
> >>>>>>>>>>>> depend on the how good the reinvention was. As it
> >> happens I
> >>>>> did a
> >>>>>>>>>> quick
> >>>>>>>>>>>> (~900 loc) version of a metrics library that is under
> >>>>>>>>>>>> kafka.common.metrics.
> >>>>>>>>>>>> I think it has some advantages over the Yammer metrics
> >>>> package
> >>>>>> for
> >>>>>>>>>> our
> >>>>>>>>>>>> usage beyond just not causing incompatibilities. I will
> >>>>> describe
> >>>>>> this
> >>>>>>>>>>>> code
> >>>>>>>>>>>> so we can discuss the pros and cons. Although I favor
> >> this
> >>>>>> approach I
> >>>>>>>>>>>> have
> >>>>>>>>>>>> no emotional attachment and wouldn't be too sad if I
> >> ended
> >>> up
> >>>>>>>>>> deleting
> >>>>>>>>>>>> it.
> >>>>>>>>>>>> Here are javadocs for this code, though I haven't written
> >>>> much
> >>>>>>>>>>>> documentation yet since I might end up deleting it:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Here is a quick overview of this library.
> >>>>>>>>>>>>
> >>>>>>>>>>>> There are three main public interfaces:
> >>>>>>>>>>>>  Metrics - This is a repository of metrics being
> >> tracked.
> >>>>>>>>>>>>  Metric - A single, named numerical value being measured
> >>>>> (i.e. a
> >>>>>>>>>>>> counter).
> >>>>>>>>>>>>  Sensor - This is a thing that records values and
> >> updates
> >>>> zero
> >>>>>> or
> >>>>>>>>>> more
> >>>>>>>>>>>> metrics
> >>>>>>>>>>>>
> >>>>>>>>>>>> So let's say we want to track three values about message
> >>>> sizes;
> >>>>>>>>>>>> specifically say we want to record the average, the
> >>> maximum,
> >>>>> the
> >>>>>>>>>> total
> >>>>>>>>>>>> rate
> >>>>>>>>>>>> of bytes being sent, and a count of messages. Then we
> >> would
> >>>> do
> >>>>>>>>>> something
> >>>>>>>>>>>> like this:
> >>>>>>>>>>>>
> >>>>>>>>>>>>   // setup code
> >>>>>>>>>>>>   Metrics metrics = new Metrics(); // this is a global
> >>>>>> "singleton"
> >>>>>>>>>>>>   Sensor sensor =
> >>>>>> metrics.sensor("kafka.producer.message.sizes");
> >>>>>>>>>>>>   sensor.add("kafka.producer.message-size.avg", new
> >>> Avg());
> >>>>>>>>>>>>   sensor.add("kafka.producer.message-size.max", new
> >>> Max());
> >>>>>>>>>>>>   sensor.add("kafka.producer.bytes-sent-per-sec", new
> >>>> Rate());
> >>>>>>>>>>>>   sensor.add("kafka.producer.message-count", new
> >> Count());
> >>>>>>>>>>>>
> >>>>>>>>>>>>   // now when we get a message we do this
> >>>>>>>>>>>>   sensor.record(messageSize);
> >>>>>>>>>>>>
> >>>>>>>>>>>> The above code creates the global metrics repository,
> >>>> creates a
> >>>>>>>>>> single
> >>>>>>>>>>>> Sensor, and defines 5 named metrics that are updated by
> >>> that
> >>>>>> Sensor.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Like Yammer Metrics (YM) I allow you to plug in
> >>> "reporters",
> >>>>>>>>>> including a
> >>>>>>>>>>>> JMX reporter. Unlike the Coda Hale JMX reporter the
> >>> reporter
> >>>> I
> >>>>>> have
> >>>>>>>>>> keys
> >>>>>>>>>>>> off the metric names not the Sensor names, which I think
> >> is
> >>>> an
> >>>>>>>>>>>> improvement--I just use the convention that the last
> >>> portion
> >>>> of
> >>>>>> the
> >>>>>>>>>>>> name is
> >>>>>>>>>>>> the attribute name, the second to last is the mbean name,
> >>> and
> >>>>> the
> >>>>>>>>>> rest
> >>>>>>>>>>>> is
> >>>>>>>>>>>> the package. So in the above example there is a producer
> >>>> mbean
> >>>>>> that
> >>>>>>>>>> has
> >>>>>>>>>>>> a
> >>>>>>>>>>>> avg and max attribute and a producer mbean that has a
> >>>>>>>>>> bytes-sent-per-sec
> >>>>>>>>>>>> and message-count attribute. This is nice because you can
> >>>>>> logically
> >>>>>>>>>>>> group
> >>>>>>>>>>>> the values reported irrespective of where in the program
> >>> they
> >>>>> are
> >>>>>>>>>>>> computed--that is an mbean can logically group attributes
> >>>>>> computed
> >>>>>>>>>> off
> >>>>>>>>>>>> different sensors. This means you can report values by
> >>>> logical
> >>>>>>>>>>>> subsystem.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I also allow the concept of hierarchical Sensors which I
> >>>> think
> >>>>>> is a
> >>>>>>>>>> good
> >>>>>>>>>>>> convenience. I have noticed a common pattern in systems
> >>> where
> >>>>> you
> >>>>>>>>>> need
> >>>>>>>>>>>> to
> >>>>>>>>>>>> roll up the same values along different dimensions. An
> >>> simple
> >>>>>>>>>> example is
> >>>>>>>>>>>> metrics about qps, data rate, etc on the broker. These we
> >>>> want
> >>>>> to
> >>>>>>>>>>>> capture
> >>>>>>>>>>>> in aggregate, but also broken down by topic-id. You can
> >> do
> >>>> this
> >>>>>>>>>> purely
> >>>>>>>>>>>> by
> >>>>>>>>>>>> defining the sensor hierarchy:
> >>>>>>>>>>>> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> >>>>>>>>>>>> Sensor topicSizes = metrics.sensor("kafka.producer." +
> >>> topic
> >>>> +
> >>>>>>>>>>>> ".sizes",
> >>>>>>>>>>>> allSizes);
> >>>>>>>>>>>> Now each actual update will go to the appropriate
> >>> topicSizes
> >>>>>> sensor
> >>>>>>>>>>>> (based
> >>>>>>>>>>>> on the topic name), but allSizes metrics will get updated
> >>>> too.
> >>>>> I
> >>>>>> also
> >>>>>>>>>>>> support multiple parents for each sensor as well as
> >>> multiple
> >>>>>> layers
> >>>>>>>>>> of
> >>>>>>>>>>>> hiearchy, so you can define a more elaborate DAG of
> >>> sensors.
> >>>> An
> >>>>>>>>>> example
> >>>>>>>>>>>> of
> >>>>>>>>>>>> how this would be useful is if you wanted to record your
> >>>>> metrics
> >>>>>>>>>> broken
> >>>>>>>>>>>> down by topic AND client id as well as the global
> >>> aggregate.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Each metric can take a configurable Quota value which
> >>> allows
> >>>> us
> >>>>>> to
> >>>>>>>>>> limit
> >>>>>>>>>>>> the maximum value of that sensor. This is intended for
> >> use
> >>> on
> >>>>> the
> >>>>>>>>>>>> server as
> >>>>>>>>>>>> part of our Quota implementation. The way this works is
> >>> that
> >>>>> you
> >>>>>>>>>> record
> >>>>>>>>>>>> metrics as usual:
> >>>>>>>>>>>>   mySensor.record(42.0)
> >>>>>>>>>>>> However if this event occurance causes one of the metrics
> >>> to
> >>>>>> exceed
> >>>>>>>>>> its
> >>>>>>>>>>>> maximum allowable value (the quota) this call will throw
> >> a
> >>>>>>>>>>>> QuotaViolationException. The cool thing about this is
> >> that
> >>> it
> >>>>>> means
> >>>>>>>>>> we
> >>>>>>>>>>>> can
> >>>>>>>>>>>> define quotas on anything we capture metrics for, which I
> >>>> think
> >>>>>> is
> >>>>>>>>>>>> pretty
> >>>>>>>>>>>> cool.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Another question is how to handle windowing of the
> >> values?
> >>>>>> Metrics
> >>>>>>>>>> want
> >>>>>>>>>>>> to
> >>>>>>>>>>>> record the "current" value, but the definition of current
> >>> is
> >>>>>>>>>> inherently
> >>>>>>>>>>>> nebulous. A few of the obvious gotchas are that if you
> >>> define
> >>>>>>>>>> "current"
> >>>>>>>>>>>> to
> >>>>>>>>>>>> be a number of events you can end up measuring an
> >>> arbitrarily
> >>>>>> long
> >>>>>>>>>>>> window
> >>>>>>>>>>>> of time if the event rate is low (e.g. you think you are
> >>>>> getting
> >>>>>> 50
> >>>>>>>>>>>> messages/sec because that was the rate yesterday when all
> >>>>> events
> >>>>>>>>>>>> topped).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Here is how I approach this. All the metrics use the same
> >>>>>> windowing
> >>>>>>>>>>>> approach. We define a single window by a length of time
> >> or
> >>>>>> number of
> >>>>>>>>>>>> values
> >>>>>>>>>>>> (you can use either or both--if both the window ends when
> >>>>>> *either*
> >>>>>>>>>> the
> >>>>>>>>>>>> time
> >>>>>>>>>>>> bound or event bound is hit). The typical problem with
> >> hard
> >>>>>> window
> >>>>>>>>>>>> boundaries is that at the beginning of the window you
> >> have
> >>> no
> >>>>>> data
> >>>>>>>>>> and
> >>>>>>>>>>>> the
> >>>>>>>>>>>> first few samples are too small to be a valid sample.
> >>>> (Consider
> >>>>>> if
> >>>>>>>>>> you
> >>>>>>>>>>>> were
> >>>>>>>>>>>> keeping an avg and the first value in the window happens
> >> to
> >>>> be
> >>>>>> very
> >>>>>>>>>> very
> >>>>>>>>>>>> high, if you check the avg at this exact time you will
> >>>> conclude
> >>>>>> the
> >>>>>>>>>> avg
> >>>>>>>>>>>> is
> >>>>>>>>>>>> very high but on a sample size of one). One simple fix
> >>> would
> >>>> be
> >>>>>> to
> >>>>>>>>>>>> always
> >>>>>>>>>>>> report the last complete window, however this is not
> >>>>> appropriate
> >>>>>> here
> >>>>>>>>>>>> because (1) we want to drive quotas off it so it needs to
> >>> be
> >>>>>> current,
> >>>>>>>>>>>> and
> >>>>>>>>>>>> (2) since this is for monitoring you kind of care more
> >>> about
> >>>>> the
> >>>>>>>>>> current
> >>>>>>>>>>>> state. The ideal solution here would be to define a
> >>> backwards
> >>>>>> looking
> >>>>>>>>>>>> sliding window from the present, but many statistics are
> >>>>> actually
> >>>>>>>>>> very
> >>>>>>>>>>>> hard
> >>>>>>>>>>>> to compute in this model without retaining all the values
> >>>> which
> >>>>>>>>>> would be
> >>>>>>>>>>>> hopelessly inefficient. My solution to this is to keep a
> >>>>>> configurable
> >>>>>>>>>>>> number of windows (default is two) and combine them for
> >> the
> >>>>>> estimate.
> >>>>>>>>>>>> So in
> >>>>>>>>>>>> a two sample case depending on when you ask you have
> >>> between
> >>>>> one
> >>>>>> and
> >>>>>>>>>> two
> >>>>>>>>>>>> complete samples worth of data to base the answer off of.
> >>>>>> Provided
> >>>>>>>>>> the
> >>>>>>>>>>>> sample window is large enough to get a valid result this
> >>>>>> satisfies
> >>>>>>>>>> both
> >>>>>>>>>>>> of
> >>>>>>>>>>>> my criteria of incorporating the most recent data and
> >>> having
> >>>>>>>>>> reasonable
> >>>>>>>>>>>> variance at all times.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Another approach is to use an exponential weighting
> >> scheme
> >>> to
> >>>>>> combine
> >>>>>>>>>>>> all
> >>>>>>>>>>>> history but emphasize the recent past. I have not done
> >> this
> >>>> as
> >>>>> it
> >>>>>>>>>> has a
> >>>>>>>>>>>> lot
> >>>>>>>>>>>> of issues for practical operational metrics. I'd be happy
> >>> to
> >>>>>>>>>> elaborate
> >>>>>>>>>>>> on
> >>>>>>>>>>>> this if anyone cares...
> >>>>>>>>>>>>
> >>>>>>>>>>>> The window size for metrics has a global default which
> >> can
> >>> be
> >>>>>>>>>>>> overridden at
> >>>>>>>>>>>> either the sensor or individual metric level.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In addition to these time series values the user can
> >>> directly
> >>>>>> expose
> >>>>>>>>>>>> some
> >>>>>>>>>>>> method of their choosing JMX-style by implementing the
> >>>>> Measurable
> >>>>>>>>>>>> interface
> >>>>>>>>>>>> and registering that value. E.g.
> >>>>>>>>>>>>  metrics.addMetric("my.metric", new Measurable() {
> >>>>>>>>>>>>    public double measure(MetricConfg config, long now) {
> >>>>>>>>>>>>       return this.calculateValueToExpose();
> >>>>>>>>>>>>    }
> >>>>>>>>>>>>  });
> >>>>>>>>>>>> This is useful for exposing things like the accumulator
> >>> free
> >>>>>> memory.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The set of metrics is extensible, new metrics can be
> >> added
> >>> by
> >>>>>> just
> >>>>>>>>>>>> implementing the appropriate interfaces and registering
> >>> with
> >>>> a
> >>>>>>>>>> sensor. I
> >>>>>>>>>>>> implement the following metrics:
> >>>>>>>>>>>>  total - the sum of all values from the given sensor
> >>>>>>>>>>>>  count - a windowed count of values from the sensor
> >>>>>>>>>>>>  avg - the sample average within the windows
> >>>>>>>>>>>>  max - the max over the windows
> >>>>>>>>>>>>  min - the min over the windows
> >>>>>>>>>>>>  rate - the rate in the windows (e.g. the total or count
> >>>>>> divided by
> >>>>>>>>>> the
> >>>>>>>>>>>> ellapsed time)
> >>>>>>>>>>>>  percentiles - a collection of percentiles computed over
> >>> the
> >>>>>> window
> >>>>>>>>>>>>
> >>>>>>>>>>>> My approach to percentiles is a little different from the
> >>>>> yammer
> >>>>>>>>>> metrics
> >>>>>>>>>>>> package. My complaint about the yammer metrics approach
> >> is
> >>>> that
> >>>>>> it
> >>>>>>>>>> uses
> >>>>>>>>>>>> rather expensive sampling and uses kind of a lot of
> >> memory
> >>> to
> >>>>>> get a
> >>>>>>>>>>>> reasonable sample. This is problematic for per-topic
> >>>>>> measurements.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Instead I use a fixed range for the histogram (e.g. 0.0
> >> to
> >>>>>> 30000.0)
> >>>>>>>>>>>> which
> >>>>>>>>>>>> directly allows you to specify the desired memory use.
> >> Any
> >>>>> value
> >>>>>>>>>> below
> >>>>>>>>>>>> the
> >>>>>>>>>>>> minimum is recorded as -Infinity and any value above the
> >>>>> maximum
> >>>>>> as
> >>>>>>>>>>>> +Infinity. I think this is okay as all metrics have an
> >>>> expected
> >>>>>> range
> >>>>>>>>>>>> except for latency which can be arbitrarily large, but
> >> for
> >>>> very
> >>>>>> high
> >>>>>>>>>>>> latency there is no need to model it exactly (e.g. 30
> >>>> seconds +
> >>>>>>>>>> really
> >>>>>>>>>>>> is
> >>>>>>>>>>>> effectively infinite). Within the range values are
> >> recorded
> >>>> in
> >>>>>>>>>> buckets
> >>>>>>>>>>>> which can be either fixed width or increasing width. The
> >>>>>> increasing
> >>>>>>>>>>>> width
> >>>>>>>>>>>> is analogous to the idea of significant figures, that is
> >> if
> >>>>> your
> >>>>>>>>>> value
> >>>>>>>>>>>> is
> >>>>>>>>>>>> in the range 0-10 you might want to be accurate to within
> >>>> 1ms,
> >>>>>> but if
> >>>>>>>>>>>> it is
> >>>>>>>>>>>> 20000 there is no need to be so accurate. I implemented a
> >>>>> linear
> >>>>>>>>>> bucket
> >>>>>>>>>>>> size where the Nth bucket has width proportional to N. An
> >>>>>> exponential
> >>>>>>>>>>>> bucket size would also be sensible and could likely be
> >>>> derived
> >>>>>>>>>> directly
> >>>>>>>>>>>> from the floating point representation of a the value.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'd like to get some feedback on this metrics code and
> >>> make a
> >>>>>>>>>> decision
> >>>>>>>>>>>> on
> >>>>>>>>>>>> whether we want to use it before I actually go ahead and
> >>> add
> >>>>> all
> >>>>>> the
> >>>>>>>>>>>> instrumentation in the code (otherwise I'll have to redo
> >> it
> >>>> if
> >>>>> we
> >>>>>>>>>> switch
> >>>>>>>>>>>> approaches). So the next topic of discussion will be
> >> which
> >>>>> actual
> >>>>>>>>>>>> metrics
> >>>>>>>>>>>> to add.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Jay
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: Metrics in new producer

Posted by Jun Rao <ju...@gmail.com>.

Clark,

As Martin pointed out, if a stat is stable, the numbers that you get from
the new metrics are going to be close to what you get from Coda metrics. If
a stat is not stable, what the new metrics gives you is probably more
intuitive. Given that, would you still want the Coda metrics through a pure
stub?

Thanks,

Jun


On Sat, Feb 22, 2014 at 9:53 AM, Clark Breyman <cl...@breyman.com> wrote:

> Jay - I was thinking of a pure stub rather than just wrapping Kafka metrics
> in a Coda gauge.  I'd like the Timers, Meters etc to still be Coda meters -
> that way the windows, exponential decays, etc are comparable to the rest of
> the Coda metrics in our applications. At the same time, I don't want to
> force Coda timers (or any other timers) on an app that won't make good use
> of them.
>
> Thanks again, C
>
>
> On Sat, Feb 22, 2014 at 9:25 AM, Martin Kleppmann
> <mk...@linkedin.com>wrote:
>
> > Not sure if you want yet another opinion added to the pile -- but since I
> > had a similar problem on another project recently, I thought I'd weigh
> in.
> > (On that project we were originally using Coda's library, but then
> switched
> > to rolling our own metrics implementation because we needed to do a few
> > things differently.)
> >
> > 1. Problems we encountered with Coda's library: it uses an
> > exponentially-weighted moving average (EMWA) for rates (eg.
> messages/sec),
> > and exponentially biased reservoir sampling for histograms (percentiles,
> > averages). Those methods of calculation work well for events with a
> > consistently high volume, but they give strange and misleading results
> for
> > events that are bursty or rare (eg error rates). We found that a
> fixed-size
> > window gives more predictable, easier-to-interpret results.
> >
> > 2. In defence of Coda's library, I think its histogram implementation is
> a
> > good trade-off of memory for accuracy; I'm not totally convinced that
> your
> > proposal (counts of events in a fixed set of buckets) would be much
> better.
> > Would have to do some math to work out the expected accuracy in each
> case.
> > The reservoir sampling can be configured to use a smaller sample if the
> > default of 1028 samples is too expensive. Reservoir sampling also has the
> > advantage that you don't need to hard-code a bucket distribution.
> >
> > 3. Quotas are an interesting use case. However, I'm not wild about using
> a
> > QuotaViolationException for control flow -- I think an explicit
> conditional
> > would be nicer than having to catch an exception. One question in that
> > context: if a quota is exceeded, do you still want to count the event
> > towards the metric, or do you want to stop counting it until the quota is
> > replenished? The answer may depend on the particular metric.
> >
> > 4. If you decide to go with Coda's library, I would advocate isolating
> the
> > dependency into a separate module and using it via a facade -- somewhat
> > like using SLF4J instead of Log4j directly. It's ok for Coda's library to
> > be the default metrics implementation, but it should be easy to swap it
> out
> > for something different in case someone has a version conflict or
> differing
> > requirements. The facade should be at a low level (individual events),
> not
> > at the reporter level (which deals with pre-aggregated values, and is
> > already pluggable).
> >
> > 5. If it's useful, I can probably contribute my simple (but imho
> > effective) metrics library, for embedding into Kafka. It uses reservoir
> > sampling for percentiles, like Coda's library, but uses a fixed-size
> window
> > instead of an exponential bias, which avoids weird behaviour on bursty
> > metrics.
> >
> > In summary, I would advocate one of the following approaches:
> > - Coda Hale library via facade (allowing it to be swapped for something
> > else), or
> > - Own metrics implementation, provided that we have confidence in its
> > implementation of percentiles.
> >
> > Martin
> >
> >
> > On 22 Feb 2014, at 01:06, Jay Kreps <ja...@gmail.com> wrote:
> > > Hey guys,
> > >
> > > Just picking up this thread again. I do want to drive a conclusion as I
> > > will run out of work to do on the producer soon and will need to add
> > > metrics of some sort. We can vote on it, but I'm not sure if we
> actually
> > > got everything discussed.
> > >
> > > Joel, I wasn't fully sure how to interpret your comment. I think you
> are
> > > saying you are cool with the new metrics package as long as it really
> is
> > > better. Do you have any comment on whether you think the benefits I
> > > outlined are worth it? I agree with you that we could hold off on a
> > second
> > > repo until someone else would actually want to use our code.
> > >
> > > Jun, I'm not averse to doing a sampling-based histogram and doing some
> > > comparison between the two approaches if you think this approach is
> > > otherwise better.
> > >
> > > Sriram, originally I thought you preferred just sticking to Coda Hale,
> > but
> > > after your follow-up email I wasn't really sure...
> > >
> > > Joe/Clark, yes this code allows pluggable reporting so you could have a
> > > metrics reporter that just wraps each metric in a Coda Hale Gauge if
> that
> > > is useful. Though obviously if enough people were doing that I would
> > think
> > > it would be worth just using the Coda Hale package directly...
> > >
> > > -Jay
> > >
> > >
> > >
> > >
> > > On Thu, Feb 13, 2014 at 3:34 PM, Clark Breyman <cl...@breyman.com>
> > wrote:
> > >
> > >> Not requiring the client to link Coda/Yammer metrics sounds like a
> > >> compelling reason to pivot to new interfaces. If that's the agreed
> > >> direction, I'm hoping that we'd get the choice of backend to provide
> > (e.g.
> > >> facade on Yammer metrics for those with an investment in that) rather
> > than
> > >> force the new backend.  Having a metrics factory seems better for this
> > than
> > >> directly instantiating the singleton registry.
> > >>
> > >>
> > >> On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly>
> > wrote:
> > >>
> > >>> Can we leave metrics and have multiple supported KafkaMetricsGroup
> > >>> implementing a yammer based implementation?
> > >>>
> > >>> ProducerRequestStats with your configured analytics group?
> > >>>
> > >>> On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com>
> > wrote:
> > >>>
> > >>>> I think we discussed the scala/java stuff more fully previously.
> > >>>> Essentially the client is embedded everywhere. Scala is very
> > >> incompatible
> > >>>> with itself so this makes it very hard to use for people using
> > anything
> > >>>> else in scala. Also Scala stack traces are very confusing. Basically
> > we
> > >>>> thought plain java code would be a lot easier for people to use.
> Even
> > >> if
> > >>>> Scala is more fun to write, that isn't really what we are optimizing
> > >> for.
> > >>>>
> > >>>> -Jay
> > >>>>
> > >>>>
> > >>>> On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com>
> > wrote:
> > >>>>
> > >>>>> Jay, pretty impressive how you just write a 'quick version' like
> that
> > >>> :)
> > >>>>> Not to get off-topic but why didn't you write this in scala?
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
> > >>> wrote:
> > >>>>>
> > >>>>>> I have not had a chance to review the new metrics code and its
> > >>>>>> features carefully (apart from your write-up), but here are my
> > >>> general
> > >>>>>> thoughts:
> > >>>>>>
> > >>>>>> Implementing a metrics package correctly is difficult; more so for
> > >>>>>> people like me, because I'm not a statistician.  However, if this
> > >> new
> > >>>>>> package: {(i) functions correctly (and we need to define and prove
> > >>>>>> correctness), (ii) is easy to use, (iii) serves all our current
> and
> > >>>>>> anticipated monitoring needs, (iv) is not overly complex that it
> > >>>>>> becomes a burden to maintain and we are better of with an
> available
> > >>>>>> library;} then I think it makes sense to embed it and use it
> within
> > >>>>>> the Kafka code. The main wins are: (i) predictability (no changing
> > >>>>>> APIs and intimate knowledge of the code) and (ii) control with
> > >>> respect
> > >>>>>> to both functionality (e.g., there are hard-coded decay constants
> > >> in
> > >>>>>> metrics-core 2.x) and correctness (i.e., if we find a bug in the
> > >>>>>> metrics package we have to submit a pull request and wait for it
> to
> > >>>>>> become mainstream).  I'm not sure it would help very much to pull
> > >> it
> > >>>>>> into a separate repo because that could potentially annul these
> > >>>>>> benefits.
> > >>>>>>
> > >>>>>> Joel
> > >>>>>>
> > >>>>>> On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> > >>>>>>> Sriram,
> > >>>>>>>
> > >>>>>>> Makes sense. I am cool moving this stuff into its own repo if
> > >>> people
> > >>>>>> think
> > >>>>>>> that is better. I'm not sure it would get much contribution but
> > >>> when
> > >>>> I
> > >>>>>>> started messing with this I did have a lot of grand ideas of
> > >> making
> > >>>>>> adding
> > >>>>>>> metrics to a sensor dynamic so you could add more stuff in
> > >>>>> real-time(via
> > >>>>>>> jmx, say) and/or externalize all your metrics and config to a
> > >>>> separate
> > >>>>>> file
> > >>>>>>> like log4j with only the points of instrumentation hard-coded.
> > >>>>>>>
> > >>>>>>> -Jay
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> > >>>>>>> srsubramanian@linkedin.com> wrote:
> > >>>>>>>
> > >>>>>>>> I am actually neutral to this change. I found the replies were
> > >>> more
> > >>>>>>>> towards the implementation and features so far. I would like
> > >> the
> > >>>>>> community
> > >>>>>>>> to think about the questions below before making a decision. My
> > >>>>>> opinion on
> > >>>>>>>> this is that it has potential to be its own project and it
> > >> would
> > >>>>>> attract
> > >>>>>>>> developers who are specifically interested in contributing to
> > >>>>> metrics.
> > >>>>>> I
> > >>>>>>>> am skeptical that the Kafka contributors would focus on
> > >> improving
> > >>>>> this
> > >>>>>>>> library (apart from bug fixes) instead of
> > >> developing/contributing
> > >>>> to
> > >>>>>> other
> > >>>>>>>> core pieces. It would be useful to continue and keep it
> > >> decoupled
> > >>>>> from
> > >>>>>>>> rest of Kafka (if it resides in the Kafka code base.) so that
> > >> we
> > >>>> can
> > >>>>>> move
> > >>>>>>>> it out anytime to its own project.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hey Sriram,
> > >>>>>>>>>
> > >>>>>>>>> Not sure if these are actually meant as questions or more
> > >> veiled
> > >>>>>> comments.
> > >>>>>>>>> In an case I tried to give my 2 cents inline.
> > >>>>>>>>>
> > >>>>>>>>> On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> > >>>>>>>>> srsubramanian@linkedin.com> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> I think answering the questions below would help to make a
> > >>>> better
> > >>>>>>>>>> decision. I am all for writing better code and having
> > >> superior
> > >>>>>>>>>> functionalities but it is worth thinking about stuff outside
> > >>>> just
> > >>>>>> code
> > >>>>>>>>>> in
> > >>>>>>>>>> this case -
> > >>>>>>>>>>
> > >>>>>>>>>> 1. Does metric form a core piece of kafka? Does it help
> > >> kafka
> > >>>>>> greatly in
> > >>>>>>>>>> providing better core functionalities? I would always like a
> > >>>>>> project to
> > >>>>>>>>>> do
> > >>>>>>>>>> one thing really well. Metrics is a non trivial amount of
> > >>> code.
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Metrics are obviously important, and obviously improving our
> > >>>> metrics
> > >>>>>>>>> system
> > >>>>>>>>> would be good. That said this may or may not be better, and
> > >> even
> > >>>> if
> > >>>>>> it is
> > >>>>>>>>> better that betterness might not outweigh other
> > >> considerations.
> > >>>> That
> > >>>>>> is
> > >>>>>>>>> what we are discussing.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> 2. Does it make sense to be part of Kafka or its own
> > >> project?
> > >>> If
> > >>>>>> this
> > >>>>>>>>>> metrics library has the potential to be better than
> > >>>> metrics-core,
> > >>>>> I
> > >>>>>>>>>> would
> > >>>>>>>>>> be interested in other projects take advantage of it.
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> It could be either.
> > >>>>>>>>>
> > >>>>>>>>> 3. Can Kafka maintain this library as new members join and old
> > >>>>> members
> > >>>>>>>>>> leave? Would this be a piece of code that no one (in Kafka)
> > >> in
> > >>>> the
> > >>>>>>>>>> future
> > >>>>>>>>>> spends time improving if the original author left?
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> I am not going anywhere in the near term, but if I did, yes,
> > >>> this
> > >>>>>> would be
> > >>>>>>>>> like any other code we have. As with yammer metrics or any
> > >> other
> > >>>>> code
> > >>>>>> at
> > >>>>>>>>> that point we would either use it as is or someone would
> > >> improve
> > >>>> it.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> 4. Does it affect the schedule of producer rewrite? This
> > >> needs
> > >>>> its
> > >>>>>> own
> > >>>>>>>>>> stabilization and modification to existing metric dashboards
> > >>> if
> > >>>>> the
> > >>>>>>>>>> format
> > >>>>>>>>>> is changed. Many times such cost are not factored in and a
> > >>>> project
> > >>>>>> loses
> > >>>>>>>>>> time before realizing the extra time required to make a
> > >>> library
> > >>>> as
> > >>>>>> this
> > >>>>>>>>>> operational.
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Probably not. The metrics are going to change regardless of
> > >>>> whether
> > >>>>>> we use
> > >>>>>>>>> the same library or not. If we think this is better I don't
> > >> mind
> > >>>>>> putting
> > >>>>>>>>> in
> > >>>>>>>>> a little extra effort to get there.
> > >>>>>>>>>
> > >>>>>>>>> Irrespective I think this is probably not the right thing to
> > >>>>> optimize
> > >>>>>> for.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> I am sure we can do better when we write code to a specific
> > >>> use
> > >>>>>> case (in
> > >>>>>>>>>> this case, kafka) rather than building a generic library
> > >> that
> > >>>>> suits
> > >>>>>> all
> > >>>>>>>>>> (metrics-core) but I would like us to have answers to the
> > >>>>> questions
> > >>>>>>>>>> above
> > >>>>>>>>>> and be prepared before we proceed to support this with the
> > >>>>> producer
> > >>>>>>>>>> rewrite.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Naturally we are all considering exactly these things, that is
> > >>>>>> exactly the
> > >>>>>>>>> reason I started the thread.
> > >>>>>>>>>
> > >>>>>>>>> -Jay
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Thanks for the detailed write-up. It's well thought
> > >> through.
> > >>> A
> > >>>>> few
> > >>>>>>>>>>> comments:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 1. I have a couple of concerns on the percentiles. The
> > >> first
> > >>>>> issue
> > >>>>>> is
> > >>>>>>>>>> that
> > >>>>>>>>>>> It requires the user to know the value range. Since the
> > >> range
> > >>>> for
> > >>>>>>>>>> things
> > >>>>>>>>>>> like message size (in millions) is quite different from
> > >> those
> > >>>>> like
> > >>>>>>>>>> request
> > >>>>>>>>>>> time (less than 100), it's going to be hard to pick a good
> > >>>> global
> > >>>>>>>>>> default
> > >>>>>>>>>>> range. Different apps could be dealing with different
> > >> message
> > >>>>>> size. So
> > >>>>>>>>>>> they
> > >>>>>>>>>>> probably will have to customize the range. Another issue is
> > >>>> that
> > >>>>>> it can
> > >>>>>>>>>>> only report values at the bucket boundaries. So, if you
> > >> have
> > >>>> 1000
> > >>>>>>>>>> buckets
> > >>>>>>>>>>> and a value range of 1 million, you will only see 1000
> > >>> possible
> > >>>>>> values
> > >>>>>>>>>> as
> > >>>>>>>>>>> the quantile, which is probably too sparse. The
> > >>> implementation
> > >>>> of
> > >>>>>>>>>>> histogram
> > >>>>>>>>>>> in metrics-core keeps a fix size of samples, which avoids
> > >>> both
> > >>>>>> issues.
> > >>>>>>>>>>>
> > >>>>>>>>>>> 2. We need to document the 3-part metrics names better
> > >> since
> > >>>> it's
> > >>>>>> not
> > >>>>>>>>>>> obvious what the convention is. Also, currently the name of
> > >>> the
> > >>>>>> sensor
> > >>>>>>>>>> and
> > >>>>>>>>>>> the metrics defined in it are independent. Would it make
> > >>> sense
> > >>>> to
> > >>>>>> have
> > >>>>>>>>>> the
> > >>>>>>>>>>> sensor name be a prefix of the metric name?
> > >>>>>>>>>>>
> > >>>>>>>>>>> Overall, this approach seems to be cleaner than
> > >> metrics-core
> > >>> by
> > >>>>>>>>>> decoupling
> > >>>>>>>>>>> measuring and reporting. The main benefit of metrics-core
> > >>> seems
> > >>>>> to
> > >>>>>> be
> > >>>>>>>>>> the
> > >>>>>>>>>>> existing reporters. Since not that many people voted for
> > >>>>>> metrics-core,
> > >>>>>>>>>> I
> > >>>>>>>>>>> am
> > >>>>>>>>>>> ok with going with the new implementation. My only
> > >>>> recommendation
> > >>>>>> is to
> > >>>>>>>>>>> address the concern on percentiles.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Jun
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
> > >>>> jay.kreps@gmail.com>
> > >>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hey guys,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I wanted to kick off a quick discussion of metrics with
> > >>>> respect
> > >>>>>> to
> > >>>>>>>>>> the
> > >>>>>>>>>>>> new
> > >>>>>>>>>>>> producer and consumer (and potentially the server).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> At a high level I think there are three approaches we
> > >> could
> > >>>>> take:
> > >>>>>>>>>>>> 1. Plain vanilla JMX
> > >>>>>>>>>>>> 2. Use Coda Hale (AKA Yammer) Metrics
> > >>>>>>>>>>>> 3. Do our own metrics (with JMX as one output)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 1. Has the advantage that JMX is the most commonly used
> > >>> java
> > >>>>>> thing
> > >>>>>>>>>> and
> > >>>>>>>>>>>> plugs in reasonably to most metrics systems. JMX is
> > >>> included
> > >>>> in
> > >>>>>> the
> > >>>>>>>>>> JDK
> > >>>>>>>>>>>> so
> > >>>>>>>>>>>> it doesn't impose any additional dependencies on clients.
> > >>> It
> > >>>>> has
> > >>>>>> the
> > >>>>>>>>>>>> disadvantage that plain vanilla JMX is a pain to use. We
> > >>>> would
> > >>>>>> need a
> > >>>>>>>>>>>> bunch
> > >>>>>>>>>>>> of helper code for maintaining counters to make this
> > >>>>> reasonable.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 2. Coda Hale metrics is pretty good and broadly used. It
> > >>>>>> supports JMX
> > >>>>>>>>>>>> output as well as direct output to many other types of
> > >>>> systems.
> > >>>>>> The
> > >>>>>>>>>>>> primary
> > >>>>>>>>>>>> downside we have had with Coda Hale has to do with the
> > >>>> clients
> > >>>>>> and
> > >>>>>>>>>>>> library
> > >>>>>>>>>>>> incompatibilities. We are currently on an older more
> > >>> popular
> > >>>>>> version.
> > >>>>>>>>>>>> The
> > >>>>>>>>>>>> newer version is a rewrite of the APIs and is
> > >> incompatible.
> > >>>>>>>>>> Originally
> > >>>>>>>>>>>> these were totally incompatible and people had to choose
> > >>> one
> > >>>> or
> > >>>>>> the
> > >>>>>>>>>>>> other.
> > >>>>>>>>>>>> I think that has been improved so now the new version is
> > >> a
> > >>>>>> totally
> > >>>>>>>>>>>> different package. But even in this case you end up with
> > >>> both
> > >>>>>>>>>> versions
> > >>>>>>>>>>>> if
> > >>>>>>>>>>>> you use Kafka and we are on a different version than you
> > >>>> which
> > >>>>> is
> > >>>>>>>>>> going
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>> be pretty inconvenient.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 3. Doing our own has the downside of potentially
> > >>> reinventing
> > >>>>> the
> > >>>>>>>>>> wheel,
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>> potentially needing to work out any bugs in our code. The
> > >>>>> upsides
> > >>>>>>>>>> would
> > >>>>>>>>>>>> depend on the how good the reinvention was. As it
> > >> happens I
> > >>>>> did a
> > >>>>>>>>>> quick
> > >>>>>>>>>>>> (~900 loc) version of a metrics library that is under
> > >>>>>>>>>>>> kafka.common.metrics.
> > >>>>>>>>>>>> I think it has some advantages over the Yammer metrics
> > >>>> package
> > >>>>>> for
> > >>>>>>>>>> our
> > >>>>>>>>>>>> usage beyond just not causing incompatibilities. I will
> > >>>>> describe
> > >>>>>> this
> > >>>>>>>>>>>> code
> > >>>>>>>>>>>> so we can discuss the pros and cons. Although I favor
> > >> this
> > >>>>>> approach I
> > >>>>>>>>>>>> have
> > >>>>>>>>>>>> no emotional attachment and wouldn't be too sad if I
> > >> ended
> > >>> up
> > >>>>>>>>>> deleting
> > >>>>>>>>>>>> it.
> > >>>>>>>>>>>> Here are javadocs for this code, though I haven't written
> > >>>> much
> > >>>>>>>>>>>> documentation yet since I might end up deleting it:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Here is a quick overview of this library.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> There are three main public interfaces:
> > >>>>>>>>>>>>  Metrics - This is a repository of metrics being
> > >> tracked.
> > >>>>>>>>>>>>  Metric - A single, named numerical value being measured
> > >>>>> (i.e. a
> > >>>>>>>>>>>> counter).
> > >>>>>>>>>>>>  Sensor - This is a thing that records values and
> > >> updates
> > >>>> zero
> > >>>>>> or
> > >>>>>>>>>> more
> > >>>>>>>>>>>> metrics
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> So let's say we want to track three values about message
> > >>>> sizes;
> > >>>>>>>>>>>> specifically say we want to record the average, the
> > >>> maximum,
> > >>>>> the
> > >>>>>>>>>> total
> > >>>>>>>>>>>> rate
> > >>>>>>>>>>>> of bytes being sent, and a count of messages. Then we
> > >> would
> > >>>> do
> > >>>>>>>>>> something
> > >>>>>>>>>>>> like this:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>   // setup code
> > >>>>>>>>>>>>   Metrics metrics = new Metrics(); // this is a global
> > >>>>>> "singleton"
> > >>>>>>>>>>>>   Sensor sensor =
> > >>>>>> metrics.sensor("kafka.producer.message.sizes");
> > >>>>>>>>>>>>   sensor.add("kafka.producer.message-size.avg", new
> > >>> Avg());
> > >>>>>>>>>>>>   sensor.add("kafka.producer.message-size.max", new
> > >>> Max());
> > >>>>>>>>>>>>   sensor.add("kafka.producer.bytes-sent-per-sec", new
> > >>>> Rate());
> > >>>>>>>>>>>>   sensor.add("kafka.producer.message-count", new
> > >> Count());
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>   // now when we get a message we do this
> > >>>>>>>>>>>>   sensor.record(messageSize);
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The above code creates the global metrics repository,
> > >>>> creates a
> > >>>>>>>>>> single
> > >>>>>>>>>>>> Sensor, and defines 5 named metrics that are updated by
> > >>> that
> > >>>>>> Sensor.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Like Yammer Metrics (YM) I allow you to plug in
> > >>> "reporters",
> > >>>>>>>>>> including a
> > >>>>>>>>>>>> JMX reporter. Unlike the Coda Hale JMX reporter the
> > >>> reporter
> > >>>> I
> > >>>>>> have
> > >>>>>>>>>> keys
> > >>>>>>>>>>>> off the metric names not the Sensor names, which I think
> > >> is
> > >>>> an
> > >>>>>>>>>>>> improvement--I just use the convention that the last
> > >>> portion
> > >>>> of
> > >>>>>> the
> > >>>>>>>>>>>> name is
> > >>>>>>>>>>>> the attribute name, the second to last is the mbean name,
> > >>> and
> > >>>>> the
> > >>>>>>>>>> rest
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>> the package. So in the above example there is a producer
> > >>>> mbean
> > >>>>>> that
> > >>>>>>>>>> has
> > >>>>>>>>>>>> a
> > >>>>>>>>>>>> avg and max attribute and a producer mbean that has a
> > >>>>>>>>>> bytes-sent-per-sec
> > >>>>>>>>>>>> and message-count attribute. This is nice because you can
> > >>>>>> logically
> > >>>>>>>>>>>> group
> > >>>>>>>>>>>> the values reported irrespective of where in the program
> > >>> they
> > >>>>> are
> > >>>>>>>>>>>> computed--that is an mbean can logically group attributes
> > >>>>>> computed
> > >>>>>>>>>> off
> > >>>>>>>>>>>> different sensors. This means you can report values by
> > >>>> logical
> > >>>>>>>>>>>> subsystem.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I also allow the concept of hierarchical Sensors which I
> > >>>> think
> > >>>>>> is a
> > >>>>>>>>>> good
> > >>>>>>>>>>>> convenience. I have noticed a common pattern in systems
> > >>> where
> > >>>>> you
> > >>>>>>>>>> need
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>> roll up the same values along different dimensions. An
> > >>> simple
> > >>>>>>>>>> example is
> > >>>>>>>>>>>> metrics about qps, data rate, etc on the broker. These we
> > >>>> want
> > >>>>> to
> > >>>>>>>>>>>> capture
> > >>>>>>>>>>>> in aggregate, but also broken down by topic-id. You can
> > >> do
> > >>>> this
> > >>>>>>>>>> purely
> > >>>>>>>>>>>> by
> > >>>>>>>>>>>> defining the sensor hierarchy:
> > >>>>>>>>>>>> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > >>>>>>>>>>>> Sensor topicSizes = metrics.sensor("kafka.producer." +
> > >>> topic
> > >>>> +
> > >>>>>>>>>>>> ".sizes",
> > >>>>>>>>>>>> allSizes);
> > >>>>>>>>>>>> Now each actual update will go to the appropriate
> > >>> topicSizes
> > >>>>>> sensor
> > >>>>>>>>>>>> (based
> > >>>>>>>>>>>> on the topic name), but allSizes metrics will get updated
> > >>>> too.
> > >>>>> I
> > >>>>>> also
> > >>>>>>>>>>>> support multiple parents for each sensor as well as
> > >>> multiple
> > >>>>>> layers
> > >>>>>>>>>> of
> > >>>>>>>>>>>> hiearchy, so you can define a more elaborate DAG of
> > >>> sensors.
> > >>>> An
> > >>>>>>>>>> example
> > >>>>>>>>>>>> of
> > >>>>>>>>>>>> how this would be useful is if you wanted to record your
> > >>>>> metrics
> > >>>>>>>>>> broken
> > >>>>>>>>>>>> down by topic AND client id as well as the global
> > >>> aggregate.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Each metric can take a configurable Quota value which
> > >>> allows
> > >>>> us
> > >>>>>> to
> > >>>>>>>>>> limit
> > >>>>>>>>>>>> the maximum value of that sensor. This is intended for
> > >> use
> > >>> on
> > >>>>> the
> > >>>>>>>>>>>> server as
> > >>>>>>>>>>>> part of our Quota implementation. The way this works is
> > >>> that
> > >>>>> you
> > >>>>>>>>>> record
> > >>>>>>>>>>>> metrics as usual:
> > >>>>>>>>>>>>   mySensor.record(42.0)
> > >>>>>>>>>>>> However if this event occurance causes one of the metrics
> > >>> to
> > >>>>>> exceed
> > >>>>>>>>>> its
> > >>>>>>>>>>>> maximum allowable value (the quota) this call will throw
> > >> a
> > >>>>>>>>>>>> QuotaViolationException. The cool thing about this is
> > >> that
> > >>> it
> > >>>>>> means
> > >>>>>>>>>> we
> > >>>>>>>>>>>> can
> > >>>>>>>>>>>> define quotas on anything we capture metrics for, which I
> > >>>> think
> > >>>>>> is
> > >>>>>>>>>>>> pretty
> > >>>>>>>>>>>> cool.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Another question is how to handle windowing of the
> > >> values?
> > >>>>>> Metrics
> > >>>>>>>>>> want
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>> record the "current" value, but the definition of current
> > >>> is
> > >>>>>>>>>> inherently
> > >>>>>>>>>>>> nebulous. A few of the obvious gotchas are that if you
> > >>> define
> > >>>>>>>>>> "current"
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>> be a number of events you can end up measuring an
> > >>> arbitrarily
> > >>>>>> long
> > >>>>>>>>>>>> window
> > >>>>>>>>>>>> of time if the event rate is low (e.g. you think you are
> > >>>>> getting
> > >>>>>> 50
> > >>>>>>>>>>>> messages/sec because that was the rate yesterday when all
> > >>>>> events
> > >>>>>>>>>>>> topped).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Here is how I approach this. All the metrics use the same
> > >>>>>> windowing
> > >>>>>>>>>>>> approach. We define a single window by a length of time
> > >> or
> > >>>>>> number of
> > >>>>>>>>>>>> values
> > >>>>>>>>>>>> (you can use either or both--if both the window ends when
> > >>>>>> *either*
> > >>>>>>>>>> the
> > >>>>>>>>>>>> time
> > >>>>>>>>>>>> bound or event bound is hit). The typical problem with
> > >> hard
> > >>>>>> window
> > >>>>>>>>>>>> boundaries is that at the beginning of the window you
> > >> have
> > >>> no
> > >>>>>> data
> > >>>>>>>>>> and
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>> first few samples are too small to be a valid sample.
> > >>>> (Consider
> > >>>>>> if
> > >>>>>>>>>> you
> > >>>>>>>>>>>> were
> > >>>>>>>>>>>> keeping an avg and the first value in the window happens
> > >> to
> > >>>> be
> > >>>>>> very
> > >>>>>>>>>> very
> > >>>>>>>>>>>> high, if you check the avg at this exact time you will
> > >>>> conclude
> > >>>>>> the
> > >>>>>>>>>> avg
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>> very high but on a sample size of one). One simple fix
> > >>> would
> > >>>> be
> > >>>>>> to
> > >>>>>>>>>>>> always
> > >>>>>>>>>>>> report the last complete window, however this is not
> > >>>>> appropriate
> > >>>>>> here
> > >>>>>>>>>>>> because (1) we want to drive quotas off it so it needs to
> > >>> be
> > >>>>>> current,
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>> (2) since this is for monitoring you kind of care more
> > >>> about
> > >>>>> the
> > >>>>>>>>>> current
> > >>>>>>>>>>>> state. The ideal solution here would be to define a
> > >>> backwards
> > >>>>>> looking
> > >>>>>>>>>>>> sliding window from the present, but many statistics are
> > >>>>> actually
> > >>>>>>>>>> very
> > >>>>>>>>>>>> hard
> > >>>>>>>>>>>> to compute in this model without retaining all the values
> > >>>> which
> > >>>>>>>>>> would be
> > >>>>>>>>>>>> hopelessly inefficient. My solution to this is to keep a
> > >>>>>> configurable
> > >>>>>>>>>>>> number of windows (default is two) and combine them for
> > >> the
> > >>>>>> estimate.
> > >>>>>>>>>>>> So in
> > >>>>>>>>>>>> a two sample case depending on when you ask you have
> > >>> between
> > >>>>> one
> > >>>>>> and
> > >>>>>>>>>> two
> > >>>>>>>>>>>> complete samples worth of data to base the answer off of.
> > >>>>>> Provided
> > >>>>>>>>>> the
> > >>>>>>>>>>>> sample window is large enough to get a valid result this
> > >>>>>> satisfies
> > >>>>>>>>>> both
> > >>>>>>>>>>>> of
> > >>>>>>>>>>>> my criteria of incorporating the most recent data and
> > >>> having
> > >>>>>>>>>> reasonable
> > >>>>>>>>>>>> variance at all times.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Another approach is to use an exponential weighting
> > >> scheme
> > >>> to
> > >>>>>> combine
> > >>>>>>>>>>>> all
> > >>>>>>>>>>>> history but emphasize the recent past. I have not done
> > >> this
> > >>>> as
> > >>>>> it
> > >>>>>>>>>> has a
> > >>>>>>>>>>>> lot
> > >>>>>>>>>>>> of issues for practical operational metrics. I'd be happy
> > >>> to
> > >>>>>>>>>> elaborate
> > >>>>>>>>>>>> on
> > >>>>>>>>>>>> this if anyone cares...
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The window size for metrics has a global default which
> > >> can
> > >>> be
> > >>>>>>>>>>>> overridden at
> > >>>>>>>>>>>> either the sensor or individual metric level.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> In addition to these time series values the user can
> > >>> directly
> > >>>>>> expose
> > >>>>>>>>>>>> some
> > >>>>>>>>>>>> method of their choosing JMX-style by implementing the
> > >>>>> Measurable
> > >>>>>>>>>>>> interface
> > >>>>>>>>>>>> and registering that value. E.g.
> > >>>>>>>>>>>>  metrics.addMetric("my.metric", new Measurable() {
> > >>>>>>>>>>>>    public double measure(MetricConfg config, long now) {
> > >>>>>>>>>>>>       return this.calculateValueToExpose();
> > >>>>>>>>>>>>    }
> > >>>>>>>>>>>>  });
> > >>>>>>>>>>>> This is useful for exposing things like the accumulator
> > >>> free
> > >>>>>> memory.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The set of metrics is extensible, new metrics can be
> > >> added
> > >>> by
> > >>>>>> just
> > >>>>>>>>>>>> implementing the appropriate interfaces and registering
> > >>> with
> > >>>> a
> > >>>>>>>>>> sensor. I
> > >>>>>>>>>>>> implement the following metrics:
> > >>>>>>>>>>>>  total - the sum of all values from the given sensor
> > >>>>>>>>>>>>  count - a windowed count of values from the sensor
> > >>>>>>>>>>>>  avg - the sample average within the windows
> > >>>>>>>>>>>>  max - the max over the windows
> > >>>>>>>>>>>>  min - the min over the windows
> > >>>>>>>>>>>>  rate - the rate in the windows (e.g. the total or count
> > >>>>>> divided by
> > >>>>>>>>>> the
> > >>>>>>>>>>>> ellapsed time)
> > >>>>>>>>>>>>  percentiles - a collection of percentiles computed over
> > >>> the
> > >>>>>> window
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> My approach to percentiles is a little different from the
> > >>>>> yammer
> > >>>>>>>>>> metrics
> > >>>>>>>>>>>> package. My complaint about the yammer metrics approach
> > >> is
> > >>>> that
> > >>>>>> it
> > >>>>>>>>>> uses
> > >>>>>>>>>>>> rather expensive sampling and uses kind of a lot of
> > >> memory
> > >>> to
> > >>>>>> get a
> > >>>>>>>>>>>> reasonable sample. This is problematic for per-topic
> > >>>>>> measurements.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Instead I use a fixed range for the histogram (e.g. 0.0
> > >> to
> > >>>>>> 30000.0)
> > >>>>>>>>>>>> which
> > >>>>>>>>>>>> directly allows you to specify the desired memory use.
> > >> Any
> > >>>>> value
> > >>>>>>>>>> below
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>> minimum is recorded as -Infinity and any value above the
> > >>>>> maximum
> > >>>>>> as
> > >>>>>>>>>>>> +Infinity. I think this is okay as all metrics have an
> > >>>> expected
> > >>>>>> range
> > >>>>>>>>>>>> except for latency which can be arbitrarily large, but
> > >> for
> > >>>> very
> > >>>>>> high
> > >>>>>>>>>>>> latency there is no need to model it exactly (e.g. 30
> > >>>> seconds +
> > >>>>>>>>>> really
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>> effectively infinite). Within the range values are
> > >> recorded
> > >>>> in
> > >>>>>>>>>> buckets
> > >>>>>>>>>>>> which can be either fixed width or increasing width. The
> > >>>>>> increasing
> > >>>>>>>>>>>> width
> > >>>>>>>>>>>> is analogous to the idea of significant figures, that is
> > >> if
> > >>>>> your
> > >>>>>>>>>> value
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>> in the range 0-10 you might want to be accurate to within
> > >>>> 1ms,
> > >>>>>> but if
> > >>>>>>>>>>>> it is
> > >>>>>>>>>>>> 20000 there is no need to be so accurate. I implemented a
> > >>>>> linear
> > >>>>>>>>>> bucket
> > >>>>>>>>>>>> size where the Nth bucket has width proportional to N. An
> > >>>>>> exponential
> > >>>>>>>>>>>> bucket size would also be sensible and could likely be
> > >>>> derived
> > >>>>>>>>>> directly
> > >>>>>>>>>>>> from the floating point representation of a the value.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I'd like to get some feedback on this metrics code and
> > >>> make a
> > >>>>>>>>>> decision
> > >>>>>>>>>>>> on
> > >>>>>>>>>>>> whether we want to use it before I actually go ahead and
> > >>> add
> > >>>>> all
> > >>>>>> the
> > >>>>>>>>>>>> instrumentation in the code (otherwise I'll have to redo
> > >> it
> > >>>> if
> > >>>>> we
> > >>>>>>>>>> switch
> > >>>>>>>>>>>> approaches). So the next topic of discussion will be
> > >> which
> > >>>>> actual
> > >>>>>>>>>>>> metrics
> > >>>>>>>>>>>> to add.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> -Jay
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Re: Metrics in new producer

Posted by Clark Breyman <cl...@breyman.com>.

Jay - I was thinking of a pure stub rather than just wrapping Kafka metrics
in a Coda gauge.  I'd like the Timers, Meters etc to still be Coda meters -
that way the windows, exponential decays, etc are comparable to the rest of
the Coda metrics in our applications. At the same time, I don't want to
force Coda timers (or any other timers) on an app that won't make good use
of them.

Thanks again, C


On Sat, Feb 22, 2014 at 9:25 AM, Martin Kleppmann
<mk...@linkedin.com>wrote:

> Not sure if you want yet another opinion added to the pile -- but since I
> had a similar problem on another project recently, I thought I'd weigh in.
> (On that project we were originally using Coda's library, but then switched
> to rolling our own metrics implementation because we needed to do a few
> things differently.)
>
> 1. Problems we encountered with Coda's library: it uses an
> exponentially-weighted moving average (EMWA) for rates (eg. messages/sec),
> and exponentially biased reservoir sampling for histograms (percentiles,
> averages). Those methods of calculation work well for events with a
> consistently high volume, but they give strange and misleading results for
> events that are bursty or rare (eg error rates). We found that a fixed-size
> window gives more predictable, easier-to-interpret results.
>
> 2. In defence of Coda's library, I think its histogram implementation is a
> good trade-off of memory for accuracy; I'm not totally convinced that your
> proposal (counts of events in a fixed set of buckets) would be much better.
> Would have to do some math to work out the expected accuracy in each case.
> The reservoir sampling can be configured to use a smaller sample if the
> default of 1028 samples is too expensive. Reservoir sampling also has the
> advantage that you don't need to hard-code a bucket distribution.
>
> 3. Quotas are an interesting use case. However, I'm not wild about using a
> QuotaViolationException for control flow -- I think an explicit conditional
> would be nicer than having to catch an exception. One question in that
> context: if a quota is exceeded, do you still want to count the event
> towards the metric, or do you want to stop counting it until the quota is
> replenished? The answer may depend on the particular metric.
>
> 4. If you decide to go with Coda's library, I would advocate isolating the
> dependency into a separate module and using it via a facade -- somewhat
> like using SLF4J instead of Log4j directly. It's ok for Coda's library to
> be the default metrics implementation, but it should be easy to swap it out
> for something different in case someone has a version conflict or differing
> requirements. The facade should be at a low level (individual events), not
> at the reporter level (which deals with pre-aggregated values, and is
> already pluggable).
>
> 5. If it's useful, I can probably contribute my simple (but imho
> effective) metrics library, for embedding into Kafka. It uses reservoir
> sampling for percentiles, like Coda's library, but uses a fixed-size window
> instead of an exponential bias, which avoids weird behaviour on bursty
> metrics.
>
> In summary, I would advocate one of the following approaches:
> - Coda Hale library via facade (allowing it to be swapped for something
> else), or
> - Own metrics implementation, provided that we have confidence in its
> implementation of percentiles.
>
> Martin
>
>
> On 22 Feb 2014, at 01:06, Jay Kreps <ja...@gmail.com> wrote:
> > Hey guys,
> >
> > Just picking up this thread again. I do want to drive a conclusion as I
> > will run out of work to do on the producer soon and will need to add
> > metrics of some sort. We can vote on it, but I'm not sure if we actually
> > got everything discussed.
> >
> > Joel, I wasn't fully sure how to interpret your comment. I think you are
> > saying you are cool with the new metrics package as long as it really is
> > better. Do you have any comment on whether you think the benefits I
> > outlined are worth it? I agree with you that we could hold off on a
> second
> > repo until someone else would actually want to use our code.
> >
> > Jun, I'm not averse to doing a sampling-based histogram and doing some
> > comparison between the two approaches if you think this approach is
> > otherwise better.
> >
> > Sriram, originally I thought you preferred just sticking to Coda Hale,
> but
> > after your follow-up email I wasn't really sure...
> >
> > Joe/Clark, yes this code allows pluggable reporting so you could have a
> > metrics reporter that just wraps each metric in a Coda Hale Gauge if that
> > is useful. Though obviously if enough people were doing that I would
> think
> > it would be worth just using the Coda Hale package directly...
> >
> > -Jay
> >
> >
> >
> >
> > On Thu, Feb 13, 2014 at 3:34 PM, Clark Breyman <cl...@breyman.com>
> wrote:
> >
> >> Not requiring the client to link Coda/Yammer metrics sounds like a
> >> compelling reason to pivot to new interfaces. If that's the agreed
> >> direction, I'm hoping that we'd get the choice of backend to provide
> (e.g.
> >> facade on Yammer metrics for those with an investment in that) rather
> than
> >> force the new backend.  Having a metrics factory seems better for this
> than
> >> directly instantiating the singleton registry.
> >>
> >>
> >> On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly>
> wrote:
> >>
> >>> Can we leave metrics and have multiple supported KafkaMetricsGroup
> >>> implementing a yammer based implementation?
> >>>
> >>> ProducerRequestStats with your configured analytics group?
> >>>
> >>> On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com>
> wrote:
> >>>
> >>>> I think we discussed the scala/java stuff more fully previously.
> >>>> Essentially the client is embedded everywhere. Scala is very
> >> incompatible
> >>>> with itself so this makes it very hard to use for people using
> anything
> >>>> else in scala. Also Scala stack traces are very confusing. Basically
> we
> >>>> thought plain java code would be a lot easier for people to use. Even
> >> if
> >>>> Scala is more fun to write, that isn't really what we are optimizing
> >> for.
> >>>>
> >>>> -Jay
> >>>>
> >>>>
> >>>> On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com>
> wrote:
> >>>>
> >>>>> Jay, pretty impressive how you just write a 'quick version' like that
> >>> :)
> >>>>> Not to get off-topic but why didn't you write this in scala?
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>>> I have not had a chance to review the new metrics code and its
> >>>>>> features carefully (apart from your write-up), but here are my
> >>> general
> >>>>>> thoughts:
> >>>>>>
> >>>>>> Implementing a metrics package correctly is difficult; more so for
> >>>>>> people like me, because I'm not a statistician.  However, if this
> >> new
> >>>>>> package: {(i) functions correctly (and we need to define and prove
> >>>>>> correctness), (ii) is easy to use, (iii) serves all our current and
> >>>>>> anticipated monitoring needs, (iv) is not overly complex that it
> >>>>>> becomes a burden to maintain and we are better of with an available
> >>>>>> library;} then I think it makes sense to embed it and use it within
> >>>>>> the Kafka code. The main wins are: (i) predictability (no changing
> >>>>>> APIs and intimate knowledge of the code) and (ii) control with
> >>> respect
> >>>>>> to both functionality (e.g., there are hard-coded decay constants
> >> in
> >>>>>> metrics-core 2.x) and correctness (i.e., if we find a bug in the
> >>>>>> metrics package we have to submit a pull request and wait for it to
> >>>>>> become mainstream).  I'm not sure it would help very much to pull
> >> it
> >>>>>> into a separate repo because that could potentially annul these
> >>>>>> benefits.
> >>>>>>
> >>>>>> Joel
> >>>>>>
> >>>>>> On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> >>>>>>> Sriram,
> >>>>>>>
> >>>>>>> Makes sense. I am cool moving this stuff into its own repo if
> >>> people
> >>>>>> think
> >>>>>>> that is better. I'm not sure it would get much contribution but
> >>> when
> >>>> I
> >>>>>>> started messing with this I did have a lot of grand ideas of
> >> making
> >>>>>> adding
> >>>>>>> metrics to a sensor dynamic so you could add more stuff in
> >>>>> real-time(via
> >>>>>>> jmx, say) and/or externalize all your metrics and config to a
> >>>> separate
> >>>>>> file
> >>>>>>> like log4j with only the points of instrumentation hard-coded.
> >>>>>>>
> >>>>>>> -Jay
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> >>>>>>> srsubramanian@linkedin.com> wrote:
> >>>>>>>
> >>>>>>>> I am actually neutral to this change. I found the replies were
> >>> more
> >>>>>>>> towards the implementation and features so far. I would like
> >> the
> >>>>>> community
> >>>>>>>> to think about the questions below before making a decision. My
> >>>>>> opinion on
> >>>>>>>> this is that it has potential to be its own project and it
> >> would
> >>>>>> attract
> >>>>>>>> developers who are specifically interested in contributing to
> >>>>> metrics.
> >>>>>> I
> >>>>>>>> am skeptical that the Kafka contributors would focus on
> >> improving
> >>>>> this
> >>>>>>>> library (apart from bug fixes) instead of
> >> developing/contributing
> >>>> to
> >>>>>> other
> >>>>>>>> core pieces. It would be useful to continue and keep it
> >> decoupled
> >>>>> from
> >>>>>>>> rest of Kafka (if it resides in the Kafka code base.) so that
> >> we
> >>>> can
> >>>>>> move
> >>>>>>>> it out anytime to its own project.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> Hey Sriram,
> >>>>>>>>>
> >>>>>>>>> Not sure if these are actually meant as questions or more
> >> veiled
> >>>>>> comments.
> >>>>>>>>> In an case I tried to give my 2 cents inline.
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> >>>>>>>>> srsubramanian@linkedin.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> I think answering the questions below would help to make a
> >>>> better
> >>>>>>>>>> decision. I am all for writing better code and having
> >> superior
> >>>>>>>>>> functionalities but it is worth thinking about stuff outside
> >>>> just
> >>>>>> code
> >>>>>>>>>> in
> >>>>>>>>>> this case -
> >>>>>>>>>>
> >>>>>>>>>> 1. Does metric form a core piece of kafka? Does it help
> >> kafka
> >>>>>> greatly in
> >>>>>>>>>> providing better core functionalities? I would always like a
> >>>>>> project to
> >>>>>>>>>> do
> >>>>>>>>>> one thing really well. Metrics is a non trivial amount of
> >>> code.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Metrics are obviously important, and obviously improving our
> >>>> metrics
> >>>>>>>>> system
> >>>>>>>>> would be good. That said this may or may not be better, and
> >> even
> >>>> if
> >>>>>> it is
> >>>>>>>>> better that betterness might not outweigh other
> >> considerations.
> >>>> That
> >>>>>> is
> >>>>>>>>> what we are discussing.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> 2. Does it make sense to be part of Kafka or its own
> >> project?
> >>> If
> >>>>>> this
> >>>>>>>>>> metrics library has the potential to be better than
> >>>> metrics-core,
> >>>>> I
> >>>>>>>>>> would
> >>>>>>>>>> be interested in other projects take advantage of it.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> It could be either.
> >>>>>>>>>
> >>>>>>>>> 3. Can Kafka maintain this library as new members join and old
> >>>>> members
> >>>>>>>>>> leave? Would this be a piece of code that no one (in Kafka)
> >> in
> >>>> the
> >>>>>>>>>> future
> >>>>>>>>>> spends time improving if the original author left?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I am not going anywhere in the near term, but if I did, yes,
> >>> this
> >>>>>> would be
> >>>>>>>>> like any other code we have. As with yammer metrics or any
> >> other
> >>>>> code
> >>>>>> at
> >>>>>>>>> that point we would either use it as is or someone would
> >> improve
> >>>> it.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> 4. Does it affect the schedule of producer rewrite? This
> >> needs
> >>>> its
> >>>>>> own
> >>>>>>>>>> stabilization and modification to existing metric dashboards
> >>> if
> >>>>> the
> >>>>>>>>>> format
> >>>>>>>>>> is changed. Many times such cost are not factored in and a
> >>>> project
> >>>>>> loses
> >>>>>>>>>> time before realizing the extra time required to make a
> >>> library
> >>>> as
> >>>>>> this
> >>>>>>>>>> operational.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Probably not. The metrics are going to change regardless of
> >>>> whether
> >>>>>> we use
> >>>>>>>>> the same library or not. If we think this is better I don't
> >> mind
> >>>>>> putting
> >>>>>>>>> in
> >>>>>>>>> a little extra effort to get there.
> >>>>>>>>>
> >>>>>>>>> Irrespective I think this is probably not the right thing to
> >>>>> optimize
> >>>>>> for.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> I am sure we can do better when we write code to a specific
> >>> use
> >>>>>> case (in
> >>>>>>>>>> this case, kafka) rather than building a generic library
> >> that
> >>>>> suits
> >>>>>> all
> >>>>>>>>>> (metrics-core) but I would like us to have answers to the
> >>>>> questions
> >>>>>>>>>> above
> >>>>>>>>>> and be prepared before we proceed to support this with the
> >>>>> producer
> >>>>>>>>>> rewrite.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Naturally we are all considering exactly these things, that is
> >>>>>> exactly the
> >>>>>>>>> reason I started the thread.
> >>>>>>>>>
> >>>>>>>>> -Jay
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks for the detailed write-up. It's well thought
> >> through.
> >>> A
> >>>>> few
> >>>>>>>>>>> comments:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. I have a couple of concerns on the percentiles. The
> >> first
> >>>>> issue
> >>>>>> is
> >>>>>>>>>> that
> >>>>>>>>>>> It requires the user to know the value range. Since the
> >> range
> >>>> for
> >>>>>>>>>> things
> >>>>>>>>>>> like message size (in millions) is quite different from
> >> those
> >>>>> like
> >>>>>>>>>> request
> >>>>>>>>>>> time (less than 100), it's going to be hard to pick a good
> >>>> global
> >>>>>>>>>> default
> >>>>>>>>>>> range. Different apps could be dealing with different
> >> message
> >>>>>> size. So
> >>>>>>>>>>> they
> >>>>>>>>>>> probably will have to customize the range. Another issue is
> >>>> that
> >>>>>> it can
> >>>>>>>>>>> only report values at the bucket boundaries. So, if you
> >> have
> >>>> 1000
> >>>>>>>>>> buckets
> >>>>>>>>>>> and a value range of 1 million, you will only see 1000
> >>> possible
> >>>>>> values
> >>>>>>>>>> as
> >>>>>>>>>>> the quantile, which is probably too sparse. The
> >>> implementation
> >>>> of
> >>>>>>>>>>> histogram
> >>>>>>>>>>> in metrics-core keeps a fix size of samples, which avoids
> >>> both
> >>>>>> issues.
> >>>>>>>>>>>
> >>>>>>>>>>> 2. We need to document the 3-part metrics names better
> >> since
> >>>> it's
> >>>>>> not
> >>>>>>>>>>> obvious what the convention is. Also, currently the name of
> >>> the
> >>>>>> sensor
> >>>>>>>>>> and
> >>>>>>>>>>> the metrics defined in it are independent. Would it make
> >>> sense
> >>>> to
> >>>>>> have
> >>>>>>>>>> the
> >>>>>>>>>>> sensor name be a prefix of the metric name?
> >>>>>>>>>>>
> >>>>>>>>>>> Overall, this approach seems to be cleaner than
> >> metrics-core
> >>> by
> >>>>>>>>>> decoupling
> >>>>>>>>>>> measuring and reporting. The main benefit of metrics-core
> >>> seems
> >>>>> to
> >>>>>> be
> >>>>>>>>>> the
> >>>>>>>>>>> existing reporters. Since not that many people voted for
> >>>>>> metrics-core,
> >>>>>>>>>> I
> >>>>>>>>>>> am
> >>>>>>>>>>> ok with going with the new implementation. My only
> >>>> recommendation
> >>>>>> is to
> >>>>>>>>>>> address the concern on percentiles.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>>
> >>>>>>>>>>> Jun
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
> >>>> jay.kreps@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hey guys,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I wanted to kick off a quick discussion of metrics with
> >>>> respect
> >>>>>> to
> >>>>>>>>>> the
> >>>>>>>>>>>> new
> >>>>>>>>>>>> producer and consumer (and potentially the server).
> >>>>>>>>>>>>
> >>>>>>>>>>>> At a high level I think there are three approaches we
> >> could
> >>>>> take:
> >>>>>>>>>>>> 1. Plain vanilla JMX
> >>>>>>>>>>>> 2. Use Coda Hale (AKA Yammer) Metrics
> >>>>>>>>>>>> 3. Do our own metrics (with JMX as one output)
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. Has the advantage that JMX is the most commonly used
> >>> java
> >>>>>> thing
> >>>>>>>>>> and
> >>>>>>>>>>>> plugs in reasonably to most metrics systems. JMX is
> >>> included
> >>>> in
> >>>>>> the
> >>>>>>>>>> JDK
> >>>>>>>>>>>> so
> >>>>>>>>>>>> it doesn't impose any additional dependencies on clients.
> >>> It
> >>>>> has
> >>>>>> the
> >>>>>>>>>>>> disadvantage that plain vanilla JMX is a pain to use. We
> >>>> would
> >>>>>> need a
> >>>>>>>>>>>> bunch
> >>>>>>>>>>>> of helper code for maintaining counters to make this
> >>>>> reasonable.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2. Coda Hale metrics is pretty good and broadly used. It
> >>>>>> supports JMX
> >>>>>>>>>>>> output as well as direct output to many other types of
> >>>> systems.
> >>>>>> The
> >>>>>>>>>>>> primary
> >>>>>>>>>>>> downside we have had with Coda Hale has to do with the
> >>>> clients
> >>>>>> and
> >>>>>>>>>>>> library
> >>>>>>>>>>>> incompatibilities. We are currently on an older more
> >>> popular
> >>>>>> version.
> >>>>>>>>>>>> The
> >>>>>>>>>>>> newer version is a rewrite of the APIs and is
> >> incompatible.
> >>>>>>>>>> Originally
> >>>>>>>>>>>> these were totally incompatible and people had to choose
> >>> one
> >>>> or
> >>>>>> the
> >>>>>>>>>>>> other.
> >>>>>>>>>>>> I think that has been improved so now the new version is
> >> a
> >>>>>> totally
> >>>>>>>>>>>> different package. But even in this case you end up with
> >>> both
> >>>>>>>>>> versions
> >>>>>>>>>>>> if
> >>>>>>>>>>>> you use Kafka and we are on a different version than you
> >>>> which
> >>>>> is
> >>>>>>>>>> going
> >>>>>>>>>>>> to
> >>>>>>>>>>>> be pretty inconvenient.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3. Doing our own has the downside of potentially
> >>> reinventing
> >>>>> the
> >>>>>>>>>> wheel,
> >>>>>>>>>>>> and
> >>>>>>>>>>>> potentially needing to work out any bugs in our code. The
> >>>>> upsides
> >>>>>>>>>> would
> >>>>>>>>>>>> depend on the how good the reinvention was. As it
> >> happens I
> >>>>> did a
> >>>>>>>>>> quick
> >>>>>>>>>>>> (~900 loc) version of a metrics library that is under
> >>>>>>>>>>>> kafka.common.metrics.
> >>>>>>>>>>>> I think it has some advantages over the Yammer metrics
> >>>> package
> >>>>>> for
> >>>>>>>>>> our
> >>>>>>>>>>>> usage beyond just not causing incompatibilities. I will
> >>>>> describe
> >>>>>> this
> >>>>>>>>>>>> code
> >>>>>>>>>>>> so we can discuss the pros and cons. Although I favor
> >> this
> >>>>>> approach I
> >>>>>>>>>>>> have
> >>>>>>>>>>>> no emotional attachment and wouldn't be too sad if I
> >> ended
> >>> up
> >>>>>>>>>> deleting
> >>>>>>>>>>>> it.
> >>>>>>>>>>>> Here are javadocs for this code, though I haven't written
> >>>> much
> >>>>>>>>>>>> documentation yet since I might end up deleting it:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Here is a quick overview of this library.
> >>>>>>>>>>>>
> >>>>>>>>>>>> There are three main public interfaces:
> >>>>>>>>>>>>  Metrics - This is a repository of metrics being
> >> tracked.
> >>>>>>>>>>>>  Metric - A single, named numerical value being measured
> >>>>> (i.e. a
> >>>>>>>>>>>> counter).
> >>>>>>>>>>>>  Sensor - This is a thing that records values and
> >> updates
> >>>> zero
> >>>>>> or
> >>>>>>>>>> more
> >>>>>>>>>>>> metrics
> >>>>>>>>>>>>
> >>>>>>>>>>>> So let's say we want to track three values about message
> >>>> sizes;
> >>>>>>>>>>>> specifically say we want to record the average, the
> >>> maximum,
> >>>>> the
> >>>>>>>>>> total
> >>>>>>>>>>>> rate
> >>>>>>>>>>>> of bytes being sent, and a count of messages. Then we
> >> would
> >>>> do
> >>>>>>>>>> something
> >>>>>>>>>>>> like this:
> >>>>>>>>>>>>
> >>>>>>>>>>>>   // setup code
> >>>>>>>>>>>>   Metrics metrics = new Metrics(); // this is a global
> >>>>>> "singleton"
> >>>>>>>>>>>>   Sensor sensor =
> >>>>>> metrics.sensor("kafka.producer.message.sizes");
> >>>>>>>>>>>>   sensor.add("kafka.producer.message-size.avg", new
> >>> Avg());
> >>>>>>>>>>>>   sensor.add("kafka.producer.message-size.max", new
> >>> Max());
> >>>>>>>>>>>>   sensor.add("kafka.producer.bytes-sent-per-sec", new
> >>>> Rate());
> >>>>>>>>>>>>   sensor.add("kafka.producer.message-count", new
> >> Count());
> >>>>>>>>>>>>
> >>>>>>>>>>>>   // now when we get a message we do this
> >>>>>>>>>>>>   sensor.record(messageSize);
> >>>>>>>>>>>>
> >>>>>>>>>>>> The above code creates the global metrics repository,
> >>>> creates a
> >>>>>>>>>> single
> >>>>>>>>>>>> Sensor, and defines 5 named metrics that are updated by
> >>> that
> >>>>>> Sensor.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Like Yammer Metrics (YM) I allow you to plug in
> >>> "reporters",
> >>>>>>>>>> including a
> >>>>>>>>>>>> JMX reporter. Unlike the Coda Hale JMX reporter the
> >>> reporter
> >>>> I
> >>>>>> have
> >>>>>>>>>> keys
> >>>>>>>>>>>> off the metric names not the Sensor names, which I think
> >> is
> >>>> an
> >>>>>>>>>>>> improvement--I just use the convention that the last
> >>> portion
> >>>> of
> >>>>>> the
> >>>>>>>>>>>> name is
> >>>>>>>>>>>> the attribute name, the second to last is the mbean name,
> >>> and
> >>>>> the
> >>>>>>>>>> rest
> >>>>>>>>>>>> is
> >>>>>>>>>>>> the package. So in the above example there is a producer
> >>>> mbean
> >>>>>> that
> >>>>>>>>>> has
> >>>>>>>>>>>> a
> >>>>>>>>>>>> avg and max attribute and a producer mbean that has a
> >>>>>>>>>> bytes-sent-per-sec
> >>>>>>>>>>>> and message-count attribute. This is nice because you can
> >>>>>> logically
> >>>>>>>>>>>> group
> >>>>>>>>>>>> the values reported irrespective of where in the program
> >>> they
> >>>>> are
> >>>>>>>>>>>> computed--that is an mbean can logically group attributes
> >>>>>> computed
> >>>>>>>>>> off
> >>>>>>>>>>>> different sensors. This means you can report values by
> >>>> logical
> >>>>>>>>>>>> subsystem.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I also allow the concept of hierarchical Sensors which I
> >>>> think
> >>>>>> is a
> >>>>>>>>>> good
> >>>>>>>>>>>> convenience. I have noticed a common pattern in systems
> >>> where
> >>>>> you
> >>>>>>>>>> need
> >>>>>>>>>>>> to
> >>>>>>>>>>>> roll up the same values along different dimensions. An
> >>> simple
> >>>>>>>>>> example is
> >>>>>>>>>>>> metrics about qps, data rate, etc on the broker. These we
> >>>> want
> >>>>> to
> >>>>>>>>>>>> capture
> >>>>>>>>>>>> in aggregate, but also broken down by topic-id. You can
> >> do
> >>>> this
> >>>>>>>>>> purely
> >>>>>>>>>>>> by
> >>>>>>>>>>>> defining the sensor hierarchy:
> >>>>>>>>>>>> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> >>>>>>>>>>>> Sensor topicSizes = metrics.sensor("kafka.producer." +
> >>> topic
> >>>> +
> >>>>>>>>>>>> ".sizes",
> >>>>>>>>>>>> allSizes);
> >>>>>>>>>>>> Now each actual update will go to the appropriate
> >>> topicSizes
> >>>>>> sensor
> >>>>>>>>>>>> (based
> >>>>>>>>>>>> on the topic name), but allSizes metrics will get updated
> >>>> too.
> >>>>> I
> >>>>>> also
> >>>>>>>>>>>> support multiple parents for each sensor as well as
> >>> multiple
> >>>>>> layers
> >>>>>>>>>> of
> >>>>>>>>>>>> hiearchy, so you can define a more elaborate DAG of
> >>> sensors.
> >>>> An
> >>>>>>>>>> example
> >>>>>>>>>>>> of
> >>>>>>>>>>>> how this would be useful is if you wanted to record your
> >>>>> metrics
> >>>>>>>>>> broken
> >>>>>>>>>>>> down by topic AND client id as well as the global
> >>> aggregate.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Each metric can take a configurable Quota value which
> >>> allows
> >>>> us
> >>>>>> to
> >>>>>>>>>> limit
> >>>>>>>>>>>> the maximum value of that sensor. This is intended for
> >> use
> >>> on
> >>>>> the
> >>>>>>>>>>>> server as
> >>>>>>>>>>>> part of our Quota implementation. The way this works is
> >>> that
> >>>>> you
> >>>>>>>>>> record
> >>>>>>>>>>>> metrics as usual:
> >>>>>>>>>>>>   mySensor.record(42.0)
> >>>>>>>>>>>> However if this event occurance causes one of the metrics
> >>> to
> >>>>>> exceed
> >>>>>>>>>> its
> >>>>>>>>>>>> maximum allowable value (the quota) this call will throw
> >> a
> >>>>>>>>>>>> QuotaViolationException. The cool thing about this is
> >> that
> >>> it
> >>>>>> means
> >>>>>>>>>> we
> >>>>>>>>>>>> can
> >>>>>>>>>>>> define quotas on anything we capture metrics for, which I
> >>>> think
> >>>>>> is
> >>>>>>>>>>>> pretty
> >>>>>>>>>>>> cool.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Another question is how to handle windowing of the
> >> values?
> >>>>>> Metrics
> >>>>>>>>>> want
> >>>>>>>>>>>> to
> >>>>>>>>>>>> record the "current" value, but the definition of current
> >>> is
> >>>>>>>>>> inherently
> >>>>>>>>>>>> nebulous. A few of the obvious gotchas are that if you
> >>> define
> >>>>>>>>>> "current"
> >>>>>>>>>>>> to
> >>>>>>>>>>>> be a number of events you can end up measuring an
> >>> arbitrarily
> >>>>>> long
> >>>>>>>>>>>> window
> >>>>>>>>>>>> of time if the event rate is low (e.g. you think you are
> >>>>> getting
> >>>>>> 50
> >>>>>>>>>>>> messages/sec because that was the rate yesterday when all
> >>>>> events
> >>>>>>>>>>>> topped).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Here is how I approach this. All the metrics use the same
> >>>>>> windowing
> >>>>>>>>>>>> approach. We define a single window by a length of time
> >> or
> >>>>>> number of
> >>>>>>>>>>>> values
> >>>>>>>>>>>> (you can use either or both--if both the window ends when
> >>>>>> *either*
> >>>>>>>>>> the
> >>>>>>>>>>>> time
> >>>>>>>>>>>> bound or event bound is hit). The typical problem with
> >> hard
> >>>>>> window
> >>>>>>>>>>>> boundaries is that at the beginning of the window you
> >> have
> >>> no
> >>>>>> data
> >>>>>>>>>> and
> >>>>>>>>>>>> the
> >>>>>>>>>>>> first few samples are too small to be a valid sample.
> >>>> (Consider
> >>>>>> if
> >>>>>>>>>> you
> >>>>>>>>>>>> were
> >>>>>>>>>>>> keeping an avg and the first value in the window happens
> >> to
> >>>> be
> >>>>>> very
> >>>>>>>>>> very
> >>>>>>>>>>>> high, if you check the avg at this exact time you will
> >>>> conclude
> >>>>>> the
> >>>>>>>>>> avg
> >>>>>>>>>>>> is
> >>>>>>>>>>>> very high but on a sample size of one). One simple fix
> >>> would
> >>>> be
> >>>>>> to
> >>>>>>>>>>>> always
> >>>>>>>>>>>> report the last complete window, however this is not
> >>>>> appropriate
> >>>>>> here
> >>>>>>>>>>>> because (1) we want to drive quotas off it so it needs to
> >>> be
> >>>>>> current,
> >>>>>>>>>>>> and
> >>>>>>>>>>>> (2) since this is for monitoring you kind of care more
> >>> about
> >>>>> the
> >>>>>>>>>> current
> >>>>>>>>>>>> state. The ideal solution here would be to define a
> >>> backwards
> >>>>>> looking
> >>>>>>>>>>>> sliding window from the present, but many statistics are
> >>>>> actually
> >>>>>>>>>> very
> >>>>>>>>>>>> hard
> >>>>>>>>>>>> to compute in this model without retaining all the values
> >>>> which
> >>>>>>>>>> would be
> >>>>>>>>>>>> hopelessly inefficient. My solution to this is to keep a
> >>>>>> configurable
> >>>>>>>>>>>> number of windows (default is two) and combine them for
> >> the
> >>>>>> estimate.
> >>>>>>>>>>>> So in
> >>>>>>>>>>>> a two sample case depending on when you ask you have
> >>> between
> >>>>> one
> >>>>>> and
> >>>>>>>>>> two
> >>>>>>>>>>>> complete samples worth of data to base the answer off of.
> >>>>>> Provided
> >>>>>>>>>> the
> >>>>>>>>>>>> sample window is large enough to get a valid result this
> >>>>>> satisfies
> >>>>>>>>>> both
> >>>>>>>>>>>> of
> >>>>>>>>>>>> my criteria of incorporating the most recent data and
> >>> having
> >>>>>>>>>> reasonable
> >>>>>>>>>>>> variance at all times.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Another approach is to use an exponential weighting
> >> scheme
> >>> to
> >>>>>> combine
> >>>>>>>>>>>> all
> >>>>>>>>>>>> history but emphasize the recent past. I have not done
> >> this
> >>>> as
> >>>>> it
> >>>>>>>>>> has a
> >>>>>>>>>>>> lot
> >>>>>>>>>>>> of issues for practical operational metrics. I'd be happy
> >>> to
> >>>>>>>>>> elaborate
> >>>>>>>>>>>> on
> >>>>>>>>>>>> this if anyone cares...
> >>>>>>>>>>>>
> >>>>>>>>>>>> The window size for metrics has a global default which
> >> can
> >>> be
> >>>>>>>>>>>> overridden at
> >>>>>>>>>>>> either the sensor or individual metric level.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In addition to these time series values the user can
> >>> directly
> >>>>>> expose
> >>>>>>>>>>>> some
> >>>>>>>>>>>> method of their choosing JMX-style by implementing the
> >>>>> Measurable
> >>>>>>>>>>>> interface
> >>>>>>>>>>>> and registering that value. E.g.
> >>>>>>>>>>>>  metrics.addMetric("my.metric", new Measurable() {
> >>>>>>>>>>>>    public double measure(MetricConfg config, long now) {
> >>>>>>>>>>>>       return this.calculateValueToExpose();
> >>>>>>>>>>>>    }
> >>>>>>>>>>>>  });
> >>>>>>>>>>>> This is useful for exposing things like the accumulator
> >>> free
> >>>>>> memory.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The set of metrics is extensible, new metrics can be
> >> added
> >>> by
> >>>>>> just
> >>>>>>>>>>>> implementing the appropriate interfaces and registering
> >>> with
> >>>> a
> >>>>>>>>>> sensor. I
> >>>>>>>>>>>> implement the following metrics:
> >>>>>>>>>>>>  total - the sum of all values from the given sensor
> >>>>>>>>>>>>  count - a windowed count of values from the sensor
> >>>>>>>>>>>>  avg - the sample average within the windows
> >>>>>>>>>>>>  max - the max over the windows
> >>>>>>>>>>>>  min - the min over the windows
> >>>>>>>>>>>>  rate - the rate in the windows (e.g. the total or count
> >>>>>> divided by
> >>>>>>>>>> the
> >>>>>>>>>>>> ellapsed time)
> >>>>>>>>>>>>  percentiles - a collection of percentiles computed over
> >>> the
> >>>>>> window
> >>>>>>>>>>>>
> >>>>>>>>>>>> My approach to percentiles is a little different from the
> >>>>> yammer
> >>>>>>>>>> metrics
> >>>>>>>>>>>> package. My complaint about the yammer metrics approach
> >> is
> >>>> that
> >>>>>> it
> >>>>>>>>>> uses
> >>>>>>>>>>>> rather expensive sampling and uses kind of a lot of
> >> memory
> >>> to
> >>>>>> get a
> >>>>>>>>>>>> reasonable sample. This is problematic for per-topic
> >>>>>> measurements.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Instead I use a fixed range for the histogram (e.g. 0.0
> >> to
> >>>>>> 30000.0)
> >>>>>>>>>>>> which
> >>>>>>>>>>>> directly allows you to specify the desired memory use.
> >> Any
> >>>>> value
> >>>>>>>>>> below
> >>>>>>>>>>>> the
> >>>>>>>>>>>> minimum is recorded as -Infinity and any value above the
> >>>>> maximum
> >>>>>> as
> >>>>>>>>>>>> +Infinity. I think this is okay as all metrics have an
> >>>> expected
> >>>>>> range
> >>>>>>>>>>>> except for latency which can be arbitrarily large, but
> >> for
> >>>> very
> >>>>>> high
> >>>>>>>>>>>> latency there is no need to model it exactly (e.g. 30
> >>>> seconds +
> >>>>>>>>>> really
> >>>>>>>>>>>> is
> >>>>>>>>>>>> effectively infinite). Within the range values are
> >> recorded
> >>>> in
> >>>>>>>>>> buckets
> >>>>>>>>>>>> which can be either fixed width or increasing width. The
> >>>>>> increasing
> >>>>>>>>>>>> width
> >>>>>>>>>>>> is analogous to the idea of significant figures, that is
> >> if
> >>>>> your
> >>>>>>>>>> value
> >>>>>>>>>>>> is
> >>>>>>>>>>>> in the range 0-10 you might want to be accurate to within
> >>>> 1ms,
> >>>>>> but if
> >>>>>>>>>>>> it is
> >>>>>>>>>>>> 20000 there is no need to be so accurate. I implemented a
> >>>>> linear
> >>>>>>>>>> bucket
> >>>>>>>>>>>> size where the Nth bucket has width proportional to N. An
> >>>>>> exponential
> >>>>>>>>>>>> bucket size would also be sensible and could likely be
> >>>> derived
> >>>>>>>>>> directly
> >>>>>>>>>>>> from the floating point representation of a the value.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'd like to get some feedback on this metrics code and
> >>> make a
> >>>>>>>>>> decision
> >>>>>>>>>>>> on
> >>>>>>>>>>>> whether we want to use it before I actually go ahead and
> >>> add
> >>>>> all
> >>>>>> the
> >>>>>>>>>>>> instrumentation in the code (otherwise I'll have to redo
> >> it
> >>>> if
> >>>>> we
> >>>>>>>>>> switch
> >>>>>>>>>>>> approaches). So the next topic of discussion will be
> >> which
> >>>>> actual
> >>>>>>>>>>>> metrics
> >>>>>>>>>>>> to add.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Jay
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: Metrics in new producer

Posted by Martin Kleppmann <mk...@linkedin.com>.

Not sure if you want yet another opinion added to the pile -- but since I had a similar problem on another project recently, I thought I'd weigh in. (On that project we were originally using Coda's library, but then switched to rolling our own metrics implementation because we needed to do a few things differently.)

1. Problems we encountered with Coda's library: it uses an exponentially-weighted moving average (EMWA) for rates (eg. messages/sec), and exponentially biased reservoir sampling for histograms (percentiles, averages). Those methods of calculation work well for events with a consistently high volume, but they give strange and misleading results for events that are bursty or rare (eg error rates). We found that a fixed-size window gives more predictable, easier-to-interpret results.

2. In defence of Coda's library, I think its histogram implementation is a good trade-off of memory for accuracy; I'm not totally convinced that your proposal (counts of events in a fixed set of buckets) would be much better. Would have to do some math to work out the expected accuracy in each case. The reservoir sampling can be configured to use a smaller sample if the default of 1028 samples is too expensive. Reservoir sampling also has the advantage that you don't need to hard-code a bucket distribution.

3. Quotas are an interesting use case. However, I'm not wild about using a QuotaViolationException for control flow -- I think an explicit conditional would be nicer than having to catch an exception. One question in that context: if a quota is exceeded, do you still want to count the event towards the metric, or do you want to stop counting it until the quota is replenished? The answer may depend on the particular metric.

4. If you decide to go with Coda's library, I would advocate isolating the dependency into a separate module and using it via a facade -- somewhat like using SLF4J instead of Log4j directly. It's ok for Coda's library to be the default metrics implementation, but it should be easy to swap it out for something different in case someone has a version conflict or differing requirements. The facade should be at a low level (individual events), not at the reporter level (which deals with pre-aggregated values, and is already pluggable).

5. If it's useful, I can probably contribute my simple (but imho effective) metrics library, for embedding into Kafka. It uses reservoir sampling for percentiles, like Coda's library, but uses a fixed-size window instead of an exponential bias, which avoids weird behaviour on bursty metrics.

In summary, I would advocate one of the following approaches:
- Coda Hale library via facade (allowing it to be swapped for something else), or
- Own metrics implementation, provided that we have confidence in its implementation of percentiles.

Martin


On 22 Feb 2014, at 01:06, Jay Kreps <ja...@gmail.com> wrote:
> Hey guys,
> 
> Just picking up this thread again. I do want to drive a conclusion as I
> will run out of work to do on the producer soon and will need to add
> metrics of some sort. We can vote on it, but I'm not sure if we actually
> got everything discussed.
> 
> Joel, I wasn't fully sure how to interpret your comment. I think you are
> saying you are cool with the new metrics package as long as it really is
> better. Do you have any comment on whether you think the benefits I
> outlined are worth it? I agree with you that we could hold off on a second
> repo until someone else would actually want to use our code.
> 
> Jun, I'm not averse to doing a sampling-based histogram and doing some
> comparison between the two approaches if you think this approach is
> otherwise better.
> 
> Sriram, originally I thought you preferred just sticking to Coda Hale, but
> after your follow-up email I wasn't really sure...
> 
> Joe/Clark, yes this code allows pluggable reporting so you could have a
> metrics reporter that just wraps each metric in a Coda Hale Gauge if that
> is useful. Though obviously if enough people were doing that I would think
> it would be worth just using the Coda Hale package directly...
> 
> -Jay
> 
> 
> 
> 
> On Thu, Feb 13, 2014 at 3:34 PM, Clark Breyman <cl...@breyman.com> wrote:
> 
>> Not requiring the client to link Coda/Yammer metrics sounds like a
>> compelling reason to pivot to new interfaces. If that's the agreed
>> direction, I'm hoping that we'd get the choice of backend to provide (e.g.
>> facade on Yammer metrics for those with an investment in that) rather than
>> force the new backend.  Having a metrics factory seems better for this than
>> directly instantiating the singleton registry.
>> 
>> 
>> On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly> wrote:
>> 
>>> Can we leave metrics and have multiple supported KafkaMetricsGroup
>>> implementing a yammer based implementation?
>>> 
>>> ProducerRequestStats with your configured analytics group?
>>> 
>>> On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com> wrote:
>>> 
>>>> I think we discussed the scala/java stuff more fully previously.
>>>> Essentially the client is embedded everywhere. Scala is very
>> incompatible
>>>> with itself so this makes it very hard to use for people using anything
>>>> else in scala. Also Scala stack traces are very confusing. Basically we
>>>> thought plain java code would be a lot easier for people to use. Even
>> if
>>>> Scala is more fun to write, that isn't really what we are optimizing
>> for.
>>>> 
>>>> -Jay
>>>> 
>>>> 
>>>> On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com> wrote:
>>>> 
>>>>> Jay, pretty impressive how you just write a 'quick version' like that
>>> :)
>>>>> Not to get off-topic but why didn't you write this in scala?
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
>>> wrote:
>>>>> 
>>>>>> I have not had a chance to review the new metrics code and its
>>>>>> features carefully (apart from your write-up), but here are my
>>> general
>>>>>> thoughts:
>>>>>> 
>>>>>> Implementing a metrics package correctly is difficult; more so for
>>>>>> people like me, because I'm not a statistician.  However, if this
>> new
>>>>>> package: {(i) functions correctly (and we need to define and prove
>>>>>> correctness), (ii) is easy to use, (iii) serves all our current and
>>>>>> anticipated monitoring needs, (iv) is not overly complex that it
>>>>>> becomes a burden to maintain and we are better of with an available
>>>>>> library;} then I think it makes sense to embed it and use it within
>>>>>> the Kafka code. The main wins are: (i) predictability (no changing
>>>>>> APIs and intimate knowledge of the code) and (ii) control with
>>> respect
>>>>>> to both functionality (e.g., there are hard-coded decay constants
>> in
>>>>>> metrics-core 2.x) and correctness (i.e., if we find a bug in the
>>>>>> metrics package we have to submit a pull request and wait for it to
>>>>>> become mainstream).  I'm not sure it would help very much to pull
>> it
>>>>>> into a separate repo because that could potentially annul these
>>>>>> benefits.
>>>>>> 
>>>>>> Joel
>>>>>> 
>>>>>> On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
>>>>>>> Sriram,
>>>>>>> 
>>>>>>> Makes sense. I am cool moving this stuff into its own repo if
>>> people
>>>>>> think
>>>>>>> that is better. I'm not sure it would get much contribution but
>>> when
>>>> I
>>>>>>> started messing with this I did have a lot of grand ideas of
>> making
>>>>>> adding
>>>>>>> metrics to a sensor dynamic so you could add more stuff in
>>>>> real-time(via
>>>>>>> jmx, say) and/or externalize all your metrics and config to a
>>>> separate
>>>>>> file
>>>>>>> like log4j with only the points of instrumentation hard-coded.
>>>>>>> 
>>>>>>> -Jay
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
>>>>>>> srsubramanian@linkedin.com> wrote:
>>>>>>> 
>>>>>>>> I am actually neutral to this change. I found the replies were
>>> more
>>>>>>>> towards the implementation and features so far. I would like
>> the
>>>>>> community
>>>>>>>> to think about the questions below before making a decision. My
>>>>>> opinion on
>>>>>>>> this is that it has potential to be its own project and it
>> would
>>>>>> attract
>>>>>>>> developers who are specifically interested in contributing to
>>>>> metrics.
>>>>>> I
>>>>>>>> am skeptical that the Kafka contributors would focus on
>> improving
>>>>> this
>>>>>>>> library (apart from bug fixes) instead of
>> developing/contributing
>>>> to
>>>>>> other
>>>>>>>> core pieces. It would be useful to continue and keep it
>> decoupled
>>>>> from
>>>>>>>> rest of Kafka (if it resides in the Kafka code base.) so that
>> we
>>>> can
>>>>>> move
>>>>>>>> it out anytime to its own project.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>>> Hey Sriram,
>>>>>>>>> 
>>>>>>>>> Not sure if these are actually meant as questions or more
>> veiled
>>>>>> comments.
>>>>>>>>> In an case I tried to give my 2 cents inline.
>>>>>>>>> 
>>>>>>>>> On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
>>>>>>>>> srsubramanian@linkedin.com> wrote:
>>>>>>>>> 
>>>>>>>>>> I think answering the questions below would help to make a
>>>> better
>>>>>>>>>> decision. I am all for writing better code and having
>> superior
>>>>>>>>>> functionalities but it is worth thinking about stuff outside
>>>> just
>>>>>> code
>>>>>>>>>> in
>>>>>>>>>> this case -
>>>>>>>>>> 
>>>>>>>>>> 1. Does metric form a core piece of kafka? Does it help
>> kafka
>>>>>> greatly in
>>>>>>>>>> providing better core functionalities? I would always like a
>>>>>> project to
>>>>>>>>>> do
>>>>>>>>>> one thing really well. Metrics is a non trivial amount of
>>> code.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Metrics are obviously important, and obviously improving our
>>>> metrics
>>>>>>>>> system
>>>>>>>>> would be good. That said this may or may not be better, and
>> even
>>>> if
>>>>>> it is
>>>>>>>>> better that betterness might not outweigh other
>> considerations.
>>>> That
>>>>>> is
>>>>>>>>> what we are discussing.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 2. Does it make sense to be part of Kafka or its own
>> project?
>>> If
>>>>>> this
>>>>>>>>>> metrics library has the potential to be better than
>>>> metrics-core,
>>>>> I
>>>>>>>>>> would
>>>>>>>>>> be interested in other projects take advantage of it.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> It could be either.
>>>>>>>>> 
>>>>>>>>> 3. Can Kafka maintain this library as new members join and old
>>>>> members
>>>>>>>>>> leave? Would this be a piece of code that no one (in Kafka)
>> in
>>>> the
>>>>>>>>>> future
>>>>>>>>>> spends time improving if the original author left?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I am not going anywhere in the near term, but if I did, yes,
>>> this
>>>>>> would be
>>>>>>>>> like any other code we have. As with yammer metrics or any
>> other
>>>>> code
>>>>>> at
>>>>>>>>> that point we would either use it as is or someone would
>> improve
>>>> it.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 4. Does it affect the schedule of producer rewrite? This
>> needs
>>>> its
>>>>>> own
>>>>>>>>>> stabilization and modification to existing metric dashboards
>>> if
>>>>> the
>>>>>>>>>> format
>>>>>>>>>> is changed. Many times such cost are not factored in and a
>>>> project
>>>>>> loses
>>>>>>>>>> time before realizing the extra time required to make a
>>> library
>>>> as
>>>>>> this
>>>>>>>>>> operational.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Probably not. The metrics are going to change regardless of
>>>> whether
>>>>>> we use
>>>>>>>>> the same library or not. If we think this is better I don't
>> mind
>>>>>> putting
>>>>>>>>> in
>>>>>>>>> a little extra effort to get there.
>>>>>>>>> 
>>>>>>>>> Irrespective I think this is probably not the right thing to
>>>>> optimize
>>>>>> for.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> I am sure we can do better when we write code to a specific
>>> use
>>>>>> case (in
>>>>>>>>>> this case, kafka) rather than building a generic library
>> that
>>>>> suits
>>>>>> all
>>>>>>>>>> (metrics-core) but I would like us to have answers to the
>>>>> questions
>>>>>>>>>> above
>>>>>>>>>> and be prepared before we proceed to support this with the
>>>>> producer
>>>>>>>>>> rewrite.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Naturally we are all considering exactly these things, that is
>>>>>> exactly the
>>>>>>>>> reason I started the thread.
>>>>>>>>> 
>>>>>>>>> -Jay
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Thanks for the detailed write-up. It's well thought
>> through.
>>> A
>>>>> few
>>>>>>>>>>> comments:
>>>>>>>>>>> 
>>>>>>>>>>> 1. I have a couple of concerns on the percentiles. The
>> first
>>>>> issue
>>>>>> is
>>>>>>>>>> that
>>>>>>>>>>> It requires the user to know the value range. Since the
>> range
>>>> for
>>>>>>>>>> things
>>>>>>>>>>> like message size (in millions) is quite different from
>> those
>>>>> like
>>>>>>>>>> request
>>>>>>>>>>> time (less than 100), it's going to be hard to pick a good
>>>> global
>>>>>>>>>> default
>>>>>>>>>>> range. Different apps could be dealing with different
>> message
>>>>>> size. So
>>>>>>>>>>> they
>>>>>>>>>>> probably will have to customize the range. Another issue is
>>>> that
>>>>>> it can
>>>>>>>>>>> only report values at the bucket boundaries. So, if you
>> have
>>>> 1000
>>>>>>>>>> buckets
>>>>>>>>>>> and a value range of 1 million, you will only see 1000
>>> possible
>>>>>> values
>>>>>>>>>> as
>>>>>>>>>>> the quantile, which is probably too sparse. The
>>> implementation
>>>> of
>>>>>>>>>>> histogram
>>>>>>>>>>> in metrics-core keeps a fix size of samples, which avoids
>>> both
>>>>>> issues.
>>>>>>>>>>> 
>>>>>>>>>>> 2. We need to document the 3-part metrics names better
>> since
>>>> it's
>>>>>> not
>>>>>>>>>>> obvious what the convention is. Also, currently the name of
>>> the
>>>>>> sensor
>>>>>>>>>> and
>>>>>>>>>>> the metrics defined in it are independent. Would it make
>>> sense
>>>> to
>>>>>> have
>>>>>>>>>> the
>>>>>>>>>>> sensor name be a prefix of the metric name?
>>>>>>>>>>> 
>>>>>>>>>>> Overall, this approach seems to be cleaner than
>> metrics-core
>>> by
>>>>>>>>>> decoupling
>>>>>>>>>>> measuring and reporting. The main benefit of metrics-core
>>> seems
>>>>> to
>>>>>> be
>>>>>>>>>> the
>>>>>>>>>>> existing reporters. Since not that many people voted for
>>>>>> metrics-core,
>>>>>>>>>> I
>>>>>>>>>>> am
>>>>>>>>>>> ok with going with the new implementation. My only
>>>> recommendation
>>>>>> is to
>>>>>>>>>>> address the concern on percentiles.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> 
>>>>>>>>>>> Jun
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
>>>> jay.kreps@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hey guys,
>>>>>>>>>>>> 
>>>>>>>>>>>> I wanted to kick off a quick discussion of metrics with
>>>> respect
>>>>>> to
>>>>>>>>>> the
>>>>>>>>>>>> new
>>>>>>>>>>>> producer and consumer (and potentially the server).
>>>>>>>>>>>> 
>>>>>>>>>>>> At a high level I think there are three approaches we
>> could
>>>>> take:
>>>>>>>>>>>> 1. Plain vanilla JMX
>>>>>>>>>>>> 2. Use Coda Hale (AKA Yammer) Metrics
>>>>>>>>>>>> 3. Do our own metrics (with JMX as one output)
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. Has the advantage that JMX is the most commonly used
>>> java
>>>>>> thing
>>>>>>>>>> and
>>>>>>>>>>>> plugs in reasonably to most metrics systems. JMX is
>>> included
>>>> in
>>>>>> the
>>>>>>>>>> JDK
>>>>>>>>>>>> so
>>>>>>>>>>>> it doesn't impose any additional dependencies on clients.
>>> It
>>>>> has
>>>>>> the
>>>>>>>>>>>> disadvantage that plain vanilla JMX is a pain to use. We
>>>> would
>>>>>> need a
>>>>>>>>>>>> bunch
>>>>>>>>>>>> of helper code for maintaining counters to make this
>>>>> reasonable.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. Coda Hale metrics is pretty good and broadly used. It
>>>>>> supports JMX
>>>>>>>>>>>> output as well as direct output to many other types of
>>>> systems.
>>>>>> The
>>>>>>>>>>>> primary
>>>>>>>>>>>> downside we have had with Coda Hale has to do with the
>>>> clients
>>>>>> and
>>>>>>>>>>>> library
>>>>>>>>>>>> incompatibilities. We are currently on an older more
>>> popular
>>>>>> version.
>>>>>>>>>>>> The
>>>>>>>>>>>> newer version is a rewrite of the APIs and is
>> incompatible.
>>>>>>>>>> Originally
>>>>>>>>>>>> these were totally incompatible and people had to choose
>>> one
>>>> or
>>>>>> the
>>>>>>>>>>>> other.
>>>>>>>>>>>> I think that has been improved so now the new version is
>> a
>>>>>> totally
>>>>>>>>>>>> different package. But even in this case you end up with
>>> both
>>>>>>>>>> versions
>>>>>>>>>>>> if
>>>>>>>>>>>> you use Kafka and we are on a different version than you
>>>> which
>>>>> is
>>>>>>>>>> going
>>>>>>>>>>>> to
>>>>>>>>>>>> be pretty inconvenient.
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Doing our own has the downside of potentially
>>> reinventing
>>>>> the
>>>>>>>>>> wheel,
>>>>>>>>>>>> and
>>>>>>>>>>>> potentially needing to work out any bugs in our code. The
>>>>> upsides
>>>>>>>>>> would
>>>>>>>>>>>> depend on the how good the reinvention was. As it
>> happens I
>>>>> did a
>>>>>>>>>> quick
>>>>>>>>>>>> (~900 loc) version of a metrics library that is under
>>>>>>>>>>>> kafka.common.metrics.
>>>>>>>>>>>> I think it has some advantages over the Yammer metrics
>>>> package
>>>>>> for
>>>>>>>>>> our
>>>>>>>>>>>> usage beyond just not causing incompatibilities. I will
>>>>> describe
>>>>>> this
>>>>>>>>>>>> code
>>>>>>>>>>>> so we can discuss the pros and cons. Although I favor
>> this
>>>>>> approach I
>>>>>>>>>>>> have
>>>>>>>>>>>> no emotional attachment and wouldn't be too sad if I
>> ended
>>> up
>>>>>>>>>> deleting
>>>>>>>>>>>> it.
>>>>>>>>>>>> Here are javadocs for this code, though I haven't written
>>>> much
>>>>>>>>>>>> documentation yet since I might end up deleting it:
>>>>>>>>>>>> 
>>>>>>>>>>>> Here is a quick overview of this library.
>>>>>>>>>>>> 
>>>>>>>>>>>> There are three main public interfaces:
>>>>>>>>>>>>  Metrics - This is a repository of metrics being
>> tracked.
>>>>>>>>>>>>  Metric - A single, named numerical value being measured
>>>>> (i.e. a
>>>>>>>>>>>> counter).
>>>>>>>>>>>>  Sensor - This is a thing that records values and
>> updates
>>>> zero
>>>>>> or
>>>>>>>>>> more
>>>>>>>>>>>> metrics
>>>>>>>>>>>> 
>>>>>>>>>>>> So let's say we want to track three values about message
>>>> sizes;
>>>>>>>>>>>> specifically say we want to record the average, the
>>> maximum,
>>>>> the
>>>>>>>>>> total
>>>>>>>>>>>> rate
>>>>>>>>>>>> of bytes being sent, and a count of messages. Then we
>> would
>>>> do
>>>>>>>>>> something
>>>>>>>>>>>> like this:
>>>>>>>>>>>> 
>>>>>>>>>>>>   // setup code
>>>>>>>>>>>>   Metrics metrics = new Metrics(); // this is a global
>>>>>> "singleton"
>>>>>>>>>>>>   Sensor sensor =
>>>>>> metrics.sensor("kafka.producer.message.sizes");
>>>>>>>>>>>>   sensor.add("kafka.producer.message-size.avg", new
>>> Avg());
>>>>>>>>>>>>   sensor.add("kafka.producer.message-size.max", new
>>> Max());
>>>>>>>>>>>>   sensor.add("kafka.producer.bytes-sent-per-sec", new
>>>> Rate());
>>>>>>>>>>>>   sensor.add("kafka.producer.message-count", new
>> Count());
>>>>>>>>>>>> 
>>>>>>>>>>>>   // now when we get a message we do this
>>>>>>>>>>>>   sensor.record(messageSize);
>>>>>>>>>>>> 
>>>>>>>>>>>> The above code creates the global metrics repository,
>>>> creates a
>>>>>>>>>> single
>>>>>>>>>>>> Sensor, and defines 5 named metrics that are updated by
>>> that
>>>>>> Sensor.
>>>>>>>>>>>> 
>>>>>>>>>>>> Like Yammer Metrics (YM) I allow you to plug in
>>> "reporters",
>>>>>>>>>> including a
>>>>>>>>>>>> JMX reporter. Unlike the Coda Hale JMX reporter the
>>> reporter
>>>> I
>>>>>> have
>>>>>>>>>> keys
>>>>>>>>>>>> off the metric names not the Sensor names, which I think
>> is
>>>> an
>>>>>>>>>>>> improvement--I just use the convention that the last
>>> portion
>>>> of
>>>>>> the
>>>>>>>>>>>> name is
>>>>>>>>>>>> the attribute name, the second to last is the mbean name,
>>> and
>>>>> the
>>>>>>>>>> rest
>>>>>>>>>>>> is
>>>>>>>>>>>> the package. So in the above example there is a producer
>>>> mbean
>>>>>> that
>>>>>>>>>> has
>>>>>>>>>>>> a
>>>>>>>>>>>> avg and max attribute and a producer mbean that has a
>>>>>>>>>> bytes-sent-per-sec
>>>>>>>>>>>> and message-count attribute. This is nice because you can
>>>>>> logically
>>>>>>>>>>>> group
>>>>>>>>>>>> the values reported irrespective of where in the program
>>> they
>>>>> are
>>>>>>>>>>>> computed--that is an mbean can logically group attributes
>>>>>> computed
>>>>>>>>>> off
>>>>>>>>>>>> different sensors. This means you can report values by
>>>> logical
>>>>>>>>>>>> subsystem.
>>>>>>>>>>>> 
>>>>>>>>>>>> I also allow the concept of hierarchical Sensors which I
>>>> think
>>>>>> is a
>>>>>>>>>> good
>>>>>>>>>>>> convenience. I have noticed a common pattern in systems
>>> where
>>>>> you
>>>>>>>>>> need
>>>>>>>>>>>> to
>>>>>>>>>>>> roll up the same values along different dimensions. An
>>> simple
>>>>>>>>>> example is
>>>>>>>>>>>> metrics about qps, data rate, etc on the broker. These we
>>>> want
>>>>> to
>>>>>>>>>>>> capture
>>>>>>>>>>>> in aggregate, but also broken down by topic-id. You can
>> do
>>>> this
>>>>>>>>>> purely
>>>>>>>>>>>> by
>>>>>>>>>>>> defining the sensor hierarchy:
>>>>>>>>>>>> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
>>>>>>>>>>>> Sensor topicSizes = metrics.sensor("kafka.producer." +
>>> topic
>>>> +
>>>>>>>>>>>> ".sizes",
>>>>>>>>>>>> allSizes);
>>>>>>>>>>>> Now each actual update will go to the appropriate
>>> topicSizes
>>>>>> sensor
>>>>>>>>>>>> (based
>>>>>>>>>>>> on the topic name), but allSizes metrics will get updated
>>>> too.
>>>>> I
>>>>>> also
>>>>>>>>>>>> support multiple parents for each sensor as well as
>>> multiple
>>>>>> layers
>>>>>>>>>> of
>>>>>>>>>>>> hiearchy, so you can define a more elaborate DAG of
>>> sensors.
>>>> An
>>>>>>>>>> example
>>>>>>>>>>>> of
>>>>>>>>>>>> how this would be useful is if you wanted to record your
>>>>> metrics
>>>>>>>>>> broken
>>>>>>>>>>>> down by topic AND client id as well as the global
>>> aggregate.
>>>>>>>>>>>> 
>>>>>>>>>>>> Each metric can take a configurable Quota value which
>>> allows
>>>> us
>>>>>> to
>>>>>>>>>> limit
>>>>>>>>>>>> the maximum value of that sensor. This is intended for
>> use
>>> on
>>>>> the
>>>>>>>>>>>> server as
>>>>>>>>>>>> part of our Quota implementation. The way this works is
>>> that
>>>>> you
>>>>>>>>>> record
>>>>>>>>>>>> metrics as usual:
>>>>>>>>>>>>   mySensor.record(42.0)
>>>>>>>>>>>> However if this event occurance causes one of the metrics
>>> to
>>>>>> exceed
>>>>>>>>>> its
>>>>>>>>>>>> maximum allowable value (the quota) this call will throw
>> a
>>>>>>>>>>>> QuotaViolationException. The cool thing about this is
>> that
>>> it
>>>>>> means
>>>>>>>>>> we
>>>>>>>>>>>> can
>>>>>>>>>>>> define quotas on anything we capture metrics for, which I
>>>> think
>>>>>> is
>>>>>>>>>>>> pretty
>>>>>>>>>>>> cool.
>>>>>>>>>>>> 
>>>>>>>>>>>> Another question is how to handle windowing of the
>> values?
>>>>>> Metrics
>>>>>>>>>> want
>>>>>>>>>>>> to
>>>>>>>>>>>> record the "current" value, but the definition of current
>>> is
>>>>>>>>>> inherently
>>>>>>>>>>>> nebulous. A few of the obvious gotchas are that if you
>>> define
>>>>>>>>>> "current"
>>>>>>>>>>>> to
>>>>>>>>>>>> be a number of events you can end up measuring an
>>> arbitrarily
>>>>>> long
>>>>>>>>>>>> window
>>>>>>>>>>>> of time if the event rate is low (e.g. you think you are
>>>>> getting
>>>>>> 50
>>>>>>>>>>>> messages/sec because that was the rate yesterday when all
>>>>> events
>>>>>>>>>>>> topped).
>>>>>>>>>>>> 
>>>>>>>>>>>> Here is how I approach this. All the metrics use the same
>>>>>> windowing
>>>>>>>>>>>> approach. We define a single window by a length of time
>> or
>>>>>> number of
>>>>>>>>>>>> values
>>>>>>>>>>>> (you can use either or both--if both the window ends when
>>>>>> *either*
>>>>>>>>>> the
>>>>>>>>>>>> time
>>>>>>>>>>>> bound or event bound is hit). The typical problem with
>> hard
>>>>>> window
>>>>>>>>>>>> boundaries is that at the beginning of the window you
>> have
>>> no
>>>>>> data
>>>>>>>>>> and
>>>>>>>>>>>> the
>>>>>>>>>>>> first few samples are too small to be a valid sample.
>>>> (Consider
>>>>>> if
>>>>>>>>>> you
>>>>>>>>>>>> were
>>>>>>>>>>>> keeping an avg and the first value in the window happens
>> to
>>>> be
>>>>>> very
>>>>>>>>>> very
>>>>>>>>>>>> high, if you check the avg at this exact time you will
>>>> conclude
>>>>>> the
>>>>>>>>>> avg
>>>>>>>>>>>> is
>>>>>>>>>>>> very high but on a sample size of one). One simple fix
>>> would
>>>> be
>>>>>> to
>>>>>>>>>>>> always
>>>>>>>>>>>> report the last complete window, however this is not
>>>>> appropriate
>>>>>> here
>>>>>>>>>>>> because (1) we want to drive quotas off it so it needs to
>>> be
>>>>>> current,
>>>>>>>>>>>> and
>>>>>>>>>>>> (2) since this is for monitoring you kind of care more
>>> about
>>>>> the
>>>>>>>>>> current
>>>>>>>>>>>> state. The ideal solution here would be to define a
>>> backwards
>>>>>> looking
>>>>>>>>>>>> sliding window from the present, but many statistics are
>>>>> actually
>>>>>>>>>> very
>>>>>>>>>>>> hard
>>>>>>>>>>>> to compute in this model without retaining all the values
>>>> which
>>>>>>>>>> would be
>>>>>>>>>>>> hopelessly inefficient. My solution to this is to keep a
>>>>>> configurable
>>>>>>>>>>>> number of windows (default is two) and combine them for
>> the
>>>>>> estimate.
>>>>>>>>>>>> So in
>>>>>>>>>>>> a two sample case depending on when you ask you have
>>> between
>>>>> one
>>>>>> and
>>>>>>>>>> two
>>>>>>>>>>>> complete samples worth of data to base the answer off of.
>>>>>> Provided
>>>>>>>>>> the
>>>>>>>>>>>> sample window is large enough to get a valid result this
>>>>>> satisfies
>>>>>>>>>> both
>>>>>>>>>>>> of
>>>>>>>>>>>> my criteria of incorporating the most recent data and
>>> having
>>>>>>>>>> reasonable
>>>>>>>>>>>> variance at all times.
>>>>>>>>>>>> 
>>>>>>>>>>>> Another approach is to use an exponential weighting
>> scheme
>>> to
>>>>>> combine
>>>>>>>>>>>> all
>>>>>>>>>>>> history but emphasize the recent past. I have not done
>> this
>>>> as
>>>>> it
>>>>>>>>>> has a
>>>>>>>>>>>> lot
>>>>>>>>>>>> of issues for practical operational metrics. I'd be happy
>>> to
>>>>>>>>>> elaborate
>>>>>>>>>>>> on
>>>>>>>>>>>> this if anyone cares...
>>>>>>>>>>>> 
>>>>>>>>>>>> The window size for metrics has a global default which
>> can
>>> be
>>>>>>>>>>>> overridden at
>>>>>>>>>>>> either the sensor or individual metric level.
>>>>>>>>>>>> 
>>>>>>>>>>>> In addition to these time series values the user can
>>> directly
>>>>>> expose
>>>>>>>>>>>> some
>>>>>>>>>>>> method of their choosing JMX-style by implementing the
>>>>> Measurable
>>>>>>>>>>>> interface
>>>>>>>>>>>> and registering that value. E.g.
>>>>>>>>>>>>  metrics.addMetric("my.metric", new Measurable() {
>>>>>>>>>>>>    public double measure(MetricConfg config, long now) {
>>>>>>>>>>>>       return this.calculateValueToExpose();
>>>>>>>>>>>>    }
>>>>>>>>>>>>  });
>>>>>>>>>>>> This is useful for exposing things like the accumulator
>>> free
>>>>>> memory.
>>>>>>>>>>>> 
>>>>>>>>>>>> The set of metrics is extensible, new metrics can be
>> added
>>> by
>>>>>> just
>>>>>>>>>>>> implementing the appropriate interfaces and registering
>>> with
>>>> a
>>>>>>>>>> sensor. I
>>>>>>>>>>>> implement the following metrics:
>>>>>>>>>>>>  total - the sum of all values from the given sensor
>>>>>>>>>>>>  count - a windowed count of values from the sensor
>>>>>>>>>>>>  avg - the sample average within the windows
>>>>>>>>>>>>  max - the max over the windows
>>>>>>>>>>>>  min - the min over the windows
>>>>>>>>>>>>  rate - the rate in the windows (e.g. the total or count
>>>>>> divided by
>>>>>>>>>> the
>>>>>>>>>>>> ellapsed time)
>>>>>>>>>>>>  percentiles - a collection of percentiles computed over
>>> the
>>>>>> window
>>>>>>>>>>>> 
>>>>>>>>>>>> My approach to percentiles is a little different from the
>>>>> yammer
>>>>>>>>>> metrics
>>>>>>>>>>>> package. My complaint about the yammer metrics approach
>> is
>>>> that
>>>>>> it
>>>>>>>>>> uses
>>>>>>>>>>>> rather expensive sampling and uses kind of a lot of
>> memory
>>> to
>>>>>> get a
>>>>>>>>>>>> reasonable sample. This is problematic for per-topic
>>>>>> measurements.
>>>>>>>>>>>> 
>>>>>>>>>>>> Instead I use a fixed range for the histogram (e.g. 0.0
>> to
>>>>>> 30000.0)
>>>>>>>>>>>> which
>>>>>>>>>>>> directly allows you to specify the desired memory use.
>> Any
>>>>> value
>>>>>>>>>> below
>>>>>>>>>>>> the
>>>>>>>>>>>> minimum is recorded as -Infinity and any value above the
>>>>> maximum
>>>>>> as
>>>>>>>>>>>> +Infinity. I think this is okay as all metrics have an
>>>> expected
>>>>>> range
>>>>>>>>>>>> except for latency which can be arbitrarily large, but
>> for
>>>> very
>>>>>> high
>>>>>>>>>>>> latency there is no need to model it exactly (e.g. 30
>>>> seconds +
>>>>>>>>>> really
>>>>>>>>>>>> is
>>>>>>>>>>>> effectively infinite). Within the range values are
>> recorded
>>>> in
>>>>>>>>>> buckets
>>>>>>>>>>>> which can be either fixed width or increasing width. The
>>>>>> increasing
>>>>>>>>>>>> width
>>>>>>>>>>>> is analogous to the idea of significant figures, that is
>> if
>>>>> your
>>>>>>>>>> value
>>>>>>>>>>>> is
>>>>>>>>>>>> in the range 0-10 you might want to be accurate to within
>>>> 1ms,
>>>>>> but if
>>>>>>>>>>>> it is
>>>>>>>>>>>> 20000 there is no need to be so accurate. I implemented a
>>>>> linear
>>>>>>>>>> bucket
>>>>>>>>>>>> size where the Nth bucket has width proportional to N. An
>>>>>> exponential
>>>>>>>>>>>> bucket size would also be sensible and could likely be
>>>> derived
>>>>>>>>>> directly
>>>>>>>>>>>> from the floating point representation of a the value.
>>>>>>>>>>>> 
>>>>>>>>>>>> I'd like to get some feedback on this metrics code and
>>> make a
>>>>>>>>>> decision
>>>>>>>>>>>> on
>>>>>>>>>>>> whether we want to use it before I actually go ahead and
>>> add
>>>>> all
>>>>>> the
>>>>>>>>>>>> instrumentation in the code (otherwise I'll have to redo
>> it
>>>> if
>>>>> we
>>>>>>>>>> switch
>>>>>>>>>>>> approaches). So the next topic of discussion will be
>> which
>>>>> actual
>>>>>>>>>>>> metrics
>>>>>>>>>>>> to add.
>>>>>>>>>>>> 
>>>>>>>>>>>> -Jay
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

Hey guys,

Just picking up this thread again. I do want to drive a conclusion as I
will run out of work to do on the producer soon and will need to add
metrics of some sort. We can vote on it, but I'm not sure if we actually
got everything discussed.

Joel, I wasn't fully sure how to interpret your comment. I think you are
saying you are cool with the new metrics package as long as it really is
better. Do you have any comment on whether you think the benefits I
outlined are worth it? I agree with you that we could hold off on a second
repo until someone else would actually want to use our code.

Jun, I'm not averse to doing a sampling-based histogram and doing some
comparison between the two approaches if you think this approach is
otherwise better.

Sriram, originally I thought you preferred just sticking to Coda Hale, but
after your follow-up email I wasn't really sure...

Joe/Clark, yes this code allows pluggable reporting so you could have a
metrics reporter that just wraps each metric in a Coda Hale Gauge if that
is useful. Though obviously if enough people were doing that I would think
it would be worth just using the Coda Hale package directly...

-Jay




On Thu, Feb 13, 2014 at 3:34 PM, Clark Breyman <cl...@breyman.com> wrote:

> Not requiring the client to link Coda/Yammer metrics sounds like a
> compelling reason to pivot to new interfaces. If that's the agreed
> direction, I'm hoping that we'd get the choice of backend to provide (e.g.
> facade on Yammer metrics for those with an investment in that) rather than
> force the new backend.  Having a metrics factory seems better for this than
> directly instantiating the singleton registry.
>
>
> On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly> wrote:
>
> > Can we leave metrics and have multiple supported KafkaMetricsGroup
> > implementing a yammer based implementation?
> >
> > ProducerRequestStats with your configured analytics group?
> >
> > On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com> wrote:
> >
> > > I think we discussed the scala/java stuff more fully previously.
> > > Essentially the client is embedded everywhere. Scala is very
> incompatible
> > > with itself so this makes it very hard to use for people using anything
> > > else in scala. Also Scala stack traces are very confusing. Basically we
> > > thought plain java code would be a lot easier for people to use. Even
> if
> > > Scala is more fun to write, that isn't really what we are optimizing
> for.
> > >
> > > -Jay
> > >
> > >
> > > On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com> wrote:
> > >
> > > > Jay, pretty impressive how you just write a 'quick version' like that
> > :)
> > > > Not to get off-topic but why didn't you write this in scala?
> > > >
> > > >
> > > >
> > > > On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
> > wrote:
> > > >
> > > > > I have not had a chance to review the new metrics code and its
> > > > > features carefully (apart from your write-up), but here are my
> > general
> > > > > thoughts:
> > > > >
> > > > > Implementing a metrics package correctly is difficult; more so for
> > > > > people like me, because I'm not a statistician.  However, if this
> new
> > > > > package: {(i) functions correctly (and we need to define and prove
> > > > > correctness), (ii) is easy to use, (iii) serves all our current and
> > > > > anticipated monitoring needs, (iv) is not overly complex that it
> > > > > becomes a burden to maintain and we are better of with an available
> > > > > library;} then I think it makes sense to embed it and use it within
> > > > > the Kafka code. The main wins are: (i) predictability (no changing
> > > > > APIs and intimate knowledge of the code) and (ii) control with
> > respect
> > > > > to both functionality (e.g., there are hard-coded decay constants
> in
> > > > > metrics-core 2.x) and correctness (i.e., if we find a bug in the
> > > > > metrics package we have to submit a pull request and wait for it to
> > > > > become mainstream).  I'm not sure it would help very much to pull
> it
> > > > > into a separate repo because that could potentially annul these
> > > > > benefits.
> > > > >
> > > > > Joel
> > > > >
> > > > > On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> > > > > > Sriram,
> > > > > >
> > > > > > Makes sense. I am cool moving this stuff into its own repo if
> > people
> > > > > think
> > > > > > that is better. I'm not sure it would get much contribution but
> > when
> > > I
> > > > > > started messing with this I did have a lot of grand ideas of
> making
> > > > > adding
> > > > > > metrics to a sensor dynamic so you could add more stuff in
> > > > real-time(via
> > > > > > jmx, say) and/or externalize all your metrics and config to a
> > > separate
> > > > > file
> > > > > > like log4j with only the points of instrumentation hard-coded.
> > > > > >
> > > > > > -Jay
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> > > > > > srsubramanian@linkedin.com> wrote:
> > > > > >
> > > > > > > I am actually neutral to this change. I found the replies were
> > more
> > > > > > > towards the implementation and features so far. I would like
> the
> > > > > community
> > > > > > > to think about the questions below before making a decision. My
> > > > > opinion on
> > > > > > > this is that it has potential to be its own project and it
> would
> > > > > attract
> > > > > > > developers who are specifically interested in contributing to
> > > > metrics.
> > > > > I
> > > > > > > am skeptical that the Kafka contributors would focus on
> improving
> > > > this
> > > > > > > library (apart from bug fixes) instead of
> developing/contributing
> > > to
> > > > > other
> > > > > > > core pieces. It would be useful to continue and keep it
> decoupled
> > > > from
> > > > > > > rest of Kafka (if it resides in the Kafka code base.) so that
> we
> > > can
> > > > > move
> > > > > > > it out anytime to its own project.
> > > > > > >
> > > > > > >
> > > > > > > On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> > > > > > >
> > > > > > > >Hey Sriram,
> > > > > > > >
> > > > > > > >Not sure if these are actually meant as questions or more
> veiled
> > > > > comments.
> > > > > > > >In an case I tried to give my 2 cents inline.
> > > > > > > >
> > > > > > > >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> > > > > > > >srsubramanian@linkedin.com> wrote:
> > > > > > > >
> > > > > > > >> I think answering the questions below would help to make a
> > > better
> > > > > > > >> decision. I am all for writing better code and having
> superior
> > > > > > > >> functionalities but it is worth thinking about stuff outside
> > > just
> > > > > code
> > > > > > > >>in
> > > > > > > >> this case -
> > > > > > > >>
> > > > > > > >> 1. Does metric form a core piece of kafka? Does it help
> kafka
> > > > > greatly in
> > > > > > > >> providing better core functionalities? I would always like a
> > > > > project to
> > > > > > > >>do
> > > > > > > >> one thing really well. Metrics is a non trivial amount of
> > code.
> > > > > > > >>
> > > > > > > >
> > > > > > > >Metrics are obviously important, and obviously improving our
> > > metrics
> > > > > > > >system
> > > > > > > >would be good. That said this may or may not be better, and
> even
> > > if
> > > > > it is
> > > > > > > >better that betterness might not outweigh other
> considerations.
> > > That
> > > > > is
> > > > > > > >what we are discussing.
> > > > > > > >
> > > > > > > >
> > > > > > > >> 2. Does it make sense to be part of Kafka or its own
> project?
> > If
> > > > > this
> > > > > > > >> metrics library has the potential to be better than
> > > metrics-core,
> > > > I
> > > > > > > >>would
> > > > > > > >> be interested in other projects take advantage of it.
> > > > > > > >>
> > > > > > > >
> > > > > > > >It could be either.
> > > > > > > >
> > > > > > > >3. Can Kafka maintain this library as new members join and old
> > > > members
> > > > > > > >> leave? Would this be a piece of code that no one (in Kafka)
> in
> > > the
> > > > > > > >>future
> > > > > > > >> spends time improving if the original author left?
> > > > > > > >>
> > > > > > > >
> > > > > > > >I am not going anywhere in the near term, but if I did, yes,
> > this
> > > > > would be
> > > > > > > >like any other code we have. As with yammer metrics or any
> other
> > > > code
> > > > > at
> > > > > > > >that point we would either use it as is or someone would
> improve
> > > it.
> > > > > > > >
> > > > > > > >
> > > > > > > >> 4. Does it affect the schedule of producer rewrite? This
> needs
> > > its
> > > > > own
> > > > > > > >> stabilization and modification to existing metric dashboards
> > if
> > > > the
> > > > > > > >>format
> > > > > > > >> is changed. Many times such cost are not factored in and a
> > > project
> > > > > loses
> > > > > > > >> time before realizing the extra time required to make a
> > library
> > > as
> > > > > this
> > > > > > > >> operational.
> > > > > > > >>
> > > > > > > >
> > > > > > > >Probably not. The metrics are going to change regardless of
> > > whether
> > > > > we use
> > > > > > > >the same library or not. If we think this is better I don't
> mind
> > > > > putting
> > > > > > > >in
> > > > > > > >a little extra effort to get there.
> > > > > > > >
> > > > > > > >Irrespective I think this is probably not the right thing to
> > > > optimize
> > > > > for.
> > > > > > > >
> > > > > > > >
> > > > > > > >> I am sure we can do better when we write code to a specific
> > use
> > > > > case (in
> > > > > > > >> this case, kafka) rather than building a generic library
> that
> > > > suits
> > > > > all
> > > > > > > >> (metrics-core) but I would like us to have answers to the
> > > > questions
> > > > > > > >>above
> > > > > > > >> and be prepared before we proceed to support this with the
> > > > producer
> > > > > > > >> rewrite.
> > > > > > > >
> > > > > > > >
> > > > > > > >Naturally we are all considering exactly these things, that is
> > > > > exactly the
> > > > > > > >reason I started the thread.
> > > > > > > >
> > > > > > > >-Jay
> > > > > > > >
> > > > > > > >
> > > > > > > >> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> > > > > > > >>
> > > > > > > >> >Thanks for the detailed write-up. It's well thought
> through.
> > A
> > > > few
> > > > > > > >> >comments:
> > > > > > > >> >
> > > > > > > >> >1. I have a couple of concerns on the percentiles. The
> first
> > > > issue
> > > > > is
> > > > > > > >>that
> > > > > > > >> >It requires the user to know the value range. Since the
> range
> > > for
> > > > > > > >>things
> > > > > > > >> >like message size (in millions) is quite different from
> those
> > > > like
> > > > > > > >>request
> > > > > > > >> >time (less than 100), it's going to be hard to pick a good
> > > global
> > > > > > > >>default
> > > > > > > >> >range. Different apps could be dealing with different
> message
> > > > > size. So
> > > > > > > >> >they
> > > > > > > >> >probably will have to customize the range. Another issue is
> > > that
> > > > > it can
> > > > > > > >> >only report values at the bucket boundaries. So, if you
> have
> > > 1000
> > > > > > > >>buckets
> > > > > > > >> >and a value range of 1 million, you will only see 1000
> > possible
> > > > > values
> > > > > > > >>as
> > > > > > > >> >the quantile, which is probably too sparse. The
> > implementation
> > > of
> > > > > > > >> >histogram
> > > > > > > >> >in metrics-core keeps a fix size of samples, which avoids
> > both
> > > > > issues.
> > > > > > > >> >
> > > > > > > >> >2. We need to document the 3-part metrics names better
> since
> > > it's
> > > > > not
> > > > > > > >> >obvious what the convention is. Also, currently the name of
> > the
> > > > > sensor
> > > > > > > >>and
> > > > > > > >> >the metrics defined in it are independent. Would it make
> > sense
> > > to
> > > > > have
> > > > > > > >>the
> > > > > > > >> >sensor name be a prefix of the metric name?
> > > > > > > >> >
> > > > > > > >> >Overall, this approach seems to be cleaner than
> metrics-core
> > by
> > > > > > > >>decoupling
> > > > > > > >> >measuring and reporting. The main benefit of metrics-core
> > seems
> > > > to
> > > > > be
> > > > > > > >>the
> > > > > > > >> >existing reporters. Since not that many people voted for
> > > > > metrics-core,
> > > > > > > >>I
> > > > > > > >> >am
> > > > > > > >> >ok with going with the new implementation. My only
> > > recommendation
> > > > > is to
> > > > > > > >> >address the concern on percentiles.
> > > > > > > >> >
> > > > > > > >> >Thanks,
> > > > > > > >> >
> > > > > > > >> >Jun
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
> > > jay.kreps@gmail.com>
> > > > > > > wrote:
> > > > > > > >> >
> > > > > > > >> >> Hey guys,
> > > > > > > >> >>
> > > > > > > >> >> I wanted to kick off a quick discussion of metrics with
> > > respect
> > > > > to
> > > > > > > >>the
> > > > > > > >> >>new
> > > > > > > >> >> producer and consumer (and potentially the server).
> > > > > > > >> >>
> > > > > > > >> >> At a high level I think there are three approaches we
> could
> > > > take:
> > > > > > > >> >> 1. Plain vanilla JMX
> > > > > > > >> >> 2. Use Coda Hale (AKA Yammer) Metrics
> > > > > > > >> >> 3. Do our own metrics (with JMX as one output)
> > > > > > > >> >>
> > > > > > > >> >> 1. Has the advantage that JMX is the most commonly used
> > java
> > > > > thing
> > > > > > > >>and
> > > > > > > >> >> plugs in reasonably to most metrics systems. JMX is
> > included
> > > in
> > > > > the
> > > > > > > >>JDK
> > > > > > > >> >>so
> > > > > > > >> >> it doesn't impose any additional dependencies on clients.
> > It
> > > > has
> > > > > the
> > > > > > > >> >> disadvantage that plain vanilla JMX is a pain to use. We
> > > would
> > > > > need a
> > > > > > > >> >>bunch
> > > > > > > >> >> of helper code for maintaining counters to make this
> > > > reasonable.
> > > > > > > >> >>
> > > > > > > >> >> 2. Coda Hale metrics is pretty good and broadly used. It
> > > > > supports JMX
> > > > > > > >> >> output as well as direct output to many other types of
> > > systems.
> > > > > The
> > > > > > > >> >>primary
> > > > > > > >> >> downside we have had with Coda Hale has to do with the
> > > clients
> > > > > and
> > > > > > > >> >>library
> > > > > > > >> >> incompatibilities. We are currently on an older more
> > popular
> > > > > version.
> > > > > > > >> >>The
> > > > > > > >> >> newer version is a rewrite of the APIs and is
> incompatible.
> > > > > > > >>Originally
> > > > > > > >> >> these were totally incompatible and people had to choose
> > one
> > > or
> > > > > the
> > > > > > > >> >>other.
> > > > > > > >> >> I think that has been improved so now the new version is
> a
> > > > > totally
> > > > > > > >> >> different package. But even in this case you end up with
> > both
> > > > > > > >>versions
> > > > > > > >> >>if
> > > > > > > >> >> you use Kafka and we are on a different version than you
> > > which
> > > > is
> > > > > > > >>going
> > > > > > > >> >>to
> > > > > > > >> >> be pretty inconvenient.
> > > > > > > >> >>
> > > > > > > >> >> 3. Doing our own has the downside of potentially
> > reinventing
> > > > the
> > > > > > > >>wheel,
> > > > > > > >> >>and
> > > > > > > >> >> potentially needing to work out any bugs in our code. The
> > > > upsides
> > > > > > > >>would
> > > > > > > >> >> depend on the how good the reinvention was. As it
> happens I
> > > > did a
> > > > > > > >>quick
> > > > > > > >> >> (~900 loc) version of a metrics library that is under
> > > > > > > >> >>kafka.common.metrics.
> > > > > > > >> >> I think it has some advantages over the Yammer metrics
> > > package
> > > > > for
> > > > > > > >>our
> > > > > > > >> >> usage beyond just not causing incompatibilities. I will
> > > > describe
> > > > > this
> > > > > > > >> >>code
> > > > > > > >> >> so we can discuss the pros and cons. Although I favor
> this
> > > > > approach I
> > > > > > > >> >>have
> > > > > > > >> >> no emotional attachment and wouldn't be too sad if I
> ended
> > up
> > > > > > > >>deleting
> > > > > > > >> >>it.
> > > > > > > >> >> Here are javadocs for this code, though I haven't written
> > > much
> > > > > > > >> >> documentation yet since I might end up deleting it:
> > > > > > > >> >>
> > > > > > > >> >> Here is a quick overview of this library.
> > > > > > > >> >>
> > > > > > > >> >> There are three main public interfaces:
> > > > > > > >> >>   Metrics - This is a repository of metrics being
> tracked.
> > > > > > > >> >>   Metric - A single, named numerical value being measured
> > > > (i.e. a
> > > > > > > >> >>counter).
> > > > > > > >> >>   Sensor - This is a thing that records values and
> updates
> > > zero
> > > > > or
> > > > > > > >>more
> > > > > > > >> >> metrics
> > > > > > > >> >>
> > > > > > > >> >> So let's say we want to track three values about message
> > > sizes;
> > > > > > > >> >> specifically say we want to record the average, the
> > maximum,
> > > > the
> > > > > > > >>total
> > > > > > > >> >>rate
> > > > > > > >> >> of bytes being sent, and a count of messages. Then we
> would
> > > do
> > > > > > > >>something
> > > > > > > >> >> like this:
> > > > > > > >> >>
> > > > > > > >> >>    // setup code
> > > > > > > >> >>    Metrics metrics = new Metrics(); // this is a global
> > > > > "singleton"
> > > > > > > >> >>    Sensor sensor =
> > > > > metrics.sensor("kafka.producer.message.sizes");
> > > > > > > >> >>    sensor.add("kafka.producer.message-size.avg", new
> > Avg());
> > > > > > > >> >>    sensor.add("kafka.producer.message-size.max", new
> > Max());
> > > > > > > >> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new
> > > Rate());
> > > > > > > >> >>    sensor.add("kafka.producer.message-count", new
> Count());
> > > > > > > >> >>
> > > > > > > >> >>    // now when we get a message we do this
> > > > > > > >> >>    sensor.record(messageSize);
> > > > > > > >> >>
> > > > > > > >> >> The above code creates the global metrics repository,
> > > creates a
> > > > > > > >>single
> > > > > > > >> >> Sensor, and defines 5 named metrics that are updated by
> > that
> > > > > Sensor.
> > > > > > > >> >>
> > > > > > > >> >> Like Yammer Metrics (YM) I allow you to plug in
> > "reporters",
> > > > > > > >>including a
> > > > > > > >> >> JMX reporter. Unlike the Coda Hale JMX reporter the
> > reporter
> > > I
> > > > > have
> > > > > > > >>keys
> > > > > > > >> >> off the metric names not the Sensor names, which I think
> is
> > > an
> > > > > > > >> >> improvement--I just use the convention that the last
> > portion
> > > of
> > > > > the
> > > > > > > >> >>name is
> > > > > > > >> >> the attribute name, the second to last is the mbean name,
> > and
> > > > the
> > > > > > > >>rest
> > > > > > > >> >>is
> > > > > > > >> >> the package. So in the above example there is a producer
> > > mbean
> > > > > that
> > > > > > > >>has
> > > > > > > >> >>a
> > > > > > > >> >> avg and max attribute and a producer mbean that has a
> > > > > > > >>bytes-sent-per-sec
> > > > > > > >> >> and message-count attribute. This is nice because you can
> > > > > logically
> > > > > > > >> >>group
> > > > > > > >> >> the values reported irrespective of where in the program
> > they
> > > > are
> > > > > > > >> >> computed--that is an mbean can logically group attributes
> > > > > computed
> > > > > > > >>off
> > > > > > > >> >> different sensors. This means you can report values by
> > > logical
> > > > > > > >> >>subsystem.
> > > > > > > >> >>
> > > > > > > >> >> I also allow the concept of hierarchical Sensors which I
> > > think
> > > > > is a
> > > > > > > >>good
> > > > > > > >> >> convenience. I have noticed a common pattern in systems
> > where
> > > > you
> > > > > > > >>need
> > > > > > > >> >>to
> > > > > > > >> >> roll up the same values along different dimensions. An
> > simple
> > > > > > > >>example is
> > > > > > > >> >> metrics about qps, data rate, etc on the broker. These we
> > > want
> > > > to
> > > > > > > >> >>capture
> > > > > > > >> >> in aggregate, but also broken down by topic-id. You can
> do
> > > this
> > > > > > > >>purely
> > > > > > > >> >>by
> > > > > > > >> >> defining the sensor hierarchy:
> > > > > > > >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > > > > > > >> >> Sensor topicSizes = metrics.sensor("kafka.producer." +
> > topic
> > >  +
> > > > > > > >> >>".sizes",
> > > > > > > >> >> allSizes);
> > > > > > > >> >> Now each actual update will go to the appropriate
> > topicSizes
> > > > > sensor
> > > > > > > >> >>(based
> > > > > > > >> >> on the topic name), but allSizes metrics will get updated
> > > too.
> > > > I
> > > > > also
> > > > > > > >> >> support multiple parents for each sensor as well as
> > multiple
> > > > > layers
> > > > > > > >>of
> > > > > > > >> >> hiearchy, so you can define a more elaborate DAG of
> > sensors.
> > > An
> > > > > > > >>example
> > > > > > > >> >>of
> > > > > > > >> >> how this would be useful is if you wanted to record your
> > > > metrics
> > > > > > > >>broken
> > > > > > > >> >> down by topic AND client id as well as the global
> > aggregate.
> > > > > > > >> >>
> > > > > > > >> >> Each metric can take a configurable Quota value which
> > allows
> > > us
> > > > > to
> > > > > > > >>limit
> > > > > > > >> >> the maximum value of that sensor. This is intended for
> use
> > on
> > > > the
> > > > > > > >> >>server as
> > > > > > > >> >> part of our Quota implementation. The way this works is
> > that
> > > > you
> > > > > > > >>record
> > > > > > > >> >> metrics as usual:
> > > > > > > >> >>    mySensor.record(42.0)
> > > > > > > >> >> However if this event occurance causes one of the metrics
> > to
> > > > > exceed
> > > > > > > >>its
> > > > > > > >> >> maximum allowable value (the quota) this call will throw
> a
> > > > > > > >> >> QuotaViolationException. The cool thing about this is
> that
> > it
> > > > > means
> > > > > > > >>we
> > > > > > > >> >>can
> > > > > > > >> >> define quotas on anything we capture metrics for, which I
> > > think
> > > > > is
> > > > > > > >> >>pretty
> > > > > > > >> >> cool.
> > > > > > > >> >>
> > > > > > > >> >> Another question is how to handle windowing of the
> values?
> > > > > Metrics
> > > > > > > >>want
> > > > > > > >> >>to
> > > > > > > >> >> record the "current" value, but the definition of current
> > is
> > > > > > > >>inherently
> > > > > > > >> >> nebulous. A few of the obvious gotchas are that if you
> > define
> > > > > > > >>"current"
> > > > > > > >> >>to
> > > > > > > >> >> be a number of events you can end up measuring an
> > arbitrarily
> > > > > long
> > > > > > > >> >>window
> > > > > > > >> >> of time if the event rate is low (e.g. you think you are
> > > > getting
> > > > > 50
> > > > > > > >> >> messages/sec because that was the rate yesterday when all
> > > > events
> > > > > > > >> >>topped).
> > > > > > > >> >>
> > > > > > > >> >> Here is how I approach this. All the metrics use the same
> > > > > windowing
> > > > > > > >> >> approach. We define a single window by a length of time
> or
> > > > > number of
> > > > > > > >> >>values
> > > > > > > >> >> (you can use either or both--if both the window ends when
> > > > > *either*
> > > > > > > >>the
> > > > > > > >> >>time
> > > > > > > >> >> bound or event bound is hit). The typical problem with
> hard
> > > > > window
> > > > > > > >> >> boundaries is that at the beginning of the window you
> have
> > no
> > > > > data
> > > > > > > >>and
> > > > > > > >> >>the
> > > > > > > >> >> first few samples are too small to be a valid sample.
> > > (Consider
> > > > > if
> > > > > > > >>you
> > > > > > > >> >>were
> > > > > > > >> >> keeping an avg and the first value in the window happens
> to
> > > be
> > > > > very
> > > > > > > >>very
> > > > > > > >> >> high, if you check the avg at this exact time you will
> > > conclude
> > > > > the
> > > > > > > >>avg
> > > > > > > >> >>is
> > > > > > > >> >> very high but on a sample size of one). One simple fix
> > would
> > > be
> > > > > to
> > > > > > > >> >>always
> > > > > > > >> >> report the last complete window, however this is not
> > > > appropriate
> > > > > here
> > > > > > > >> >> because (1) we want to drive quotas off it so it needs to
> > be
> > > > > current,
> > > > > > > >> >>and
> > > > > > > >> >> (2) since this is for monitoring you kind of care more
> > about
> > > > the
> > > > > > > >>current
> > > > > > > >> >> state. The ideal solution here would be to define a
> > backwards
> > > > > looking
> > > > > > > >> >> sliding window from the present, but many statistics are
> > > > actually
> > > > > > > >>very
> > > > > > > >> >>hard
> > > > > > > >> >> to compute in this model without retaining all the values
> > > which
> > > > > > > >>would be
> > > > > > > >> >> hopelessly inefficient. My solution to this is to keep a
> > > > > configurable
> > > > > > > >> >> number of windows (default is two) and combine them for
> the
> > > > > estimate.
> > > > > > > >> >>So in
> > > > > > > >> >> a two sample case depending on when you ask you have
> > between
> > > > one
> > > > > and
> > > > > > > >>two
> > > > > > > >> >> complete samples worth of data to base the answer off of.
> > > > > Provided
> > > > > > > >>the
> > > > > > > >> >> sample window is large enough to get a valid result this
> > > > > satisfies
> > > > > > > >>both
> > > > > > > >> >>of
> > > > > > > >> >> my criteria of incorporating the most recent data and
> > having
> > > > > > > >>reasonable
> > > > > > > >> >> variance at all times.
> > > > > > > >> >>
> > > > > > > >> >> Another approach is to use an exponential weighting
> scheme
> > to
> > > > > combine
> > > > > > > >> >>all
> > > > > > > >> >> history but emphasize the recent past. I have not done
> this
> > > as
> > > > it
> > > > > > > >>has a
> > > > > > > >> >>lot
> > > > > > > >> >> of issues for practical operational metrics. I'd be happy
> > to
> > > > > > > >>elaborate
> > > > > > > >> >>on
> > > > > > > >> >> this if anyone cares...
> > > > > > > >> >>
> > > > > > > >> >> The window size for metrics has a global default which
> can
> > be
> > > > > > > >> >>overridden at
> > > > > > > >> >> either the sensor or individual metric level.
> > > > > > > >> >>
> > > > > > > >> >> In addition to these time series values the user can
> > directly
> > > > > expose
> > > > > > > >> >>some
> > > > > > > >> >> method of their choosing JMX-style by implementing the
> > > > Measurable
> > > > > > > >> >>interface
> > > > > > > >> >> and registering that value. E.g.
> > > > > > > >> >>   metrics.addMetric("my.metric", new Measurable() {
> > > > > > > >> >>     public double measure(MetricConfg config, long now) {
> > > > > > > >> >>        return this.calculateValueToExpose();
> > > > > > > >> >>     }
> > > > > > > >> >>   });
> > > > > > > >> >> This is useful for exposing things like the accumulator
> > free
> > > > > memory.
> > > > > > > >> >>
> > > > > > > >> >> The set of metrics is extensible, new metrics can be
> added
> > by
> > > > > just
> > > > > > > >> >> implementing the appropriate interfaces and registering
> > with
> > > a
> > > > > > > >>sensor. I
> > > > > > > >> >> implement the following metrics:
> > > > > > > >> >>   total - the sum of all values from the given sensor
> > > > > > > >> >>   count - a windowed count of values from the sensor
> > > > > > > >> >>   avg - the sample average within the windows
> > > > > > > >> >>   max - the max over the windows
> > > > > > > >> >>   min - the min over the windows
> > > > > > > >> >>   rate - the rate in the windows (e.g. the total or count
> > > > > divided by
> > > > > > > >>the
> > > > > > > >> >> ellapsed time)
> > > > > > > >> >>   percentiles - a collection of percentiles computed over
> > the
> > > > > window
> > > > > > > >> >>
> > > > > > > >> >> My approach to percentiles is a little different from the
> > > > yammer
> > > > > > > >>metrics
> > > > > > > >> >> package. My complaint about the yammer metrics approach
> is
> > > that
> > > > > it
> > > > > > > >>uses
> > > > > > > >> >> rather expensive sampling and uses kind of a lot of
> memory
> > to
> > > > > get a
> > > > > > > >> >> reasonable sample. This is problematic for per-topic
> > > > > measurements.
> > > > > > > >> >>
> > > > > > > >> >> Instead I use a fixed range for the histogram (e.g. 0.0
> to
> > > > > 30000.0)
> > > > > > > >> >>which
> > > > > > > >> >> directly allows you to specify the desired memory use.
> Any
> > > > value
> > > > > > > >>below
> > > > > > > >> >>the
> > > > > > > >> >> minimum is recorded as -Infinity and any value above the
> > > > maximum
> > > > > as
> > > > > > > >> >> +Infinity. I think this is okay as all metrics have an
> > > expected
> > > > > range
> > > > > > > >> >> except for latency which can be arbitrarily large, but
> for
> > > very
> > > > > high
> > > > > > > >> >> latency there is no need to model it exactly (e.g. 30
> > > seconds +
> > > > > > > >>really
> > > > > > > >> >>is
> > > > > > > >> >> effectively infinite). Within the range values are
> recorded
> > > in
> > > > > > > >>buckets
> > > > > > > >> >> which can be either fixed width or increasing width. The
> > > > > increasing
> > > > > > > >> >>width
> > > > > > > >> >> is analogous to the idea of significant figures, that is
> if
> > > > your
> > > > > > > >>value
> > > > > > > >> >>is
> > > > > > > >> >> in the range 0-10 you might want to be accurate to within
> > > 1ms,
> > > > > but if
> > > > > > > >> >>it is
> > > > > > > >> >> 20000 there is no need to be so accurate. I implemented a
> > > > linear
> > > > > > > >>bucket
> > > > > > > >> >> size where the Nth bucket has width proportional to N. An
> > > > > exponential
> > > > > > > >> >> bucket size would also be sensible and could likely be
> > > derived
> > > > > > > >>directly
> > > > > > > >> >> from the floating point representation of a the value.
> > > > > > > >> >>
> > > > > > > >> >> I'd like to get some feedback on this metrics code and
> > make a
> > > > > > > >>decision
> > > > > > > >> >>on
> > > > > > > >> >> whether we want to use it before I actually go ahead and
> > add
> > > > all
> > > > > the
> > > > > > > >> >> instrumentation in the code (otherwise I'll have to redo
> it
> > > if
> > > > we
> > > > > > > >>switch
> > > > > > > >> >> approaches). So the next topic of discussion will be
> which
> > > > actual
> > > > > > > >> >>metrics
> > > > > > > >> >> to add.
> > > > > > > >> >>
> > > > > > > >> >> -Jay
> > > > > > > >> >>
> > > > > > > >>
> > > > > > > >>
> > > > > > >
> > > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Metrics in new producer

Posted by Clark Breyman <cl...@breyman.com>.

Not requiring the client to link Coda/Yammer metrics sounds like a
compelling reason to pivot to new interfaces. If that's the agreed
direction, I'm hoping that we'd get the choice of backend to provide (e.g.
facade on Yammer metrics for those with an investment in that) rather than
force the new backend.  Having a metrics factory seems better for this than
directly instantiating the singleton registry.


On Thu, Feb 13, 2014 at 2:39 PM, Joe Stein <jo...@stealth.ly> wrote:

> Can we leave metrics and have multiple supported KafkaMetricsGroup
> implementing a yammer based implementation?
>
> ProducerRequestStats with your configured analytics group?
>
> On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com> wrote:
>
> > I think we discussed the scala/java stuff more fully previously.
> > Essentially the client is embedded everywhere. Scala is very incompatible
> > with itself so this makes it very hard to use for people using anything
> > else in scala. Also Scala stack traces are very confusing. Basically we
> > thought plain java code would be a lot easier for people to use. Even if
> > Scala is more fun to write, that isn't really what we are optimizing for.
> >
> > -Jay
> >
> >
> > On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com> wrote:
> >
> > > Jay, pretty impressive how you just write a 'quick version' like that
> :)
> > > Not to get off-topic but why didn't you write this in scala?
> > >
> > >
> > >
> > > On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com>
> wrote:
> > >
> > > > I have not had a chance to review the new metrics code and its
> > > > features carefully (apart from your write-up), but here are my
> general
> > > > thoughts:
> > > >
> > > > Implementing a metrics package correctly is difficult; more so for
> > > > people like me, because I'm not a statistician.  However, if this new
> > > > package: {(i) functions correctly (and we need to define and prove
> > > > correctness), (ii) is easy to use, (iii) serves all our current and
> > > > anticipated monitoring needs, (iv) is not overly complex that it
> > > > becomes a burden to maintain and we are better of with an available
> > > > library;} then I think it makes sense to embed it and use it within
> > > > the Kafka code. The main wins are: (i) predictability (no changing
> > > > APIs and intimate knowledge of the code) and (ii) control with
> respect
> > > > to both functionality (e.g., there are hard-coded decay constants in
> > > > metrics-core 2.x) and correctness (i.e., if we find a bug in the
> > > > metrics package we have to submit a pull request and wait for it to
> > > > become mainstream).  I'm not sure it would help very much to pull it
> > > > into a separate repo because that could potentially annul these
> > > > benefits.
> > > >
> > > > Joel
> > > >
> > > > On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> > > > > Sriram,
> > > > >
> > > > > Makes sense. I am cool moving this stuff into its own repo if
> people
> > > > think
> > > > > that is better. I'm not sure it would get much contribution but
> when
> > I
> > > > > started messing with this I did have a lot of grand ideas of making
> > > > adding
> > > > > metrics to a sensor dynamic so you could add more stuff in
> > > real-time(via
> > > > > jmx, say) and/or externalize all your metrics and config to a
> > separate
> > > > file
> > > > > like log4j with only the points of instrumentation hard-coded.
> > > > >
> > > > > -Jay
> > > > >
> > > > >
> > > > > On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> > > > > srsubramanian@linkedin.com> wrote:
> > > > >
> > > > > > I am actually neutral to this change. I found the replies were
> more
> > > > > > towards the implementation and features so far. I would like the
> > > > community
> > > > > > to think about the questions below before making a decision. My
> > > > opinion on
> > > > > > this is that it has potential to be its own project and it would
> > > > attract
> > > > > > developers who are specifically interested in contributing to
> > > metrics.
> > > > I
> > > > > > am skeptical that the Kafka contributors would focus on improving
> > > this
> > > > > > library (apart from bug fixes) instead of developing/contributing
> > to
> > > > other
> > > > > > core pieces. It would be useful to continue and keep it decoupled
> > > from
> > > > > > rest of Kafka (if it resides in the Kafka code base.) so that we
> > can
> > > > move
> > > > > > it out anytime to its own project.
> > > > > >
> > > > > >
> > > > > > On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> > > > > >
> > > > > > >Hey Sriram,
> > > > > > >
> > > > > > >Not sure if these are actually meant as questions or more veiled
> > > > comments.
> > > > > > >In an case I tried to give my 2 cents inline.
> > > > > > >
> > > > > > >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> > > > > > >srsubramanian@linkedin.com> wrote:
> > > > > > >
> > > > > > >> I think answering the questions below would help to make a
> > better
> > > > > > >> decision. I am all for writing better code and having superior
> > > > > > >> functionalities but it is worth thinking about stuff outside
> > just
> > > > code
> > > > > > >>in
> > > > > > >> this case -
> > > > > > >>
> > > > > > >> 1. Does metric form a core piece of kafka? Does it help kafka
> > > > greatly in
> > > > > > >> providing better core functionalities? I would always like a
> > > > project to
> > > > > > >>do
> > > > > > >> one thing really well. Metrics is a non trivial amount of
> code.
> > > > > > >>
> > > > > > >
> > > > > > >Metrics are obviously important, and obviously improving our
> > metrics
> > > > > > >system
> > > > > > >would be good. That said this may or may not be better, and even
> > if
> > > > it is
> > > > > > >better that betterness might not outweigh other considerations.
> > That
> > > > is
> > > > > > >what we are discussing.
> > > > > > >
> > > > > > >
> > > > > > >> 2. Does it make sense to be part of Kafka or its own project?
> If
> > > > this
> > > > > > >> metrics library has the potential to be better than
> > metrics-core,
> > > I
> > > > > > >>would
> > > > > > >> be interested in other projects take advantage of it.
> > > > > > >>
> > > > > > >
> > > > > > >It could be either.
> > > > > > >
> > > > > > >3. Can Kafka maintain this library as new members join and old
> > > members
> > > > > > >> leave? Would this be a piece of code that no one (in Kafka) in
> > the
> > > > > > >>future
> > > > > > >> spends time improving if the original author left?
> > > > > > >>
> > > > > > >
> > > > > > >I am not going anywhere in the near term, but if I did, yes,
> this
> > > > would be
> > > > > > >like any other code we have. As with yammer metrics or any other
> > > code
> > > > at
> > > > > > >that point we would either use it as is or someone would improve
> > it.
> > > > > > >
> > > > > > >
> > > > > > >> 4. Does it affect the schedule of producer rewrite? This needs
> > its
> > > > own
> > > > > > >> stabilization and modification to existing metric dashboards
> if
> > > the
> > > > > > >>format
> > > > > > >> is changed. Many times such cost are not factored in and a
> > project
> > > > loses
> > > > > > >> time before realizing the extra time required to make a
> library
> > as
> > > > this
> > > > > > >> operational.
> > > > > > >>
> > > > > > >
> > > > > > >Probably not. The metrics are going to change regardless of
> > whether
> > > > we use
> > > > > > >the same library or not. If we think this is better I don't mind
> > > > putting
> > > > > > >in
> > > > > > >a little extra effort to get there.
> > > > > > >
> > > > > > >Irrespective I think this is probably not the right thing to
> > > optimize
> > > > for.
> > > > > > >
> > > > > > >
> > > > > > >> I am sure we can do better when we write code to a specific
> use
> > > > case (in
> > > > > > >> this case, kafka) rather than building a generic library that
> > > suits
> > > > all
> > > > > > >> (metrics-core) but I would like us to have answers to the
> > > questions
> > > > > > >>above
> > > > > > >> and be prepared before we proceed to support this with the
> > > producer
> > > > > > >> rewrite.
> > > > > > >
> > > > > > >
> > > > > > >Naturally we are all considering exactly these things, that is
> > > > exactly the
> > > > > > >reason I started the thread.
> > > > > > >
> > > > > > >-Jay
> > > > > > >
> > > > > > >
> > > > > > >> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> > > > > > >>
> > > > > > >> >Thanks for the detailed write-up. It's well thought through.
> A
> > > few
> > > > > > >> >comments:
> > > > > > >> >
> > > > > > >> >1. I have a couple of concerns on the percentiles. The first
> > > issue
> > > > is
> > > > > > >>that
> > > > > > >> >It requires the user to know the value range. Since the range
> > for
> > > > > > >>things
> > > > > > >> >like message size (in millions) is quite different from those
> > > like
> > > > > > >>request
> > > > > > >> >time (less than 100), it's going to be hard to pick a good
> > global
> > > > > > >>default
> > > > > > >> >range. Different apps could be dealing with different message
> > > > size. So
> > > > > > >> >they
> > > > > > >> >probably will have to customize the range. Another issue is
> > that
> > > > it can
> > > > > > >> >only report values at the bucket boundaries. So, if you have
> > 1000
> > > > > > >>buckets
> > > > > > >> >and a value range of 1 million, you will only see 1000
> possible
> > > > values
> > > > > > >>as
> > > > > > >> >the quantile, which is probably too sparse. The
> implementation
> > of
> > > > > > >> >histogram
> > > > > > >> >in metrics-core keeps a fix size of samples, which avoids
> both
> > > > issues.
> > > > > > >> >
> > > > > > >> >2. We need to document the 3-part metrics names better since
> > it's
> > > > not
> > > > > > >> >obvious what the convention is. Also, currently the name of
> the
> > > > sensor
> > > > > > >>and
> > > > > > >> >the metrics defined in it are independent. Would it make
> sense
> > to
> > > > have
> > > > > > >>the
> > > > > > >> >sensor name be a prefix of the metric name?
> > > > > > >> >
> > > > > > >> >Overall, this approach seems to be cleaner than metrics-core
> by
> > > > > > >>decoupling
> > > > > > >> >measuring and reporting. The main benefit of metrics-core
> seems
> > > to
> > > > be
> > > > > > >>the
> > > > > > >> >existing reporters. Since not that many people voted for
> > > > metrics-core,
> > > > > > >>I
> > > > > > >> >am
> > > > > > >> >ok with going with the new implementation. My only
> > recommendation
> > > > is to
> > > > > > >> >address the concern on percentiles.
> > > > > > >> >
> > > > > > >> >Thanks,
> > > > > > >> >
> > > > > > >> >Jun
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
> > jay.kreps@gmail.com>
> > > > > > wrote:
> > > > > > >> >
> > > > > > >> >> Hey guys,
> > > > > > >> >>
> > > > > > >> >> I wanted to kick off a quick discussion of metrics with
> > respect
> > > > to
> > > > > > >>the
> > > > > > >> >>new
> > > > > > >> >> producer and consumer (and potentially the server).
> > > > > > >> >>
> > > > > > >> >> At a high level I think there are three approaches we could
> > > take:
> > > > > > >> >> 1. Plain vanilla JMX
> > > > > > >> >> 2. Use Coda Hale (AKA Yammer) Metrics
> > > > > > >> >> 3. Do our own metrics (with JMX as one output)
> > > > > > >> >>
> > > > > > >> >> 1. Has the advantage that JMX is the most commonly used
> java
> > > > thing
> > > > > > >>and
> > > > > > >> >> plugs in reasonably to most metrics systems. JMX is
> included
> > in
> > > > the
> > > > > > >>JDK
> > > > > > >> >>so
> > > > > > >> >> it doesn't impose any additional dependencies on clients.
> It
> > > has
> > > > the
> > > > > > >> >> disadvantage that plain vanilla JMX is a pain to use. We
> > would
> > > > need a
> > > > > > >> >>bunch
> > > > > > >> >> of helper code for maintaining counters to make this
> > > reasonable.
> > > > > > >> >>
> > > > > > >> >> 2. Coda Hale metrics is pretty good and broadly used. It
> > > > supports JMX
> > > > > > >> >> output as well as direct output to many other types of
> > systems.
> > > > The
> > > > > > >> >>primary
> > > > > > >> >> downside we have had with Coda Hale has to do with the
> > clients
> > > > and
> > > > > > >> >>library
> > > > > > >> >> incompatibilities. We are currently on an older more
> popular
> > > > version.
> > > > > > >> >>The
> > > > > > >> >> newer version is a rewrite of the APIs and is incompatible.
> > > > > > >>Originally
> > > > > > >> >> these were totally incompatible and people had to choose
> one
> > or
> > > > the
> > > > > > >> >>other.
> > > > > > >> >> I think that has been improved so now the new version is a
> > > > totally
> > > > > > >> >> different package. But even in this case you end up with
> both
> > > > > > >>versions
> > > > > > >> >>if
> > > > > > >> >> you use Kafka and we are on a different version than you
> > which
> > > is
> > > > > > >>going
> > > > > > >> >>to
> > > > > > >> >> be pretty inconvenient.
> > > > > > >> >>
> > > > > > >> >> 3. Doing our own has the downside of potentially
> reinventing
> > > the
> > > > > > >>wheel,
> > > > > > >> >>and
> > > > > > >> >> potentially needing to work out any bugs in our code. The
> > > upsides
> > > > > > >>would
> > > > > > >> >> depend on the how good the reinvention was. As it happens I
> > > did a
> > > > > > >>quick
> > > > > > >> >> (~900 loc) version of a metrics library that is under
> > > > > > >> >>kafka.common.metrics.
> > > > > > >> >> I think it has some advantages over the Yammer metrics
> > package
> > > > for
> > > > > > >>our
> > > > > > >> >> usage beyond just not causing incompatibilities. I will
> > > describe
> > > > this
> > > > > > >> >>code
> > > > > > >> >> so we can discuss the pros and cons. Although I favor this
> > > > approach I
> > > > > > >> >>have
> > > > > > >> >> no emotional attachment and wouldn't be too sad if I ended
> up
> > > > > > >>deleting
> > > > > > >> >>it.
> > > > > > >> >> Here are javadocs for this code, though I haven't written
> > much
> > > > > > >> >> documentation yet since I might end up deleting it:
> > > > > > >> >>
> > > > > > >> >> Here is a quick overview of this library.
> > > > > > >> >>
> > > > > > >> >> There are three main public interfaces:
> > > > > > >> >>   Metrics - This is a repository of metrics being tracked.
> > > > > > >> >>   Metric - A single, named numerical value being measured
> > > (i.e. a
> > > > > > >> >>counter).
> > > > > > >> >>   Sensor - This is a thing that records values and updates
> > zero
> > > > or
> > > > > > >>more
> > > > > > >> >> metrics
> > > > > > >> >>
> > > > > > >> >> So let's say we want to track three values about message
> > sizes;
> > > > > > >> >> specifically say we want to record the average, the
> maximum,
> > > the
> > > > > > >>total
> > > > > > >> >>rate
> > > > > > >> >> of bytes being sent, and a count of messages. Then we would
> > do
> > > > > > >>something
> > > > > > >> >> like this:
> > > > > > >> >>
> > > > > > >> >>    // setup code
> > > > > > >> >>    Metrics metrics = new Metrics(); // this is a global
> > > > "singleton"
> > > > > > >> >>    Sensor sensor =
> > > > metrics.sensor("kafka.producer.message.sizes");
> > > > > > >> >>    sensor.add("kafka.producer.message-size.avg", new
> Avg());
> > > > > > >> >>    sensor.add("kafka.producer.message-size.max", new
> Max());
> > > > > > >> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new
> > Rate());
> > > > > > >> >>    sensor.add("kafka.producer.message-count", new Count());
> > > > > > >> >>
> > > > > > >> >>    // now when we get a message we do this
> > > > > > >> >>    sensor.record(messageSize);
> > > > > > >> >>
> > > > > > >> >> The above code creates the global metrics repository,
> > creates a
> > > > > > >>single
> > > > > > >> >> Sensor, and defines 5 named metrics that are updated by
> that
> > > > Sensor.
> > > > > > >> >>
> > > > > > >> >> Like Yammer Metrics (YM) I allow you to plug in
> "reporters",
> > > > > > >>including a
> > > > > > >> >> JMX reporter. Unlike the Coda Hale JMX reporter the
> reporter
> > I
> > > > have
> > > > > > >>keys
> > > > > > >> >> off the metric names not the Sensor names, which I think is
> > an
> > > > > > >> >> improvement--I just use the convention that the last
> portion
> > of
> > > > the
> > > > > > >> >>name is
> > > > > > >> >> the attribute name, the second to last is the mbean name,
> and
> > > the
> > > > > > >>rest
> > > > > > >> >>is
> > > > > > >> >> the package. So in the above example there is a producer
> > mbean
> > > > that
> > > > > > >>has
> > > > > > >> >>a
> > > > > > >> >> avg and max attribute and a producer mbean that has a
> > > > > > >>bytes-sent-per-sec
> > > > > > >> >> and message-count attribute. This is nice because you can
> > > > logically
> > > > > > >> >>group
> > > > > > >> >> the values reported irrespective of where in the program
> they
> > > are
> > > > > > >> >> computed--that is an mbean can logically group attributes
> > > > computed
> > > > > > >>off
> > > > > > >> >> different sensors. This means you can report values by
> > logical
> > > > > > >> >>subsystem.
> > > > > > >> >>
> > > > > > >> >> I also allow the concept of hierarchical Sensors which I
> > think
> > > > is a
> > > > > > >>good
> > > > > > >> >> convenience. I have noticed a common pattern in systems
> where
> > > you
> > > > > > >>need
> > > > > > >> >>to
> > > > > > >> >> roll up the same values along different dimensions. An
> simple
> > > > > > >>example is
> > > > > > >> >> metrics about qps, data rate, etc on the broker. These we
> > want
> > > to
> > > > > > >> >>capture
> > > > > > >> >> in aggregate, but also broken down by topic-id. You can do
> > this
> > > > > > >>purely
> > > > > > >> >>by
> > > > > > >> >> defining the sensor hierarchy:
> > > > > > >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > > > > > >> >> Sensor topicSizes = metrics.sensor("kafka.producer." +
> topic
> >  +
> > > > > > >> >>".sizes",
> > > > > > >> >> allSizes);
> > > > > > >> >> Now each actual update will go to the appropriate
> topicSizes
> > > > sensor
> > > > > > >> >>(based
> > > > > > >> >> on the topic name), but allSizes metrics will get updated
> > too.
> > > I
> > > > also
> > > > > > >> >> support multiple parents for each sensor as well as
> multiple
> > > > layers
> > > > > > >>of
> > > > > > >> >> hiearchy, so you can define a more elaborate DAG of
> sensors.
> > An
> > > > > > >>example
> > > > > > >> >>of
> > > > > > >> >> how this would be useful is if you wanted to record your
> > > metrics
> > > > > > >>broken
> > > > > > >> >> down by topic AND client id as well as the global
> aggregate.
> > > > > > >> >>
> > > > > > >> >> Each metric can take a configurable Quota value which
> allows
> > us
> > > > to
> > > > > > >>limit
> > > > > > >> >> the maximum value of that sensor. This is intended for use
> on
> > > the
> > > > > > >> >>server as
> > > > > > >> >> part of our Quota implementation. The way this works is
> that
> > > you
> > > > > > >>record
> > > > > > >> >> metrics as usual:
> > > > > > >> >>    mySensor.record(42.0)
> > > > > > >> >> However if this event occurance causes one of the metrics
> to
> > > > exceed
> > > > > > >>its
> > > > > > >> >> maximum allowable value (the quota) this call will throw a
> > > > > > >> >> QuotaViolationException. The cool thing about this is that
> it
> > > > means
> > > > > > >>we
> > > > > > >> >>can
> > > > > > >> >> define quotas on anything we capture metrics for, which I
> > think
> > > > is
> > > > > > >> >>pretty
> > > > > > >> >> cool.
> > > > > > >> >>
> > > > > > >> >> Another question is how to handle windowing of the values?
> > > > Metrics
> > > > > > >>want
> > > > > > >> >>to
> > > > > > >> >> record the "current" value, but the definition of current
> is
> > > > > > >>inherently
> > > > > > >> >> nebulous. A few of the obvious gotchas are that if you
> define
> > > > > > >>"current"
> > > > > > >> >>to
> > > > > > >> >> be a number of events you can end up measuring an
> arbitrarily
> > > > long
> > > > > > >> >>window
> > > > > > >> >> of time if the event rate is low (e.g. you think you are
> > > getting
> > > > 50
> > > > > > >> >> messages/sec because that was the rate yesterday when all
> > > events
> > > > > > >> >>topped).
> > > > > > >> >>
> > > > > > >> >> Here is how I approach this. All the metrics use the same
> > > > windowing
> > > > > > >> >> approach. We define a single window by a length of time or
> > > > number of
> > > > > > >> >>values
> > > > > > >> >> (you can use either or both--if both the window ends when
> > > > *either*
> > > > > > >>the
> > > > > > >> >>time
> > > > > > >> >> bound or event bound is hit). The typical problem with hard
> > > > window
> > > > > > >> >> boundaries is that at the beginning of the window you have
> no
> > > > data
> > > > > > >>and
> > > > > > >> >>the
> > > > > > >> >> first few samples are too small to be a valid sample.
> > (Consider
> > > > if
> > > > > > >>you
> > > > > > >> >>were
> > > > > > >> >> keeping an avg and the first value in the window happens to
> > be
> > > > very
> > > > > > >>very
> > > > > > >> >> high, if you check the avg at this exact time you will
> > conclude
> > > > the
> > > > > > >>avg
> > > > > > >> >>is
> > > > > > >> >> very high but on a sample size of one). One simple fix
> would
> > be
> > > > to
> > > > > > >> >>always
> > > > > > >> >> report the last complete window, however this is not
> > > appropriate
> > > > here
> > > > > > >> >> because (1) we want to drive quotas off it so it needs to
> be
> > > > current,
> > > > > > >> >>and
> > > > > > >> >> (2) since this is for monitoring you kind of care more
> about
> > > the
> > > > > > >>current
> > > > > > >> >> state. The ideal solution here would be to define a
> backwards
> > > > looking
> > > > > > >> >> sliding window from the present, but many statistics are
> > > actually
> > > > > > >>very
> > > > > > >> >>hard
> > > > > > >> >> to compute in this model without retaining all the values
> > which
> > > > > > >>would be
> > > > > > >> >> hopelessly inefficient. My solution to this is to keep a
> > > > configurable
> > > > > > >> >> number of windows (default is two) and combine them for the
> > > > estimate.
> > > > > > >> >>So in
> > > > > > >> >> a two sample case depending on when you ask you have
> between
> > > one
> > > > and
> > > > > > >>two
> > > > > > >> >> complete samples worth of data to base the answer off of.
> > > > Provided
> > > > > > >>the
> > > > > > >> >> sample window is large enough to get a valid result this
> > > > satisfies
> > > > > > >>both
> > > > > > >> >>of
> > > > > > >> >> my criteria of incorporating the most recent data and
> having
> > > > > > >>reasonable
> > > > > > >> >> variance at all times.
> > > > > > >> >>
> > > > > > >> >> Another approach is to use an exponential weighting scheme
> to
> > > > combine
> > > > > > >> >>all
> > > > > > >> >> history but emphasize the recent past. I have not done this
> > as
> > > it
> > > > > > >>has a
> > > > > > >> >>lot
> > > > > > >> >> of issues for practical operational metrics. I'd be happy
> to
> > > > > > >>elaborate
> > > > > > >> >>on
> > > > > > >> >> this if anyone cares...
> > > > > > >> >>
> > > > > > >> >> The window size for metrics has a global default which can
> be
> > > > > > >> >>overridden at
> > > > > > >> >> either the sensor or individual metric level.
> > > > > > >> >>
> > > > > > >> >> In addition to these time series values the user can
> directly
> > > > expose
> > > > > > >> >>some
> > > > > > >> >> method of their choosing JMX-style by implementing the
> > > Measurable
> > > > > > >> >>interface
> > > > > > >> >> and registering that value. E.g.
> > > > > > >> >>   metrics.addMetric("my.metric", new Measurable() {
> > > > > > >> >>     public double measure(MetricConfg config, long now) {
> > > > > > >> >>        return this.calculateValueToExpose();
> > > > > > >> >>     }
> > > > > > >> >>   });
> > > > > > >> >> This is useful for exposing things like the accumulator
> free
> > > > memory.
> > > > > > >> >>
> > > > > > >> >> The set of metrics is extensible, new metrics can be added
> by
> > > > just
> > > > > > >> >> implementing the appropriate interfaces and registering
> with
> > a
> > > > > > >>sensor. I
> > > > > > >> >> implement the following metrics:
> > > > > > >> >>   total - the sum of all values from the given sensor
> > > > > > >> >>   count - a windowed count of values from the sensor
> > > > > > >> >>   avg - the sample average within the windows
> > > > > > >> >>   max - the max over the windows
> > > > > > >> >>   min - the min over the windows
> > > > > > >> >>   rate - the rate in the windows (e.g. the total or count
> > > > divided by
> > > > > > >>the
> > > > > > >> >> ellapsed time)
> > > > > > >> >>   percentiles - a collection of percentiles computed over
> the
> > > > window
> > > > > > >> >>
> > > > > > >> >> My approach to percentiles is a little different from the
> > > yammer
> > > > > > >>metrics
> > > > > > >> >> package. My complaint about the yammer metrics approach is
> > that
> > > > it
> > > > > > >>uses
> > > > > > >> >> rather expensive sampling and uses kind of a lot of memory
> to
> > > > get a
> > > > > > >> >> reasonable sample. This is problematic for per-topic
> > > > measurements.
> > > > > > >> >>
> > > > > > >> >> Instead I use a fixed range for the histogram (e.g. 0.0 to
> > > > 30000.0)
> > > > > > >> >>which
> > > > > > >> >> directly allows you to specify the desired memory use. Any
> > > value
> > > > > > >>below
> > > > > > >> >>the
> > > > > > >> >> minimum is recorded as -Infinity and any value above the
> > > maximum
> > > > as
> > > > > > >> >> +Infinity. I think this is okay as all metrics have an
> > expected
> > > > range
> > > > > > >> >> except for latency which can be arbitrarily large, but for
> > very
> > > > high
> > > > > > >> >> latency there is no need to model it exactly (e.g. 30
> > seconds +
> > > > > > >>really
> > > > > > >> >>is
> > > > > > >> >> effectively infinite). Within the range values are recorded
> > in
> > > > > > >>buckets
> > > > > > >> >> which can be either fixed width or increasing width. The
> > > > increasing
> > > > > > >> >>width
> > > > > > >> >> is analogous to the idea of significant figures, that is if
> > > your
> > > > > > >>value
> > > > > > >> >>is
> > > > > > >> >> in the range 0-10 you might want to be accurate to within
> > 1ms,
> > > > but if
> > > > > > >> >>it is
> > > > > > >> >> 20000 there is no need to be so accurate. I implemented a
> > > linear
> > > > > > >>bucket
> > > > > > >> >> size where the Nth bucket has width proportional to N. An
> > > > exponential
> > > > > > >> >> bucket size would also be sensible and could likely be
> > derived
> > > > > > >>directly
> > > > > > >> >> from the floating point representation of a the value.
> > > > > > >> >>
> > > > > > >> >> I'd like to get some feedback on this metrics code and
> make a
> > > > > > >>decision
> > > > > > >> >>on
> > > > > > >> >> whether we want to use it before I actually go ahead and
> add
> > > all
> > > > the
> > > > > > >> >> instrumentation in the code (otherwise I'll have to redo it
> > if
> > > we
> > > > > > >>switch
> > > > > > >> >> approaches). So the next topic of discussion will be which
> > > actual
> > > > > > >> >>metrics
> > > > > > >> >> to add.
> > > > > > >> >>
> > > > > > >> >> -Jay
> > > > > > >> >>
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > > >
> > > >
> > > >
> > >
> >
>

Re: Metrics in new producer

Posted by Joe Stein <jo...@stealth.ly>.

Can we leave metrics and have multiple supported KafkaMetricsGroup
implementing a yammer based implementation?

ProducerRequestStats with your configured analytics group?

On Thu, Feb 13, 2014 at 11:37 AM, Jay Kreps <ja...@gmail.com> wrote:

> I think we discussed the scala/java stuff more fully previously.
> Essentially the client is embedded everywhere. Scala is very incompatible
> with itself so this makes it very hard to use for people using anything
> else in scala. Also Scala stack traces are very confusing. Basically we
> thought plain java code would be a lot easier for people to use. Even if
> Scala is more fun to write, that isn't really what we are optimizing for.
>
> -Jay
>
>
> On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com> wrote:
>
> > Jay, pretty impressive how you just write a 'quick version' like that :)
> > Not to get off-topic but why didn't you write this in scala?
> >
> >
> >
> > On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com> wrote:
> >
> > > I have not had a chance to review the new metrics code and its
> > > features carefully (apart from your write-up), but here are my general
> > > thoughts:
> > >
> > > Implementing a metrics package correctly is difficult; more so for
> > > people like me, because I'm not a statistician.  However, if this new
> > > package: {(i) functions correctly (and we need to define and prove
> > > correctness), (ii) is easy to use, (iii) serves all our current and
> > > anticipated monitoring needs, (iv) is not overly complex that it
> > > becomes a burden to maintain and we are better of with an available
> > > library;} then I think it makes sense to embed it and use it within
> > > the Kafka code. The main wins are: (i) predictability (no changing
> > > APIs and intimate knowledge of the code) and (ii) control with respect
> > > to both functionality (e.g., there are hard-coded decay constants in
> > > metrics-core 2.x) and correctness (i.e., if we find a bug in the
> > > metrics package we have to submit a pull request and wait for it to
> > > become mainstream).  I'm not sure it would help very much to pull it
> > > into a separate repo because that could potentially annul these
> > > benefits.
> > >
> > > Joel
> > >
> > > On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> > > > Sriram,
> > > >
> > > > Makes sense. I am cool moving this stuff into its own repo if people
> > > think
> > > > that is better. I'm not sure it would get much contribution but when
> I
> > > > started messing with this I did have a lot of grand ideas of making
> > > adding
> > > > metrics to a sensor dynamic so you could add more stuff in
> > real-time(via
> > > > jmx, say) and/or externalize all your metrics and config to a
> separate
> > > file
> > > > like log4j with only the points of instrumentation hard-coded.
> > > >
> > > > -Jay
> > > >
> > > >
> > > > On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> > > > srsubramanian@linkedin.com> wrote:
> > > >
> > > > > I am actually neutral to this change. I found the replies were more
> > > > > towards the implementation and features so far. I would like the
> > > community
> > > > > to think about the questions below before making a decision. My
> > > opinion on
> > > > > this is that it has potential to be its own project and it would
> > > attract
> > > > > developers who are specifically interested in contributing to
> > metrics.
> > > I
> > > > > am skeptical that the Kafka contributors would focus on improving
> > this
> > > > > library (apart from bug fixes) instead of developing/contributing
> to
> > > other
> > > > > core pieces. It would be useful to continue and keep it decoupled
> > from
> > > > > rest of Kafka (if it resides in the Kafka code base.) so that we
> can
> > > move
> > > > > it out anytime to its own project.
> > > > >
> > > > >
> > > > > On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> > > > >
> > > > > >Hey Sriram,
> > > > > >
> > > > > >Not sure if these are actually meant as questions or more veiled
> > > comments.
> > > > > >In an case I tried to give my 2 cents inline.
> > > > > >
> > > > > >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> > > > > >srsubramanian@linkedin.com> wrote:
> > > > > >
> > > > > >> I think answering the questions below would help to make a
> better
> > > > > >> decision. I am all for writing better code and having superior
> > > > > >> functionalities but it is worth thinking about stuff outside
> just
> > > code
> > > > > >>in
> > > > > >> this case -
> > > > > >>
> > > > > >> 1. Does metric form a core piece of kafka? Does it help kafka
> > > greatly in
> > > > > >> providing better core functionalities? I would always like a
> > > project to
> > > > > >>do
> > > > > >> one thing really well. Metrics is a non trivial amount of code.
> > > > > >>
> > > > > >
> > > > > >Metrics are obviously important, and obviously improving our
> metrics
> > > > > >system
> > > > > >would be good. That said this may or may not be better, and even
> if
> > > it is
> > > > > >better that betterness might not outweigh other considerations.
> That
> > > is
> > > > > >what we are discussing.
> > > > > >
> > > > > >
> > > > > >> 2. Does it make sense to be part of Kafka or its own project? If
> > > this
> > > > > >> metrics library has the potential to be better than
> metrics-core,
> > I
> > > > > >>would
> > > > > >> be interested in other projects take advantage of it.
> > > > > >>
> > > > > >
> > > > > >It could be either.
> > > > > >
> > > > > >3. Can Kafka maintain this library as new members join and old
> > members
> > > > > >> leave? Would this be a piece of code that no one (in Kafka) in
> the
> > > > > >>future
> > > > > >> spends time improving if the original author left?
> > > > > >>
> > > > > >
> > > > > >I am not going anywhere in the near term, but if I did, yes, this
> > > would be
> > > > > >like any other code we have. As with yammer metrics or any other
> > code
> > > at
> > > > > >that point we would either use it as is or someone would improve
> it.
> > > > > >
> > > > > >
> > > > > >> 4. Does it affect the schedule of producer rewrite? This needs
> its
> > > own
> > > > > >> stabilization and modification to existing metric dashboards if
> > the
> > > > > >>format
> > > > > >> is changed. Many times such cost are not factored in and a
> project
> > > loses
> > > > > >> time before realizing the extra time required to make a library
> as
> > > this
> > > > > >> operational.
> > > > > >>
> > > > > >
> > > > > >Probably not. The metrics are going to change regardless of
> whether
> > > we use
> > > > > >the same library or not. If we think this is better I don't mind
> > > putting
> > > > > >in
> > > > > >a little extra effort to get there.
> > > > > >
> > > > > >Irrespective I think this is probably not the right thing to
> > optimize
> > > for.
> > > > > >
> > > > > >
> > > > > >> I am sure we can do better when we write code to a specific use
> > > case (in
> > > > > >> this case, kafka) rather than building a generic library that
> > suits
> > > all
> > > > > >> (metrics-core) but I would like us to have answers to the
> > questions
> > > > > >>above
> > > > > >> and be prepared before we proceed to support this with the
> > producer
> > > > > >> rewrite.
> > > > > >
> > > > > >
> > > > > >Naturally we are all considering exactly these things, that is
> > > exactly the
> > > > > >reason I started the thread.
> > > > > >
> > > > > >-Jay
> > > > > >
> > > > > >
> > > > > >> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> > > > > >>
> > > > > >> >Thanks for the detailed write-up. It's well thought through. A
> > few
> > > > > >> >comments:
> > > > > >> >
> > > > > >> >1. I have a couple of concerns on the percentiles. The first
> > issue
> > > is
> > > > > >>that
> > > > > >> >It requires the user to know the value range. Since the range
> for
> > > > > >>things
> > > > > >> >like message size (in millions) is quite different from those
> > like
> > > > > >>request
> > > > > >> >time (less than 100), it's going to be hard to pick a good
> global
> > > > > >>default
> > > > > >> >range. Different apps could be dealing with different message
> > > size. So
> > > > > >> >they
> > > > > >> >probably will have to customize the range. Another issue is
> that
> > > it can
> > > > > >> >only report values at the bucket boundaries. So, if you have
> 1000
> > > > > >>buckets
> > > > > >> >and a value range of 1 million, you will only see 1000 possible
> > > values
> > > > > >>as
> > > > > >> >the quantile, which is probably too sparse. The implementation
> of
> > > > > >> >histogram
> > > > > >> >in metrics-core keeps a fix size of samples, which avoids both
> > > issues.
> > > > > >> >
> > > > > >> >2. We need to document the 3-part metrics names better since
> it's
> > > not
> > > > > >> >obvious what the convention is. Also, currently the name of the
> > > sensor
> > > > > >>and
> > > > > >> >the metrics defined in it are independent. Would it make sense
> to
> > > have
> > > > > >>the
> > > > > >> >sensor name be a prefix of the metric name?
> > > > > >> >
> > > > > >> >Overall, this approach seems to be cleaner than metrics-core by
> > > > > >>decoupling
> > > > > >> >measuring and reporting. The main benefit of metrics-core seems
> > to
> > > be
> > > > > >>the
> > > > > >> >existing reporters. Since not that many people voted for
> > > metrics-core,
> > > > > >>I
> > > > > >> >am
> > > > > >> >ok with going with the new implementation. My only
> recommendation
> > > is to
> > > > > >> >address the concern on percentiles.
> > > > > >> >
> > > > > >> >Thanks,
> > > > > >> >
> > > > > >> >Jun
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <
> jay.kreps@gmail.com>
> > > > > wrote:
> > > > > >> >
> > > > > >> >> Hey guys,
> > > > > >> >>
> > > > > >> >> I wanted to kick off a quick discussion of metrics with
> respect
> > > to
> > > > > >>the
> > > > > >> >>new
> > > > > >> >> producer and consumer (and potentially the server).
> > > > > >> >>
> > > > > >> >> At a high level I think there are three approaches we could
> > take:
> > > > > >> >> 1. Plain vanilla JMX
> > > > > >> >> 2. Use Coda Hale (AKA Yammer) Metrics
> > > > > >> >> 3. Do our own metrics (with JMX as one output)
> > > > > >> >>
> > > > > >> >> 1. Has the advantage that JMX is the most commonly used java
> > > thing
> > > > > >>and
> > > > > >> >> plugs in reasonably to most metrics systems. JMX is included
> in
> > > the
> > > > > >>JDK
> > > > > >> >>so
> > > > > >> >> it doesn't impose any additional dependencies on clients. It
> > has
> > > the
> > > > > >> >> disadvantage that plain vanilla JMX is a pain to use. We
> would
> > > need a
> > > > > >> >>bunch
> > > > > >> >> of helper code for maintaining counters to make this
> > reasonable.
> > > > > >> >>
> > > > > >> >> 2. Coda Hale metrics is pretty good and broadly used. It
> > > supports JMX
> > > > > >> >> output as well as direct output to many other types of
> systems.
> > > The
> > > > > >> >>primary
> > > > > >> >> downside we have had with Coda Hale has to do with the
> clients
> > > and
> > > > > >> >>library
> > > > > >> >> incompatibilities. We are currently on an older more popular
> > > version.
> > > > > >> >>The
> > > > > >> >> newer version is a rewrite of the APIs and is incompatible.
> > > > > >>Originally
> > > > > >> >> these were totally incompatible and people had to choose one
> or
> > > the
> > > > > >> >>other.
> > > > > >> >> I think that has been improved so now the new version is a
> > > totally
> > > > > >> >> different package. But even in this case you end up with both
> > > > > >>versions
> > > > > >> >>if
> > > > > >> >> you use Kafka and we are on a different version than you
> which
> > is
> > > > > >>going
> > > > > >> >>to
> > > > > >> >> be pretty inconvenient.
> > > > > >> >>
> > > > > >> >> 3. Doing our own has the downside of potentially reinventing
> > the
> > > > > >>wheel,
> > > > > >> >>and
> > > > > >> >> potentially needing to work out any bugs in our code. The
> > upsides
> > > > > >>would
> > > > > >> >> depend on the how good the reinvention was. As it happens I
> > did a
> > > > > >>quick
> > > > > >> >> (~900 loc) version of a metrics library that is under
> > > > > >> >>kafka.common.metrics.
> > > > > >> >> I think it has some advantages over the Yammer metrics
> package
> > > for
> > > > > >>our
> > > > > >> >> usage beyond just not causing incompatibilities. I will
> > describe
> > > this
> > > > > >> >>code
> > > > > >> >> so we can discuss the pros and cons. Although I favor this
> > > approach I
> > > > > >> >>have
> > > > > >> >> no emotional attachment and wouldn't be too sad if I ended up
> > > > > >>deleting
> > > > > >> >>it.
> > > > > >> >> Here are javadocs for this code, though I haven't written
> much
> > > > > >> >> documentation yet since I might end up deleting it:
> > > > > >> >>
> > > > > >> >> Here is a quick overview of this library.
> > > > > >> >>
> > > > > >> >> There are three main public interfaces:
> > > > > >> >>   Metrics - This is a repository of metrics being tracked.
> > > > > >> >>   Metric - A single, named numerical value being measured
> > (i.e. a
> > > > > >> >>counter).
> > > > > >> >>   Sensor - This is a thing that records values and updates
> zero
> > > or
> > > > > >>more
> > > > > >> >> metrics
> > > > > >> >>
> > > > > >> >> So let's say we want to track three values about message
> sizes;
> > > > > >> >> specifically say we want to record the average, the maximum,
> > the
> > > > > >>total
> > > > > >> >>rate
> > > > > >> >> of bytes being sent, and a count of messages. Then we would
> do
> > > > > >>something
> > > > > >> >> like this:
> > > > > >> >>
> > > > > >> >>    // setup code
> > > > > >> >>    Metrics metrics = new Metrics(); // this is a global
> > > "singleton"
> > > > > >> >>    Sensor sensor =
> > > metrics.sensor("kafka.producer.message.sizes");
> > > > > >> >>    sensor.add("kafka.producer.message-size.avg", new Avg());
> > > > > >> >>    sensor.add("kafka.producer.message-size.max", new Max());
> > > > > >> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new
> Rate());
> > > > > >> >>    sensor.add("kafka.producer.message-count", new Count());
> > > > > >> >>
> > > > > >> >>    // now when we get a message we do this
> > > > > >> >>    sensor.record(messageSize);
> > > > > >> >>
> > > > > >> >> The above code creates the global metrics repository,
> creates a
> > > > > >>single
> > > > > >> >> Sensor, and defines 5 named metrics that are updated by that
> > > Sensor.
> > > > > >> >>
> > > > > >> >> Like Yammer Metrics (YM) I allow you to plug in "reporters",
> > > > > >>including a
> > > > > >> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter
> I
> > > have
> > > > > >>keys
> > > > > >> >> off the metric names not the Sensor names, which I think is
> an
> > > > > >> >> improvement--I just use the convention that the last portion
> of
> > > the
> > > > > >> >>name is
> > > > > >> >> the attribute name, the second to last is the mbean name, and
> > the
> > > > > >>rest
> > > > > >> >>is
> > > > > >> >> the package. So in the above example there is a producer
> mbean
> > > that
> > > > > >>has
> > > > > >> >>a
> > > > > >> >> avg and max attribute and a producer mbean that has a
> > > > > >>bytes-sent-per-sec
> > > > > >> >> and message-count attribute. This is nice because you can
> > > logically
> > > > > >> >>group
> > > > > >> >> the values reported irrespective of where in the program they
> > are
> > > > > >> >> computed--that is an mbean can logically group attributes
> > > computed
> > > > > >>off
> > > > > >> >> different sensors. This means you can report values by
> logical
> > > > > >> >>subsystem.
> > > > > >> >>
> > > > > >> >> I also allow the concept of hierarchical Sensors which I
> think
> > > is a
> > > > > >>good
> > > > > >> >> convenience. I have noticed a common pattern in systems where
> > you
> > > > > >>need
> > > > > >> >>to
> > > > > >> >> roll up the same values along different dimensions. An simple
> > > > > >>example is
> > > > > >> >> metrics about qps, data rate, etc on the broker. These we
> want
> > to
> > > > > >> >>capture
> > > > > >> >> in aggregate, but also broken down by topic-id. You can do
> this
> > > > > >>purely
> > > > > >> >>by
> > > > > >> >> defining the sensor hierarchy:
> > > > > >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > > > > >> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic
>  +
> > > > > >> >>".sizes",
> > > > > >> >> allSizes);
> > > > > >> >> Now each actual update will go to the appropriate topicSizes
> > > sensor
> > > > > >> >>(based
> > > > > >> >> on the topic name), but allSizes metrics will get updated
> too.
> > I
> > > also
> > > > > >> >> support multiple parents for each sensor as well as multiple
> > > layers
> > > > > >>of
> > > > > >> >> hiearchy, so you can define a more elaborate DAG of sensors.
> An
> > > > > >>example
> > > > > >> >>of
> > > > > >> >> how this would be useful is if you wanted to record your
> > metrics
> > > > > >>broken
> > > > > >> >> down by topic AND client id as well as the global aggregate.
> > > > > >> >>
> > > > > >> >> Each metric can take a configurable Quota value which allows
> us
> > > to
> > > > > >>limit
> > > > > >> >> the maximum value of that sensor. This is intended for use on
> > the
> > > > > >> >>server as
> > > > > >> >> part of our Quota implementation. The way this works is that
> > you
> > > > > >>record
> > > > > >> >> metrics as usual:
> > > > > >> >>    mySensor.record(42.0)
> > > > > >> >> However if this event occurance causes one of the metrics to
> > > exceed
> > > > > >>its
> > > > > >> >> maximum allowable value (the quota) this call will throw a
> > > > > >> >> QuotaViolationException. The cool thing about this is that it
> > > means
> > > > > >>we
> > > > > >> >>can
> > > > > >> >> define quotas on anything we capture metrics for, which I
> think
> > > is
> > > > > >> >>pretty
> > > > > >> >> cool.
> > > > > >> >>
> > > > > >> >> Another question is how to handle windowing of the values?
> > > Metrics
> > > > > >>want
> > > > > >> >>to
> > > > > >> >> record the "current" value, but the definition of current is
> > > > > >>inherently
> > > > > >> >> nebulous. A few of the obvious gotchas are that if you define
> > > > > >>"current"
> > > > > >> >>to
> > > > > >> >> be a number of events you can end up measuring an arbitrarily
> > > long
> > > > > >> >>window
> > > > > >> >> of time if the event rate is low (e.g. you think you are
> > getting
> > > 50
> > > > > >> >> messages/sec because that was the rate yesterday when all
> > events
> > > > > >> >>topped).
> > > > > >> >>
> > > > > >> >> Here is how I approach this. All the metrics use the same
> > > windowing
> > > > > >> >> approach. We define a single window by a length of time or
> > > number of
> > > > > >> >>values
> > > > > >> >> (you can use either or both--if both the window ends when
> > > *either*
> > > > > >>the
> > > > > >> >>time
> > > > > >> >> bound or event bound is hit). The typical problem with hard
> > > window
> > > > > >> >> boundaries is that at the beginning of the window you have no
> > > data
> > > > > >>and
> > > > > >> >>the
> > > > > >> >> first few samples are too small to be a valid sample.
> (Consider
> > > if
> > > > > >>you
> > > > > >> >>were
> > > > > >> >> keeping an avg and the first value in the window happens to
> be
> > > very
> > > > > >>very
> > > > > >> >> high, if you check the avg at this exact time you will
> conclude
> > > the
> > > > > >>avg
> > > > > >> >>is
> > > > > >> >> very high but on a sample size of one). One simple fix would
> be
> > > to
> > > > > >> >>always
> > > > > >> >> report the last complete window, however this is not
> > appropriate
> > > here
> > > > > >> >> because (1) we want to drive quotas off it so it needs to be
> > > current,
> > > > > >> >>and
> > > > > >> >> (2) since this is for monitoring you kind of care more about
> > the
> > > > > >>current
> > > > > >> >> state. The ideal solution here would be to define a backwards
> > > looking
> > > > > >> >> sliding window from the present, but many statistics are
> > actually
> > > > > >>very
> > > > > >> >>hard
> > > > > >> >> to compute in this model without retaining all the values
> which
> > > > > >>would be
> > > > > >> >> hopelessly inefficient. My solution to this is to keep a
> > > configurable
> > > > > >> >> number of windows (default is two) and combine them for the
> > > estimate.
> > > > > >> >>So in
> > > > > >> >> a two sample case depending on when you ask you have between
> > one
> > > and
> > > > > >>two
> > > > > >> >> complete samples worth of data to base the answer off of.
> > > Provided
> > > > > >>the
> > > > > >> >> sample window is large enough to get a valid result this
> > > satisfies
> > > > > >>both
> > > > > >> >>of
> > > > > >> >> my criteria of incorporating the most recent data and having
> > > > > >>reasonable
> > > > > >> >> variance at all times.
> > > > > >> >>
> > > > > >> >> Another approach is to use an exponential weighting scheme to
> > > combine
> > > > > >> >>all
> > > > > >> >> history but emphasize the recent past. I have not done this
> as
> > it
> > > > > >>has a
> > > > > >> >>lot
> > > > > >> >> of issues for practical operational metrics. I'd be happy to
> > > > > >>elaborate
> > > > > >> >>on
> > > > > >> >> this if anyone cares...
> > > > > >> >>
> > > > > >> >> The window size for metrics has a global default which can be
> > > > > >> >>overridden at
> > > > > >> >> either the sensor or individual metric level.
> > > > > >> >>
> > > > > >> >> In addition to these time series values the user can directly
> > > expose
> > > > > >> >>some
> > > > > >> >> method of their choosing JMX-style by implementing the
> > Measurable
> > > > > >> >>interface
> > > > > >> >> and registering that value. E.g.
> > > > > >> >>   metrics.addMetric("my.metric", new Measurable() {
> > > > > >> >>     public double measure(MetricConfg config, long now) {
> > > > > >> >>        return this.calculateValueToExpose();
> > > > > >> >>     }
> > > > > >> >>   });
> > > > > >> >> This is useful for exposing things like the accumulator free
> > > memory.
> > > > > >> >>
> > > > > >> >> The set of metrics is extensible, new metrics can be added by
> > > just
> > > > > >> >> implementing the appropriate interfaces and registering with
> a
> > > > > >>sensor. I
> > > > > >> >> implement the following metrics:
> > > > > >> >>   total - the sum of all values from the given sensor
> > > > > >> >>   count - a windowed count of values from the sensor
> > > > > >> >>   avg - the sample average within the windows
> > > > > >> >>   max - the max over the windows
> > > > > >> >>   min - the min over the windows
> > > > > >> >>   rate - the rate in the windows (e.g. the total or count
> > > divided by
> > > > > >>the
> > > > > >> >> ellapsed time)
> > > > > >> >>   percentiles - a collection of percentiles computed over the
> > > window
> > > > > >> >>
> > > > > >> >> My approach to percentiles is a little different from the
> > yammer
> > > > > >>metrics
> > > > > >> >> package. My complaint about the yammer metrics approach is
> that
> > > it
> > > > > >>uses
> > > > > >> >> rather expensive sampling and uses kind of a lot of memory to
> > > get a
> > > > > >> >> reasonable sample. This is problematic for per-topic
> > > measurements.
> > > > > >> >>
> > > > > >> >> Instead I use a fixed range for the histogram (e.g. 0.0 to
> > > 30000.0)
> > > > > >> >>which
> > > > > >> >> directly allows you to specify the desired memory use. Any
> > value
> > > > > >>below
> > > > > >> >>the
> > > > > >> >> minimum is recorded as -Infinity and any value above the
> > maximum
> > > as
> > > > > >> >> +Infinity. I think this is okay as all metrics have an
> expected
> > > range
> > > > > >> >> except for latency which can be arbitrarily large, but for
> very
> > > high
> > > > > >> >> latency there is no need to model it exactly (e.g. 30
> seconds +
> > > > > >>really
> > > > > >> >>is
> > > > > >> >> effectively infinite). Within the range values are recorded
> in
> > > > > >>buckets
> > > > > >> >> which can be either fixed width or increasing width. The
> > > increasing
> > > > > >> >>width
> > > > > >> >> is analogous to the idea of significant figures, that is if
> > your
> > > > > >>value
> > > > > >> >>is
> > > > > >> >> in the range 0-10 you might want to be accurate to within
> 1ms,
> > > but if
> > > > > >> >>it is
> > > > > >> >> 20000 there is no need to be so accurate. I implemented a
> > linear
> > > > > >>bucket
> > > > > >> >> size where the Nth bucket has width proportional to N. An
> > > exponential
> > > > > >> >> bucket size would also be sensible and could likely be
> derived
> > > > > >>directly
> > > > > >> >> from the floating point representation of a the value.
> > > > > >> >>
> > > > > >> >> I'd like to get some feedback on this metrics code and make a
> > > > > >>decision
> > > > > >> >>on
> > > > > >> >> whether we want to use it before I actually go ahead and add
> > all
> > > the
> > > > > >> >> instrumentation in the code (otherwise I'll have to redo it
> if
> > we
> > > > > >>switch
> > > > > >> >> approaches). So the next topic of discussion will be which
> > actual
> > > > > >> >>metrics
> > > > > >> >> to add.
> > > > > >> >>
> > > > > >> >> -Jay
> > > > > >> >>
> > > > > >>
> > > > > >>
> > > > >
> > > > >
> > >
> > >
> >
>

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

I think we discussed the scala/java stuff more fully previously.
Essentially the client is embedded everywhere. Scala is very incompatible
with itself so this makes it very hard to use for people using anything
else in scala. Also Scala stack traces are very confusing. Basically we
thought plain java code would be a lot easier for people to use. Even if
Scala is more fun to write, that isn't really what we are optimizing for.

-Jay


On Thu, Feb 13, 2014 at 8:09 AM, S Ahmed <sa...@gmail.com> wrote:

> Jay, pretty impressive how you just write a 'quick version' like that :)
> Not to get off-topic but why didn't you write this in scala?
>
>
>
> On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com> wrote:
>
> > I have not had a chance to review the new metrics code and its
> > features carefully (apart from your write-up), but here are my general
> > thoughts:
> >
> > Implementing a metrics package correctly is difficult; more so for
> > people like me, because I'm not a statistician.  However, if this new
> > package: {(i) functions correctly (and we need to define and prove
> > correctness), (ii) is easy to use, (iii) serves all our current and
> > anticipated monitoring needs, (iv) is not overly complex that it
> > becomes a burden to maintain and we are better of with an available
> > library;} then I think it makes sense to embed it and use it within
> > the Kafka code. The main wins are: (i) predictability (no changing
> > APIs and intimate knowledge of the code) and (ii) control with respect
> > to both functionality (e.g., there are hard-coded decay constants in
> > metrics-core 2.x) and correctness (i.e., if we find a bug in the
> > metrics package we have to submit a pull request and wait for it to
> > become mainstream).  I'm not sure it would help very much to pull it
> > into a separate repo because that could potentially annul these
> > benefits.
> >
> > Joel
> >
> > On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> > > Sriram,
> > >
> > > Makes sense. I am cool moving this stuff into its own repo if people
> > think
> > > that is better. I'm not sure it would get much contribution but when I
> > > started messing with this I did have a lot of grand ideas of making
> > adding
> > > metrics to a sensor dynamic so you could add more stuff in
> real-time(via
> > > jmx, say) and/or externalize all your metrics and config to a separate
> > file
> > > like log4j with only the points of instrumentation hard-coded.
> > >
> > > -Jay
> > >
> > >
> > > On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> > > srsubramanian@linkedin.com> wrote:
> > >
> > > > I am actually neutral to this change. I found the replies were more
> > > > towards the implementation and features so far. I would like the
> > community
> > > > to think about the questions below before making a decision. My
> > opinion on
> > > > this is that it has potential to be its own project and it would
> > attract
> > > > developers who are specifically interested in contributing to
> metrics.
> > I
> > > > am skeptical that the Kafka contributors would focus on improving
> this
> > > > library (apart from bug fixes) instead of developing/contributing to
> > other
> > > > core pieces. It would be useful to continue and keep it decoupled
> from
> > > > rest of Kafka (if it resides in the Kafka code base.) so that we can
> > move
> > > > it out anytime to its own project.
> > > >
> > > >
> > > > On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> > > >
> > > > >Hey Sriram,
> > > > >
> > > > >Not sure if these are actually meant as questions or more veiled
> > comments.
> > > > >In an case I tried to give my 2 cents inline.
> > > > >
> > > > >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> > > > >srsubramanian@linkedin.com> wrote:
> > > > >
> > > > >> I think answering the questions below would help to make a better
> > > > >> decision. I am all for writing better code and having superior
> > > > >> functionalities but it is worth thinking about stuff outside just
> > code
> > > > >>in
> > > > >> this case -
> > > > >>
> > > > >> 1. Does metric form a core piece of kafka? Does it help kafka
> > greatly in
> > > > >> providing better core functionalities? I would always like a
> > project to
> > > > >>do
> > > > >> one thing really well. Metrics is a non trivial amount of code.
> > > > >>
> > > > >
> > > > >Metrics are obviously important, and obviously improving our metrics
> > > > >system
> > > > >would be good. That said this may or may not be better, and even if
> > it is
> > > > >better that betterness might not outweigh other considerations. That
> > is
> > > > >what we are discussing.
> > > > >
> > > > >
> > > > >> 2. Does it make sense to be part of Kafka or its own project? If
> > this
> > > > >> metrics library has the potential to be better than metrics-core,
> I
> > > > >>would
> > > > >> be interested in other projects take advantage of it.
> > > > >>
> > > > >
> > > > >It could be either.
> > > > >
> > > > >3. Can Kafka maintain this library as new members join and old
> members
> > > > >> leave? Would this be a piece of code that no one (in Kafka) in the
> > > > >>future
> > > > >> spends time improving if the original author left?
> > > > >>
> > > > >
> > > > >I am not going anywhere in the near term, but if I did, yes, this
> > would be
> > > > >like any other code we have. As with yammer metrics or any other
> code
> > at
> > > > >that point we would either use it as is or someone would improve it.
> > > > >
> > > > >
> > > > >> 4. Does it affect the schedule of producer rewrite? This needs its
> > own
> > > > >> stabilization and modification to existing metric dashboards if
> the
> > > > >>format
> > > > >> is changed. Many times such cost are not factored in and a project
> > loses
> > > > >> time before realizing the extra time required to make a library as
> > this
> > > > >> operational.
> > > > >>
> > > > >
> > > > >Probably not. The metrics are going to change regardless of whether
> > we use
> > > > >the same library or not. If we think this is better I don't mind
> > putting
> > > > >in
> > > > >a little extra effort to get there.
> > > > >
> > > > >Irrespective I think this is probably not the right thing to
> optimize
> > for.
> > > > >
> > > > >
> > > > >> I am sure we can do better when we write code to a specific use
> > case (in
> > > > >> this case, kafka) rather than building a generic library that
> suits
> > all
> > > > >> (metrics-core) but I would like us to have answers to the
> questions
> > > > >>above
> > > > >> and be prepared before we proceed to support this with the
> producer
> > > > >> rewrite.
> > > > >
> > > > >
> > > > >Naturally we are all considering exactly these things, that is
> > exactly the
> > > > >reason I started the thread.
> > > > >
> > > > >-Jay
> > > > >
> > > > >
> > > > >> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> > > > >>
> > > > >> >Thanks for the detailed write-up. It's well thought through. A
> few
> > > > >> >comments:
> > > > >> >
> > > > >> >1. I have a couple of concerns on the percentiles. The first
> issue
> > is
> > > > >>that
> > > > >> >It requires the user to know the value range. Since the range for
> > > > >>things
> > > > >> >like message size (in millions) is quite different from those
> like
> > > > >>request
> > > > >> >time (less than 100), it's going to be hard to pick a good global
> > > > >>default
> > > > >> >range. Different apps could be dealing with different message
> > size. So
> > > > >> >they
> > > > >> >probably will have to customize the range. Another issue is that
> > it can
> > > > >> >only report values at the bucket boundaries. So, if you have 1000
> > > > >>buckets
> > > > >> >and a value range of 1 million, you will only see 1000 possible
> > values
> > > > >>as
> > > > >> >the quantile, which is probably too sparse. The implementation of
> > > > >> >histogram
> > > > >> >in metrics-core keeps a fix size of samples, which avoids both
> > issues.
> > > > >> >
> > > > >> >2. We need to document the 3-part metrics names better since it's
> > not
> > > > >> >obvious what the convention is. Also, currently the name of the
> > sensor
> > > > >>and
> > > > >> >the metrics defined in it are independent. Would it make sense to
> > have
> > > > >>the
> > > > >> >sensor name be a prefix of the metric name?
> > > > >> >
> > > > >> >Overall, this approach seems to be cleaner than metrics-core by
> > > > >>decoupling
> > > > >> >measuring and reporting. The main benefit of metrics-core seems
> to
> > be
> > > > >>the
> > > > >> >existing reporters. Since not that many people voted for
> > metrics-core,
> > > > >>I
> > > > >> >am
> > > > >> >ok with going with the new implementation. My only recommendation
> > is to
> > > > >> >address the concern on percentiles.
> > > > >> >
> > > > >> >Thanks,
> > > > >> >
> > > > >> >Jun
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com>
> > > > wrote:
> > > > >> >
> > > > >> >> Hey guys,
> > > > >> >>
> > > > >> >> I wanted to kick off a quick discussion of metrics with respect
> > to
> > > > >>the
> > > > >> >>new
> > > > >> >> producer and consumer (and potentially the server).
> > > > >> >>
> > > > >> >> At a high level I think there are three approaches we could
> take:
> > > > >> >> 1. Plain vanilla JMX
> > > > >> >> 2. Use Coda Hale (AKA Yammer) Metrics
> > > > >> >> 3. Do our own metrics (with JMX as one output)
> > > > >> >>
> > > > >> >> 1. Has the advantage that JMX is the most commonly used java
> > thing
> > > > >>and
> > > > >> >> plugs in reasonably to most metrics systems. JMX is included in
> > the
> > > > >>JDK
> > > > >> >>so
> > > > >> >> it doesn't impose any additional dependencies on clients. It
> has
> > the
> > > > >> >> disadvantage that plain vanilla JMX is a pain to use. We would
> > need a
> > > > >> >>bunch
> > > > >> >> of helper code for maintaining counters to make this
> reasonable.
> > > > >> >>
> > > > >> >> 2. Coda Hale metrics is pretty good and broadly used. It
> > supports JMX
> > > > >> >> output as well as direct output to many other types of systems.
> > The
> > > > >> >>primary
> > > > >> >> downside we have had with Coda Hale has to do with the clients
> > and
> > > > >> >>library
> > > > >> >> incompatibilities. We are currently on an older more popular
> > version.
> > > > >> >>The
> > > > >> >> newer version is a rewrite of the APIs and is incompatible.
> > > > >>Originally
> > > > >> >> these were totally incompatible and people had to choose one or
> > the
> > > > >> >>other.
> > > > >> >> I think that has been improved so now the new version is a
> > totally
> > > > >> >> different package. But even in this case you end up with both
> > > > >>versions
> > > > >> >>if
> > > > >> >> you use Kafka and we are on a different version than you which
> is
> > > > >>going
> > > > >> >>to
> > > > >> >> be pretty inconvenient.
> > > > >> >>
> > > > >> >> 3. Doing our own has the downside of potentially reinventing
> the
> > > > >>wheel,
> > > > >> >>and
> > > > >> >> potentially needing to work out any bugs in our code. The
> upsides
> > > > >>would
> > > > >> >> depend on the how good the reinvention was. As it happens I
> did a
> > > > >>quick
> > > > >> >> (~900 loc) version of a metrics library that is under
> > > > >> >>kafka.common.metrics.
> > > > >> >> I think it has some advantages over the Yammer metrics package
> > for
> > > > >>our
> > > > >> >> usage beyond just not causing incompatibilities. I will
> describe
> > this
> > > > >> >>code
> > > > >> >> so we can discuss the pros and cons. Although I favor this
> > approach I
> > > > >> >>have
> > > > >> >> no emotional attachment and wouldn't be too sad if I ended up
> > > > >>deleting
> > > > >> >>it.
> > > > >> >> Here are javadocs for this code, though I haven't written much
> > > > >> >> documentation yet since I might end up deleting it:
> > > > >> >>
> > > > >> >> Here is a quick overview of this library.
> > > > >> >>
> > > > >> >> There are three main public interfaces:
> > > > >> >>   Metrics - This is a repository of metrics being tracked.
> > > > >> >>   Metric - A single, named numerical value being measured
> (i.e. a
> > > > >> >>counter).
> > > > >> >>   Sensor - This is a thing that records values and updates zero
> > or
> > > > >>more
> > > > >> >> metrics
> > > > >> >>
> > > > >> >> So let's say we want to track three values about message sizes;
> > > > >> >> specifically say we want to record the average, the maximum,
> the
> > > > >>total
> > > > >> >>rate
> > > > >> >> of bytes being sent, and a count of messages. Then we would do
> > > > >>something
> > > > >> >> like this:
> > > > >> >>
> > > > >> >>    // setup code
> > > > >> >>    Metrics metrics = new Metrics(); // this is a global
> > "singleton"
> > > > >> >>    Sensor sensor =
> > metrics.sensor("kafka.producer.message.sizes");
> > > > >> >>    sensor.add("kafka.producer.message-size.avg", new Avg());
> > > > >> >>    sensor.add("kafka.producer.message-size.max", new Max());
> > > > >> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> > > > >> >>    sensor.add("kafka.producer.message-count", new Count());
> > > > >> >>
> > > > >> >>    // now when we get a message we do this
> > > > >> >>    sensor.record(messageSize);
> > > > >> >>
> > > > >> >> The above code creates the global metrics repository, creates a
> > > > >>single
> > > > >> >> Sensor, and defines 5 named metrics that are updated by that
> > Sensor.
> > > > >> >>
> > > > >> >> Like Yammer Metrics (YM) I allow you to plug in "reporters",
> > > > >>including a
> > > > >> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I
> > have
> > > > >>keys
> > > > >> >> off the metric names not the Sensor names, which I think is an
> > > > >> >> improvement--I just use the convention that the last portion of
> > the
> > > > >> >>name is
> > > > >> >> the attribute name, the second to last is the mbean name, and
> the
> > > > >>rest
> > > > >> >>is
> > > > >> >> the package. So in the above example there is a producer mbean
> > that
> > > > >>has
> > > > >> >>a
> > > > >> >> avg and max attribute and a producer mbean that has a
> > > > >>bytes-sent-per-sec
> > > > >> >> and message-count attribute. This is nice because you can
> > logically
> > > > >> >>group
> > > > >> >> the values reported irrespective of where in the program they
> are
> > > > >> >> computed--that is an mbean can logically group attributes
> > computed
> > > > >>off
> > > > >> >> different sensors. This means you can report values by logical
> > > > >> >>subsystem.
> > > > >> >>
> > > > >> >> I also allow the concept of hierarchical Sensors which I think
> > is a
> > > > >>good
> > > > >> >> convenience. I have noticed a common pattern in systems where
> you
> > > > >>need
> > > > >> >>to
> > > > >> >> roll up the same values along different dimensions. An simple
> > > > >>example is
> > > > >> >> metrics about qps, data rate, etc on the broker. These we want
> to
> > > > >> >>capture
> > > > >> >> in aggregate, but also broken down by topic-id. You can do this
> > > > >>purely
> > > > >> >>by
> > > > >> >> defining the sensor hierarchy:
> > > > >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > > > >> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
> > > > >> >>".sizes",
> > > > >> >> allSizes);
> > > > >> >> Now each actual update will go to the appropriate topicSizes
> > sensor
> > > > >> >>(based
> > > > >> >> on the topic name), but allSizes metrics will get updated too.
> I
> > also
> > > > >> >> support multiple parents for each sensor as well as multiple
> > layers
> > > > >>of
> > > > >> >> hiearchy, so you can define a more elaborate DAG of sensors. An
> > > > >>example
> > > > >> >>of
> > > > >> >> how this would be useful is if you wanted to record your
> metrics
> > > > >>broken
> > > > >> >> down by topic AND client id as well as the global aggregate.
> > > > >> >>
> > > > >> >> Each metric can take a configurable Quota value which allows us
> > to
> > > > >>limit
> > > > >> >> the maximum value of that sensor. This is intended for use on
> the
> > > > >> >>server as
> > > > >> >> part of our Quota implementation. The way this works is that
> you
> > > > >>record
> > > > >> >> metrics as usual:
> > > > >> >>    mySensor.record(42.0)
> > > > >> >> However if this event occurance causes one of the metrics to
> > exceed
> > > > >>its
> > > > >> >> maximum allowable value (the quota) this call will throw a
> > > > >> >> QuotaViolationException. The cool thing about this is that it
> > means
> > > > >>we
> > > > >> >>can
> > > > >> >> define quotas on anything we capture metrics for, which I think
> > is
> > > > >> >>pretty
> > > > >> >> cool.
> > > > >> >>
> > > > >> >> Another question is how to handle windowing of the values?
> > Metrics
> > > > >>want
> > > > >> >>to
> > > > >> >> record the "current" value, but the definition of current is
> > > > >>inherently
> > > > >> >> nebulous. A few of the obvious gotchas are that if you define
> > > > >>"current"
> > > > >> >>to
> > > > >> >> be a number of events you can end up measuring an arbitrarily
> > long
> > > > >> >>window
> > > > >> >> of time if the event rate is low (e.g. you think you are
> getting
> > 50
> > > > >> >> messages/sec because that was the rate yesterday when all
> events
> > > > >> >>topped).
> > > > >> >>
> > > > >> >> Here is how I approach this. All the metrics use the same
> > windowing
> > > > >> >> approach. We define a single window by a length of time or
> > number of
> > > > >> >>values
> > > > >> >> (you can use either or both--if both the window ends when
> > *either*
> > > > >>the
> > > > >> >>time
> > > > >> >> bound or event bound is hit). The typical problem with hard
> > window
> > > > >> >> boundaries is that at the beginning of the window you have no
> > data
> > > > >>and
> > > > >> >>the
> > > > >> >> first few samples are too small to be a valid sample. (Consider
> > if
> > > > >>you
> > > > >> >>were
> > > > >> >> keeping an avg and the first value in the window happens to be
> > very
> > > > >>very
> > > > >> >> high, if you check the avg at this exact time you will conclude
> > the
> > > > >>avg
> > > > >> >>is
> > > > >> >> very high but on a sample size of one). One simple fix would be
> > to
> > > > >> >>always
> > > > >> >> report the last complete window, however this is not
> appropriate
> > here
> > > > >> >> because (1) we want to drive quotas off it so it needs to be
> > current,
> > > > >> >>and
> > > > >> >> (2) since this is for monitoring you kind of care more about
> the
> > > > >>current
> > > > >> >> state. The ideal solution here would be to define a backwards
> > looking
> > > > >> >> sliding window from the present, but many statistics are
> actually
> > > > >>very
> > > > >> >>hard
> > > > >> >> to compute in this model without retaining all the values which
> > > > >>would be
> > > > >> >> hopelessly inefficient. My solution to this is to keep a
> > configurable
> > > > >> >> number of windows (default is two) and combine them for the
> > estimate.
> > > > >> >>So in
> > > > >> >> a two sample case depending on when you ask you have between
> one
> > and
> > > > >>two
> > > > >> >> complete samples worth of data to base the answer off of.
> > Provided
> > > > >>the
> > > > >> >> sample window is large enough to get a valid result this
> > satisfies
> > > > >>both
> > > > >> >>of
> > > > >> >> my criteria of incorporating the most recent data and having
> > > > >>reasonable
> > > > >> >> variance at all times.
> > > > >> >>
> > > > >> >> Another approach is to use an exponential weighting scheme to
> > combine
> > > > >> >>all
> > > > >> >> history but emphasize the recent past. I have not done this as
> it
> > > > >>has a
> > > > >> >>lot
> > > > >> >> of issues for practical operational metrics. I'd be happy to
> > > > >>elaborate
> > > > >> >>on
> > > > >> >> this if anyone cares...
> > > > >> >>
> > > > >> >> The window size for metrics has a global default which can be
> > > > >> >>overridden at
> > > > >> >> either the sensor or individual metric level.
> > > > >> >>
> > > > >> >> In addition to these time series values the user can directly
> > expose
> > > > >> >>some
> > > > >> >> method of their choosing JMX-style by implementing the
> Measurable
> > > > >> >>interface
> > > > >> >> and registering that value. E.g.
> > > > >> >>   metrics.addMetric("my.metric", new Measurable() {
> > > > >> >>     public double measure(MetricConfg config, long now) {
> > > > >> >>        return this.calculateValueToExpose();
> > > > >> >>     }
> > > > >> >>   });
> > > > >> >> This is useful for exposing things like the accumulator free
> > memory.
> > > > >> >>
> > > > >> >> The set of metrics is extensible, new metrics can be added by
> > just
> > > > >> >> implementing the appropriate interfaces and registering with a
> > > > >>sensor. I
> > > > >> >> implement the following metrics:
> > > > >> >>   total - the sum of all values from the given sensor
> > > > >> >>   count - a windowed count of values from the sensor
> > > > >> >>   avg - the sample average within the windows
> > > > >> >>   max - the max over the windows
> > > > >> >>   min - the min over the windows
> > > > >> >>   rate - the rate in the windows (e.g. the total or count
> > divided by
> > > > >>the
> > > > >> >> ellapsed time)
> > > > >> >>   percentiles - a collection of percentiles computed over the
> > window
> > > > >> >>
> > > > >> >> My approach to percentiles is a little different from the
> yammer
> > > > >>metrics
> > > > >> >> package. My complaint about the yammer metrics approach is that
> > it
> > > > >>uses
> > > > >> >> rather expensive sampling and uses kind of a lot of memory to
> > get a
> > > > >> >> reasonable sample. This is problematic for per-topic
> > measurements.
> > > > >> >>
> > > > >> >> Instead I use a fixed range for the histogram (e.g. 0.0 to
> > 30000.0)
> > > > >> >>which
> > > > >> >> directly allows you to specify the desired memory use. Any
> value
> > > > >>below
> > > > >> >>the
> > > > >> >> minimum is recorded as -Infinity and any value above the
> maximum
> > as
> > > > >> >> +Infinity. I think this is okay as all metrics have an expected
> > range
> > > > >> >> except for latency which can be arbitrarily large, but for very
> > high
> > > > >> >> latency there is no need to model it exactly (e.g. 30 seconds +
> > > > >>really
> > > > >> >>is
> > > > >> >> effectively infinite). Within the range values are recorded in
> > > > >>buckets
> > > > >> >> which can be either fixed width or increasing width. The
> > increasing
> > > > >> >>width
> > > > >> >> is analogous to the idea of significant figures, that is if
> your
> > > > >>value
> > > > >> >>is
> > > > >> >> in the range 0-10 you might want to be accurate to within 1ms,
> > but if
> > > > >> >>it is
> > > > >> >> 20000 there is no need to be so accurate. I implemented a
> linear
> > > > >>bucket
> > > > >> >> size where the Nth bucket has width proportional to N. An
> > exponential
> > > > >> >> bucket size would also be sensible and could likely be derived
> > > > >>directly
> > > > >> >> from the floating point representation of a the value.
> > > > >> >>
> > > > >> >> I'd like to get some feedback on this metrics code and make a
> > > > >>decision
> > > > >> >>on
> > > > >> >> whether we want to use it before I actually go ahead and add
> all
> > the
> > > > >> >> instrumentation in the code (otherwise I'll have to redo it if
> we
> > > > >>switch
> > > > >> >> approaches). So the next topic of discussion will be which
> actual
> > > > >> >>metrics
> > > > >> >> to add.
> > > > >> >>
> > > > >> >> -Jay
> > > > >> >>
> > > > >>
> > > > >>
> > > >
> > > >
> >
> >
>

Re: Metrics in new producer

Posted by S Ahmed <sa...@gmail.com>.

Jay, pretty impressive how you just write a 'quick version' like that :)
Not to get off-topic but why didn't you write this in scala?



On Wed, Feb 12, 2014 at 6:54 PM, Joel Koshy <jj...@gmail.com> wrote:

> I have not had a chance to review the new metrics code and its
> features carefully (apart from your write-up), but here are my general
> thoughts:
>
> Implementing a metrics package correctly is difficult; more so for
> people like me, because I'm not a statistician.  However, if this new
> package: {(i) functions correctly (and we need to define and prove
> correctness), (ii) is easy to use, (iii) serves all our current and
> anticipated monitoring needs, (iv) is not overly complex that it
> becomes a burden to maintain and we are better of with an available
> library;} then I think it makes sense to embed it and use it within
> the Kafka code. The main wins are: (i) predictability (no changing
> APIs and intimate knowledge of the code) and (ii) control with respect
> to both functionality (e.g., there are hard-coded decay constants in
> metrics-core 2.x) and correctness (i.e., if we find a bug in the
> metrics package we have to submit a pull request and wait for it to
> become mainstream).  I'm not sure it would help very much to pull it
> into a separate repo because that could potentially annul these
> benefits.
>
> Joel
>
> On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> > Sriram,
> >
> > Makes sense. I am cool moving this stuff into its own repo if people
> think
> > that is better. I'm not sure it would get much contribution but when I
> > started messing with this I did have a lot of grand ideas of making
> adding
> > metrics to a sensor dynamic so you could add more stuff in real-time(via
> > jmx, say) and/or externalize all your metrics and config to a separate
> file
> > like log4j with only the points of instrumentation hard-coded.
> >
> > -Jay
> >
> >
> > On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> > srsubramanian@linkedin.com> wrote:
> >
> > > I am actually neutral to this change. I found the replies were more
> > > towards the implementation and features so far. I would like the
> community
> > > to think about the questions below before making a decision. My
> opinion on
> > > this is that it has potential to be its own project and it would
> attract
> > > developers who are specifically interested in contributing to metrics.
> I
> > > am skeptical that the Kafka contributors would focus on improving this
> > > library (apart from bug fixes) instead of developing/contributing to
> other
> > > core pieces. It would be useful to continue and keep it decoupled from
> > > rest of Kafka (if it resides in the Kafka code base.) so that we can
> move
> > > it out anytime to its own project.
> > >
> > >
> > > On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> > >
> > > >Hey Sriram,
> > > >
> > > >Not sure if these are actually meant as questions or more veiled
> comments.
> > > >In an case I tried to give my 2 cents inline.
> > > >
> > > >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> > > >srsubramanian@linkedin.com> wrote:
> > > >
> > > >> I think answering the questions below would help to make a better
> > > >> decision. I am all for writing better code and having superior
> > > >> functionalities but it is worth thinking about stuff outside just
> code
> > > >>in
> > > >> this case -
> > > >>
> > > >> 1. Does metric form a core piece of kafka? Does it help kafka
> greatly in
> > > >> providing better core functionalities? I would always like a
> project to
> > > >>do
> > > >> one thing really well. Metrics is a non trivial amount of code.
> > > >>
> > > >
> > > >Metrics are obviously important, and obviously improving our metrics
> > > >system
> > > >would be good. That said this may or may not be better, and even if
> it is
> > > >better that betterness might not outweigh other considerations. That
> is
> > > >what we are discussing.
> > > >
> > > >
> > > >> 2. Does it make sense to be part of Kafka or its own project? If
> this
> > > >> metrics library has the potential to be better than metrics-core, I
> > > >>would
> > > >> be interested in other projects take advantage of it.
> > > >>
> > > >
> > > >It could be either.
> > > >
> > > >3. Can Kafka maintain this library as new members join and old members
> > > >> leave? Would this be a piece of code that no one (in Kafka) in the
> > > >>future
> > > >> spends time improving if the original author left?
> > > >>
> > > >
> > > >I am not going anywhere in the near term, but if I did, yes, this
> would be
> > > >like any other code we have. As with yammer metrics or any other code
> at
> > > >that point we would either use it as is or someone would improve it.
> > > >
> > > >
> > > >> 4. Does it affect the schedule of producer rewrite? This needs its
> own
> > > >> stabilization and modification to existing metric dashboards if the
> > > >>format
> > > >> is changed. Many times such cost are not factored in and a project
> loses
> > > >> time before realizing the extra time required to make a library as
> this
> > > >> operational.
> > > >>
> > > >
> > > >Probably not. The metrics are going to change regardless of whether
> we use
> > > >the same library or not. If we think this is better I don't mind
> putting
> > > >in
> > > >a little extra effort to get there.
> > > >
> > > >Irrespective I think this is probably not the right thing to optimize
> for.
> > > >
> > > >
> > > >> I am sure we can do better when we write code to a specific use
> case (in
> > > >> this case, kafka) rather than building a generic library that suits
> all
> > > >> (metrics-core) but I would like us to have answers to the questions
> > > >>above
> > > >> and be prepared before we proceed to support this with the producer
> > > >> rewrite.
> > > >
> > > >
> > > >Naturally we are all considering exactly these things, that is
> exactly the
> > > >reason I started the thread.
> > > >
> > > >-Jay
> > > >
> > > >
> > > >> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> > > >>
> > > >> >Thanks for the detailed write-up. It's well thought through. A few
> > > >> >comments:
> > > >> >
> > > >> >1. I have a couple of concerns on the percentiles. The first issue
> is
> > > >>that
> > > >> >It requires the user to know the value range. Since the range for
> > > >>things
> > > >> >like message size (in millions) is quite different from those like
> > > >>request
> > > >> >time (less than 100), it's going to be hard to pick a good global
> > > >>default
> > > >> >range. Different apps could be dealing with different message
> size. So
> > > >> >they
> > > >> >probably will have to customize the range. Another issue is that
> it can
> > > >> >only report values at the bucket boundaries. So, if you have 1000
> > > >>buckets
> > > >> >and a value range of 1 million, you will only see 1000 possible
> values
> > > >>as
> > > >> >the quantile, which is probably too sparse. The implementation of
> > > >> >histogram
> > > >> >in metrics-core keeps a fix size of samples, which avoids both
> issues.
> > > >> >
> > > >> >2. We need to document the 3-part metrics names better since it's
> not
> > > >> >obvious what the convention is. Also, currently the name of the
> sensor
> > > >>and
> > > >> >the metrics defined in it are independent. Would it make sense to
> have
> > > >>the
> > > >> >sensor name be a prefix of the metric name?
> > > >> >
> > > >> >Overall, this approach seems to be cleaner than metrics-core by
> > > >>decoupling
> > > >> >measuring and reporting. The main benefit of metrics-core seems to
> be
> > > >>the
> > > >> >existing reporters. Since not that many people voted for
> metrics-core,
> > > >>I
> > > >> >am
> > > >> >ok with going with the new implementation. My only recommendation
> is to
> > > >> >address the concern on percentiles.
> > > >> >
> > > >> >Thanks,
> > > >> >
> > > >> >Jun
> > > >> >
> > > >> >
> > > >> >
> > > >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com>
> > > wrote:
> > > >> >
> > > >> >> Hey guys,
> > > >> >>
> > > >> >> I wanted to kick off a quick discussion of metrics with respect
> to
> > > >>the
> > > >> >>new
> > > >> >> producer and consumer (and potentially the server).
> > > >> >>
> > > >> >> At a high level I think there are three approaches we could take:
> > > >> >> 1. Plain vanilla JMX
> > > >> >> 2. Use Coda Hale (AKA Yammer) Metrics
> > > >> >> 3. Do our own metrics (with JMX as one output)
> > > >> >>
> > > >> >> 1. Has the advantage that JMX is the most commonly used java
> thing
> > > >>and
> > > >> >> plugs in reasonably to most metrics systems. JMX is included in
> the
> > > >>JDK
> > > >> >>so
> > > >> >> it doesn't impose any additional dependencies on clients. It has
> the
> > > >> >> disadvantage that plain vanilla JMX is a pain to use. We would
> need a
> > > >> >>bunch
> > > >> >> of helper code for maintaining counters to make this reasonable.
> > > >> >>
> > > >> >> 2. Coda Hale metrics is pretty good and broadly used. It
> supports JMX
> > > >> >> output as well as direct output to many other types of systems.
> The
> > > >> >>primary
> > > >> >> downside we have had with Coda Hale has to do with the clients
> and
> > > >> >>library
> > > >> >> incompatibilities. We are currently on an older more popular
> version.
> > > >> >>The
> > > >> >> newer version is a rewrite of the APIs and is incompatible.
> > > >>Originally
> > > >> >> these were totally incompatible and people had to choose one or
> the
> > > >> >>other.
> > > >> >> I think that has been improved so now the new version is a
> totally
> > > >> >> different package. But even in this case you end up with both
> > > >>versions
> > > >> >>if
> > > >> >> you use Kafka and we are on a different version than you which is
> > > >>going
> > > >> >>to
> > > >> >> be pretty inconvenient.
> > > >> >>
> > > >> >> 3. Doing our own has the downside of potentially reinventing the
> > > >>wheel,
> > > >> >>and
> > > >> >> potentially needing to work out any bugs in our code. The upsides
> > > >>would
> > > >> >> depend on the how good the reinvention was. As it happens I did a
> > > >>quick
> > > >> >> (~900 loc) version of a metrics library that is under
> > > >> >>kafka.common.metrics.
> > > >> >> I think it has some advantages over the Yammer metrics package
> for
> > > >>our
> > > >> >> usage beyond just not causing incompatibilities. I will describe
> this
> > > >> >>code
> > > >> >> so we can discuss the pros and cons. Although I favor this
> approach I
> > > >> >>have
> > > >> >> no emotional attachment and wouldn't be too sad if I ended up
> > > >>deleting
> > > >> >>it.
> > > >> >> Here are javadocs for this code, though I haven't written much
> > > >> >> documentation yet since I might end up deleting it:
> > > >> >>
> > > >> >> Here is a quick overview of this library.
> > > >> >>
> > > >> >> There are three main public interfaces:
> > > >> >>   Metrics - This is a repository of metrics being tracked.
> > > >> >>   Metric - A single, named numerical value being measured (i.e. a
> > > >> >>counter).
> > > >> >>   Sensor - This is a thing that records values and updates zero
> or
> > > >>more
> > > >> >> metrics
> > > >> >>
> > > >> >> So let's say we want to track three values about message sizes;
> > > >> >> specifically say we want to record the average, the maximum, the
> > > >>total
> > > >> >>rate
> > > >> >> of bytes being sent, and a count of messages. Then we would do
> > > >>something
> > > >> >> like this:
> > > >> >>
> > > >> >>    // setup code
> > > >> >>    Metrics metrics = new Metrics(); // this is a global
> "singleton"
> > > >> >>    Sensor sensor =
> metrics.sensor("kafka.producer.message.sizes");
> > > >> >>    sensor.add("kafka.producer.message-size.avg", new Avg());
> > > >> >>    sensor.add("kafka.producer.message-size.max", new Max());
> > > >> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> > > >> >>    sensor.add("kafka.producer.message-count", new Count());
> > > >> >>
> > > >> >>    // now when we get a message we do this
> > > >> >>    sensor.record(messageSize);
> > > >> >>
> > > >> >> The above code creates the global metrics repository, creates a
> > > >>single
> > > >> >> Sensor, and defines 5 named metrics that are updated by that
> Sensor.
> > > >> >>
> > > >> >> Like Yammer Metrics (YM) I allow you to plug in "reporters",
> > > >>including a
> > > >> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I
> have
> > > >>keys
> > > >> >> off the metric names not the Sensor names, which I think is an
> > > >> >> improvement--I just use the convention that the last portion of
> the
> > > >> >>name is
> > > >> >> the attribute name, the second to last is the mbean name, and the
> > > >>rest
> > > >> >>is
> > > >> >> the package. So in the above example there is a producer mbean
> that
> > > >>has
> > > >> >>a
> > > >> >> avg and max attribute and a producer mbean that has a
> > > >>bytes-sent-per-sec
> > > >> >> and message-count attribute. This is nice because you can
> logically
> > > >> >>group
> > > >> >> the values reported irrespective of where in the program they are
> > > >> >> computed--that is an mbean can logically group attributes
> computed
> > > >>off
> > > >> >> different sensors. This means you can report values by logical
> > > >> >>subsystem.
> > > >> >>
> > > >> >> I also allow the concept of hierarchical Sensors which I think
> is a
> > > >>good
> > > >> >> convenience. I have noticed a common pattern in systems where you
> > > >>need
> > > >> >>to
> > > >> >> roll up the same values along different dimensions. An simple
> > > >>example is
> > > >> >> metrics about qps, data rate, etc on the broker. These we want to
> > > >> >>capture
> > > >> >> in aggregate, but also broken down by topic-id. You can do this
> > > >>purely
> > > >> >>by
> > > >> >> defining the sensor hierarchy:
> > > >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > > >> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
> > > >> >>".sizes",
> > > >> >> allSizes);
> > > >> >> Now each actual update will go to the appropriate topicSizes
> sensor
> > > >> >>(based
> > > >> >> on the topic name), but allSizes metrics will get updated too. I
> also
> > > >> >> support multiple parents for each sensor as well as multiple
> layers
> > > >>of
> > > >> >> hiearchy, so you can define a more elaborate DAG of sensors. An
> > > >>example
> > > >> >>of
> > > >> >> how this would be useful is if you wanted to record your metrics
> > > >>broken
> > > >> >> down by topic AND client id as well as the global aggregate.
> > > >> >>
> > > >> >> Each metric can take a configurable Quota value which allows us
> to
> > > >>limit
> > > >> >> the maximum value of that sensor. This is intended for use on the
> > > >> >>server as
> > > >> >> part of our Quota implementation. The way this works is that you
> > > >>record
> > > >> >> metrics as usual:
> > > >> >>    mySensor.record(42.0)
> > > >> >> However if this event occurance causes one of the metrics to
> exceed
> > > >>its
> > > >> >> maximum allowable value (the quota) this call will throw a
> > > >> >> QuotaViolationException. The cool thing about this is that it
> means
> > > >>we
> > > >> >>can
> > > >> >> define quotas on anything we capture metrics for, which I think
> is
> > > >> >>pretty
> > > >> >> cool.
> > > >> >>
> > > >> >> Another question is how to handle windowing of the values?
> Metrics
> > > >>want
> > > >> >>to
> > > >> >> record the "current" value, but the definition of current is
> > > >>inherently
> > > >> >> nebulous. A few of the obvious gotchas are that if you define
> > > >>"current"
> > > >> >>to
> > > >> >> be a number of events you can end up measuring an arbitrarily
> long
> > > >> >>window
> > > >> >> of time if the event rate is low (e.g. you think you are getting
> 50
> > > >> >> messages/sec because that was the rate yesterday when all events
> > > >> >>topped).
> > > >> >>
> > > >> >> Here is how I approach this. All the metrics use the same
> windowing
> > > >> >> approach. We define a single window by a length of time or
> number of
> > > >> >>values
> > > >> >> (you can use either or both--if both the window ends when
> *either*
> > > >>the
> > > >> >>time
> > > >> >> bound or event bound is hit). The typical problem with hard
> window
> > > >> >> boundaries is that at the beginning of the window you have no
> data
> > > >>and
> > > >> >>the
> > > >> >> first few samples are too small to be a valid sample. (Consider
> if
> > > >>you
> > > >> >>were
> > > >> >> keeping an avg and the first value in the window happens to be
> very
> > > >>very
> > > >> >> high, if you check the avg at this exact time you will conclude
> the
> > > >>avg
> > > >> >>is
> > > >> >> very high but on a sample size of one). One simple fix would be
> to
> > > >> >>always
> > > >> >> report the last complete window, however this is not appropriate
> here
> > > >> >> because (1) we want to drive quotas off it so it needs to be
> current,
> > > >> >>and
> > > >> >> (2) since this is for monitoring you kind of care more about the
> > > >>current
> > > >> >> state. The ideal solution here would be to define a backwards
> looking
> > > >> >> sliding window from the present, but many statistics are actually
> > > >>very
> > > >> >>hard
> > > >> >> to compute in this model without retaining all the values which
> > > >>would be
> > > >> >> hopelessly inefficient. My solution to this is to keep a
> configurable
> > > >> >> number of windows (default is two) and combine them for the
> estimate.
> > > >> >>So in
> > > >> >> a two sample case depending on when you ask you have between one
> and
> > > >>two
> > > >> >> complete samples worth of data to base the answer off of.
> Provided
> > > >>the
> > > >> >> sample window is large enough to get a valid result this
> satisfies
> > > >>both
> > > >> >>of
> > > >> >> my criteria of incorporating the most recent data and having
> > > >>reasonable
> > > >> >> variance at all times.
> > > >> >>
> > > >> >> Another approach is to use an exponential weighting scheme to
> combine
> > > >> >>all
> > > >> >> history but emphasize the recent past. I have not done this as it
> > > >>has a
> > > >> >>lot
> > > >> >> of issues for practical operational metrics. I'd be happy to
> > > >>elaborate
> > > >> >>on
> > > >> >> this if anyone cares...
> > > >> >>
> > > >> >> The window size for metrics has a global default which can be
> > > >> >>overridden at
> > > >> >> either the sensor or individual metric level.
> > > >> >>
> > > >> >> In addition to these time series values the user can directly
> expose
> > > >> >>some
> > > >> >> method of their choosing JMX-style by implementing the Measurable
> > > >> >>interface
> > > >> >> and registering that value. E.g.
> > > >> >>   metrics.addMetric("my.metric", new Measurable() {
> > > >> >>     public double measure(MetricConfg config, long now) {
> > > >> >>        return this.calculateValueToExpose();
> > > >> >>     }
> > > >> >>   });
> > > >> >> This is useful for exposing things like the accumulator free
> memory.
> > > >> >>
> > > >> >> The set of metrics is extensible, new metrics can be added by
> just
> > > >> >> implementing the appropriate interfaces and registering with a
> > > >>sensor. I
> > > >> >> implement the following metrics:
> > > >> >>   total - the sum of all values from the given sensor
> > > >> >>   count - a windowed count of values from the sensor
> > > >> >>   avg - the sample average within the windows
> > > >> >>   max - the max over the windows
> > > >> >>   min - the min over the windows
> > > >> >>   rate - the rate in the windows (e.g. the total or count
> divided by
> > > >>the
> > > >> >> ellapsed time)
> > > >> >>   percentiles - a collection of percentiles computed over the
> window
> > > >> >>
> > > >> >> My approach to percentiles is a little different from the yammer
> > > >>metrics
> > > >> >> package. My complaint about the yammer metrics approach is that
> it
> > > >>uses
> > > >> >> rather expensive sampling and uses kind of a lot of memory to
> get a
> > > >> >> reasonable sample. This is problematic for per-topic
> measurements.
> > > >> >>
> > > >> >> Instead I use a fixed range for the histogram (e.g. 0.0 to
> 30000.0)
> > > >> >>which
> > > >> >> directly allows you to specify the desired memory use. Any value
> > > >>below
> > > >> >>the
> > > >> >> minimum is recorded as -Infinity and any value above the maximum
> as
> > > >> >> +Infinity. I think this is okay as all metrics have an expected
> range
> > > >> >> except for latency which can be arbitrarily large, but for very
> high
> > > >> >> latency there is no need to model it exactly (e.g. 30 seconds +
> > > >>really
> > > >> >>is
> > > >> >> effectively infinite). Within the range values are recorded in
> > > >>buckets
> > > >> >> which can be either fixed width or increasing width. The
> increasing
> > > >> >>width
> > > >> >> is analogous to the idea of significant figures, that is if your
> > > >>value
> > > >> >>is
> > > >> >> in the range 0-10 you might want to be accurate to within 1ms,
> but if
> > > >> >>it is
> > > >> >> 20000 there is no need to be so accurate. I implemented a linear
> > > >>bucket
> > > >> >> size where the Nth bucket has width proportional to N. An
> exponential
> > > >> >> bucket size would also be sensible and could likely be derived
> > > >>directly
> > > >> >> from the floating point representation of a the value.
> > > >> >>
> > > >> >> I'd like to get some feedback on this metrics code and make a
> > > >>decision
> > > >> >>on
> > > >> >> whether we want to use it before I actually go ahead and add all
> the
> > > >> >> instrumentation in the code (otherwise I'll have to redo it if we
> > > >>switch
> > > >> >> approaches). So the next topic of discussion will be which actual
> > > >> >>metrics
> > > >> >> to add.
> > > >> >>
> > > >> >> -Jay
> > > >> >>
> > > >>
> > > >>
> > >
> > >
>
>

Re: Metrics in new producer

Posted by Joel Koshy <jj...@gmail.com>.

I have not had a chance to review the new metrics code and its
features carefully (apart from your write-up), but here are my general
thoughts:

Implementing a metrics package correctly is difficult; more so for
people like me, because I'm not a statistician.  However, if this new
package: {(i) functions correctly (and we need to define and prove
correctness), (ii) is easy to use, (iii) serves all our current and
anticipated monitoring needs, (iv) is not overly complex that it
becomes a burden to maintain and we are better of with an available
library;} then I think it makes sense to embed it and use it within
the Kafka code. The main wins are: (i) predictability (no changing
APIs and intimate knowledge of the code) and (ii) control with respect
to both functionality (e.g., there are hard-coded decay constants in
metrics-core 2.x) and correctness (i.e., if we find a bug in the
metrics package we have to submit a pull request and wait for it to
become mainstream).  I'm not sure it would help very much to pull it
into a separate repo because that could potentially annul these
benefits.

Joel

On Wed, Feb 12, 2014 at 02:50:43PM -0800, Jay Kreps wrote:
> Sriram,
> 
> Makes sense. I am cool moving this stuff into its own repo if people think
> that is better. I'm not sure it would get much contribution but when I
> started messing with this I did have a lot of grand ideas of making adding
> metrics to a sensor dynamic so you could add more stuff in real-time(via
> jmx, say) and/or externalize all your metrics and config to a separate file
> like log4j with only the points of instrumentation hard-coded.
> 
> -Jay
> 
> 
> On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
> srsubramanian@linkedin.com> wrote:
> 
> > I am actually neutral to this change. I found the replies were more
> > towards the implementation and features so far. I would like the community
> > to think about the questions below before making a decision. My opinion on
> > this is that it has potential to be its own project and it would attract
> > developers who are specifically interested in contributing to metrics. I
> > am skeptical that the Kafka contributors would focus on improving this
> > library (apart from bug fixes) instead of developing/contributing to other
> > core pieces. It would be useful to continue and keep it decoupled from
> > rest of Kafka (if it resides in the Kafka code base.) so that we can move
> > it out anytime to its own project.
> >
> >
> > On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
> >
> > >Hey Sriram,
> > >
> > >Not sure if these are actually meant as questions or more veiled comments.
> > >In an case I tried to give my 2 cents inline.
> > >
> > >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> > >srsubramanian@linkedin.com> wrote:
> > >
> > >> I think answering the questions below would help to make a better
> > >> decision. I am all for writing better code and having superior
> > >> functionalities but it is worth thinking about stuff outside just code
> > >>in
> > >> this case -
> > >>
> > >> 1. Does metric form a core piece of kafka? Does it help kafka greatly in
> > >> providing better core functionalities? I would always like a project to
> > >>do
> > >> one thing really well. Metrics is a non trivial amount of code.
> > >>
> > >
> > >Metrics are obviously important, and obviously improving our metrics
> > >system
> > >would be good. That said this may or may not be better, and even if it is
> > >better that betterness might not outweigh other considerations. That is
> > >what we are discussing.
> > >
> > >
> > >> 2. Does it make sense to be part of Kafka or its own project? If this
> > >> metrics library has the potential to be better than metrics-core, I
> > >>would
> > >> be interested in other projects take advantage of it.
> > >>
> > >
> > >It could be either.
> > >
> > >3. Can Kafka maintain this library as new members join and old members
> > >> leave? Would this be a piece of code that no one (in Kafka) in the
> > >>future
> > >> spends time improving if the original author left?
> > >>
> > >
> > >I am not going anywhere in the near term, but if I did, yes, this would be
> > >like any other code we have. As with yammer metrics or any other code at
> > >that point we would either use it as is or someone would improve it.
> > >
> > >
> > >> 4. Does it affect the schedule of producer rewrite? This needs its own
> > >> stabilization and modification to existing metric dashboards if the
> > >>format
> > >> is changed. Many times such cost are not factored in and a project loses
> > >> time before realizing the extra time required to make a library as this
> > >> operational.
> > >>
> > >
> > >Probably not. The metrics are going to change regardless of whether we use
> > >the same library or not. If we think this is better I don't mind putting
> > >in
> > >a little extra effort to get there.
> > >
> > >Irrespective I think this is probably not the right thing to optimize for.
> > >
> > >
> > >> I am sure we can do better when we write code to a specific use case (in
> > >> this case, kafka) rather than building a generic library that suits all
> > >> (metrics-core) but I would like us to have answers to the questions
> > >>above
> > >> and be prepared before we proceed to support this with the producer
> > >> rewrite.
> > >
> > >
> > >Naturally we are all considering exactly these things, that is exactly the
> > >reason I started the thread.
> > >
> > >-Jay
> > >
> > >
> > >> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> > >>
> > >> >Thanks for the detailed write-up. It's well thought through. A few
> > >> >comments:
> > >> >
> > >> >1. I have a couple of concerns on the percentiles. The first issue is
> > >>that
> > >> >It requires the user to know the value range. Since the range for
> > >>things
> > >> >like message size (in millions) is quite different from those like
> > >>request
> > >> >time (less than 100), it's going to be hard to pick a good global
> > >>default
> > >> >range. Different apps could be dealing with different message size. So
> > >> >they
> > >> >probably will have to customize the range. Another issue is that it can
> > >> >only report values at the bucket boundaries. So, if you have 1000
> > >>buckets
> > >> >and a value range of 1 million, you will only see 1000 possible values
> > >>as
> > >> >the quantile, which is probably too sparse. The implementation of
> > >> >histogram
> > >> >in metrics-core keeps a fix size of samples, which avoids both issues.
> > >> >
> > >> >2. We need to document the 3-part metrics names better since it's not
> > >> >obvious what the convention is. Also, currently the name of the sensor
> > >>and
> > >> >the metrics defined in it are independent. Would it make sense to have
> > >>the
> > >> >sensor name be a prefix of the metric name?
> > >> >
> > >> >Overall, this approach seems to be cleaner than metrics-core by
> > >>decoupling
> > >> >measuring and reporting. The main benefit of metrics-core seems to be
> > >>the
> > >> >existing reporters. Since not that many people voted for metrics-core,
> > >>I
> > >> >am
> > >> >ok with going with the new implementation. My only recommendation is to
> > >> >address the concern on percentiles.
> > >> >
> > >> >Thanks,
> > >> >
> > >> >Jun
> > >> >
> > >> >
> > >> >
> > >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com>
> > wrote:
> > >> >
> > >> >> Hey guys,
> > >> >>
> > >> >> I wanted to kick off a quick discussion of metrics with respect to
> > >>the
> > >> >>new
> > >> >> producer and consumer (and potentially the server).
> > >> >>
> > >> >> At a high level I think there are three approaches we could take:
> > >> >> 1. Plain vanilla JMX
> > >> >> 2. Use Coda Hale (AKA Yammer) Metrics
> > >> >> 3. Do our own metrics (with JMX as one output)
> > >> >>
> > >> >> 1. Has the advantage that JMX is the most commonly used java thing
> > >>and
> > >> >> plugs in reasonably to most metrics systems. JMX is included in the
> > >>JDK
> > >> >>so
> > >> >> it doesn't impose any additional dependencies on clients. It has the
> > >> >> disadvantage that plain vanilla JMX is a pain to use. We would need a
> > >> >>bunch
> > >> >> of helper code for maintaining counters to make this reasonable.
> > >> >>
> > >> >> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> > >> >> output as well as direct output to many other types of systems. The
> > >> >>primary
> > >> >> downside we have had with Coda Hale has to do with the clients and
> > >> >>library
> > >> >> incompatibilities. We are currently on an older more popular version.
> > >> >>The
> > >> >> newer version is a rewrite of the APIs and is incompatible.
> > >>Originally
> > >> >> these were totally incompatible and people had to choose one or the
> > >> >>other.
> > >> >> I think that has been improved so now the new version is a totally
> > >> >> different package. But even in this case you end up with both
> > >>versions
> > >> >>if
> > >> >> you use Kafka and we are on a different version than you which is
> > >>going
> > >> >>to
> > >> >> be pretty inconvenient.
> > >> >>
> > >> >> 3. Doing our own has the downside of potentially reinventing the
> > >>wheel,
> > >> >>and
> > >> >> potentially needing to work out any bugs in our code. The upsides
> > >>would
> > >> >> depend on the how good the reinvention was. As it happens I did a
> > >>quick
> > >> >> (~900 loc) version of a metrics library that is under
> > >> >>kafka.common.metrics.
> > >> >> I think it has some advantages over the Yammer metrics package for
> > >>our
> > >> >> usage beyond just not causing incompatibilities. I will describe this
> > >> >>code
> > >> >> so we can discuss the pros and cons. Although I favor this approach I
> > >> >>have
> > >> >> no emotional attachment and wouldn't be too sad if I ended up
> > >>deleting
> > >> >>it.
> > >> >> Here are javadocs for this code, though I haven't written much
> > >> >> documentation yet since I might end up deleting it:
> > >> >>
> > >> >> Here is a quick overview of this library.
> > >> >>
> > >> >> There are three main public interfaces:
> > >> >>   Metrics - This is a repository of metrics being tracked.
> > >> >>   Metric - A single, named numerical value being measured (i.e. a
> > >> >>counter).
> > >> >>   Sensor - This is a thing that records values and updates zero or
> > >>more
> > >> >> metrics
> > >> >>
> > >> >> So let's say we want to track three values about message sizes;
> > >> >> specifically say we want to record the average, the maximum, the
> > >>total
> > >> >>rate
> > >> >> of bytes being sent, and a count of messages. Then we would do
> > >>something
> > >> >> like this:
> > >> >>
> > >> >>    // setup code
> > >> >>    Metrics metrics = new Metrics(); // this is a global "singleton"
> > >> >>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
> > >> >>    sensor.add("kafka.producer.message-size.avg", new Avg());
> > >> >>    sensor.add("kafka.producer.message-size.max", new Max());
> > >> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> > >> >>    sensor.add("kafka.producer.message-count", new Count());
> > >> >>
> > >> >>    // now when we get a message we do this
> > >> >>    sensor.record(messageSize);
> > >> >>
> > >> >> The above code creates the global metrics repository, creates a
> > >>single
> > >> >> Sensor, and defines 5 named metrics that are updated by that Sensor.
> > >> >>
> > >> >> Like Yammer Metrics (YM) I allow you to plug in "reporters",
> > >>including a
> > >> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have
> > >>keys
> > >> >> off the metric names not the Sensor names, which I think is an
> > >> >> improvement--I just use the convention that the last portion of the
> > >> >>name is
> > >> >> the attribute name, the second to last is the mbean name, and the
> > >>rest
> > >> >>is
> > >> >> the package. So in the above example there is a producer mbean that
> > >>has
> > >> >>a
> > >> >> avg and max attribute and a producer mbean that has a
> > >>bytes-sent-per-sec
> > >> >> and message-count attribute. This is nice because you can logically
> > >> >>group
> > >> >> the values reported irrespective of where in the program they are
> > >> >> computed--that is an mbean can logically group attributes computed
> > >>off
> > >> >> different sensors. This means you can report values by logical
> > >> >>subsystem.
> > >> >>
> > >> >> I also allow the concept of hierarchical Sensors which I think is a
> > >>good
> > >> >> convenience. I have noticed a common pattern in systems where you
> > >>need
> > >> >>to
> > >> >> roll up the same values along different dimensions. An simple
> > >>example is
> > >> >> metrics about qps, data rate, etc on the broker. These we want to
> > >> >>capture
> > >> >> in aggregate, but also broken down by topic-id. You can do this
> > >>purely
> > >> >>by
> > >> >> defining the sensor hierarchy:
> > >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > >> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
> > >> >>".sizes",
> > >> >> allSizes);
> > >> >> Now each actual update will go to the appropriate topicSizes sensor
> > >> >>(based
> > >> >> on the topic name), but allSizes metrics will get updated too. I also
> > >> >> support multiple parents for each sensor as well as multiple layers
> > >>of
> > >> >> hiearchy, so you can define a more elaborate DAG of sensors. An
> > >>example
> > >> >>of
> > >> >> how this would be useful is if you wanted to record your metrics
> > >>broken
> > >> >> down by topic AND client id as well as the global aggregate.
> > >> >>
> > >> >> Each metric can take a configurable Quota value which allows us to
> > >>limit
> > >> >> the maximum value of that sensor. This is intended for use on the
> > >> >>server as
> > >> >> part of our Quota implementation. The way this works is that you
> > >>record
> > >> >> metrics as usual:
> > >> >>    mySensor.record(42.0)
> > >> >> However if this event occurance causes one of the metrics to exceed
> > >>its
> > >> >> maximum allowable value (the quota) this call will throw a
> > >> >> QuotaViolationException. The cool thing about this is that it means
> > >>we
> > >> >>can
> > >> >> define quotas on anything we capture metrics for, which I think is
> > >> >>pretty
> > >> >> cool.
> > >> >>
> > >> >> Another question is how to handle windowing of the values? Metrics
> > >>want
> > >> >>to
> > >> >> record the "current" value, but the definition of current is
> > >>inherently
> > >> >> nebulous. A few of the obvious gotchas are that if you define
> > >>"current"
> > >> >>to
> > >> >> be a number of events you can end up measuring an arbitrarily long
> > >> >>window
> > >> >> of time if the event rate is low (e.g. you think you are getting 50
> > >> >> messages/sec because that was the rate yesterday when all events
> > >> >>topped).
> > >> >>
> > >> >> Here is how I approach this. All the metrics use the same windowing
> > >> >> approach. We define a single window by a length of time or number of
> > >> >>values
> > >> >> (you can use either or both--if both the window ends when *either*
> > >>the
> > >> >>time
> > >> >> bound or event bound is hit). The typical problem with hard window
> > >> >> boundaries is that at the beginning of the window you have no data
> > >>and
> > >> >>the
> > >> >> first few samples are too small to be a valid sample. (Consider if
> > >>you
> > >> >>were
> > >> >> keeping an avg and the first value in the window happens to be very
> > >>very
> > >> >> high, if you check the avg at this exact time you will conclude the
> > >>avg
> > >> >>is
> > >> >> very high but on a sample size of one). One simple fix would be to
> > >> >>always
> > >> >> report the last complete window, however this is not appropriate here
> > >> >> because (1) we want to drive quotas off it so it needs to be current,
> > >> >>and
> > >> >> (2) since this is for monitoring you kind of care more about the
> > >>current
> > >> >> state. The ideal solution here would be to define a backwards looking
> > >> >> sliding window from the present, but many statistics are actually
> > >>very
> > >> >>hard
> > >> >> to compute in this model without retaining all the values which
> > >>would be
> > >> >> hopelessly inefficient. My solution to this is to keep a configurable
> > >> >> number of windows (default is two) and combine them for the estimate.
> > >> >>So in
> > >> >> a two sample case depending on when you ask you have between one and
> > >>two
> > >> >> complete samples worth of data to base the answer off of. Provided
> > >>the
> > >> >> sample window is large enough to get a valid result this satisfies
> > >>both
> > >> >>of
> > >> >> my criteria of incorporating the most recent data and having
> > >>reasonable
> > >> >> variance at all times.
> > >> >>
> > >> >> Another approach is to use an exponential weighting scheme to combine
> > >> >>all
> > >> >> history but emphasize the recent past. I have not done this as it
> > >>has a
> > >> >>lot
> > >> >> of issues for practical operational metrics. I'd be happy to
> > >>elaborate
> > >> >>on
> > >> >> this if anyone cares...
> > >> >>
> > >> >> The window size for metrics has a global default which can be
> > >> >>overridden at
> > >> >> either the sensor or individual metric level.
> > >> >>
> > >> >> In addition to these time series values the user can directly expose
> > >> >>some
> > >> >> method of their choosing JMX-style by implementing the Measurable
> > >> >>interface
> > >> >> and registering that value. E.g.
> > >> >>   metrics.addMetric("my.metric", new Measurable() {
> > >> >>     public double measure(MetricConfg config, long now) {
> > >> >>        return this.calculateValueToExpose();
> > >> >>     }
> > >> >>   });
> > >> >> This is useful for exposing things like the accumulator free memory.
> > >> >>
> > >> >> The set of metrics is extensible, new metrics can be added by just
> > >> >> implementing the appropriate interfaces and registering with a
> > >>sensor. I
> > >> >> implement the following metrics:
> > >> >>   total - the sum of all values from the given sensor
> > >> >>   count - a windowed count of values from the sensor
> > >> >>   avg - the sample average within the windows
> > >> >>   max - the max over the windows
> > >> >>   min - the min over the windows
> > >> >>   rate - the rate in the windows (e.g. the total or count divided by
> > >>the
> > >> >> ellapsed time)
> > >> >>   percentiles - a collection of percentiles computed over the window
> > >> >>
> > >> >> My approach to percentiles is a little different from the yammer
> > >>metrics
> > >> >> package. My complaint about the yammer metrics approach is that it
> > >>uses
> > >> >> rather expensive sampling and uses kind of a lot of memory to get a
> > >> >> reasonable sample. This is problematic for per-topic measurements.
> > >> >>
> > >> >> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0)
> > >> >>which
> > >> >> directly allows you to specify the desired memory use. Any value
> > >>below
> > >> >>the
> > >> >> minimum is recorded as -Infinity and any value above the maximum as
> > >> >> +Infinity. I think this is okay as all metrics have an expected range
> > >> >> except for latency which can be arbitrarily large, but for very high
> > >> >> latency there is no need to model it exactly (e.g. 30 seconds +
> > >>really
> > >> >>is
> > >> >> effectively infinite). Within the range values are recorded in
> > >>buckets
> > >> >> which can be either fixed width or increasing width. The increasing
> > >> >>width
> > >> >> is analogous to the idea of significant figures, that is if your
> > >>value
> > >> >>is
> > >> >> in the range 0-10 you might want to be accurate to within 1ms, but if
> > >> >>it is
> > >> >> 20000 there is no need to be so accurate. I implemented a linear
> > >>bucket
> > >> >> size where the Nth bucket has width proportional to N. An exponential
> > >> >> bucket size would also be sensible and could likely be derived
> > >>directly
> > >> >> from the floating point representation of a the value.
> > >> >>
> > >> >> I'd like to get some feedback on this metrics code and make a
> > >>decision
> > >> >>on
> > >> >> whether we want to use it before I actually go ahead and add all the
> > >> >> instrumentation in the code (otherwise I'll have to redo it if we
> > >>switch
> > >> >> approaches). So the next topic of discussion will be which actual
> > >> >>metrics
> > >> >> to add.
> > >> >>
> > >> >> -Jay
> > >> >>
> > >>
> > >>
> >
> >

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

Sriram,

Makes sense. I am cool moving this stuff into its own repo if people think
that is better. I'm not sure it would get much contribution but when I
started messing with this I did have a lot of grand ideas of making adding
metrics to a sensor dynamic so you could add more stuff in real-time(via
jmx, say) and/or externalize all your metrics and config to a separate file
like log4j with only the points of instrumentation hard-coded.

-Jay


On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian <
srsubramanian@linkedin.com> wrote:

> I am actually neutral to this change. I found the replies were more
> towards the implementation and features so far. I would like the community
> to think about the questions below before making a decision. My opinion on
> this is that it has potential to be its own project and it would attract
> developers who are specifically interested in contributing to metrics. I
> am skeptical that the Kafka contributors would focus on improving this
> library (apart from bug fixes) instead of developing/contributing to other
> core pieces. It would be useful to continue and keep it decoupled from
> rest of Kafka (if it resides in the Kafka code base.) so that we can move
> it out anytime to its own project.
>
>
> On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:
>
> >Hey Sriram,
> >
> >Not sure if these are actually meant as questions or more veiled comments.
> >In an case I tried to give my 2 cents inline.
> >
> >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
> >srsubramanian@linkedin.com> wrote:
> >
> >> I think answering the questions below would help to make a better
> >> decision. I am all for writing better code and having superior
> >> functionalities but it is worth thinking about stuff outside just code
> >>in
> >> this case -
> >>
> >> 1. Does metric form a core piece of kafka? Does it help kafka greatly in
> >> providing better core functionalities? I would always like a project to
> >>do
> >> one thing really well. Metrics is a non trivial amount of code.
> >>
> >
> >Metrics are obviously important, and obviously improving our metrics
> >system
> >would be good. That said this may or may not be better, and even if it is
> >better that betterness might not outweigh other considerations. That is
> >what we are discussing.
> >
> >
> >> 2. Does it make sense to be part of Kafka or its own project? If this
> >> metrics library has the potential to be better than metrics-core, I
> >>would
> >> be interested in other projects take advantage of it.
> >>
> >
> >It could be either.
> >
> >3. Can Kafka maintain this library as new members join and old members
> >> leave? Would this be a piece of code that no one (in Kafka) in the
> >>future
> >> spends time improving if the original author left?
> >>
> >
> >I am not going anywhere in the near term, but if I did, yes, this would be
> >like any other code we have. As with yammer metrics or any other code at
> >that point we would either use it as is or someone would improve it.
> >
> >
> >> 4. Does it affect the schedule of producer rewrite? This needs its own
> >> stabilization and modification to existing metric dashboards if the
> >>format
> >> is changed. Many times such cost are not factored in and a project loses
> >> time before realizing the extra time required to make a library as this
> >> operational.
> >>
> >
> >Probably not. The metrics are going to change regardless of whether we use
> >the same library or not. If we think this is better I don't mind putting
> >in
> >a little extra effort to get there.
> >
> >Irrespective I think this is probably not the right thing to optimize for.
> >
> >
> >> I am sure we can do better when we write code to a specific use case (in
> >> this case, kafka) rather than building a generic library that suits all
> >> (metrics-core) but I would like us to have answers to the questions
> >>above
> >> and be prepared before we proceed to support this with the producer
> >> rewrite.
> >
> >
> >Naturally we are all considering exactly these things, that is exactly the
> >reason I started the thread.
> >
> >-Jay
> >
> >
> >> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
> >>
> >> >Thanks for the detailed write-up. It's well thought through. A few
> >> >comments:
> >> >
> >> >1. I have a couple of concerns on the percentiles. The first issue is
> >>that
> >> >It requires the user to know the value range. Since the range for
> >>things
> >> >like message size (in millions) is quite different from those like
> >>request
> >> >time (less than 100), it's going to be hard to pick a good global
> >>default
> >> >range. Different apps could be dealing with different message size. So
> >> >they
> >> >probably will have to customize the range. Another issue is that it can
> >> >only report values at the bucket boundaries. So, if you have 1000
> >>buckets
> >> >and a value range of 1 million, you will only see 1000 possible values
> >>as
> >> >the quantile, which is probably too sparse. The implementation of
> >> >histogram
> >> >in metrics-core keeps a fix size of samples, which avoids both issues.
> >> >
> >> >2. We need to document the 3-part metrics names better since it's not
> >> >obvious what the convention is. Also, currently the name of the sensor
> >>and
> >> >the metrics defined in it are independent. Would it make sense to have
> >>the
> >> >sensor name be a prefix of the metric name?
> >> >
> >> >Overall, this approach seems to be cleaner than metrics-core by
> >>decoupling
> >> >measuring and reporting. The main benefit of metrics-core seems to be
> >>the
> >> >existing reporters. Since not that many people voted for metrics-core,
> >>I
> >> >am
> >> >ok with going with the new implementation. My only recommendation is to
> >> >address the concern on percentiles.
> >> >
> >> >Thanks,
> >> >
> >> >Jun
> >> >
> >> >
> >> >
> >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com>
> wrote:
> >> >
> >> >> Hey guys,
> >> >>
> >> >> I wanted to kick off a quick discussion of metrics with respect to
> >>the
> >> >>new
> >> >> producer and consumer (and potentially the server).
> >> >>
> >> >> At a high level I think there are three approaches we could take:
> >> >> 1. Plain vanilla JMX
> >> >> 2. Use Coda Hale (AKA Yammer) Metrics
> >> >> 3. Do our own metrics (with JMX as one output)
> >> >>
> >> >> 1. Has the advantage that JMX is the most commonly used java thing
> >>and
> >> >> plugs in reasonably to most metrics systems. JMX is included in the
> >>JDK
> >> >>so
> >> >> it doesn't impose any additional dependencies on clients. It has the
> >> >> disadvantage that plain vanilla JMX is a pain to use. We would need a
> >> >>bunch
> >> >> of helper code for maintaining counters to make this reasonable.
> >> >>
> >> >> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> >> >> output as well as direct output to many other types of systems. The
> >> >>primary
> >> >> downside we have had with Coda Hale has to do with the clients and
> >> >>library
> >> >> incompatibilities. We are currently on an older more popular version.
> >> >>The
> >> >> newer version is a rewrite of the APIs and is incompatible.
> >>Originally
> >> >> these were totally incompatible and people had to choose one or the
> >> >>other.
> >> >> I think that has been improved so now the new version is a totally
> >> >> different package. But even in this case you end up with both
> >>versions
> >> >>if
> >> >> you use Kafka and we are on a different version than you which is
> >>going
> >> >>to
> >> >> be pretty inconvenient.
> >> >>
> >> >> 3. Doing our own has the downside of potentially reinventing the
> >>wheel,
> >> >>and
> >> >> potentially needing to work out any bugs in our code. The upsides
> >>would
> >> >> depend on the how good the reinvention was. As it happens I did a
> >>quick
> >> >> (~900 loc) version of a metrics library that is under
> >> >>kafka.common.metrics.
> >> >> I think it has some advantages over the Yammer metrics package for
> >>our
> >> >> usage beyond just not causing incompatibilities. I will describe this
> >> >>code
> >> >> so we can discuss the pros and cons. Although I favor this approach I
> >> >>have
> >> >> no emotional attachment and wouldn't be too sad if I ended up
> >>deleting
> >> >>it.
> >> >> Here are javadocs for this code, though I haven't written much
> >> >> documentation yet since I might end up deleting it:
> >> >>
> >> >> Here is a quick overview of this library.
> >> >>
> >> >> There are three main public interfaces:
> >> >>   Metrics - This is a repository of metrics being tracked.
> >> >>   Metric - A single, named numerical value being measured (i.e. a
> >> >>counter).
> >> >>   Sensor - This is a thing that records values and updates zero or
> >>more
> >> >> metrics
> >> >>
> >> >> So let's say we want to track three values about message sizes;
> >> >> specifically say we want to record the average, the maximum, the
> >>total
> >> >>rate
> >> >> of bytes being sent, and a count of messages. Then we would do
> >>something
> >> >> like this:
> >> >>
> >> >>    // setup code
> >> >>    Metrics metrics = new Metrics(); // this is a global "singleton"
> >> >>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
> >> >>    sensor.add("kafka.producer.message-size.avg", new Avg());
> >> >>    sensor.add("kafka.producer.message-size.max", new Max());
> >> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> >> >>    sensor.add("kafka.producer.message-count", new Count());
> >> >>
> >> >>    // now when we get a message we do this
> >> >>    sensor.record(messageSize);
> >> >>
> >> >> The above code creates the global metrics repository, creates a
> >>single
> >> >> Sensor, and defines 5 named metrics that are updated by that Sensor.
> >> >>
> >> >> Like Yammer Metrics (YM) I allow you to plug in "reporters",
> >>including a
> >> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have
> >>keys
> >> >> off the metric names not the Sensor names, which I think is an
> >> >> improvement--I just use the convention that the last portion of the
> >> >>name is
> >> >> the attribute name, the second to last is the mbean name, and the
> >>rest
> >> >>is
> >> >> the package. So in the above example there is a producer mbean that
> >>has
> >> >>a
> >> >> avg and max attribute and a producer mbean that has a
> >>bytes-sent-per-sec
> >> >> and message-count attribute. This is nice because you can logically
> >> >>group
> >> >> the values reported irrespective of where in the program they are
> >> >> computed--that is an mbean can logically group attributes computed
> >>off
> >> >> different sensors. This means you can report values by logical
> >> >>subsystem.
> >> >>
> >> >> I also allow the concept of hierarchical Sensors which I think is a
> >>good
> >> >> convenience. I have noticed a common pattern in systems where you
> >>need
> >> >>to
> >> >> roll up the same values along different dimensions. An simple
> >>example is
> >> >> metrics about qps, data rate, etc on the broker. These we want to
> >> >>capture
> >> >> in aggregate, but also broken down by topic-id. You can do this
> >>purely
> >> >>by
> >> >> defining the sensor hierarchy:
> >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> >> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
> >> >>".sizes",
> >> >> allSizes);
> >> >> Now each actual update will go to the appropriate topicSizes sensor
> >> >>(based
> >> >> on the topic name), but allSizes metrics will get updated too. I also
> >> >> support multiple parents for each sensor as well as multiple layers
> >>of
> >> >> hiearchy, so you can define a more elaborate DAG of sensors. An
> >>example
> >> >>of
> >> >> how this would be useful is if you wanted to record your metrics
> >>broken
> >> >> down by topic AND client id as well as the global aggregate.
> >> >>
> >> >> Each metric can take a configurable Quota value which allows us to
> >>limit
> >> >> the maximum value of that sensor. This is intended for use on the
> >> >>server as
> >> >> part of our Quota implementation. The way this works is that you
> >>record
> >> >> metrics as usual:
> >> >>    mySensor.record(42.0)
> >> >> However if this event occurance causes one of the metrics to exceed
> >>its
> >> >> maximum allowable value (the quota) this call will throw a
> >> >> QuotaViolationException. The cool thing about this is that it means
> >>we
> >> >>can
> >> >> define quotas on anything we capture metrics for, which I think is
> >> >>pretty
> >> >> cool.
> >> >>
> >> >> Another question is how to handle windowing of the values? Metrics
> >>want
> >> >>to
> >> >> record the "current" value, but the definition of current is
> >>inherently
> >> >> nebulous. A few of the obvious gotchas are that if you define
> >>"current"
> >> >>to
> >> >> be a number of events you can end up measuring an arbitrarily long
> >> >>window
> >> >> of time if the event rate is low (e.g. you think you are getting 50
> >> >> messages/sec because that was the rate yesterday when all events
> >> >>topped).
> >> >>
> >> >> Here is how I approach this. All the metrics use the same windowing
> >> >> approach. We define a single window by a length of time or number of
> >> >>values
> >> >> (you can use either or both--if both the window ends when *either*
> >>the
> >> >>time
> >> >> bound or event bound is hit). The typical problem with hard window
> >> >> boundaries is that at the beginning of the window you have no data
> >>and
> >> >>the
> >> >> first few samples are too small to be a valid sample. (Consider if
> >>you
> >> >>were
> >> >> keeping an avg and the first value in the window happens to be very
> >>very
> >> >> high, if you check the avg at this exact time you will conclude the
> >>avg
> >> >>is
> >> >> very high but on a sample size of one). One simple fix would be to
> >> >>always
> >> >> report the last complete window, however this is not appropriate here
> >> >> because (1) we want to drive quotas off it so it needs to be current,
> >> >>and
> >> >> (2) since this is for monitoring you kind of care more about the
> >>current
> >> >> state. The ideal solution here would be to define a backwards looking
> >> >> sliding window from the present, but many statistics are actually
> >>very
> >> >>hard
> >> >> to compute in this model without retaining all the values which
> >>would be
> >> >> hopelessly inefficient. My solution to this is to keep a configurable
> >> >> number of windows (default is two) and combine them for the estimate.
> >> >>So in
> >> >> a two sample case depending on when you ask you have between one and
> >>two
> >> >> complete samples worth of data to base the answer off of. Provided
> >>the
> >> >> sample window is large enough to get a valid result this satisfies
> >>both
> >> >>of
> >> >> my criteria of incorporating the most recent data and having
> >>reasonable
> >> >> variance at all times.
> >> >>
> >> >> Another approach is to use an exponential weighting scheme to combine
> >> >>all
> >> >> history but emphasize the recent past. I have not done this as it
> >>has a
> >> >>lot
> >> >> of issues for practical operational metrics. I'd be happy to
> >>elaborate
> >> >>on
> >> >> this if anyone cares...
> >> >>
> >> >> The window size for metrics has a global default which can be
> >> >>overridden at
> >> >> either the sensor or individual metric level.
> >> >>
> >> >> In addition to these time series values the user can directly expose
> >> >>some
> >> >> method of their choosing JMX-style by implementing the Measurable
> >> >>interface
> >> >> and registering that value. E.g.
> >> >>   metrics.addMetric("my.metric", new Measurable() {
> >> >>     public double measure(MetricConfg config, long now) {
> >> >>        return this.calculateValueToExpose();
> >> >>     }
> >> >>   });
> >> >> This is useful for exposing things like the accumulator free memory.
> >> >>
> >> >> The set of metrics is extensible, new metrics can be added by just
> >> >> implementing the appropriate interfaces and registering with a
> >>sensor. I
> >> >> implement the following metrics:
> >> >>   total - the sum of all values from the given sensor
> >> >>   count - a windowed count of values from the sensor
> >> >>   avg - the sample average within the windows
> >> >>   max - the max over the windows
> >> >>   min - the min over the windows
> >> >>   rate - the rate in the windows (e.g. the total or count divided by
> >>the
> >> >> ellapsed time)
> >> >>   percentiles - a collection of percentiles computed over the window
> >> >>
> >> >> My approach to percentiles is a little different from the yammer
> >>metrics
> >> >> package. My complaint about the yammer metrics approach is that it
> >>uses
> >> >> rather expensive sampling and uses kind of a lot of memory to get a
> >> >> reasonable sample. This is problematic for per-topic measurements.
> >> >>
> >> >> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0)
> >> >>which
> >> >> directly allows you to specify the desired memory use. Any value
> >>below
> >> >>the
> >> >> minimum is recorded as -Infinity and any value above the maximum as
> >> >> +Infinity. I think this is okay as all metrics have an expected range
> >> >> except for latency which can be arbitrarily large, but for very high
> >> >> latency there is no need to model it exactly (e.g. 30 seconds +
> >>really
> >> >>is
> >> >> effectively infinite). Within the range values are recorded in
> >>buckets
> >> >> which can be either fixed width or increasing width. The increasing
> >> >>width
> >> >> is analogous to the idea of significant figures, that is if your
> >>value
> >> >>is
> >> >> in the range 0-10 you might want to be accurate to within 1ms, but if
> >> >>it is
> >> >> 20000 there is no need to be so accurate. I implemented a linear
> >>bucket
> >> >> size where the Nth bucket has width proportional to N. An exponential
> >> >> bucket size would also be sensible and could likely be derived
> >>directly
> >> >> from the floating point representation of a the value.
> >> >>
> >> >> I'd like to get some feedback on this metrics code and make a
> >>decision
> >> >>on
> >> >> whether we want to use it before I actually go ahead and add all the
> >> >> instrumentation in the code (otherwise I'll have to redo it if we
> >>switch
> >> >> approaches). So the next topic of discussion will be which actual
> >> >>metrics
> >> >> to add.
> >> >>
> >> >> -Jay
> >> >>
> >>
> >>
>
>

Re: Metrics in new producer

Posted by Sriram Subramanian <sr...@linkedin.com>.

I am actually neutral to this change. I found the replies were more
towards the implementation and features so far. I would like the community
to think about the questions below before making a decision. My opinion on
this is that it has potential to be its own project and it would attract
developers who are specifically interested in contributing to metrics. I
am skeptical that the Kafka contributors would focus on improving this
library (apart from bug fixes) instead of developing/contributing to other
core pieces. It would be useful to continue and keep it decoupled from
rest of Kafka (if it resides in the Kafka code base.) so that we can move
it out anytime to its own project.


On 2/12/14 1:21 PM, "Jay Kreps" <ja...@gmail.com> wrote:

>Hey Sriram,
>
>Not sure if these are actually meant as questions or more veiled comments.
>In an case I tried to give my 2 cents inline.
>
>On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
>srsubramanian@linkedin.com> wrote:
>
>> I think answering the questions below would help to make a better
>> decision. I am all for writing better code and having superior
>> functionalities but it is worth thinking about stuff outside just code
>>in
>> this case -
>>
>> 1. Does metric form a core piece of kafka? Does it help kafka greatly in
>> providing better core functionalities? I would always like a project to
>>do
>> one thing really well. Metrics is a non trivial amount of code.
>>
>
>Metrics are obviously important, and obviously improving our metrics
>system
>would be good. That said this may or may not be better, and even if it is
>better that betterness might not outweigh other considerations. That is
>what we are discussing.
>
>
>> 2. Does it make sense to be part of Kafka or its own project? If this
>> metrics library has the potential to be better than metrics-core, I
>>would
>> be interested in other projects take advantage of it.
>>
>
>It could be either.
>
>3. Can Kafka maintain this library as new members join and old members
>> leave? Would this be a piece of code that no one (in Kafka) in the
>>future
>> spends time improving if the original author left?
>>
>
>I am not going anywhere in the near term, but if I did, yes, this would be
>like any other code we have. As with yammer metrics or any other code at
>that point we would either use it as is or someone would improve it.
>
>
>> 4. Does it affect the schedule of producer rewrite? This needs its own
>> stabilization and modification to existing metric dashboards if the
>>format
>> is changed. Many times such cost are not factored in and a project loses
>> time before realizing the extra time required to make a library as this
>> operational.
>>
>
>Probably not. The metrics are going to change regardless of whether we use
>the same library or not. If we think this is better I don't mind putting
>in
>a little extra effort to get there.
>
>Irrespective I think this is probably not the right thing to optimize for.
>
>
>> I am sure we can do better when we write code to a specific use case (in
>> this case, kafka) rather than building a generic library that suits all
>> (metrics-core) but I would like us to have answers to the questions
>>above
>> and be prepared before we proceed to support this with the producer
>> rewrite.
>
>
>Naturally we are all considering exactly these things, that is exactly the
>reason I started the thread.
>
>-Jay
>
>
>> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
>>
>> >Thanks for the detailed write-up. It's well thought through. A few
>> >comments:
>> >
>> >1. I have a couple of concerns on the percentiles. The first issue is
>>that
>> >It requires the user to know the value range. Since the range for
>>things
>> >like message size (in millions) is quite different from those like
>>request
>> >time (less than 100), it's going to be hard to pick a good global
>>default
>> >range. Different apps could be dealing with different message size. So
>> >they
>> >probably will have to customize the range. Another issue is that it can
>> >only report values at the bucket boundaries. So, if you have 1000
>>buckets
>> >and a value range of 1 million, you will only see 1000 possible values
>>as
>> >the quantile, which is probably too sparse. The implementation of
>> >histogram
>> >in metrics-core keeps a fix size of samples, which avoids both issues.
>> >
>> >2. We need to document the 3-part metrics names better since it's not
>> >obvious what the convention is. Also, currently the name of the sensor
>>and
>> >the metrics defined in it are independent. Would it make sense to have
>>the
>> >sensor name be a prefix of the metric name?
>> >
>> >Overall, this approach seems to be cleaner than metrics-core by
>>decoupling
>> >measuring and reporting. The main benefit of metrics-core seems to be
>>the
>> >existing reporters. Since not that many people voted for metrics-core,
>>I
>> >am
>> >ok with going with the new implementation. My only recommendation is to
>> >address the concern on percentiles.
>> >
>> >Thanks,
>> >
>> >Jun
>> >
>> >
>> >
>> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:
>> >
>> >> Hey guys,
>> >>
>> >> I wanted to kick off a quick discussion of metrics with respect to
>>the
>> >>new
>> >> producer and consumer (and potentially the server).
>> >>
>> >> At a high level I think there are three approaches we could take:
>> >> 1. Plain vanilla JMX
>> >> 2. Use Coda Hale (AKA Yammer) Metrics
>> >> 3. Do our own metrics (with JMX as one output)
>> >>
>> >> 1. Has the advantage that JMX is the most commonly used java thing
>>and
>> >> plugs in reasonably to most metrics systems. JMX is included in the
>>JDK
>> >>so
>> >> it doesn't impose any additional dependencies on clients. It has the
>> >> disadvantage that plain vanilla JMX is a pain to use. We would need a
>> >>bunch
>> >> of helper code for maintaining counters to make this reasonable.
>> >>
>> >> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
>> >> output as well as direct output to many other types of systems. The
>> >>primary
>> >> downside we have had with Coda Hale has to do with the clients and
>> >>library
>> >> incompatibilities. We are currently on an older more popular version.
>> >>The
>> >> newer version is a rewrite of the APIs and is incompatible.
>>Originally
>> >> these were totally incompatible and people had to choose one or the
>> >>other.
>> >> I think that has been improved so now the new version is a totally
>> >> different package. But even in this case you end up with both
>>versions
>> >>if
>> >> you use Kafka and we are on a different version than you which is
>>going
>> >>to
>> >> be pretty inconvenient.
>> >>
>> >> 3. Doing our own has the downside of potentially reinventing the
>>wheel,
>> >>and
>> >> potentially needing to work out any bugs in our code. The upsides
>>would
>> >> depend on the how good the reinvention was. As it happens I did a
>>quick
>> >> (~900 loc) version of a metrics library that is under
>> >>kafka.common.metrics.
>> >> I think it has some advantages over the Yammer metrics package for
>>our
>> >> usage beyond just not causing incompatibilities. I will describe this
>> >>code
>> >> so we can discuss the pros and cons. Although I favor this approach I
>> >>have
>> >> no emotional attachment and wouldn't be too sad if I ended up
>>deleting
>> >>it.
>> >> Here are javadocs for this code, though I haven't written much
>> >> documentation yet since I might end up deleting it:
>> >>
>> >> Here is a quick overview of this library.
>> >>
>> >> There are three main public interfaces:
>> >>   Metrics - This is a repository of metrics being tracked.
>> >>   Metric - A single, named numerical value being measured (i.e. a
>> >>counter).
>> >>   Sensor - This is a thing that records values and updates zero or
>>more
>> >> metrics
>> >>
>> >> So let's say we want to track three values about message sizes;
>> >> specifically say we want to record the average, the maximum, the
>>total
>> >>rate
>> >> of bytes being sent, and a count of messages. Then we would do
>>something
>> >> like this:
>> >>
>> >>    // setup code
>> >>    Metrics metrics = new Metrics(); // this is a global "singleton"
>> >>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
>> >>    sensor.add("kafka.producer.message-size.avg", new Avg());
>> >>    sensor.add("kafka.producer.message-size.max", new Max());
>> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
>> >>    sensor.add("kafka.producer.message-count", new Count());
>> >>
>> >>    // now when we get a message we do this
>> >>    sensor.record(messageSize);
>> >>
>> >> The above code creates the global metrics repository, creates a
>>single
>> >> Sensor, and defines 5 named metrics that are updated by that Sensor.
>> >>
>> >> Like Yammer Metrics (YM) I allow you to plug in "reporters",
>>including a
>> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have
>>keys
>> >> off the metric names not the Sensor names, which I think is an
>> >> improvement--I just use the convention that the last portion of the
>> >>name is
>> >> the attribute name, the second to last is the mbean name, and the
>>rest
>> >>is
>> >> the package. So in the above example there is a producer mbean that
>>has
>> >>a
>> >> avg and max attribute and a producer mbean that has a
>>bytes-sent-per-sec
>> >> and message-count attribute. This is nice because you can logically
>> >>group
>> >> the values reported irrespective of where in the program they are
>> >> computed--that is an mbean can logically group attributes computed
>>off
>> >> different sensors. This means you can report values by logical
>> >>subsystem.
>> >>
>> >> I also allow the concept of hierarchical Sensors which I think is a
>>good
>> >> convenience. I have noticed a common pattern in systems where you
>>need
>> >>to
>> >> roll up the same values along different dimensions. An simple
>>example is
>> >> metrics about qps, data rate, etc on the broker. These we want to
>> >>capture
>> >> in aggregate, but also broken down by topic-id. You can do this
>>purely
>> >>by
>> >> defining the sensor hierarchy:
>> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
>> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
>> >>".sizes",
>> >> allSizes);
>> >> Now each actual update will go to the appropriate topicSizes sensor
>> >>(based
>> >> on the topic name), but allSizes metrics will get updated too. I also
>> >> support multiple parents for each sensor as well as multiple layers
>>of
>> >> hiearchy, so you can define a more elaborate DAG of sensors. An
>>example
>> >>of
>> >> how this would be useful is if you wanted to record your metrics
>>broken
>> >> down by topic AND client id as well as the global aggregate.
>> >>
>> >> Each metric can take a configurable Quota value which allows us to
>>limit
>> >> the maximum value of that sensor. This is intended for use on the
>> >>server as
>> >> part of our Quota implementation. The way this works is that you
>>record
>> >> metrics as usual:
>> >>    mySensor.record(42.0)
>> >> However if this event occurance causes one of the metrics to exceed
>>its
>> >> maximum allowable value (the quota) this call will throw a
>> >> QuotaViolationException. The cool thing about this is that it means
>>we
>> >>can
>> >> define quotas on anything we capture metrics for, which I think is
>> >>pretty
>> >> cool.
>> >>
>> >> Another question is how to handle windowing of the values? Metrics
>>want
>> >>to
>> >> record the "current" value, but the definition of current is
>>inherently
>> >> nebulous. A few of the obvious gotchas are that if you define
>>"current"
>> >>to
>> >> be a number of events you can end up measuring an arbitrarily long
>> >>window
>> >> of time if the event rate is low (e.g. you think you are getting 50
>> >> messages/sec because that was the rate yesterday when all events
>> >>topped).
>> >>
>> >> Here is how I approach this. All the metrics use the same windowing
>> >> approach. We define a single window by a length of time or number of
>> >>values
>> >> (you can use either or both--if both the window ends when *either*
>>the
>> >>time
>> >> bound or event bound is hit). The typical problem with hard window
>> >> boundaries is that at the beginning of the window you have no data
>>and
>> >>the
>> >> first few samples are too small to be a valid sample. (Consider if
>>you
>> >>were
>> >> keeping an avg and the first value in the window happens to be very
>>very
>> >> high, if you check the avg at this exact time you will conclude the
>>avg
>> >>is
>> >> very high but on a sample size of one). One simple fix would be to
>> >>always
>> >> report the last complete window, however this is not appropriate here
>> >> because (1) we want to drive quotas off it so it needs to be current,
>> >>and
>> >> (2) since this is for monitoring you kind of care more about the
>>current
>> >> state. The ideal solution here would be to define a backwards looking
>> >> sliding window from the present, but many statistics are actually
>>very
>> >>hard
>> >> to compute in this model without retaining all the values which
>>would be
>> >> hopelessly inefficient. My solution to this is to keep a configurable
>> >> number of windows (default is two) and combine them for the estimate.
>> >>So in
>> >> a two sample case depending on when you ask you have between one and
>>two
>> >> complete samples worth of data to base the answer off of. Provided
>>the
>> >> sample window is large enough to get a valid result this satisfies
>>both
>> >>of
>> >> my criteria of incorporating the most recent data and having
>>reasonable
>> >> variance at all times.
>> >>
>> >> Another approach is to use an exponential weighting scheme to combine
>> >>all
>> >> history but emphasize the recent past. I have not done this as it
>>has a
>> >>lot
>> >> of issues for practical operational metrics. I'd be happy to
>>elaborate
>> >>on
>> >> this if anyone cares...
>> >>
>> >> The window size for metrics has a global default which can be
>> >>overridden at
>> >> either the sensor or individual metric level.
>> >>
>> >> In addition to these time series values the user can directly expose
>> >>some
>> >> method of their choosing JMX-style by implementing the Measurable
>> >>interface
>> >> and registering that value. E.g.
>> >>   metrics.addMetric("my.metric", new Measurable() {
>> >>     public double measure(MetricConfg config, long now) {
>> >>        return this.calculateValueToExpose();
>> >>     }
>> >>   });
>> >> This is useful for exposing things like the accumulator free memory.
>> >>
>> >> The set of metrics is extensible, new metrics can be added by just
>> >> implementing the appropriate interfaces and registering with a
>>sensor. I
>> >> implement the following metrics:
>> >>   total - the sum of all values from the given sensor
>> >>   count - a windowed count of values from the sensor
>> >>   avg - the sample average within the windows
>> >>   max - the max over the windows
>> >>   min - the min over the windows
>> >>   rate - the rate in the windows (e.g. the total or count divided by
>>the
>> >> ellapsed time)
>> >>   percentiles - a collection of percentiles computed over the window
>> >>
>> >> My approach to percentiles is a little different from the yammer
>>metrics
>> >> package. My complaint about the yammer metrics approach is that it
>>uses
>> >> rather expensive sampling and uses kind of a lot of memory to get a
>> >> reasonable sample. This is problematic for per-topic measurements.
>> >>
>> >> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0)
>> >>which
>> >> directly allows you to specify the desired memory use. Any value
>>below
>> >>the
>> >> minimum is recorded as -Infinity and any value above the maximum as
>> >> +Infinity. I think this is okay as all metrics have an expected range
>> >> except for latency which can be arbitrarily large, but for very high
>> >> latency there is no need to model it exactly (e.g. 30 seconds +
>>really
>> >>is
>> >> effectively infinite). Within the range values are recorded in
>>buckets
>> >> which can be either fixed width or increasing width. The increasing
>> >>width
>> >> is analogous to the idea of significant figures, that is if your
>>value
>> >>is
>> >> in the range 0-10 you might want to be accurate to within 1ms, but if
>> >>it is
>> >> 20000 there is no need to be so accurate. I implemented a linear
>>bucket
>> >> size where the Nth bucket has width proportional to N. An exponential
>> >> bucket size would also be sensible and could likely be derived
>>directly
>> >> from the floating point representation of a the value.
>> >>
>> >> I'd like to get some feedback on this metrics code and make a
>>decision
>> >>on
>> >> whether we want to use it before I actually go ahead and add all the
>> >> instrumentation in the code (otherwise I'll have to redo it if we
>>switch
>> >> approaches). So the next topic of discussion will be which actual
>> >>metrics
>> >> to add.
>> >>
>> >> -Jay
>> >>
>>
>>

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

Hey Sriram,

Not sure if these are actually meant as questions or more veiled comments.
In an case I tried to give my 2 cents inline.

On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian <
srsubramanian@linkedin.com> wrote:

> I think answering the questions below would help to make a better
> decision. I am all for writing better code and having superior
> functionalities but it is worth thinking about stuff outside just code in
> this case -
>
> 1. Does metric form a core piece of kafka? Does it help kafka greatly in
> providing better core functionalities? I would always like a project to do
> one thing really well. Metrics is a non trivial amount of code.
>

Metrics are obviously important, and obviously improving our metrics system
would be good. That said this may or may not be better, and even if it is
better that betterness might not outweigh other considerations. That is
what we are discussing.


> 2. Does it make sense to be part of Kafka or its own project? If this
> metrics library has the potential to be better than metrics-core, I would
> be interested in other projects take advantage of it.
>

It could be either.

3. Can Kafka maintain this library as new members join and old members
> leave? Would this be a piece of code that no one (in Kafka) in the future
> spends time improving if the original author left?
>

I am not going anywhere in the near term, but if I did, yes, this would be
like any other code we have. As with yammer metrics or any other code at
that point we would either use it as is or someone would improve it.


> 4. Does it affect the schedule of producer rewrite? This needs its own
> stabilization and modification to existing metric dashboards if the format
> is changed. Many times such cost are not factored in and a project loses
> time before realizing the extra time required to make a library as this
> operational.
>

Probably not. The metrics are going to change regardless of whether we use
the same library or not. If we think this is better I don't mind putting in
a little extra effort to get there.

Irrespective I think this is probably not the right thing to optimize for.


> I am sure we can do better when we write code to a specific use case (in
> this case, kafka) rather than building a generic library that suits all
> (metrics-core) but I would like us to have answers to the questions above
> and be prepared before we proceed to support this with the producer
> rewrite.


Naturally we are all considering exactly these things, that is exactly the
reason I started the thread.

-Jay


> On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:
>
> >Thanks for the detailed write-up. It's well thought through. A few
> >comments:
> >
> >1. I have a couple of concerns on the percentiles. The first issue is that
> >It requires the user to know the value range. Since the range for things
> >like message size (in millions) is quite different from those like request
> >time (less than 100), it's going to be hard to pick a good global default
> >range. Different apps could be dealing with different message size. So
> >they
> >probably will have to customize the range. Another issue is that it can
> >only report values at the bucket boundaries. So, if you have 1000 buckets
> >and a value range of 1 million, you will only see 1000 possible values as
> >the quantile, which is probably too sparse. The implementation of
> >histogram
> >in metrics-core keeps a fix size of samples, which avoids both issues.
> >
> >2. We need to document the 3-part metrics names better since it's not
> >obvious what the convention is. Also, currently the name of the sensor and
> >the metrics defined in it are independent. Would it make sense to have the
> >sensor name be a prefix of the metric name?
> >
> >Overall, this approach seems to be cleaner than metrics-core by decoupling
> >measuring and reporting. The main benefit of metrics-core seems to be the
> >existing reporters. Since not that many people voted for metrics-core, I
> >am
> >ok with going with the new implementation. My only recommendation is to
> >address the concern on percentiles.
> >
> >Thanks,
> >
> >Jun
> >
> >
> >
> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:
> >
> >> Hey guys,
> >>
> >> I wanted to kick off a quick discussion of metrics with respect to the
> >>new
> >> producer and consumer (and potentially the server).
> >>
> >> At a high level I think there are three approaches we could take:
> >> 1. Plain vanilla JMX
> >> 2. Use Coda Hale (AKA Yammer) Metrics
> >> 3. Do our own metrics (with JMX as one output)
> >>
> >> 1. Has the advantage that JMX is the most commonly used java thing and
> >> plugs in reasonably to most metrics systems. JMX is included in the JDK
> >>so
> >> it doesn't impose any additional dependencies on clients. It has the
> >> disadvantage that plain vanilla JMX is a pain to use. We would need a
> >>bunch
> >> of helper code for maintaining counters to make this reasonable.
> >>
> >> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> >> output as well as direct output to many other types of systems. The
> >>primary
> >> downside we have had with Coda Hale has to do with the clients and
> >>library
> >> incompatibilities. We are currently on an older more popular version.
> >>The
> >> newer version is a rewrite of the APIs and is incompatible. Originally
> >> these were totally incompatible and people had to choose one or the
> >>other.
> >> I think that has been improved so now the new version is a totally
> >> different package. But even in this case you end up with both versions
> >>if
> >> you use Kafka and we are on a different version than you which is going
> >>to
> >> be pretty inconvenient.
> >>
> >> 3. Doing our own has the downside of potentially reinventing the wheel,
> >>and
> >> potentially needing to work out any bugs in our code. The upsides would
> >> depend on the how good the reinvention was. As it happens I did a quick
> >> (~900 loc) version of a metrics library that is under
> >>kafka.common.metrics.
> >> I think it has some advantages over the Yammer metrics package for our
> >> usage beyond just not causing incompatibilities. I will describe this
> >>code
> >> so we can discuss the pros and cons. Although I favor this approach I
> >>have
> >> no emotional attachment and wouldn't be too sad if I ended up deleting
> >>it.
> >> Here are javadocs for this code, though I haven't written much
> >> documentation yet since I might end up deleting it:
> >>
> >> Here is a quick overview of this library.
> >>
> >> There are three main public interfaces:
> >>   Metrics - This is a repository of metrics being tracked.
> >>   Metric - A single, named numerical value being measured (i.e. a
> >>counter).
> >>   Sensor - This is a thing that records values and updates zero or more
> >> metrics
> >>
> >> So let's say we want to track three values about message sizes;
> >> specifically say we want to record the average, the maximum, the total
> >>rate
> >> of bytes being sent, and a count of messages. Then we would do something
> >> like this:
> >>
> >>    // setup code
> >>    Metrics metrics = new Metrics(); // this is a global "singleton"
> >>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
> >>    sensor.add("kafka.producer.message-size.avg", new Avg());
> >>    sensor.add("kafka.producer.message-size.max", new Max());
> >>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> >>    sensor.add("kafka.producer.message-count", new Count());
> >>
> >>    // now when we get a message we do this
> >>    sensor.record(messageSize);
> >>
> >> The above code creates the global metrics repository, creates a single
> >> Sensor, and defines 5 named metrics that are updated by that Sensor.
> >>
> >> Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> >> off the metric names not the Sensor names, which I think is an
> >> improvement--I just use the convention that the last portion of the
> >>name is
> >> the attribute name, the second to last is the mbean name, and the rest
> >>is
> >> the package. So in the above example there is a producer mbean that has
> >>a
> >> avg and max attribute and a producer mbean that has a bytes-sent-per-sec
> >> and message-count attribute. This is nice because you can logically
> >>group
> >> the values reported irrespective of where in the program they are
> >> computed--that is an mbean can logically group attributes computed off
> >> different sensors. This means you can report values by logical
> >>subsystem.
> >>
> >> I also allow the concept of hierarchical Sensors which I think is a good
> >> convenience. I have noticed a common pattern in systems where you need
> >>to
> >> roll up the same values along different dimensions. An simple example is
> >> metrics about qps, data rate, etc on the broker. These we want to
> >>capture
> >> in aggregate, but also broken down by topic-id. You can do this purely
> >>by
> >> defining the sensor hierarchy:
> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
> >>".sizes",
> >> allSizes);
> >> Now each actual update will go to the appropriate topicSizes sensor
> >>(based
> >> on the topic name), but allSizes metrics will get updated too. I also
> >> support multiple parents for each sensor as well as multiple layers of
> >> hiearchy, so you can define a more elaborate DAG of sensors. An example
> >>of
> >> how this would be useful is if you wanted to record your metrics broken
> >> down by topic AND client id as well as the global aggregate.
> >>
> >> Each metric can take a configurable Quota value which allows us to limit
> >> the maximum value of that sensor. This is intended for use on the
> >>server as
> >> part of our Quota implementation. The way this works is that you record
> >> metrics as usual:
> >>    mySensor.record(42.0)
> >> However if this event occurance causes one of the metrics to exceed its
> >> maximum allowable value (the quota) this call will throw a
> >> QuotaViolationException. The cool thing about this is that it means we
> >>can
> >> define quotas on anything we capture metrics for, which I think is
> >>pretty
> >> cool.
> >>
> >> Another question is how to handle windowing of the values? Metrics want
> >>to
> >> record the "current" value, but the definition of current is inherently
> >> nebulous. A few of the obvious gotchas are that if you define "current"
> >>to
> >> be a number of events you can end up measuring an arbitrarily long
> >>window
> >> of time if the event rate is low (e.g. you think you are getting 50
> >> messages/sec because that was the rate yesterday when all events
> >>topped).
> >>
> >> Here is how I approach this. All the metrics use the same windowing
> >> approach. We define a single window by a length of time or number of
> >>values
> >> (you can use either or both--if both the window ends when *either* the
> >>time
> >> bound or event bound is hit). The typical problem with hard window
> >> boundaries is that at the beginning of the window you have no data and
> >>the
> >> first few samples are too small to be a valid sample. (Consider if you
> >>were
> >> keeping an avg and the first value in the window happens to be very very
> >> high, if you check the avg at this exact time you will conclude the avg
> >>is
> >> very high but on a sample size of one). One simple fix would be to
> >>always
> >> report the last complete window, however this is not appropriate here
> >> because (1) we want to drive quotas off it so it needs to be current,
> >>and
> >> (2) since this is for monitoring you kind of care more about the current
> >> state. The ideal solution here would be to define a backwards looking
> >> sliding window from the present, but many statistics are actually very
> >>hard
> >> to compute in this model without retaining all the values which would be
> >> hopelessly inefficient. My solution to this is to keep a configurable
> >> number of windows (default is two) and combine them for the estimate.
> >>So in
> >> a two sample case depending on when you ask you have between one and two
> >> complete samples worth of data to base the answer off of. Provided the
> >> sample window is large enough to get a valid result this satisfies both
> >>of
> >> my criteria of incorporating the most recent data and having reasonable
> >> variance at all times.
> >>
> >> Another approach is to use an exponential weighting scheme to combine
> >>all
> >> history but emphasize the recent past. I have not done this as it has a
> >>lot
> >> of issues for practical operational metrics. I'd be happy to elaborate
> >>on
> >> this if anyone cares...
> >>
> >> The window size for metrics has a global default which can be
> >>overridden at
> >> either the sensor or individual metric level.
> >>
> >> In addition to these time series values the user can directly expose
> >>some
> >> method of their choosing JMX-style by implementing the Measurable
> >>interface
> >> and registering that value. E.g.
> >>   metrics.addMetric("my.metric", new Measurable() {
> >>     public double measure(MetricConfg config, long now) {
> >>        return this.calculateValueToExpose();
> >>     }
> >>   });
> >> This is useful for exposing things like the accumulator free memory.
> >>
> >> The set of metrics is extensible, new metrics can be added by just
> >> implementing the appropriate interfaces and registering with a sensor. I
> >> implement the following metrics:
> >>   total - the sum of all values from the given sensor
> >>   count - a windowed count of values from the sensor
> >>   avg - the sample average within the windows
> >>   max - the max over the windows
> >>   min - the min over the windows
> >>   rate - the rate in the windows (e.g. the total or count divided by the
> >> ellapsed time)
> >>   percentiles - a collection of percentiles computed over the window
> >>
> >> My approach to percentiles is a little different from the yammer metrics
> >> package. My complaint about the yammer metrics approach is that it uses
> >> rather expensive sampling and uses kind of a lot of memory to get a
> >> reasonable sample. This is problematic for per-topic measurements.
> >>
> >> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0)
> >>which
> >> directly allows you to specify the desired memory use. Any value below
> >>the
> >> minimum is recorded as -Infinity and any value above the maximum as
> >> +Infinity. I think this is okay as all metrics have an expected range
> >> except for latency which can be arbitrarily large, but for very high
> >> latency there is no need to model it exactly (e.g. 30 seconds + really
> >>is
> >> effectively infinite). Within the range values are recorded in buckets
> >> which can be either fixed width or increasing width. The increasing
> >>width
> >> is analogous to the idea of significant figures, that is if your value
> >>is
> >> in the range 0-10 you might want to be accurate to within 1ms, but if
> >>it is
> >> 20000 there is no need to be so accurate. I implemented a linear bucket
> >> size where the Nth bucket has width proportional to N. An exponential
> >> bucket size would also be sensible and could likely be derived directly
> >> from the floating point representation of a the value.
> >>
> >> I'd like to get some feedback on this metrics code and make a decision
> >>on
> >> whether we want to use it before I actually go ahead and add all the
> >> instrumentation in the code (otherwise I'll have to redo it if we switch
> >> approaches). So the next topic of discussion will be which actual
> >>metrics
> >> to add.
> >>
> >> -Jay
> >>
>
>

Re: Metrics in new producer

Posted by Jun Rao <ju...@gmail.com>.

The issue with metrics like message size is that max_mesage_size is a
server side config, not a client side one. Even if it is on the client
side, different clients can have different distributions (e.g., uniform in
the whole range, dense in the low, mid or high range). So, I am not sure
there is a good out of box range/bucketization (even at the per sensor
level) that works for every client.

Thanks,

Jun


On Wed, Feb 12, 2014 at 1:05 PM, Jay Kreps <ja...@gmail.com> wrote:

> Hey Jun,
>
> 1. With respect to percentile, my assumption is that you would not use a
> global default. Since metrics are attached per-sensor you would give a
> range that makes sense for that value. So in your example message size
> would presumably have the range (0,max_message_size) and latency would have
> the range (0, 30k). I think these ranges can be fixed per sensor, they are
> not application specific. The sampling approach to percentiles does avoid a
> fixed range but has other issues. In particular, for high-percentiles it
> will require very large samples to avoid missing these altogether. This was
> the concern that took me down this path. I would be up for implementing a
> sampling-based histogram and comparing, though.
>
> 2. Agreed, the documentation is definitely lacking. I'm happy to add that
> if we end up using this. I started as you described, including the sensor
> name in the metric name but what I realized is that these two things are
> actually completely separate. The way you report metrics shouldn't be
> coupled to the particular things you measured to get that. It will often be
> the case that there are two or three related sensors. Say for requests you
> want to capture size, latency, throughput, etc. But you may want to report
> these all together as part of the same mbean (say
> kafka.producer.avg-latency, kafka.producer.avg-message-size, etc. The fact
> that you measured different things shouldn't trickle out into what is
> reported.
>
> I agree with your breakdown of the tradeoffs.
>
> -Jay
>
> On Tue, Feb 11, 2014 at 6:28 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > Thanks for the detailed write-up. It's well thought through. A few
> > comments:
> >
> > 1. I have a couple of concerns on the percentiles. The first issue is
> that
> > It requires the user to know the value range. Since the range for things
> > like message size (in millions) is quite different from those like
> request
> > time (less than 100), it's going to be hard to pick a good global default
> > range. Different apps could be dealing with different message size. So
> they
> > probably will have to customize the range. Another issue is that it can
> > only report values at the bucket boundaries. So, if you have 1000 buckets
> > and a value range of 1 million, you will only see 1000 possible values as
> > the quantile, which is probably too sparse. The implementation of
> histogram
> > in metrics-core keeps a fix size of samples, which avoids both issues.
> >
> > 2. We need to document the 3-part metrics names better since it's not
> > obvious what the convention is. Also, currently the name of the sensor
> and
> > the metrics defined in it are independent. Would it make sense to have
> the
> > sensor name be a prefix of the metric name?
> >
> > Overall, this approach seems to be cleaner than metrics-core by
> decoupling
> > measuring and reporting. The main benefit of metrics-core seems to be the
> > existing reporters. Since not that many people voted for metrics-core, I
> am
> > ok with going with the new implementation. My only recommendation is to
> > address the concern on percentiles.
> >
> > Thanks,
> >
> > Jun
> >
> >
> >
> > On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:
> >
> > > Hey guys,
> > >
> > > I wanted to kick off a quick discussion of metrics with respect to the
> > new
> > > producer and consumer (and potentially the server).
> > >
> > > At a high level I think there are three approaches we could take:
> > > 1. Plain vanilla JMX
> > > 2. Use Coda Hale (AKA Yammer) Metrics
> > > 3. Do our own metrics (with JMX as one output)
> > >
> > > 1. Has the advantage that JMX is the most commonly used java thing and
> > > plugs in reasonably to most metrics systems. JMX is included in the JDK
> > so
> > > it doesn't impose any additional dependencies on clients. It has the
> > > disadvantage that plain vanilla JMX is a pain to use. We would need a
> > bunch
> > > of helper code for maintaining counters to make this reasonable.
> > >
> > > 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> > > output as well as direct output to many other types of systems. The
> > primary
> > > downside we have had with Coda Hale has to do with the clients and
> > library
> > > incompatibilities. We are currently on an older more popular version.
> The
> > > newer version is a rewrite of the APIs and is incompatible. Originally
> > > these were totally incompatible and people had to choose one or the
> > other.
> > > I think that has been improved so now the new version is a totally
> > > different package. But even in this case you end up with both versions
> if
> > > you use Kafka and we are on a different version than you which is going
> > to
> > > be pretty inconvenient.
> > >
> > > 3. Doing our own has the downside of potentially reinventing the wheel,
> > and
> > > potentially needing to work out any bugs in our code. The upsides would
> > > depend on the how good the reinvention was. As it happens I did a quick
> > > (~900 loc) version of a metrics library that is under
> > kafka.common.metrics.
> > > I think it has some advantages over the Yammer metrics package for our
> > > usage beyond just not causing incompatibilities. I will describe this
> > code
> > > so we can discuss the pros and cons. Although I favor this approach I
> > have
> > > no emotional attachment and wouldn't be too sad if I ended up deleting
> > it.
> > > Here are javadocs for this code, though I haven't written much
> > > documentation yet since I might end up deleting it:
> > >
> > > Here is a quick overview of this library.
> > >
> > > There are three main public interfaces:
> > >   Metrics - This is a repository of metrics being tracked.
> > >   Metric - A single, named numerical value being measured (i.e. a
> > counter).
> > >   Sensor - This is a thing that records values and updates zero or more
> > > metrics
> > >
> > > So let's say we want to track three values about message sizes;
> > > specifically say we want to record the average, the maximum, the total
> > rate
> > > of bytes being sent, and a count of messages. Then we would do
> something
> > > like this:
> > >
> > >    // setup code
> > >    Metrics metrics = new Metrics(); // this is a global "singleton"
> > >    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
> > >    sensor.add("kafka.producer.message-size.avg", new Avg());
> > >    sensor.add("kafka.producer.message-size.max", new Max());
> > >    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> > >    sensor.add("kafka.producer.message-count", new Count());
> > >
> > >    // now when we get a message we do this
> > >    sensor.record(messageSize);
> > >
> > > The above code creates the global metrics repository, creates a single
> > > Sensor, and defines 5 named metrics that are updated by that Sensor.
> > >
> > > Like Yammer Metrics (YM) I allow you to plug in "reporters", including
> a
> > > JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have
> keys
> > > off the metric names not the Sensor names, which I think is an
> > > improvement--I just use the convention that the last portion of the
> name
> > is
> > > the attribute name, the second to last is the mbean name, and the rest
> is
> > > the package. So in the above example there is a producer mbean that
> has a
> > > avg and max attribute and a producer mbean that has a
> bytes-sent-per-sec
> > > and message-count attribute. This is nice because you can logically
> group
> > > the values reported irrespective of where in the program they are
> > > computed--that is an mbean can logically group attributes computed off
> > > different sensors. This means you can report values by logical
> subsystem.
> > >
> > > I also allow the concept of hierarchical Sensors which I think is a
> good
> > > convenience. I have noticed a common pattern in systems where you need
> to
> > > roll up the same values along different dimensions. An simple example
> is
> > > metrics about qps, data rate, etc on the broker. These we want to
> capture
> > > in aggregate, but also broken down by topic-id. You can do this purely
> by
> > > defining the sensor hierarchy:
> > > Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > > Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
> ".sizes",
> > > allSizes);
> > > Now each actual update will go to the appropriate topicSizes sensor
> > (based
> > > on the topic name), but allSizes metrics will get updated too. I also
> > > support multiple parents for each sensor as well as multiple layers of
> > > hiearchy, so you can define a more elaborate DAG of sensors. An example
> > of
> > > how this would be useful is if you wanted to record your metrics broken
> > > down by topic AND client id as well as the global aggregate.
> > >
> > > Each metric can take a configurable Quota value which allows us to
> limit
> > > the maximum value of that sensor. This is intended for use on the
> server
> > as
> > > part of our Quota implementation. The way this works is that you record
> > > metrics as usual:
> > >    mySensor.record(42.0)
> > > However if this event occurance causes one of the metrics to exceed its
> > > maximum allowable value (the quota) this call will throw a
> > > QuotaViolationException. The cool thing about this is that it means we
> > can
> > > define quotas on anything we capture metrics for, which I think is
> pretty
> > > cool.
> > >
> > > Another question is how to handle windowing of the values? Metrics want
> > to
> > > record the "current" value, but the definition of current is inherently
> > > nebulous. A few of the obvious gotchas are that if you define "current"
> > to
> > > be a number of events you can end up measuring an arbitrarily long
> window
> > > of time if the event rate is low (e.g. you think you are getting 50
> > > messages/sec because that was the rate yesterday when all events
> topped).
> > >
> > > Here is how I approach this. All the metrics use the same windowing
> > > approach. We define a single window by a length of time or number of
> > values
> > > (you can use either or both--if both the window ends when *either* the
> > time
> > > bound or event bound is hit). The typical problem with hard window
> > > boundaries is that at the beginning of the window you have no data and
> > the
> > > first few samples are too small to be a valid sample. (Consider if you
> > were
> > > keeping an avg and the first value in the window happens to be very
> very
> > > high, if you check the avg at this exact time you will conclude the avg
> > is
> > > very high but on a sample size of one). One simple fix would be to
> always
> > > report the last complete window, however this is not appropriate here
> > > because (1) we want to drive quotas off it so it needs to be current,
> and
> > > (2) since this is for monitoring you kind of care more about the
> current
> > > state. The ideal solution here would be to define a backwards looking
> > > sliding window from the present, but many statistics are actually very
> > hard
> > > to compute in this model without retaining all the values which would
> be
> > > hopelessly inefficient. My solution to this is to keep a configurable
> > > number of windows (default is two) and combine them for the estimate.
> So
> > in
> > > a two sample case depending on when you ask you have between one and
> two
> > > complete samples worth of data to base the answer off of. Provided the
> > > sample window is large enough to get a valid result this satisfies both
> > of
> > > my criteria of incorporating the most recent data and having reasonable
> > > variance at all times.
> > >
> > > Another approach is to use an exponential weighting scheme to combine
> all
> > > history but emphasize the recent past. I have not done this as it has a
> > lot
> > > of issues for practical operational metrics. I'd be happy to elaborate
> on
> > > this if anyone cares...
> > >
> > > The window size for metrics has a global default which can be
> overridden
> > at
> > > either the sensor or individual metric level.
> > >
> > > In addition to these time series values the user can directly expose
> some
> > > method of their choosing JMX-style by implementing the Measurable
> > interface
> > > and registering that value. E.g.
> > >   metrics.addMetric("my.metric", new Measurable() {
> > >     public double measure(MetricConfg config, long now) {
> > >        return this.calculateValueToExpose();
> > >     }
> > >   });
> > > This is useful for exposing things like the accumulator free memory.
> > >
> > > The set of metrics is extensible, new metrics can be added by just
> > > implementing the appropriate interfaces and registering with a sensor.
> I
> > > implement the following metrics:
> > >   total - the sum of all values from the given sensor
> > >   count - a windowed count of values from the sensor
> > >   avg - the sample average within the windows
> > >   max - the max over the windows
> > >   min - the min over the windows
> > >   rate - the rate in the windows (e.g. the total or count divided by
> the
> > > ellapsed time)
> > >   percentiles - a collection of percentiles computed over the window
> > >
> > > My approach to percentiles is a little different from the yammer
> metrics
> > > package. My complaint about the yammer metrics approach is that it uses
> > > rather expensive sampling and uses kind of a lot of memory to get a
> > > reasonable sample. This is problematic for per-topic measurements.
> > >
> > > Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0)
> which
> > > directly allows you to specify the desired memory use. Any value below
> > the
> > > minimum is recorded as -Infinity and any value above the maximum as
> > > +Infinity. I think this is okay as all metrics have an expected range
> > > except for latency which can be arbitrarily large, but for very high
> > > latency there is no need to model it exactly (e.g. 30 seconds + really
> is
> > > effectively infinite). Within the range values are recorded in buckets
> > > which can be either fixed width or increasing width. The increasing
> width
> > > is analogous to the idea of significant figures, that is if your value
> is
> > > in the range 0-10 you might want to be accurate to within 1ms, but if
> it
> > is
> > > 20000 there is no need to be so accurate. I implemented a linear bucket
> > > size where the Nth bucket has width proportional to N. An exponential
> > > bucket size would also be sensible and could likely be derived directly
> > > from the floating point representation of a the value.
> > >
> > > I'd like to get some feedback on this metrics code and make a decision
> on
> > > whether we want to use it before I actually go ahead and add all the
> > > instrumentation in the code (otherwise I'll have to redo it if we
> switch
> > > approaches). So the next topic of discussion will be which actual
> metrics
> > > to add.
> > >
> > > -Jay
> > >
> >
>

Re: Metrics in new producer

Posted by Jay Kreps <ja...@gmail.com>.

Hey Jun,

1. With respect to percentile, my assumption is that you would not use a
global default. Since metrics are attached per-sensor you would give a
range that makes sense for that value. So in your example message size
would presumably have the range (0,max_message_size) and latency would have
the range (0, 30k). I think these ranges can be fixed per sensor, they are
not application specific. The sampling approach to percentiles does avoid a
fixed range but has other issues. In particular, for high-percentiles it
will require very large samples to avoid missing these altogether. This was
the concern that took me down this path. I would be up for implementing a
sampling-based histogram and comparing, though.

2. Agreed, the documentation is definitely lacking. I'm happy to add that
if we end up using this. I started as you described, including the sensor
name in the metric name but what I realized is that these two things are
actually completely separate. The way you report metrics shouldn't be
coupled to the particular things you measured to get that. It will often be
the case that there are two or three related sensors. Say for requests you
want to capture size, latency, throughput, etc. But you may want to report
these all together as part of the same mbean (say
kafka.producer.avg-latency, kafka.producer.avg-message-size, etc. The fact
that you measured different things shouldn't trickle out into what is
reported.

I agree with your breakdown of the tradeoffs.

-Jay

On Tue, Feb 11, 2014 at 6:28 PM, Jun Rao <ju...@gmail.com> wrote:

> Thanks for the detailed write-up. It's well thought through. A few
> comments:
>
> 1. I have a couple of concerns on the percentiles. The first issue is that
> It requires the user to know the value range. Since the range for things
> like message size (in millions) is quite different from those like request
> time (less than 100), it's going to be hard to pick a good global default
> range. Different apps could be dealing with different message size. So they
> probably will have to customize the range. Another issue is that it can
> only report values at the bucket boundaries. So, if you have 1000 buckets
> and a value range of 1 million, you will only see 1000 possible values as
> the quantile, which is probably too sparse. The implementation of histogram
> in metrics-core keeps a fix size of samples, which avoids both issues.
>
> 2. We need to document the 3-part metrics names better since it's not
> obvious what the convention is. Also, currently the name of the sensor and
> the metrics defined in it are independent. Would it make sense to have the
> sensor name be a prefix of the metric name?
>
> Overall, this approach seems to be cleaner than metrics-core by decoupling
> measuring and reporting. The main benefit of metrics-core seems to be the
> existing reporters. Since not that many people voted for metrics-core, I am
> ok with going with the new implementation. My only recommendation is to
> address the concern on percentiles.
>
> Thanks,
>
> Jun
>
>
>
> On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:
>
> > Hey guys,
> >
> > I wanted to kick off a quick discussion of metrics with respect to the
> new
> > producer and consumer (and potentially the server).
> >
> > At a high level I think there are three approaches we could take:
> > 1. Plain vanilla JMX
> > 2. Use Coda Hale (AKA Yammer) Metrics
> > 3. Do our own metrics (with JMX as one output)
> >
> > 1. Has the advantage that JMX is the most commonly used java thing and
> > plugs in reasonably to most metrics systems. JMX is included in the JDK
> so
> > it doesn't impose any additional dependencies on clients. It has the
> > disadvantage that plain vanilla JMX is a pain to use. We would need a
> bunch
> > of helper code for maintaining counters to make this reasonable.
> >
> > 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> > output as well as direct output to many other types of systems. The
> primary
> > downside we have had with Coda Hale has to do with the clients and
> library
> > incompatibilities. We are currently on an older more popular version. The
> > newer version is a rewrite of the APIs and is incompatible. Originally
> > these were totally incompatible and people had to choose one or the
> other.
> > I think that has been improved so now the new version is a totally
> > different package. But even in this case you end up with both versions if
> > you use Kafka and we are on a different version than you which is going
> to
> > be pretty inconvenient.
> >
> > 3. Doing our own has the downside of potentially reinventing the wheel,
> and
> > potentially needing to work out any bugs in our code. The upsides would
> > depend on the how good the reinvention was. As it happens I did a quick
> > (~900 loc) version of a metrics library that is under
> kafka.common.metrics.
> > I think it has some advantages over the Yammer metrics package for our
> > usage beyond just not causing incompatibilities. I will describe this
> code
> > so we can discuss the pros and cons. Although I favor this approach I
> have
> > no emotional attachment and wouldn't be too sad if I ended up deleting
> it.
> > Here are javadocs for this code, though I haven't written much
> > documentation yet since I might end up deleting it:
> >
> > Here is a quick overview of this library.
> >
> > There are three main public interfaces:
> >   Metrics - This is a repository of metrics being tracked.
> >   Metric - A single, named numerical value being measured (i.e. a
> counter).
> >   Sensor - This is a thing that records values and updates zero or more
> > metrics
> >
> > So let's say we want to track three values about message sizes;
> > specifically say we want to record the average, the maximum, the total
> rate
> > of bytes being sent, and a count of messages. Then we would do something
> > like this:
> >
> >    // setup code
> >    Metrics metrics = new Metrics(); // this is a global "singleton"
> >    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
> >    sensor.add("kafka.producer.message-size.avg", new Avg());
> >    sensor.add("kafka.producer.message-size.max", new Max());
> >    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
> >    sensor.add("kafka.producer.message-count", new Count());
> >
> >    // now when we get a message we do this
> >    sensor.record(messageSize);
> >
> > The above code creates the global metrics repository, creates a single
> > Sensor, and defines 5 named metrics that are updated by that Sensor.
> >
> > Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> > JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> > off the metric names not the Sensor names, which I think is an
> > improvement--I just use the convention that the last portion of the name
> is
> > the attribute name, the second to last is the mbean name, and the rest is
> > the package. So in the above example there is a producer mbean that has a
> > avg and max attribute and a producer mbean that has a bytes-sent-per-sec
> > and message-count attribute. This is nice because you can logically group
> > the values reported irrespective of where in the program they are
> > computed--that is an mbean can logically group attributes computed off
> > different sensors. This means you can report values by logical subsystem.
> >
> > I also allow the concept of hierarchical Sensors which I think is a good
> > convenience. I have noticed a common pattern in systems where you need to
> > roll up the same values along different dimensions. An simple example is
> > metrics about qps, data rate, etc on the broker. These we want to capture
> > in aggregate, but also broken down by topic-id. You can do this purely by
> > defining the sensor hierarchy:
> > Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> > Sensor topicSizes = metrics.sensor("kafka.producer." + topic  + ".sizes",
> > allSizes);
> > Now each actual update will go to the appropriate topicSizes sensor
> (based
> > on the topic name), but allSizes metrics will get updated too. I also
> > support multiple parents for each sensor as well as multiple layers of
> > hiearchy, so you can define a more elaborate DAG of sensors. An example
> of
> > how this would be useful is if you wanted to record your metrics broken
> > down by topic AND client id as well as the global aggregate.
> >
> > Each metric can take a configurable Quota value which allows us to limit
> > the maximum value of that sensor. This is intended for use on the server
> as
> > part of our Quota implementation. The way this works is that you record
> > metrics as usual:
> >    mySensor.record(42.0)
> > However if this event occurance causes one of the metrics to exceed its
> > maximum allowable value (the quota) this call will throw a
> > QuotaViolationException. The cool thing about this is that it means we
> can
> > define quotas on anything we capture metrics for, which I think is pretty
> > cool.
> >
> > Another question is how to handle windowing of the values? Metrics want
> to
> > record the "current" value, but the definition of current is inherently
> > nebulous. A few of the obvious gotchas are that if you define "current"
> to
> > be a number of events you can end up measuring an arbitrarily long window
> > of time if the event rate is low (e.g. you think you are getting 50
> > messages/sec because that was the rate yesterday when all events topped).
> >
> > Here is how I approach this. All the metrics use the same windowing
> > approach. We define a single window by a length of time or number of
> values
> > (you can use either or both--if both the window ends when *either* the
> time
> > bound or event bound is hit). The typical problem with hard window
> > boundaries is that at the beginning of the window you have no data and
> the
> > first few samples are too small to be a valid sample. (Consider if you
> were
> > keeping an avg and the first value in the window happens to be very very
> > high, if you check the avg at this exact time you will conclude the avg
> is
> > very high but on a sample size of one). One simple fix would be to always
> > report the last complete window, however this is not appropriate here
> > because (1) we want to drive quotas off it so it needs to be current, and
> > (2) since this is for monitoring you kind of care more about the current
> > state. The ideal solution here would be to define a backwards looking
> > sliding window from the present, but many statistics are actually very
> hard
> > to compute in this model without retaining all the values which would be
> > hopelessly inefficient. My solution to this is to keep a configurable
> > number of windows (default is two) and combine them for the estimate. So
> in
> > a two sample case depending on when you ask you have between one and two
> > complete samples worth of data to base the answer off of. Provided the
> > sample window is large enough to get a valid result this satisfies both
> of
> > my criteria of incorporating the most recent data and having reasonable
> > variance at all times.
> >
> > Another approach is to use an exponential weighting scheme to combine all
> > history but emphasize the recent past. I have not done this as it has a
> lot
> > of issues for practical operational metrics. I'd be happy to elaborate on
> > this if anyone cares...
> >
> > The window size for metrics has a global default which can be overridden
> at
> > either the sensor or individual metric level.
> >
> > In addition to these time series values the user can directly expose some
> > method of their choosing JMX-style by implementing the Measurable
> interface
> > and registering that value. E.g.
> >   metrics.addMetric("my.metric", new Measurable() {
> >     public double measure(MetricConfg config, long now) {
> >        return this.calculateValueToExpose();
> >     }
> >   });
> > This is useful for exposing things like the accumulator free memory.
> >
> > The set of metrics is extensible, new metrics can be added by just
> > implementing the appropriate interfaces and registering with a sensor. I
> > implement the following metrics:
> >   total - the sum of all values from the given sensor
> >   count - a windowed count of values from the sensor
> >   avg - the sample average within the windows
> >   max - the max over the windows
> >   min - the min over the windows
> >   rate - the rate in the windows (e.g. the total or count divided by the
> > ellapsed time)
> >   percentiles - a collection of percentiles computed over the window
> >
> > My approach to percentiles is a little different from the yammer metrics
> > package. My complaint about the yammer metrics approach is that it uses
> > rather expensive sampling and uses kind of a lot of memory to get a
> > reasonable sample. This is problematic for per-topic measurements.
> >
> > Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0) which
> > directly allows you to specify the desired memory use. Any value below
> the
> > minimum is recorded as -Infinity and any value above the maximum as
> > +Infinity. I think this is okay as all metrics have an expected range
> > except for latency which can be arbitrarily large, but for very high
> > latency there is no need to model it exactly (e.g. 30 seconds + really is
> > effectively infinite). Within the range values are recorded in buckets
> > which can be either fixed width or increasing width. The increasing width
> > is analogous to the idea of significant figures, that is if your value is
> > in the range 0-10 you might want to be accurate to within 1ms, but if it
> is
> > 20000 there is no need to be so accurate. I implemented a linear bucket
> > size where the Nth bucket has width proportional to N. An exponential
> > bucket size would also be sensible and could likely be derived directly
> > from the floating point representation of a the value.
> >
> > I'd like to get some feedback on this metrics code and make a decision on
> > whether we want to use it before I actually go ahead and add all the
> > instrumentation in the code (otherwise I'll have to redo it if we switch
> > approaches). So the next topic of discussion will be which actual metrics
> > to add.
> >
> > -Jay
> >
>

Re: Metrics in new producer

Posted by Sriram Subramanian <sr...@linkedin.com>.

I think answering the questions below would help to make a better
decision. I am all for writing better code and having superior
functionalities but it is worth thinking about stuff outside just code in
this case -

1. Does metric form a core piece of kafka? Does it help kafka greatly in
providing better core functionalities? I would always like a project to do
one thing really well. Metrics is a non trivial amount of code.

2. Does it make sense to be part of Kafka or its own project? If this
metrics library has the potential to be better than metrics-core, I would
be interested in other projects take advantage of it.

3. Can Kafka maintain this library as new members join and old members
leave? Would this be a piece of code that no one (in Kafka) in the future
spends time improving if the original author left?

4. Does it affect the schedule of producer rewrite? This needs its own
stabilization and modification to existing metric dashboards if the format
is changed. Many times such cost are not factored in and a project loses
time before realizing the extra time required to make a library as this
operational.

I am sure we can do better when we write code to a specific use case (in
this case, kafka) rather than building a generic library that suits all
(metrics-core) but I would like us to have answers to the questions above
and be prepared before we proceed to support this with the producer
rewrite.

On 2/11/14 6:28 PM, "Jun Rao" <ju...@gmail.com> wrote:

>Thanks for the detailed write-up. It's well thought through. A few
>comments:
>
>1. I have a couple of concerns on the percentiles. The first issue is that
>It requires the user to know the value range. Since the range for things
>like message size (in millions) is quite different from those like request
>time (less than 100), it's going to be hard to pick a good global default
>range. Different apps could be dealing with different message size. So
>they
>probably will have to customize the range. Another issue is that it can
>only report values at the bucket boundaries. So, if you have 1000 buckets
>and a value range of 1 million, you will only see 1000 possible values as
>the quantile, which is probably too sparse. The implementation of
>histogram
>in metrics-core keeps a fix size of samples, which avoids both issues.
>
>2. We need to document the 3-part metrics names better since it's not
>obvious what the convention is. Also, currently the name of the sensor and
>the metrics defined in it are independent. Would it make sense to have the
>sensor name be a prefix of the metric name?
>
>Overall, this approach seems to be cleaner than metrics-core by decoupling
>measuring and reporting. The main benefit of metrics-core seems to be the
>existing reporters. Since not that many people voted for metrics-core, I
>am
>ok with going with the new implementation. My only recommendation is to
>address the concern on percentiles.
>
>Thanks,
>
>Jun
>
>
>
>On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:
>
>> Hey guys,
>>
>> I wanted to kick off a quick discussion of metrics with respect to the
>>new
>> producer and consumer (and potentially the server).
>>
>> At a high level I think there are three approaches we could take:
>> 1. Plain vanilla JMX
>> 2. Use Coda Hale (AKA Yammer) Metrics
>> 3. Do our own metrics (with JMX as one output)
>>
>> 1. Has the advantage that JMX is the most commonly used java thing and
>> plugs in reasonably to most metrics systems. JMX is included in the JDK
>>so
>> it doesn't impose any additional dependencies on clients. It has the
>> disadvantage that plain vanilla JMX is a pain to use. We would need a
>>bunch
>> of helper code for maintaining counters to make this reasonable.
>>
>> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
>> output as well as direct output to many other types of systems. The
>>primary
>> downside we have had with Coda Hale has to do with the clients and
>>library
>> incompatibilities. We are currently on an older more popular version.
>>The
>> newer version is a rewrite of the APIs and is incompatible. Originally
>> these were totally incompatible and people had to choose one or the
>>other.
>> I think that has been improved so now the new version is a totally
>> different package. But even in this case you end up with both versions
>>if
>> you use Kafka and we are on a different version than you which is going
>>to
>> be pretty inconvenient.
>>
>> 3. Doing our own has the downside of potentially reinventing the wheel,
>>and
>> potentially needing to work out any bugs in our code. The upsides would
>> depend on the how good the reinvention was. As it happens I did a quick
>> (~900 loc) version of a metrics library that is under
>>kafka.common.metrics.
>> I think it has some advantages over the Yammer metrics package for our
>> usage beyond just not causing incompatibilities. I will describe this
>>code
>> so we can discuss the pros and cons. Although I favor this approach I
>>have
>> no emotional attachment and wouldn't be too sad if I ended up deleting
>>it.
>> Here are javadocs for this code, though I haven't written much
>> documentation yet since I might end up deleting it:
>>
>> Here is a quick overview of this library.
>>
>> There are three main public interfaces:
>>   Metrics - This is a repository of metrics being tracked.
>>   Metric - A single, named numerical value being measured (i.e. a
>>counter).
>>   Sensor - This is a thing that records values and updates zero or more
>> metrics
>>
>> So let's say we want to track three values about message sizes;
>> specifically say we want to record the average, the maximum, the total
>>rate
>> of bytes being sent, and a count of messages. Then we would do something
>> like this:
>>
>>    // setup code
>>    Metrics metrics = new Metrics(); // this is a global "singleton"
>>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
>>    sensor.add("kafka.producer.message-size.avg", new Avg());
>>    sensor.add("kafka.producer.message-size.max", new Max());
>>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
>>    sensor.add("kafka.producer.message-count", new Count());
>>
>>    // now when we get a message we do this
>>    sensor.record(messageSize);
>>
>> The above code creates the global metrics repository, creates a single
>> Sensor, and defines 5 named metrics that are updated by that Sensor.
>>
>> Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
>> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
>> off the metric names not the Sensor names, which I think is an
>> improvement--I just use the convention that the last portion of the
>>name is
>> the attribute name, the second to last is the mbean name, and the rest
>>is
>> the package. So in the above example there is a producer mbean that has
>>a
>> avg and max attribute and a producer mbean that has a bytes-sent-per-sec
>> and message-count attribute. This is nice because you can logically
>>group
>> the values reported irrespective of where in the program they are
>> computed--that is an mbean can logically group attributes computed off
>> different sensors. This means you can report values by logical
>>subsystem.
>>
>> I also allow the concept of hierarchical Sensors which I think is a good
>> convenience. I have noticed a common pattern in systems where you need
>>to
>> roll up the same values along different dimensions. An simple example is
>> metrics about qps, data rate, etc on the broker. These we want to
>>capture
>> in aggregate, but also broken down by topic-id. You can do this purely
>>by
>> defining the sensor hierarchy:
>> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
>> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  +
>>".sizes",
>> allSizes);
>> Now each actual update will go to the appropriate topicSizes sensor
>>(based
>> on the topic name), but allSizes metrics will get updated too. I also
>> support multiple parents for each sensor as well as multiple layers of
>> hiearchy, so you can define a more elaborate DAG of sensors. An example
>>of
>> how this would be useful is if you wanted to record your metrics broken
>> down by topic AND client id as well as the global aggregate.
>>
>> Each metric can take a configurable Quota value which allows us to limit
>> the maximum value of that sensor. This is intended for use on the
>>server as
>> part of our Quota implementation. The way this works is that you record
>> metrics as usual:
>>    mySensor.record(42.0)
>> However if this event occurance causes one of the metrics to exceed its
>> maximum allowable value (the quota) this call will throw a
>> QuotaViolationException. The cool thing about this is that it means we
>>can
>> define quotas on anything we capture metrics for, which I think is
>>pretty
>> cool.
>>
>> Another question is how to handle windowing of the values? Metrics want
>>to
>> record the "current" value, but the definition of current is inherently
>> nebulous. A few of the obvious gotchas are that if you define "current"
>>to
>> be a number of events you can end up measuring an arbitrarily long
>>window
>> of time if the event rate is low (e.g. you think you are getting 50
>> messages/sec because that was the rate yesterday when all events
>>topped).
>>
>> Here is how I approach this. All the metrics use the same windowing
>> approach. We define a single window by a length of time or number of
>>values
>> (you can use either or both--if both the window ends when *either* the
>>time
>> bound or event bound is hit). The typical problem with hard window
>> boundaries is that at the beginning of the window you have no data and
>>the
>> first few samples are too small to be a valid sample. (Consider if you
>>were
>> keeping an avg and the first value in the window happens to be very very
>> high, if you check the avg at this exact time you will conclude the avg
>>is
>> very high but on a sample size of one). One simple fix would be to
>>always
>> report the last complete window, however this is not appropriate here
>> because (1) we want to drive quotas off it so it needs to be current,
>>and
>> (2) since this is for monitoring you kind of care more about the current
>> state. The ideal solution here would be to define a backwards looking
>> sliding window from the present, but many statistics are actually very
>>hard
>> to compute in this model without retaining all the values which would be
>> hopelessly inefficient. My solution to this is to keep a configurable
>> number of windows (default is two) and combine them for the estimate.
>>So in
>> a two sample case depending on when you ask you have between one and two
>> complete samples worth of data to base the answer off of. Provided the
>> sample window is large enough to get a valid result this satisfies both
>>of
>> my criteria of incorporating the most recent data and having reasonable
>> variance at all times.
>>
>> Another approach is to use an exponential weighting scheme to combine
>>all
>> history but emphasize the recent past. I have not done this as it has a
>>lot
>> of issues for practical operational metrics. I'd be happy to elaborate
>>on
>> this if anyone cares...
>>
>> The window size for metrics has a global default which can be
>>overridden at
>> either the sensor or individual metric level.
>>
>> In addition to these time series values the user can directly expose
>>some
>> method of their choosing JMX-style by implementing the Measurable
>>interface
>> and registering that value. E.g.
>>   metrics.addMetric("my.metric", new Measurable() {
>>     public double measure(MetricConfg config, long now) {
>>        return this.calculateValueToExpose();
>>     }
>>   });
>> This is useful for exposing things like the accumulator free memory.
>>
>> The set of metrics is extensible, new metrics can be added by just
>> implementing the appropriate interfaces and registering with a sensor. I
>> implement the following metrics:
>>   total - the sum of all values from the given sensor
>>   count - a windowed count of values from the sensor
>>   avg - the sample average within the windows
>>   max - the max over the windows
>>   min - the min over the windows
>>   rate - the rate in the windows (e.g. the total or count divided by the
>> ellapsed time)
>>   percentiles - a collection of percentiles computed over the window
>>
>> My approach to percentiles is a little different from the yammer metrics
>> package. My complaint about the yammer metrics approach is that it uses
>> rather expensive sampling and uses kind of a lot of memory to get a
>> reasonable sample. This is problematic for per-topic measurements.
>>
>> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0)
>>which
>> directly allows you to specify the desired memory use. Any value below
>>the
>> minimum is recorded as -Infinity and any value above the maximum as
>> +Infinity. I think this is okay as all metrics have an expected range
>> except for latency which can be arbitrarily large, but for very high
>> latency there is no need to model it exactly (e.g. 30 seconds + really
>>is
>> effectively infinite). Within the range values are recorded in buckets
>> which can be either fixed width or increasing width. The increasing
>>width
>> is analogous to the idea of significant figures, that is if your value
>>is
>> in the range 0-10 you might want to be accurate to within 1ms, but if
>>it is
>> 20000 there is no need to be so accurate. I implemented a linear bucket
>> size where the Nth bucket has width proportional to N. An exponential
>> bucket size would also be sensible and could likely be derived directly
>> from the floating point representation of a the value.
>>
>> I'd like to get some feedback on this metrics code and make a decision
>>on
>> whether we want to use it before I actually go ahead and add all the
>> instrumentation in the code (otherwise I'll have to redo it if we switch
>> approaches). So the next topic of discussion will be which actual
>>metrics
>> to add.
>>
>> -Jay
>>

Re: Metrics in new producer

Posted by Jun Rao <ju...@gmail.com>.

Thanks for the detailed write-up. It's well thought through. A few comments:

1. I have a couple of concerns on the percentiles. The first issue is that
It requires the user to know the value range. Since the range for things
like message size (in millions) is quite different from those like request
time (less than 100), it's going to be hard to pick a good global default
range. Different apps could be dealing with different message size. So they
probably will have to customize the range. Another issue is that it can
only report values at the bucket boundaries. So, if you have 1000 buckets
and a value range of 1 million, you will only see 1000 possible values as
the quantile, which is probably too sparse. The implementation of histogram
in metrics-core keeps a fix size of samples, which avoids both issues.

2. We need to document the 3-part metrics names better since it's not
obvious what the convention is. Also, currently the name of the sensor and
the metrics defined in it are independent. Would it make sense to have the
sensor name be a prefix of the metric name?

Overall, this approach seems to be cleaner than metrics-core by decoupling
measuring and reporting. The main benefit of metrics-core seems to be the
existing reporters. Since not that many people voted for metrics-core, I am
ok with going with the new implementation. My only recommendation is to
address the concern on percentiles.

Thanks,

Jun



On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <ja...@gmail.com> wrote:

> Hey guys,
>
> I wanted to kick off a quick discussion of metrics with respect to the new
> producer and consumer (and potentially the server).
>
> At a high level I think there are three approaches we could take:
> 1. Plain vanilla JMX
> 2. Use Coda Hale (AKA Yammer) Metrics
> 3. Do our own metrics (with JMX as one output)
>
> 1. Has the advantage that JMX is the most commonly used java thing and
> plugs in reasonably to most metrics systems. JMX is included in the JDK so
> it doesn't impose any additional dependencies on clients. It has the
> disadvantage that plain vanilla JMX is a pain to use. We would need a bunch
> of helper code for maintaining counters to make this reasonable.
>
> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> output as well as direct output to many other types of systems. The primary
> downside we have had with Coda Hale has to do with the clients and library
> incompatibilities. We are currently on an older more popular version. The
> newer version is a rewrite of the APIs and is incompatible. Originally
> these were totally incompatible and people had to choose one or the other.
> I think that has been improved so now the new version is a totally
> different package. But even in this case you end up with both versions if
> you use Kafka and we are on a different version than you which is going to
> be pretty inconvenient.
>
> 3. Doing our own has the downside of potentially reinventing the wheel, and
> potentially needing to work out any bugs in our code. The upsides would
> depend on the how good the reinvention was. As it happens I did a quick
> (~900 loc) version of a metrics library that is under kafka.common.metrics.
> I think it has some advantages over the Yammer metrics package for our
> usage beyond just not causing incompatibilities. I will describe this code
> so we can discuss the pros and cons. Although I favor this approach I have
> no emotional attachment and wouldn't be too sad if I ended up deleting it.
> Here are javadocs for this code, though I haven't written much
> documentation yet since I might end up deleting it:
>
> Here is a quick overview of this library.
>
> There are three main public interfaces:
>   Metrics - This is a repository of metrics being tracked.
>   Metric - A single, named numerical value being measured (i.e. a counter).
>   Sensor - This is a thing that records values and updates zero or more
> metrics
>
> So let's say we want to track three values about message sizes;
> specifically say we want to record the average, the maximum, the total rate
> of bytes being sent, and a count of messages. Then we would do something
> like this:
>
>    // setup code
>    Metrics metrics = new Metrics(); // this is a global "singleton"
>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
>    sensor.add("kafka.producer.message-size.avg", new Avg());
>    sensor.add("kafka.producer.message-size.max", new Max());
>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
>    sensor.add("kafka.producer.message-count", new Count());
>
>    // now when we get a message we do this
>    sensor.record(messageSize);
>
> The above code creates the global metrics repository, creates a single
> Sensor, and defines 5 named metrics that are updated by that Sensor.
>
> Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> off the metric names not the Sensor names, which I think is an
> improvement--I just use the convention that the last portion of the name is
> the attribute name, the second to last is the mbean name, and the rest is
> the package. So in the above example there is a producer mbean that has a
> avg and max attribute and a producer mbean that has a bytes-sent-per-sec
> and message-count attribute. This is nice because you can logically group
> the values reported irrespective of where in the program they are
> computed--that is an mbean can logically group attributes computed off
> different sensors. This means you can report values by logical subsystem.
>
> I also allow the concept of hierarchical Sensors which I think is a good
> convenience. I have noticed a common pattern in systems where you need to
> roll up the same values along different dimensions. An simple example is
> metrics about qps, data rate, etc on the broker. These we want to capture
> in aggregate, but also broken down by topic-id. You can do this purely by
> defining the sensor hierarchy:
> Sensor allSizes = metrics.sensor("kafka.producer.sizes");
> Sensor topicSizes = metrics.sensor("kafka.producer." + topic  + ".sizes",
> allSizes);
> Now each actual update will go to the appropriate topicSizes sensor (based
> on the topic name), but allSizes metrics will get updated too. I also
> support multiple parents for each sensor as well as multiple layers of
> hiearchy, so you can define a more elaborate DAG of sensors. An example of
> how this would be useful is if you wanted to record your metrics broken
> down by topic AND client id as well as the global aggregate.
>
> Each metric can take a configurable Quota value which allows us to limit
> the maximum value of that sensor. This is intended for use on the server as
> part of our Quota implementation. The way this works is that you record
> metrics as usual:
>    mySensor.record(42.0)
> However if this event occurance causes one of the metrics to exceed its
> maximum allowable value (the quota) this call will throw a
> QuotaViolationException. The cool thing about this is that it means we can
> define quotas on anything we capture metrics for, which I think is pretty
> cool.
>
> Another question is how to handle windowing of the values? Metrics want to
> record the "current" value, but the definition of current is inherently
> nebulous. A few of the obvious gotchas are that if you define "current" to
> be a number of events you can end up measuring an arbitrarily long window
> of time if the event rate is low (e.g. you think you are getting 50
> messages/sec because that was the rate yesterday when all events topped).
>
> Here is how I approach this. All the metrics use the same windowing
> approach. We define a single window by a length of time or number of values
> (you can use either or both--if both the window ends when *either* the time
> bound or event bound is hit). The typical problem with hard window
> boundaries is that at the beginning of the window you have no data and the
> first few samples are too small to be a valid sample. (Consider if you were
> keeping an avg and the first value in the window happens to be very very
> high, if you check the avg at this exact time you will conclude the avg is
> very high but on a sample size of one). One simple fix would be to always
> report the last complete window, however this is not appropriate here
> because (1) we want to drive quotas off it so it needs to be current, and
> (2) since this is for monitoring you kind of care more about the current
> state. The ideal solution here would be to define a backwards looking
> sliding window from the present, but many statistics are actually very hard
> to compute in this model without retaining all the values which would be
> hopelessly inefficient. My solution to this is to keep a configurable
> number of windows (default is two) and combine them for the estimate. So in
> a two sample case depending on when you ask you have between one and two
> complete samples worth of data to base the answer off of. Provided the
> sample window is large enough to get a valid result this satisfies both of
> my criteria of incorporating the most recent data and having reasonable
> variance at all times.
>
> Another approach is to use an exponential weighting scheme to combine all
> history but emphasize the recent past. I have not done this as it has a lot
> of issues for practical operational metrics. I'd be happy to elaborate on
> this if anyone cares...
>
> The window size for metrics has a global default which can be overridden at
> either the sensor or individual metric level.
>
> In addition to these time series values the user can directly expose some
> method of their choosing JMX-style by implementing the Measurable interface
> and registering that value. E.g.
>   metrics.addMetric("my.metric", new Measurable() {
>     public double measure(MetricConfg config, long now) {
>        return this.calculateValueToExpose();
>     }
>   });
> This is useful for exposing things like the accumulator free memory.
>
> The set of metrics is extensible, new metrics can be added by just
> implementing the appropriate interfaces and registering with a sensor. I
> implement the following metrics:
>   total - the sum of all values from the given sensor
>   count - a windowed count of values from the sensor
>   avg - the sample average within the windows
>   max - the max over the windows
>   min - the min over the windows
>   rate - the rate in the windows (e.g. the total or count divided by the
> ellapsed time)
>   percentiles - a collection of percentiles computed over the window
>
> My approach to percentiles is a little different from the yammer metrics
> package. My complaint about the yammer metrics approach is that it uses
> rather expensive sampling and uses kind of a lot of memory to get a
> reasonable sample. This is problematic for per-topic measurements.
>
> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0) which
> directly allows you to specify the desired memory use. Any value below the
> minimum is recorded as -Infinity and any value above the maximum as
> +Infinity. I think this is okay as all metrics have an expected range
> except for latency which can be arbitrarily large, but for very high
> latency there is no need to model it exactly (e.g. 30 seconds + really is
> effectively infinite). Within the range values are recorded in buckets
> which can be either fixed width or increasing width. The increasing width
> is analogous to the idea of significant figures, that is if your value is
> in the range 0-10 you might want to be accurate to within 1ms, but if it is
> 20000 there is no need to be so accurate. I implemented a linear bucket
> size where the Nth bucket has width proportional to N. An exponential
> bucket size would also be sensible and could likely be derived directly
> from the floating point representation of a the value.
>
> I'd like to get some feedback on this metrics code and make a decision on
> whether we want to use it before I actually go ahead and add all the
> instrumentation in the code (otherwise I'll have to redo it if we switch
> approaches). So the next topic of discussion will be which actual metrics
> to add.
>
> -Jay
>