You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2018/04/23 21:18:49 UTC
[GitHub] nicmichael opened a new pull request #1364: Issue #1363: Fast and
Garbage-Free Codahale Statistics Timers
nicmichael opened a new pull request #1364: Issue #1363: Fast and Garbage-Free Codahale Statistics Timers
URL: https://github.com/apache/bookkeeper/pull/1364
This change introduces a new Codahale-based metrics provider "FastCodahaleMetricsProvider" in the codahale-metrics-provider artifact, which can be activated through the "statsProviderClass" config parameter (no changes made to the existing "CodahaleMetricsProvider"). This new "FastCodahaleMetricsProvider" brings a new timer implementation "FastTimer", which extends Codahale's Timer class, provides an identical API, is registered in Codahale's timer registry, and can be queried through the existing JettyServices implementation. FastTimer provides similar functionality as Codahale's default timer, but is significantly faster and (nearly) garbage free.
The improvements of FastTimer over Codahale default Timer with ExponentiallyDecayingReservoirs are:
* entirely garbage-free Timer updates (only snapshot creations when querying a timer allocate a small number of Java objects)
* 3x faster Timer updates (i.e. 3x shorter code-path to update a timer)
* half the memory footprint (in default configuration)
FastTimer reduces GC pause times in our experiments as described in Issue 1363 by half (same as entirely deactivating all timers).
The primary functional differences between FastTimer and Codahale default Timer with ExponentiallyDecayingReservoirs are:
* FastTimer keeps values for a fixed, configurable time period (default: 60 seconds) rather than exponentially decaying them
* FastTimer bucketizes event times rather than storing discrete values. Percentile values depend on the bucket resolution.
Design Considerations
---------------------
The design goals of this timer implementation are for it to be
- fast (i.e. few instructions to update a timer)
- scalable (i.e. little synchronization cost for concurrent timer updates)
- garbage-free for timer updates (i.e. no object allocation for timer updates)
- space-efficient (i.e. as little memory footprint as possible while achieving first three goals)
- provide similar functionality as Codahale's default timers with ExponentiallyDecayingReservoirs
This implementation provides rate and response times over a configurable sliding time window. Data
is stored in upfront allocated circular arrays, in which each array element holds data
for one second. Data is overwritten in a circular fashion without the allocation of new data
structures and is therefore garbage-free for all timer updates.
This implementation does not store individual response times, but instead allocates bucketized counters
upfront, which are incremented for any event falling into a particular response time bucket. A
fine-grained bucket definition (intended for capturing successsful events) and a coarse-grained
bucket definition (intended to capture failure or timed-out events) are provided.
To improve scalability of concurrent timer updates, most data structures are replicated HASH_SIZE
times, and calling threads updating a timer are hashed to individual instances. Performance tests
(see below) have shown that this implementation is light-weight enough to achieve slightly better
scalability than Codahale's default timers even without hashing, and can further improve scalability
if hashing is used.
Trading off performance and scalability vs. memory footprint, we need to be conservative in the hash
size we chose. Different flavors of this timer implementation have been evaluated using JMH
micro-benchmarks (see microbenchmarks/src/main/java/org/apache/bookkeeper/stats/TimerBenchmark.java),
comparing implementations of FastTimer with a time window of 60 seconds and
- (DEV1) a HASH_SIZE of 3 for all data structures (meters, counters, min/max, and response time buckets)
- (DEV2) a HASH_SIZE of 1 for all data structures (meters, counters, min/max, and response time buckets)
- (FINAL) a HASH_SIZE of 3 for meters, counters, min/max, and no hashing for response time buckets
to the default timer implementation
- (BASE-E) Codahale Timer with ExponentiallyDecayingReservoir (default as used by bookkeeper code)
- (BASE-T) Codahale Timer with SlidingTimeWindowReservoir configured to 60 seconds
- (BASE-S) Codahale Timer with SlidingWindowReservoir configured to hold 100,000 events.
Based on results below, implementation (FINAL) was chosen as the final FastTimer implementation, as it
achieves nearly the same throughput as (DEV1) at nearly the same memory footprint as (DEV2), and
ultimately achieves roughly 3x higher throughput and scalability that Codahale's default implementation
at around half the memory footprint.
The following results have been collected on an eight core x86 server running at 3.2 GHz (updated
timers are shared across 4 threads):
```
Config Timer Impl Timers Threads ops/ms Alloc B/op Kb/TimerPair
----------------------------------------------------------------------------------------
DEV1 FastTimer (Hash 3) 1 4 11487.904 0 253
DEV1 FastTimer (Hash 3) 10 4 22621.702 0 253
DEV1 FastTimer (Hash 3) 100 4 21781.319 0 253
DEV2 FastTimer (Hash 1) 1 4 5138.143 0 88
DEV2 FastTimer (Hash 1) 10 4 22902.195 0 88
DEV2 FastTimer (Hash 1) 100 4 19173.085 0 88
FINAL FastTimer (Hash 3/1) 1 4 9291.002 0 99
FINAL FastTimer (Hash 3/1) 10 4 16379.940 0 99
FINAL FastTimer (Hash 3/1) 100 4 16751.020 0 99
BASE-E CodahaleTimer 1 4 3845.187 82.609 189
BASE-E CodahaleTimer 10 4 7262.445 35.035 189
BASE-E CodahaleTimer 100 4 7051.77 32.843 189
BASE-T CodahaleTimer/TimeWindow 1 4 102.479 90.851 174
BASE-T CodahaleTimer/TimeWindow 10 4 68.852 84.812 174
BASE-T CodahaleTimer/TimeWindow 100 4 153.444 136.436 174
BASE-S CodahaleTimer/SlidingWdw 1 4 4670.543 0 2103
BASE-S CodahaleTimer/SlidingWdw 10 4 13696.168 0 2103
BASE-S CodahaleTimer/SlidingWdw 100 4 12541.936 0 2103
```
- ops/ms is the number of timer updates per millisecond.
- Alloc B/op is the number of bytes allocated per timer update
- Kb/TimerPair is the heap footprint per pair of timers (one with fine-grained, one with coarse-grained buckets)
The following test results include snapshot creation every 109 timer updates (typically, we would assume
snapshot creation to be much less frequent), and show that also with snapshots in the mix, FastTimer outperforms
Codahale default Timers both with respect to throughput and scalability as well as object allocation:
```
Config Timer Impl Timers Threads ops/ms Alloc B/op
-------------------------------------------------------------------------
FINAL FastTimer (Hash 3/1) 1 4 1569.953 23.707
FINAL FastTimer (Hash 3/1) 10 4 7316.794 24.073
FINAL FastTimer (Hash 3/1) 100 4 6498.215 24.073
BASE-E CodahaleTimer 1 4 246.953 481.771
BASE-E CodahaleTimer 10 4 1989.134 476.807
BASE-E CodahaleTimer 100 4 1514.729 468.624
BASE-T CodahaleTimer/TimeWindow 1 4 6.063 43795.810
BASE-T CodahaleTimer/TimeWindow 10 4 44.651 33916.315
BASE-T CodahaleTimer/TimeWindow 100 4 180.431 12330.939
BASE-S CodahaleTimer/SlidingWdw 1 4 17.439 14683.756
BASE-S CodahaleTimer/SlidingWdw 10 4 107.257 14683.745
BASE-S CodahaleTimer/SlidingWdw 100 4 236.538 9767.106
```
Unfortunately Codahale does not have a Timer interface we can implement, and some Codahale
base classes are assuming instances of Timer (for example, our JettyServices instantiate a
Codahale MetricsServlet, which instantiates a Codahale MetricsModule, which only serializes
timers that are instances of Timer class into the json output stream). Unless we wanted to
reimplement or override all these base classes, we can't just implement Codahale's Metered and Sampling
interfaces. Instead we have to extend its Timer class, even though we're not using any of its
inherited functionality or data structures. The inherited (unused) member variables of Codahale Timer
consume slightly less than 512 byte per FastTimer (measured around 425 byte in Codahale 3.1).
Above memory footprint results include ~ 1 kb of inherited (unused) data structures, which comprise
around 1% of FastTimer's overall memory footprint.
In terms of functionality, FastTimer provides the same functionality as Codahale's timers
(in default configuration with ExponentiallyDecayingReservoirs), with the following exceptions:
- Statistics are kept for a fixed amount of time (rather than exponentially decayed), by
default 60 seconds. As a consequence, getMeanRate(), getOneMinuteRate(), getFiveMinuteRate()
and getFifteenMinuteRate() all return the same value if FastTimer is configured to use a
60 second time window.
- FastTimer and FastSnapshot only record bucketized instead of discrete response times. As a
consequence, the accuracy of percentiles depends on bucket granularity. FastSnapshot also
can't return discrete values: getValues() returns an empty array, and size returns 0.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services