An HdrHistogram histogram supports the recording and analyzing sampled data value counts across a configurable integer value range with configurable value precision within the range. Value precision is expressed as the number of significant digits in the value recording, and provides control over value quantization behavior across the value range and the subsequent value resolution at any given level.
In contrast to traditional histograms that use linear, logarithmic, or arbitrary sized bins or buckets, HdrHistograms use a fixed storage internal data representation that simultaneously supports an arbitrarily high dynamic range and arbitrary precision throughout that dynamic range. This capability makes HdrHistograms extremely useful for tracking and reporting on the distribution of percentile values with high resolution and across a wide dynamic range -- a common need in latency behavior characterization.
The HdrHistogram package was specifically designed with latency and performance sensitive applications in mind. Experimental u-benchmark measurements show value recording times as low as 3-6 nanoseconds on modern (circa 2012) Intel CPUs. All Histogram variants maintain a fixed cost in both space and time. A Histogram's memory footprint is constant, with no allocation operations involved in recording data values or in iterating through them. The memory footprint is fixed regardless of the number of data value samples recorded, and depends solely on the dynamic range and precision chosen. The amount of work involved in recording a sample is constant, and directly computes storage index locations such that no iteration or searching is ever involved in recording data values.
The combination of high dynamic range and precision is useful for collection and accurate post-recording analysis of sampled value data distribution in various forms. Whether it's calculating or plotting arbitrary percentiles, iterating through and summarizing values in various ways, or deriving mean and standard deviation values, the fact that the recorded value count information is kept in high resolution allows for accurate post-recording analysis with low [and ultimately configurable] loss in accuracy when compared to performing the same analysis directly on the potentially infinite series of sourced data values samples.
An HdrHistogram histogram is usually configured to maintain value count data with a resolution good enough to support a desired precision in post-recording analysis and reporting on the collected data. Analysis can include the computation and reporting of distribution by percentiles, linear or logarithmic arbitrary value buckets, mean and standard deviation, as well as any other computations that can supported using the various iteration techniques available on the collected value count data. In practice, a precision levels of 2 or 3 decimal points are most commonly used, as they maintain a value accuracy of +/- ~1% or +/- ~0.1% respectively for derived distribution statistics.
A good example of HdrHistogram use would be tracking of latencies across a wide dynamic range. E.g. from a
microsecond to an hour. A Histogram can be configured to track and later report on the counts of observed integer
usec-unit latency values between 0 and 3,600,000,000 while maintaining a value precision of 3 significant digits
across that range. Such an example Histogram would simply be created with a
highestTrackableValue
of 3,600,000,000, and a
numberOfSignificantValueDigits
of 3, and would occupy a fixed, unchanging memory footprint
of around 185KB (see "Footprint estimation" below).
Code for this use example would include these basic elements:
Specifying 3 decimal points of precision in this example guarantees that value quantization within the value range will be no larger than 1/1,000th (or 0.1%) of any recorded value. This example Histogram can be therefor used to track, analyze and report the counts of observed latencies ranging between 1 microsecond and 1 hour in magnitude, while maintaining a value resolution 1 microsecond (or better) up to 1 millisecond, a resolution of 1 millisecond (or better) up to one second, and a resolution of 1 second (or better) up to 1,000 seconds. At it's maximum tracked value (1 hour), it would still maintain a resolution of 3.6 seconds (or better).histogram = new
Histogram
(3600000000L, 3); . . . // Repeatedly record measured latencies: histogram.
Histogram
recordValue
(latency); . . . // Report histogram percentiles, expressed in msec units: histogram.getHistogramData
().outputPercentileDistribution
(histogramLog, 1000.0);
AbstractHistogram
class:
Histogram
, which is the commonly used Histogram form and tracks value counts
in long
fields. IntHistogram
and ShortHistogram
, which track value counts
in int
and
short
fields respectively, are provided for use cases where smaller count ranges are practical
and smaller overall storage is beneficial (e.g. systems where tens of thousands of in-memory histogram are
being tracked).AtomicHistogram
and SynchronizedHistogram
Internally, data in HdrHistogram variants is maintained using a concept somewhat similar to that of floating
point number representation: Using a an exponent a (non-normalized) mantissa to
support a wide dynamic range at a high but varying (by exponent value) resolution.
AbstractHistogram uses exponentially increasing bucket value ranges (the parallel of
the exponent portion of a floating point number) with each bucket containing
a fixed number (per bucket) set of linear sub-buckets (the parallel of a non-normalized mantissa portion
of a floating point number).
Both dynamic range and resolution are configurable, with highestTrackableValue
controlling dynamic range, and numberOfSignificantValueDigits
controlling
resolution.
Histogram
class and it's IntHistogram
and ShortHistogram
variants are NOT
internally synchronized, and do NOT use atomic variables. Callers wishing to make potentially concurrent,
multi-threaded updates or queries against Histogram objects should either take care to externally synchronize and/or
order their access, or use the SynchronizedHistogram
or
AtomicHistogram
variants.
It's worth mentioning that since Histogram objects are additive, it is common practice to use per-thread, non-synchronized histograms for the recording fast path, and "flipping" the actively recorded-to histogram (usually with some non-locking variants on the fast path) and having a summary/reporting thread perform histogram aggregation math across time and/or threads.
HistogramData
available through AbstractHistogram.getHistogramData()
.
Iteration mechanisms all provide HistogramIterationValue
data points along the
histogram's iterated data set, and are available for the default (corrected) histogram data set
via the following HistogramData
methods:
percentiles
:
An java.lang.Iterable
<HistogramIterationValue
> through the
histogram using a PercentileIterator
linearBucketValues
:
An java.lang.Iterable
<HistogramIterationValue
> through
the histogram using a LinearIterator
logarithmicBucketValues
:
An java.lang.Iterable
<HistogramIterationValue
>
through the histogram using a LogarithmicIterator
recordedValues
:
An java.lang.Iterable
<HistogramIterationValue
> through
the histogram using a RecordedValuesIterator
allValues
:
An java.lang.Iterable
<HistogramIterationValue
> through
the histogram using a AllValuesIterator
Iteration is typically done with a for-each loop statement. E.g.:
for (HistogramIterationValue v : histogram.getHistogramData().percentiles(percentileTicksPerHalfDistance)) { ... }or
for (HistogramIterationValue v : histogram.getHistogramData().linearBucketValues(valueUnitsPerBucket)) { ... }The iterators associated with each iteration method are resettable, such that a caller that would like to avoid allocating a new iterator object for each iteration loop can re-use an iterator to repeatedly iterate through the histogram. This iterator re-use usually takes the form of a traditional for loop using the Iterator's
hasNext()
and next()
methods:
to avoid allocating a new iterator object for each iteration loop:
PercentileIterator iter = histogram.getHistogramData().percentiles().iterator(percentileTicksPerHalfDistance); ... iter.reset(percentileTicksPerHalfDistance); for (iter.hasNext() { HistogramIterationValue v = iter.next(); ... }
Due to the finite (and configurable) resolution of the histogram, multiple adjacent integer data values can be "equivalent". Two values are considered "equivalent" if samples recorded for both are always counted in a common total count due to the histogram's resolution level. Histogram provides methods for determining the lowest and highest equivalent values for any given value, as we as determining whether two values are equivalent, and for finding the next non-equivalent value for a given value (useful when looping through values, in order to avoid double-counting count).
Regular, raw value data recording into an HdrHistogram is achieved with the
recordValue()
method.
Histogram variants also provide an auto-correcting
recordValueWithExpectedInterval()
form in support of a common use case found when histogram values are used to track response time
distribution in the presence of Coordinated Omission - an extremely common phenomenon found in latency recording
systems.
This correcting form is useful in [e.g. load generator] scenarios where measured response times may exceed the
expected interval between issuing requests, leading to the "omission" of response time measurements that would
typically correlate with "bad" results. This coordinated (non random) omission of source data, if left uncorrected,
will then dramatically skew any overall latency stats computed on the recorded information, as the recorded data set
itself will be significantly skewed towards good results.
<
When a value recorded in the histogram exceeds the
expectedIntervalBetweenValueSamples
parameter, recorded histogram data will
reflect an appropriate number of additional values, linearly decreasing in steps of
expectedIntervalBetweenValueSamples
, down to the last value
that would still be higher than expectedIntervalBetweenValueSamples
).
To illustrate why this corrective behavior is critically needed in order to accurately represent value
distribution when large value measurements may lead to missed samples, imagine a system for which response
times samples are taken once every 10 msec to characterize response time distribution.
The hypothetical system behaves "perfectly" for 100 seconds (10,000 recorded samples), with each sample
showing a 1msec response time value. At each sample for 100 seconds (10,000 logged samples
at 1msec each). The hypothetical system then encounters a 100 sec pause during which only a single sample is
recorded (with a 100 second value).
An normally recorded (uncorrected) data histogram collected for such a hypothetical system (over the 200 second
scenario above) would show ~99.99% of results at 1msec or below, which is obviously "not right". In contrast, a
histogram that records the same data using the auto-correcting
recordValueWithExpectedInterval()
method with the knowledge of an expectedIntervalBetweenValueSamples of 10msec will correctly represent the
real world response time distribution of this hypothetical system. Only ~50% of results will be at 1msec or below,
with the remaining 50% coming from the auto-generated value records covering the missing increments spread between
10msec and 100 sec.
Data sets recorded with and with
recordValue()
and with
recordValueWithExpectedInterval()
will differ only if at least one value recorded was greater than it's
associated expectedIntervalBetweenValueSamples
parameter.
Data sets recorded with
recordValueWithExpectedInterval()
parameter will be identical to ones recorded with
recordValue()
it if all values recorded via the recordValue
calls were smaller
than their associated expectedIntervalBetweenValueSamples
parameters.
In addition to at-recording-time correction option, Histrogram variants also provide the post-recording correction
methods
copyCorrectedForCoordinatedOmission()
and
addWhileCorrectingForCoordinatedOmission()
.
These methods can be used for post-recording correction, and are useful when the
expectedIntervalBetweenValueSamples
parameter is estimated to be the same for all recorded
values. However, for obvious reasons, it is important to note that only one correction method (during or post
recording) should be be used on a given histogram data set.
When used for response time characterization, the recording with the optional expectedIntervalBetweenValueSamples parameter will tend to produce data sets that would much more accurately reflect the response time distribution that a random, uncoordinated request would have experienced.
highestTrackableValue
and numberOfSignificantValueDigits
combination. Beyond a relatively small fixed-size footprint used for internal fields and stats (which can be
estimated as "fixed at well less than 1KB"), the bulk of a Histogram's storage is taken up by it's data value
recording counts array. The total footprint can be conservatively estimated by:
largestValueWithSingleUnitResolution = 2 * (10 ^ numberOfSignificantValueDigits);
subBucketSize = roundedUpToNearestPowerOf2(largestValueWithSingleUnitResolution);
expectedHistogramFootprintInBytes = 512 +
({primitive type size} / 2) *
(log2RoundedUp((highestTrackableValue) / subBucketSize) + 2) *
subBucketSize
A conservative (high) estimate of a Histogram's footprint in bytes is available via the
getEstimatedFootprintInBytes()
method.