Package com.tdunning.math.stats
Class AVLTreeDigest
- java.lang.Object
-
- com.tdunning.math.stats.TDigest
-
- com.tdunning.math.stats.AbstractTDigest
-
- com.tdunning.math.stats.AVLTreeDigest
-
- All Implemented Interfaces:
java.io.Serializable
public class AVLTreeDigest extends AbstractTDigest
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description private double
compression
private long
count
private static int
SMALL_ENCODING
private AVLGroupTree
summary
private static int
VERBOSE_ENCODING
-
Fields inherited from class com.tdunning.math.stats.AbstractTDigest
gen, recordAllData
-
-
Constructor Summary
Constructors Constructor Description AVLTreeDigest(double compression)
A histogram structure that will record a sketch of a distribution.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(double x, int w)
Adds a sample to a histogram.(package private) void
add(double x, int w, Centroid base)
void
add(double x, int w, java.util.List<java.lang.Double> data)
void
add(java.util.List<? extends TDigest> others)
void
asBytes(java.nio.ByteBuffer buf)
Outputs a histogram as bytes using a particularly cheesy encoding.void
asSmallBytes(java.nio.ByteBuffer buf)
Serialize this TDigest into a byte buffer.int
byteSize()
Returns an upper bound on the number bytes that will be required to represent this histogram.double
cdf(double x)
Returns the fraction of all points added which are <= x.int
centroidCount()
java.util.Collection<Centroid>
centroids()
ACollection
that lets you go through the centroids in ascending order by mean.void
compress()
Re-examines a t-digest to determine whether some centroids are redundant.double
compression()
Returns the current compression factor.static AVLTreeDigest
fromBytes(java.nio.ByteBuffer buf)
Reads a histogram from a byte bufferdouble
quantile(double q)
Returns an estimate of the cutoff such that a specified fraction of the data added to this TDigest would be less than or equal to the cutoff.TDigest
recordAllData()
Sets up so that all centroids will record all data assigned to them.long
size()
Returns the number of samples represented in this histogram.int
smallByteSize()
Returns an upper bound on the number of bytes that will be required to represent this histogram in the tighter representation.-
Methods inherited from class com.tdunning.math.stats.AbstractTDigest
add, add, createCentroid, decode, encode, interpolate, isRecording, quantile, weightedAverage
-
Methods inherited from class com.tdunning.math.stats.TDigest
checkValue, createAvlTreeDigest, createDigest, createMergingDigest, getMax, getMin, setMinMax
-
-
-
-
Field Detail
-
compression
private final double compression
-
summary
private AVLGroupTree summary
-
count
private long count
-
VERBOSE_ENCODING
private static final int VERBOSE_ENCODING
- See Also:
- Constant Field Values
-
SMALL_ENCODING
private static final int SMALL_ENCODING
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
AVLTreeDigest
public AVLTreeDigest(double compression)
A histogram structure that will record a sketch of a distribution.- Parameters:
compression
- How should accuracy be traded for size? A value of N here will give quantile errors almost always less than 3/N with considerably smaller errors expected for extreme quantiles. Conversely, you should expect to track about 5 N centroids for this accuracy.
-
-
Method Detail
-
recordAllData
public TDigest recordAllData()
Description copied from class:AbstractTDigest
Sets up so that all centroids will record all data assigned to them. For testing only, really.- Overrides:
recordAllData
in classAbstractTDigest
- Returns:
- This TDigest so that configurations can be done in fluent style.
-
centroidCount
public int centroidCount()
- Specified by:
centroidCount
in classTDigest
-
add
void add(double x, int w, Centroid base)
- Specified by:
add
in classAbstractTDigest
-
add
public void add(double x, int w)
Description copied from class:TDigest
Adds a sample to a histogram.
-
add
public void add(double x, int w, java.util.List<java.lang.Double> data)
-
compress
public void compress()
Description copied from class:TDigest
Re-examines a t-digest to determine whether some centroids are redundant. If your data are perversely ordered, this may be a good idea. Even if not, this may save 20% or so in space. The cost is roughly the same as adding as many data points as there are centroids. This is typically < 10 * compression, but could be as high as 100 * compression. This is a destructive operation that is not thread-safe.
-
size
public long size()
Returns the number of samples represented in this histogram. If you want to know how many centroids are being used, try centroids().size().
-
cdf
public double cdf(double x)
Description copied from class:TDigest
Returns the fraction of all points added which are <= x.
-
quantile
public double quantile(double q)
Description copied from class:TDigest
Returns an estimate of the cutoff such that a specified fraction of the data added to this TDigest would be less than or equal to the cutoff.
-
centroids
public java.util.Collection<Centroid> centroids()
Description copied from class:TDigest
ACollection
that lets you go through the centroids in ascending order by mean. Centroids returned will not be re-used, but may or may not share storage with this TDigest.
-
compression
public double compression()
Description copied from class:TDigest
Returns the current compression factor.- Specified by:
compression
in classTDigest
- Returns:
- The compression factor originally used to set up the TDigest.
-
byteSize
public int byteSize()
Returns an upper bound on the number bytes that will be required to represent this histogram.
-
smallByteSize
public int smallByteSize()
Returns an upper bound on the number of bytes that will be required to represent this histogram in the tighter representation.- Specified by:
smallByteSize
in classTDigest
- Returns:
- The number of bytes required.
-
asBytes
public void asBytes(java.nio.ByteBuffer buf)
Outputs a histogram as bytes using a particularly cheesy encoding.
-
asSmallBytes
public void asSmallBytes(java.nio.ByteBuffer buf)
Description copied from class:TDigest
Serialize this TDigest into a byte buffer. Some simple compression is used such as using variable byte representation to store the centroid weights and using delta-encoding on the centroid means so that floats can be reasonably used to store the centroid means.- Specified by:
asSmallBytes
in classTDigest
- Parameters:
buf
- The byte buffer into which the TDigest should be serialized.
-
fromBytes
public static AVLTreeDigest fromBytes(java.nio.ByteBuffer buf)
Reads a histogram from a byte buffer- Returns:
- The new histogram structure
-
-