org.anchoranalysis.math.histogram

## Class Histogram

• ```public final class Histogram
extends Object```
A histogram of integer values.

The bin-size is always 1, so each bin corresponds to a discrete integer.

This can be used to record a discrete probability distribution, and is typically used in the Anchor software to record the distribution of image voxel intensity values.

Note that this is dense implementation and memory is allocated to store all values from `minValue` to `maxValue` (inclusive). This can be a lot of memory for e.g. unsigned-short value types. However, it allows for a maximally efficient incrementing through voxels in an image, without intermediate structures.

Author:
Owen Feehan
• ### Nested Class Summary

Nested Classes
Modifier and Type Class and Description
`static interface ` `Histogram.BinConsumer`
Consumes a bin and corresponding count.
• ### Constructor Summary

Constructors
Constructor and Description
`Histogram(int maxValue)`
Constructs with a maximum value, and assuming a minimum value of 0.
```Histogram(int minValue, int maxValue)```
Constructs with a minimum and maximum value.
• ### Method Summary

All Methods
Modifier and Type Method and Description
`void` `addHistogram(Histogram other)`
Adds the counts from another histogram to the current object.
`int` `calculateMaximum()`
Calculates the maximum value with non zero-count among the histogram values.
`int` `calculateMinimum()`
Calculates the minimum value with non zero-count among the histogram values.
`int` `calculateMode()`
Calculates the mode of the histogram values.
`long` `calculateSum()`
Calculates the sum of all values in the distribution considering their counts.
`long` `calculateSumCubes()`
Calculates the cube of the squares of all values in the distribution considering their counts.
`long` `calculateSumSquares()`
Calculates the sum of the squares of all values in the distribution considering their counts.
`long` `countMatching(java.util.function.IntPredicate predicate)`
Gets the total count of all values that match a predicate.
`Histogram` `cropRemoveLargerValues(long maxCount)`
Like `cropRemoveSmallerValues(long)` but larger values are removed rather than smaller values if the total count is too high.
`Histogram` `cropRemoveSmallerValues(long maxCount)`
Creates a `Histogram` reusing the bins in the current histogram, but with an upper limit on the total count.
`Histogram` `duplicate()`
Creates a deep-copy of the current object.
`int` `getCount(int value)`
The count corresponding to a particular value.
`int` `getMaxValue()`
Maximum possible value in the histogram (inclusive).
`long` `getTotalCount()`
The total count across values in the histogram.
`boolean` `hasNonZeroCount(int threshold)`
Whether at least one value, greater or equal to `startMin` has non-zero count?
`void` `incrementValue(int value)`
Increments the count for a particular value by one.
`void` ```incrementValueBy(int value, int increase)```
Increments the count for a particular value.
`void` ```incrementValueBy(int value, long increase)```
Like `incrementValueBy(int, int)` but accepts a `long` as the `increase` argument.
`boolean` `isEmpty()`
If no value exists in the histogram with a count greater than zero.
`void` `iterateValues(Histogram.BinConsumer consumer)`
Calls `consumer` for every value, increasing from min to max.
`void` ```iterateValuesUntil(int limit, Histogram.BinConsumer consumer)```
Calls `consumer` for every value until a limit, increasing from min to `limit`.
`double` `mean()`
Calculates the mean of the histogram values, considering their frequency.
`double` `mean(double power)`
Calculates the mean of the values in the distribution, if each value is raised to a power.
`double` ```mean(double power, double subtractValue)```
Like `mean(double)` but a value may be subtracted before raising to a power.
`int` `quantile(double quantile)`
Calculates the corresponding value for a particular quantile in the distribution of values in the histogram.
`void` `removeBelowThreshold(int threshold)`
All values less than `threshold` are removed.
`void` `reset()`
Sets the count for all values to 0.
`int` `size()`
The size of the range of values in the histogram.
`double` `standardDeviation()`
Calculates the standard-deviation of the distribution represented by the histogram.
`Histogram` `threshold(java.util.function.DoublePredicate predicate)`
Generates a new histogram containing only values that match a predicate.
`String` `toString()`
A string representation of what's in the histogram.
`void` ```transferCount(int valueFrom, int valueTo)```
Moves all count for a particular value and adds it to the count for another.
`double` `variance()`
Calculates the variance of the distribution represented by the histogram.
`void` `zeroValue(int value)`
Sets the count for a particular value to 0.
• ### Methods inherited from class Object

`clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`
• ### Constructor Detail

• #### Histogram

`public Histogram(int maxValue)`
Constructs with a maximum value, and assuming a minimum value of 0.
Parameters:
`maxValue` - maximum possible value in the histogram (inclusive).
• #### Histogram

```public Histogram(int minValue,
int maxValue)```
Constructs with a minimum and maximum value.
Parameters:
`minValue` - minimum possible value in the histogram (inclusive).
`maxValue` - maximum possible value in the histogram (inclusive).
• ### Method Detail

• #### duplicate

`public Histogram duplicate()`
Creates a deep-copy of the current object.
Returns:
a deep-copy.
• #### reset

`public void reset()`
Sets the count for all values to 0.
• #### zeroValue

`public void zeroValue(int value)`
Sets the count for a particular value to 0.
Parameters:
`value` - the value whose count is zeroed.
• #### transferCount

```public void transferCount(int valueFrom,
int valueTo)```
Moves all count for a particular value and adds it to the count for another.
Parameters:
`valueFrom` - the value whose count is moved, after which it's count is set to zero.
`valueTo` - the value to which the count for `valueFrom` is added.
• #### incrementValue

`public void incrementValue(int value)`
Increments the count for a particular value by one.
Parameters:
`value` - the value whose count will be incremented by one.
• #### incrementValueBy

```public void incrementValueBy(int value,
int increase)```
Increments the count for a particular value.
Parameters:
`value` - the value whose count will be incremented.
`increase` - how much to increase the count by.
• #### incrementValueBy

```public void incrementValueBy(int value,
long increase)```
Like `incrementValueBy(int, int)` but accepts a `long` as the `increase` argument.
Parameters:
`value` - the value whose count will be incremented.
`increase` - how much to increase the count by.
Throws:
`ArithmeticException` - if increase cannot be converted to an `int` safely.
• #### removeBelowThreshold

`public void removeBelowThreshold(int threshold)`
All values less than `threshold` are removed.
Parameters:
`threshold` - values greater or equal to this are kept in the histogram, lesser values are removed.
• #### isEmpty

`public boolean isEmpty()`
If no value exists in the histogram with a count greater than zero.
Returns:
true iff the histogram has zero-count for all values.
• #### getCount

`public int getCount(int value)`
The count corresponding to a particular value.
Parameters:
`value` - the value (the bin) to find a count for.
Returns:
the corresponding count.
• #### size

`public int size()`
The size of the range of values in the histogram.

This is equivalent to `(maxValue - minValue + 1)`.

Returns:
the number of values represented in the histogram.

```public void addHistogram(Histogram other)
throws OperationFailedException```
Adds the counts from another histogram to the current object.

Both histograms must have identical minimum and maximum values, and therefore represent the same range of values.

Parameters:
`other` - the histogram to add.
Throws:
`OperationFailedException` - if the histograms do have identical minimum and maximum values.
• #### mean

```public double mean()
throws OperationFailedException```
Calculates the mean of the histogram values, considering their frequency.

Specifically, this is the mean of `value * countFor(value)` across all values.

Returns:
the mean.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### quantile

```public int quantile(double quantile)
throws OperationFailedException```
Calculates the corresponding value for a particular quantile in the distribution of values in the histogram.

A quantile of 0.3, would return the minimal value, greater or equal to at least 30% of the count.

Parameters:
`quantile` - the quantile, in the interval `[0, 1]`.
Returns:
the mean.
Throws:
`OperationFailedException` - if the histogram has no values, or the quantile is outside acceptable bounds.
• #### hasNonZeroCount

`public boolean hasNonZeroCount(int threshold)`
Whether at least one value, greater or equal to `startMin` has non-zero count?
Parameters:
`threshold` - only values greater or equal to `threshold` are considered. Use 0 for all values.
Returns:
true iff at least one value in this range has a non-zero count, false if all values in the range are zero.
• #### calculateMode

```public int calculateMode()
throws OperationFailedException```
Calculates the mode of the histogram values.

The mode is the most frequently occurring item.

Returns:
the mode.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### calculateMaximum

```public int calculateMaximum()
throws OperationFailedException```
Calculates the maximum value with non zero-count among the histogram values.
Returns:
the maximal value with non-zero count.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### calculateMinimum

```public int calculateMinimum()
throws OperationFailedException```
Calculates the minimum value with non zero-count among the histogram values.
Returns:
the minimal value with non-zero count.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### calculateSum

`public long calculateSum()`
Calculates the sum of all values in the distribution considering their counts.

Specifically, the sum is `value * countFor(value)` across all values.

Returns:
the sum.
• #### calculateSumSquares

`public long calculateSumSquares()`
Calculates the sum of the squares of all values in the distribution considering their counts.

Specifically, the sum is `value^2 * countFor(value)` across all values.

Returns:
the sum of squares.
• #### calculateSumCubes

`public long calculateSumCubes()`
Calculates the cube of the squares of all values in the distribution considering their counts.

Specifically, the sum is `value^3 * countFor(value)` across all values.

Returns:
the sum of cubes.
• #### standardDeviation

```public double standardDeviation()
throws OperationFailedException```
Calculates the standard-deviation of the distribution represented by the histogram.
Returns:
the standard-deviation.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### variance

```public double variance()
throws OperationFailedException```
Calculates the variance of the distribution represented by the histogram.
Returns:
the variance.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### countMatching

`public long countMatching(java.util.function.IntPredicate predicate)`
Gets the total count of all values that match a predicate.
Parameters:
`predicate` - the predicate a value must match to be included in the count.
Returns:
the sum of the counts corresponding to all values that match the predicate.
• #### threshold

`public Histogram threshold(java.util.function.DoublePredicate predicate)`
Generates a new histogram containing only values that match a predicate.

This is an immutable operation. The existing histogram's values are unchanged.

Parameters:
`predicate` - a condition that must hold on the value for it to be included in the created histogram.
Returns:
a newly created `Histogram` containing values and corresponding counts from this object, but only if they fulfill the predicate.
• #### toString

`public String toString()`
A string representation of what's in the histogram.
Overrides:
`toString` in class `Object`
• #### getTotalCount

`public long getTotalCount()`
The total count across values in the histogram.

This is pre-calculated, so calling this operation occurs no computational expense.

Returns:
the total count.
• #### cropRemoveSmallerValues

`public Histogram cropRemoveSmallerValues(long maxCount)`
Creates a `Histogram` reusing the bins in the current histogram, but with an upper limit on the total count.

If more total count exists than `maxCount`, values are removed in ascending order, until the count is under the limit.

Parameters:
`maxCount` - the maximum allowable total-count for the extracted histogram.
Returns:
a newly created `Histogram` either a copy of the existing (if the total count is less than `maxCount` or cropped as per above rules.
• #### cropRemoveLargerValues

`public Histogram cropRemoveLargerValues(long maxCount)`
Like `cropRemoveSmallerValues(long)` but larger values are removed rather than smaller values if the total count is too high.
Parameters:
`maxCount` - the maximum allowable total-count for the extracted histogram.
Returns:
a newly created `Histogram` either a copy of the existing (if the total count is less than `maxCount` or cropped as per above rules.
• #### mean

```public double mean(double power)
throws OperationFailedException```
Calculates the mean of the values in the distribution, if each value is raised to a power.

Specifically, it calculates the mean of `countFor(value) * value^power` across all values.

Parameters:
`power` - the power to raise each value to.
Returns:
the calculated mean.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### mean

```public double mean(double power,
double subtractValue)
throws OperationFailedException```
Like `mean(double)` but a value may be subtracted before raising to a power.

Specifically, it calculates the mean of ```countFor(value) * (value - subtractValue)^power``` across all values.

Parameters:
`power` - the power to raise each value to (after subtraction).
`subtractValue` - a value subtracted before raising to a power.
Returns:
the calculated mean.
Throws:
`OperationFailedException` - if the histogram has no values.
• #### iterateValues

`public void iterateValues(Histogram.BinConsumer consumer)`
Calls `consumer` for every value, increasing from min to max.
Parameters:
`consumer` - called for every bin.
• #### iterateValuesUntil

```public void iterateValuesUntil(int limit,
Histogram.BinConsumer consumer)```
Calls `consumer` for every value until a limit, increasing from min to `limit`.
Parameters:
`limit` - the maximum-value to consume (inclusive).
`consumer` - called for every bin.
• #### getMaxValue

`public int getMaxValue()`
Maximum possible value in the histogram (inclusive).