netcal.metrics.regression.UCE

class netcal.metrics.regression.UCE(bins: int | Iterable[int] = 10, sample_threshold: int = 1)

Uncertainty Calibration Error (UCE) for a regression calibration evaluation to test for variance calibration. A probabilistic regression model takes \(X\) as input and outputs a mean \(\mu_Y(X)\) and a variance \(\sigma_Y^2(X)\) targeting the ground-truth \(y\). Similar to the netcal.metrics.confidence.ECE, the UCE applies a binning scheme with \(B\) bins over the predicted variance \(\sigma_Y^2(X)\) and measures the absolute difference between mean squared error (MSE) and mean variance (RMV) [1]. Thus, the UCE [1] is defined by

\[\text{UCE} := \sum^B_{b=1} \frac{N_b}{N} |MSE(b) - MV(b)| ,\]

where \(MSE(b)\) and \(MV(b)\) are the mean squared error and the mean variance within bin \(b\), respectively, and \(N_b\) is the number of samples within bin \(b\).

If multiple dimensions are given, the UCE is measured for each dimension separately.

Parameters:
  • bins (int or iterable, default: 10) – Number of bins used by the UCE binning. If iterable, use different amount of bins for each dimension (nx1, nx2, … = bins).

  • sample_threshold (int, optional, default: 1) – Bins with an amount of samples below this threshold are not included into the miscalibration metrics.

References

Methods

__init__([bins, sample_threshold])

Constructor.

binning(bin_bounds, samples, *values[, nan])

Perform binning on value (and all additional values passed) based on samples.

frequency(X, y[, batched, uncertainty])

Measure the frequency of each point by binning.

measure(X, y, *[, kind, range_])

Measure quantile loss for given input data either as tuple consisting of mean and stddev estimates or as NumPy array consisting of a sample distribution.

prepare(X, y[, batched, uncertainty])

Check input data.

process(metric, acc_hist, conf_hist, ...)

Determine miscalibration based on passed histograms.

reduce(histogram, distribution, axis[, ...])

Calculate the weighted mean on a given histogram based on a dedicated data distribution.