netcal.metrics.PICP

class netcal.metrics.PICP(bins: int | Iterable[int] = 10, equal_intervals: bool = True, detection: bool = False, sample_threshold: int = 1)

Compute Prediction Interval Coverage Probability (PICP) and Mean Prediction Interval Width (MPIW). These metrics have been proposed by [1], [2]. This metric is used for Bayesian models to determine the quality of the uncertainty estimates. In Bayesian mode, an uncertainty estimate is attached to each sample. The PICP measures the probability, that the true (observed) accuracy falls into the p% prediction interval. The uncertainty is well-calibrated, if the PICP is equal to p%. Simultaneously, the MPIW measures the mean width of all prediction intervals to evaluate the sharpness of the uncertainty estimates.

Parameters:
  • bins (int or iterable, default: 10) – Number of bins used by the PICP. On detection mode: if int, use same amount of bins for each dimension (nx1 = nx2 = … = bins). If iterable, use different amount of bins for each dimension (nx1, nx2, … = bins).

  • equal_intervals (bool, optional, default: True) – If True, the bins have the same width. If False, the bins are splitted to equalize the number of samples in each bin.

  • detection (bool, default: False) – If False, the input array ‘X’ is treated as multi-class confidence input (softmax) with shape (n_samples, [n_classes]). If True, the input array ‘X’ is treated as a box predictions with several box features (at least box confidence must be present) with shape (n_samples, [n_box_features]).

  • sample_threshold (int, optional, default: 1) – Bins with an amount of samples below this threshold are not included into the process metrics.

References

Methods

__init__([bins, equal_intervals, detection, ...])

Constructor.

binning(bin_bounds, samples, *values[, nan])

Perform binning on value (and all additional values passed) based on samples.

frequency(X, y[, batched, uncertainty])

Measure the frequency of each point by binning.

measure(X, y, q, *[, kind, reduction])

Measure calibration by given predictions with confidence and the according ground truth.

prepare(X, y[, batched, uncertainty])

Check input data.

process(metric, acc_hist, conf_hist, ...)

Determine miscalibration based on passed histograms.

reduce(histogram, distribution, axis[, ...])

Calculate the weighted mean on a given histogram based on a dedicated data distribution.