netcal.scaling.LogisticCalibrationDependent¶

class netcal.scaling.LogisticCalibrationDependent(*args, **kwargs)¶

This calibration method is for detection only and uses multivariate normal distributions to obtain a calibration mapping by means of the confidence as well as additional features. This calibration scheme tries to model several dependencies in the variables given by the input X [1].

It is necessary to provide all data in input parameter X as an NumPy array of shape (n_samples, n_features), whereas the confidence must be the first feature given in the input array. The ground-truth samples y must be an array of shape (n_samples,) consisting of binary labels \(y \in \{0, 1\}\). Those labels indicate if the according sample has matched a ground truth box \(\text{m}=1\) or is a false prediction \(\text{m}=0\).

Mathematical background: For confidence calibration in classification tasks, a confidence mapping \(g\) is applied on top of a miscalibrated scoring classifier \(\hat{p} = h(x)\) to deliver a calibrated confidence score \(\hat{q} = g(h(x))\).

For detection calibration, we can also use the additional box regression output which we denote as \(\hat{r} \in [0, 1]^J\) with \(J\) as the number of dimensions used for the box encoding (e.g. \(J=4\) for x position, y position, width and height). Therefore, the calibration map is not only a function of the confidence score, but also of \(\hat{r}\). To define a general calibration map for binary problems, we use the logistic function and the combined input \(s = (\hat{p}, \hat{r})\) of size K by

\[g(s) = \frac{1}{1 + \exp(-z(s))} ,\]

According to [2], we can interpret the logit \(z\) as the logarithm of the posterior odds

\[z(s) = \log \frac{f(\text{m}=1 | s)}{f(\text{m}=0 | s)} \approx \log \frac{f(s | \text{m}=1)}{f(s | \text{m}=1)} = \ell r(s)\]

Inserting multivariate normal density distributions into this framework with \(\mu^+, \mu^- \in \mathbb{R}^K\) and \(\Sigma^+, \Sigma^- \in \mathbb{R}^{K \times K}\) as the mean vectors and covariance matrices for \(\text{m}=1\) and \(\text{m}=0\), respectively, we get a likelihood ratio of

\[\ell r(s) = \log \frac{|\Sigma^-|}{|\Sigma^+|} + \frac{1}{2} \Big[ (s_-^T \Sigma_-^{-1}s^-) - (s_+^T \Sigma_+^{-1}s^+) \Big],\]

with \(s^+ = s - \mu^+\) and \(s^- = s - \mu^-\).

To keep the restrictions to covariance matrices (symmetric and positive semidefinit), we optimize a decomposed matrix V as

\[\Sigma = V^T V\]

instead of estimating \(\Sigma\) directly. This guarantees both requirements.

Capturing epistemic uncertainty of the calibration method is also able with this implementation [3].

Parameters:

method (str, default: "mle") – Method that is used to obtain a calibration mapping: - ‘mle’: Maximum likelihood estimate without uncertainty using a convex optimizer. - ‘momentum’: MLE estimate using Momentum optimizer for non-convex optimization. - ‘variational’: Variational Inference with uncertainty. - ‘mcmc’: Markov-Chain Monte-Carlo sampling with uncertainty.
momentum_epochs (int, optional, default: 1000) – Number of epochs used by momentum optimizer.
mcmc_steps (int, optional, default: 20) – Number of weight samples obtained by MCMC sampling.
mcmc_chains (int, optional, default: 1) – Number of Markov-chains used in parallel for MCMC sampling (this will result in mcmc_steps * mcmc_chains samples).
mcmc_warmup_steps (int, optional, default: 100) – Warmup steps used for MCMC sampling.
vi_epochs (int, optional, default: 1000) – Number of epochs used for ELBO optimization.
independent_probabilities (bool, optional, default: False) – Boolean for multi class probabilities. If set to True, the probability estimates for each class are treated as independent of each other (sigmoid).
use_cuda (str or bool, optional, default: False) – Specify if CUDA should be used. If str, you can also specify the device number like ‘cuda:0’, etc.

References

Methods

`__init__`(args, *kwargs)	Create an instance of LogisticCalibrationDependent.
`clear`()	Clear model parameters.
`convex`(data, y, tensorboard, log_dir)	Convex optimization to find the global optimum of current parameter search.
`epsilon`(dtype)	Get the smallest digit that is representable depending on the passed dtype (NumPy or PyTorch).
`fit`(X, y[, random_state, tensorboard, log_dir])	Build logitic calibration model either conventional with single MLE estimate or with Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) algorithm to also obtain uncertainty estimates.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`guide`([X, y])	Variational substitution definition for each parameter.
`load_model`(filename)	Load model from saved torch dump.
`mask`()	Seek for all relevant weights whose values are negative.
`mcmc`(data, y, tensorboard, log_dir)	Perform Markov-Chain Monte-Carlo sampling on the (unknown) posterior.
`model`([X, y])	Definition of the log regression model.
`momentum`(data, y, tensorboard, log_dir)	Momentum optimization to find the global optimum of current parameter search.
`prepare`(X)	Preprocessing of input data before called at the beginning of the fit-function.
`prior`(dtype)	Prior definition of the weights used for log regression.
`save_model`(filename)	Save model instance as with torch's save function as this is safer for torch tensors.
`set_fit_request`(*[, log_dir, random_state, ...])	Request metadata passed to the `fit` method.
`set_output`(*[, transform])	Set output container.
`set_params`(**params)	Set the parameters of this estimator.
`set_transform_request`(*[, mean_estimate, ...])	Request metadata passed to the `transform` method.
`to`(device)	Set distribution parameters to the desired device in order to compute either on CPU or GPU.
`transform`(X[, num_samples, random_state, ...])	After model calibration, this function is used to get calibrated outputs of uncalibrated confidence estimates.
`variational`(data, y, tensorboard, log_dir)	Perform variational inference using the guide.