netcal.scaling.LogisticCalibrationDependent¶
- class netcal.scaling.LogisticCalibrationDependent(*args, **kwargs)¶
This calibration method is for detection only and uses multivariate normal distributions to obtain a calibration mapping by means of the confidence as well as additional features. This calibration scheme tries to model several dependencies in the variables given by the input
X
[1].It is necessary to provide all data in input parameter
X
as an NumPy array of shape(n_samples, n_features)
, whereas the confidence must be the first feature given in the input array. The ground-truth samplesy
must be an array of shape(n_samples,)
consisting of binary labels \(y \in \{0, 1\}\). Those labels indicate if the according sample has matched a ground truth box \(\text{m}=1\) or is a false prediction \(\text{m}=0\).Mathematical background: For confidence calibration in classification tasks, a confidence mapping \(g\) is applied on top of a miscalibrated scoring classifier \(\hat{p} = h(x)\) to deliver a calibrated confidence score \(\hat{q} = g(h(x))\).
For detection calibration, we can also use the additional box regression output which we denote as \(\hat{r} \in [0, 1]^J\) with \(J\) as the number of dimensions used for the box encoding (e.g. \(J=4\) for x position, y position, width and height). Therefore, the calibration map is not only a function of the confidence score, but also of \(\hat{r}\). To define a general calibration map for binary problems, we use the logistic function and the combined input \(s = (\hat{p}, \hat{r})\) of size K by
\[g(s) = \frac{1}{1 + \exp(-z(s))} ,\]According to [2], we can interpret the logit \(z\) as the logarithm of the posterior odds
\[z(s) = \log \frac{f(\text{m}=1 | s)}{f(\text{m}=0 | s)} \approx \log \frac{f(s | \text{m}=1)}{f(s | \text{m}=1)} = \ell r(s)\]Inserting multivariate normal density distributions into this framework with \(\mu^+, \mu^- \in \mathbb{R}^K\) and \(\Sigma^+, \Sigma^- \in \mathbb{R}^{K \times K}\) as the mean vectors and covariance matrices for \(\text{m}=1\) and \(\text{m}=0\), respectively, we get a likelihood ratio of
\[\ell r(s) = \log \frac{|\Sigma^-|}{|\Sigma^+|} + \frac{1}{2} \Big[ (s_-^T \Sigma_-^{-1}s^-) - (s_+^T \Sigma_+^{-1}s^+) \Big],\]with \(s^+ = s - \mu^+\) and \(s^- = s - \mu^-\).
To keep the restrictions to covariance matrices (symmetric and positive semidefinit), we optimize a decomposed matrix V as
\[\Sigma = V^T V\]instead of estimating \(\Sigma\) directly. This guarantees both requirements.
Capturing epistemic uncertainty of the calibration method is also able with this implementation [3].
- Parameters:
method (str, default: "mle") – Method that is used to obtain a calibration mapping: - ‘mle’: Maximum likelihood estimate without uncertainty using a convex optimizer. - ‘momentum’: MLE estimate using Momentum optimizer for non-convex optimization. - ‘variational’: Variational Inference with uncertainty. - ‘mcmc’: Markov-Chain Monte-Carlo sampling with uncertainty.
momentum_epochs (int, optional, default: 1000) – Number of epochs used by momentum optimizer.
mcmc_steps (int, optional, default: 20) – Number of weight samples obtained by MCMC sampling.
mcmc_chains (int, optional, default: 1) – Number of Markov-chains used in parallel for MCMC sampling (this will result in mcmc_steps * mcmc_chains samples).
mcmc_warmup_steps (int, optional, default: 100) – Warmup steps used for MCMC sampling.
vi_epochs (int, optional, default: 1000) – Number of epochs used for ELBO optimization.
independent_probabilities (bool, optional, default: False) – Boolean for multi class probabilities. If set to True, the probability estimates for each class are treated as independent of each other (sigmoid).
use_cuda (str or bool, optional, default: False) – Specify if CUDA should be used. If str, you can also specify the device number like ‘cuda:0’, etc.
References
Methods
__init__
(*args, **kwargs)Create an instance of LogisticCalibrationDependent.
clear
()Clear model parameters.
convex
(data, y, tensorboard, log_dir)Convex optimization to find the global optimum of current parameter search.
epsilon
(dtype)Get the smallest digit that is representable depending on the passed dtype (NumPy or PyTorch).
fit
(X, y[, random_state, tensorboard, log_dir])Build logitic calibration model either conventional with single MLE estimate or with Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) algorithm to also obtain uncertainty estimates.
fit_transform
(X[, y])Fit to data, then transform it.
get_metadata_routing
()Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
guide
([X, y])Variational substitution definition for each parameter.
load_model
(filename)Load model from saved torch dump.
mask
()Seek for all relevant weights whose values are negative.
mcmc
(data, y, tensorboard, log_dir)Perform Markov-Chain Monte-Carlo sampling on the (unknown) posterior.
model
([X, y])Definition of the log regression model.
momentum
(data, y, tensorboard, log_dir)Momentum optimization to find the global optimum of current parameter search.
prepare
(X)Preprocessing of input data before called at the beginning of the fit-function.
prior
(dtype)Prior definition of the weights used for log regression.
save_model
(filename)Save model instance as with torch's save function as this is safer for torch tensors.
set_fit_request
(*[, log_dir, random_state, ...])Request metadata passed to the
fit
method.set_output
(*[, transform])Set output container.
set_params
(**params)Set the parameters of this estimator.
set_transform_request
(*[, mean_estimate, ...])Request metadata passed to the
transform
method.to
(device)Set distribution parameters to the desired device in order to compute either on CPU or GPU.
transform
(X[, num_samples, random_state, ...])After model calibration, this function is used to get calibrated outputs of uncalibrated confidence estimates.
variational
(data, y, tensorboard, log_dir)Perform variational inference using the guide.