netcal.scaling.AbstractLogisticRegression

class netcal.scaling.AbstractLogisticRegression(method: str = 'mle', momentum_epochs: int = 1000, mcmc_steps: int = 250, mcmc_chains: int = 1, mcmc_warmup_steps: int = 100, vi_epochs: int = 1000, detection: bool = False, independent_probabilities: bool = False, use_cuda: str | bool = False, **kwargs)

Abstract class for all calibration methods that base on logistic regression. We extended common scaling calibration methods by Bayesian epistemic uncertainty modelling [1]. On the one hand, this class supports Maximum Likelihood (MLE) estimates without uncertainty. This method is commonly solved by negative log likelihood optimization given by

\[\theta_\text{MLE} = \underset{\theta}{\text{min}} \, -\sum_{i=1}^N \log p(y | x_i, \theta)\]

with samples \(X\), label \(y\), weights \(\theta\) and likelihood \(p(y|X, \theta)\). See the implementations of the methods for more details.

On the other hand, methods to obtain uncertainty in calibration are currently Variational Inference (VI) and Markov-Chain Monte-Carlo (MCMC) sampling. Instead of estimating the weights \(\theta\) of the logistic regression directly, we place a probability distribution over the weights by

\[p(\theta | X, y) = \frac{p(y | X, \theta) p(\theta)}{\int p(y | X, \theta) p(\theta) d\theta}\]

Since the marginal likelihood cannot be evaluated analytically for logistic regression, we need to approximate the posterior by either MCMC sampling or Variational Inference. Using several techniques, we sample multiple times from the posterior in order to get multiple related calibration results with a mean and a deviation for each sample.

MCMC sampling allows the sampling of a posterior without knowing the marginal likelihood. This method is unbiased but computational expensive. In contrast, Variational Inference defines an easy variational distribution \(q_\Phi(\theta)\) (e.g. a normal distribution) for each weight parametrized by \(\Phi\). The optimization objective is then the minimization of the Kullback-Leibler divergence between the variational distribution \(q_\Phi(\theta))\) and the true posterior \(p(\theta | X, y)\). This can be solved using the ELBO method [2]. Variational Inference is faster than MCMC but also biased.

Parameters:
  • method (str, default: "mle") – Method that is used to obtain a calibration mapping: - ‘mle’: Maximum likelihood estimate without uncertainty using a convex optimizer. - ‘momentum’: MLE estimate using Momentum optimizer for non-convex optimization. - ‘variational’: Variational Inference with uncertainty. - ‘mcmc’: Markov-Chain Monte-Carlo sampling with uncertainty.

  • momentum_epochs (int, optional, default: 1000) – Number of epochs used by momentum optimizer.

  • mcmc_steps (int, optional, default: 20) – Number of weight samples obtained by MCMC sampling.

  • mcmc_chains (int, optional, default: 1) – Number of Markov-chains used in parallel for MCMC sampling (this will result in mcmc_steps * mcmc_chains samples).

  • mcmc_warmup_steps (int, optional, default: 100) – Warmup steps used for MCMC sampling.

  • vi_epochs (int, optional, default: 1000) – Number of epochs used for ELBO optimization.

  • detection (bool, default: False) – If False, the input array ‘X’ is treated as multi-class confidence input (softmax) with shape (n_samples, [n_classes]). If True, the input array ‘X’ is treated as a box predictions with several box features (at least box confidence must be present) with shape (n_samples, [n_box_features]).

  • independent_probabilities (bool, optional, default: False) – Boolean for multi class probabilities. If set to True, the probability estimates for each class are treated as independent of each other (sigmoid).

  • use_cuda (str or bool, optional, default: False) – Specify if CUDA should be used. If str, you can also specify the device number like ‘cuda:0’, etc.

References

Methods

__init__([method, momentum_epochs, ...])

Create an instance of AbstractLogisticRegression.

clear()

Clear model parameters.

convex(data, y, tensorboard, log_dir)

Convex optimization to find the global optimum of current parameter search.

epsilon(dtype)

Get the smallest digit that is representable depending on the passed dtype (NumPy or PyTorch).

fit(X, y[, random_state, tensorboard, log_dir])

Build logitic calibration model either conventional with single MLE estimate or with Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) algorithm to also obtain uncertainty estimates.

fit_transform(X[, y])

Fit to data, then transform it.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

guide([X, y])

Variational substitution definition for each parameter.

load_model(filename)

Load model from saved torch dump.

mask()

Seek for all relevant weights whose values are negative.

mcmc(data, y, tensorboard, log_dir)

Perform Markov-Chain Monte-Carlo sampling on the (unknown) posterior.

model([X, y])

Definition of the log regression model.

momentum(data, y, tensorboard, log_dir)

Momentum optimization to find the global optimum of current parameter search.

prepare(X)

Preprocessing of input data before called at the beginning of the fit-function.

prior(dtype)

Prior definition of the weights and intercept used for log regression.

save_model(filename)

Save model instance as with torch's save function as this is safer for torch tensors.

set_fit_request(*[, log_dir, random_state, ...])

Request metadata passed to the fit method.

set_output(*[, transform])

Set output container.

set_params(**params)

Set the parameters of this estimator.

set_transform_request(*[, mean_estimate, ...])

Request metadata passed to the transform method.

to(device)

Set distribution parameters to the desired device in order to compute either on CPU or GPU.

transform(X[, num_samples, random_state, ...])

After model calibration, this function is used to get calibrated outputs of uncalibrated confidence estimates.

variational(data, y, tensorboard, log_dir)

Perform variational inference using the guide.