netcal.regression.GPBeta

class netcal.regression.GPBeta(n_inducing_points: int = 12, n_random_samples: int = 128, *, name_prefix: str = 'gpbeta', **kwargs)

GP-Beta recalibration method for regression uncertainty calibration using the well-known Beta calibration method from classification calibration in combination with a Gaussian process (GP) parameter estimation. The basic idea of GP-Beta [1] is to apply recalibration on the uncalibrated cumulative density function (CDF), similar as to the Isotonic Regression method [2]. Since the CDF is restricted to the \([0, 1]\) interval, the authors in [1] propose to use the Beta calibration scheme [3] known in the scope of confidence calibration. Furthermore, the authors use a GP to obtain the recalibration parameters of the Beta function for each sample individually, so that it should finally achieve distribution calibration [1].

Mathematical background: Let \(f_Y(y)\) denote the uncalibrated probability density function (PDF), targeting the probability distribution for \(Y\). Let \(\tau_y \in [0, 1]\) denote a certain quantile on the uncalibrated CDF which is denoted by \(\tau_y = F_Y(y)\). Furthermore, let \(g_Y(y)\) and \(G_Y(y)\) denote the recalibrated PDF and CDF, respectively. The Beta calibration function \(\mathbf{c}_\beta(\tau_y)\) known from [3] is given by

\[\mathbf{c}_\beta(\tau_y) = \phi\big( a \log(\tau_y) - b \log(1-\tau_y) + c \big)\]

with recalibration parameters \(a,b \in \mathbb{R}_{>0}\) and \(c \in \mathbb{R}\), and \(\phi(\cdot)\) as the sigmoid function [3]. This method serves as a mapping from the uncalibrated CDF to the calibrated one, so that

\[G_Y(y) = \mathbf{c}_\beta\big( F_Y(y) \big)\]

holds. The PDF is the derivative of the CDF, so that the calibrated PDF is given by

\[g_Y(y) = \frac{\partial \mathbf{c}_\beta}{\partial y} = \frac{\partial \mathbf{c}_\beta}{\partial \tau_y} \frac{\partial \tau_y}{\partial y} = \mathbf{r}_\beta(\tau_y) f_Y(y) ,\]

with \(\mathbf{r}_\beta(\tau_y)\) as a beta link function [1] given by

\[\mathbf{r}_\beta(\tau_y) = \Bigg(\frac{a}{\tau_y} + \frac{b}{1-\tau_y} \Bigg) \mathbf{c}_\beta(\tau_y) \big(1 - \mathbf{c}_\beta(\tau_y)\big) .\]

Finally, the recalibration parameters \(a, b\) and \(c\) are obtained using a Gaussian process scheme. In this way, it is possible to apply non-parametric distribution calibration [1].

Parameters:
  • n_inducing_points (int) – Number of inducing points used to approximate the input space. These inducing points are also optimized.

  • n_random_samples (int) – Number of random samples used to sample from the parameter distribution during optimization and inference.

  • n_epochs (int, default: 200) – Number of optimization epochs.

  • batch_size (int, default: 256) – Size of batches during optimization.

  • num_workers (int, optional, default: 0) – Number of workers used for the dataloader.

  • lr (float, optional, default: 1e-2) – Learning rate used for the Adam optimizer.

  • use_cuda (str or bool, optional, default: False) – The optimization and inference might also run on a CUDA device. If True, use the first available CUDA device. You can also pass a string “cuda:0”, “cuda:1”, etc. to specify the CUDA device. If False, use CPU for optimization and inference.

  • jitter (float, optional, default: 1e-5) – Small digit that is added to the diagonal of a covariance matrix to stabilize Cholesky decomposition during Gaussian process optimization.

  • name_prefix (str, optional, default: "gpbeta") – Name prefix internally used in Pyro to distinguish between parameter stores.

References

Methods

__init__([n_inducing_points, ...])

Constructor.

add_module(name, module)

Adds a child module to the current module.

added_loss_terms()

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self.

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

buffers([recurse])

Returns an iterator over module buffers.

children()

Returns an iterator over immediate children modules.

clear()

Clear module parameters.

constraint_for_parameter_name(param_name)

constraints()

cpu()

Moves all model parameters and buffers to the CPU.

cuda([device])

Moves all model parameters and buffers to the GPU.

double()

Casts all floating point parameters and buffers to double datatype.

epsilon(dtype)

Get the smallest digit that is representable depending on the passed dtype (NumPy or PyTorch).

eval()

Sets the module in evaluation mode.

extra_repr()

Additional information used to print if str(method) is called.

fit(X, y[, tensorboard])

Fit a GP model to the provided data using Gaussian process optimization.

fit_transform(X[, y])

Fit to data, then transform it.

float()

Casts all floating point parameters and buffers to float datatype.

forward(x)

Forward method defines the prior for the GP.

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

get_extra_state()

Returns any extra state to include in the module's state_dict.

get_fantasy_model(inputs, targets, **kwargs)

Returns a new GP model that incorporates the specified inputs and targets as new training data using online variational conditioning (OVC).

get_metadata_routing()

Get metadata routing of this object.

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

get_params([deep])

Overwrite base method's get_params function to also capture child parameters as variational strategy, LMC coefficients, etc.

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

guide(x, y)

Pyro guide that defines the variational distribution for the Gaussian process.

half()

Casts all floating point parameters and buffers to half datatype.

hyperparameters()

initialize(**kwargs)

Set a value for a parameter

ipu([device])

Moves all model parameters and buffers to the IPU.

load_model(filename[, use_cuda])

Overwrite base method's load_model function as the parameters for the GP methods are stored differently compared to the remaining methods.

load_state_dict(state_dict[, strict])

Copies parameters and buffers from state_dict into this module and its descendants.

load_strict_shapes(value)

local_load_samples(samples_dict, memo, prefix)

Defines local behavior of this Module when loading parameters from a samples_dict generated by a Pyro sampling mechanism.

model(x, y)

Model function that defines the computation graph.

modules()

Returns an iterator over all modules in the network.

named_added_loss_terms()

Returns an iterator over module variational strategies, yielding both the name of the variational strategy as well as the strategy itself.

named_buffers([prefix, recurse])

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

named_constraints([memo, prefix])

named_hyperparameters()

named_modules([memo, prefix, remove_duplicate])

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

named_parameters([prefix, recurse])

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

named_parameters_and_constraints()

named_priors([memo, prefix])

Returns an iterator over the module's priors, yielding the name of the prior, the prior, the associated parameter names, and the transformation callable.

named_variational_parameters()

parameters([recurse])

Returns an iterator over module parameters.

pyro_guide(input[, beta, name_prefix])

(For Pyro integration only).

pyro_load_from_samples(samples_dict)

Convert this Module in to a batch Module by loading parameters from the given samples_dict.

pyro_model(input[, beta, name_prefix])

(For Pyro integration only).

pyro_sample_from_prior()

For each parameter in this Module and submodule that have defined priors, sample a value for that parameter from its corresponding prior with a pyro.sample primitive and load the resulting value in to the parameter.

register_added_loss_term(name)

register_backward_hook(hook)

Registers a backward hook on the module.

register_buffer(name, tensor[, persistent])

Adds a buffer to the module.

register_constraint(param_name, constraint)

register_forward_hook(hook)

Registers a forward hook on the module.

register_forward_pre_hook(hook)

Registers a forward pre-hook on the module.

register_full_backward_hook(hook)

Registers a backward hook on the module.

register_load_state_dict_post_hook(hook)

Registers a post hook to be run after module's load_state_dict is called.

register_module(name, module)

Alias for add_module().

register_parameter(name, parameter)

Adds a parameter to the module.

register_prior(name, prior, param_or_closure)

Adds a prior to the module.

requires_grad_([requires_grad])

Change if autograd should record operations on parameters in this module.

sample_from_prior(prior_name)

Sample parameter values from prior.

save_model(filename)

Save model instance as with torch's save function as this is safer for torch tensors.

set_extra_state(state)

This function is called from load_state_dict() to handle any extra state found within the state_dict.

set_fit_request(*[, tensorboard])

Request metadata passed to the fit method.

set_output(*[, transform])

Set output container.

set_params(**params)

Set the parameters of this estimator.

set_transform_request(*[, t])

Request metadata passed to the transform method.

share_memory()

See torch.Tensor.share_memory_()

state_dict(*args[, destination, prefix, ...])

Returns a dictionary containing references to the whole state of the module.

to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

to_empty(*, device)

Moves the parameters and buffers to the specified device without copying storage.

to_pyro_random_module()

train([mode])

Sets the module in training mode.

transform(X[, t])

Transform the given stddev to a distribution-calibrated one using the input mean and stddev as priors for the underlying Gaussian process.

type(dst_type)

Casts all parameters and buffers to dst_type.

update_added_loss_term(name, added_loss_term)

variational_parameters()

xpu([device])

Moves all model parameters and buffers to the XPU.

zero_grad([set_to_none])

Sets gradients of all model parameters to zero.