netcal.regression.GPBeta¶

class netcal.regression.GPBeta(n_inducing_points: int = 12, n_random_samples: int = 128, *, name_prefix: str = 'gpbeta', **kwargs)¶

GP-Beta recalibration method for regression uncertainty calibration using the well-known Beta calibration method from classification calibration in combination with a Gaussian process (GP) parameter estimation. The basic idea of GP-Beta [1] is to apply recalibration on the uncalibrated cumulative density function (CDF), similar as to the Isotonic Regression method [2]. Since the CDF is restricted to the \([0, 1]\) interval, the authors in [1] propose to use the Beta calibration scheme [3] known in the scope of confidence calibration. Furthermore, the authors use a GP to obtain the recalibration parameters of the Beta function for each sample individually, so that it should finally achieve distribution calibration [1].

Mathematical background: Let \(f_Y(y)\) denote the uncalibrated probability density function (PDF), targeting the probability distribution for \(Y\). Let \(\tau_y \in [0, 1]\) denote a certain quantile on the uncalibrated CDF which is denoted by \(\tau_y = F_Y(y)\). Furthermore, let \(g_Y(y)\) and \(G_Y(y)\) denote the recalibrated PDF and CDF, respectively. The Beta calibration function \(\mathbf{c}_\beta(\tau_y)\) known from [3] is given by

\[\mathbf{c}_\beta(\tau_y) = \phi\big( a \log(\tau_y) - b \log(1-\tau_y) + c \big)\]

with recalibration parameters \(a,b \in \mathbb{R}_{>0}\) and \(c \in \mathbb{R}\), and \(\phi(\cdot)\) as the sigmoid function [3]. This method serves as a mapping from the uncalibrated CDF to the calibrated one, so that

\[G_Y(y) = \mathbf{c}_\beta\big( F_Y(y) \big)\]

holds. The PDF is the derivative of the CDF, so that the calibrated PDF is given by

\[g_Y(y) = \frac{\partial \mathbf{c}_\beta}{\partial y} = \frac{\partial \mathbf{c}_\beta}{\partial \tau_y} \frac{\partial \tau_y}{\partial y} = \mathbf{r}_\beta(\tau_y) f_Y(y) ,\]

with \(\mathbf{r}_\beta(\tau_y)\) as a beta link function [1] given by

\[\mathbf{r}_\beta(\tau_y) = \Bigg(\frac{a}{\tau_y} + \frac{b}{1-\tau_y} \Bigg) \mathbf{c}_\beta(\tau_y) \big(1 - \mathbf{c}_\beta(\tau_y)\big) .\]

Finally, the recalibration parameters \(a, b\) and \(c\) are obtained using a Gaussian process scheme. In this way, it is possible to apply non-parametric distribution calibration [1].

Parameters:

n_inducing_points (int) – Number of inducing points used to approximate the input space. These inducing points are also optimized.
n_random_samples (int) – Number of random samples used to sample from the parameter distribution during optimization and inference.
n_epochs (int, default: 200) – Number of optimization epochs.
batch_size (int, default: 256) – Size of batches during optimization.
num_workers (int, optional, default: 0) – Number of workers used for the dataloader.
lr (float, optional, default: 1e-2) – Learning rate used for the Adam optimizer.
use_cuda (str or bool, optional, default: False) – The optimization and inference might also run on a CUDA device. If True, use the first available CUDA device. You can also pass a string “cuda:0”, “cuda:1”, etc. to specify the CUDA device. If False, use CPU for optimization and inference.
jitter (float, optional, default: 1e-5) – Small digit that is added to the diagonal of a covariance matrix to stabilize Cholesky decomposition during Gaussian process optimization.
name_prefix (str, optional, default: "gpbeta") – Name prefix internally used in Pyro to distinguish between parameter stores.

References

Methods

`__init__`([n_inducing_points, ...])	Constructor.
`add_module`(name, module)	Adds a child module to the current module.
`added_loss_terms`()
`apply`(fn)	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Returns an iterator over module buffers.
`children`()	Returns an iterator over immediate children modules.
`clear`()	Clear module parameters.
`constraint_for_parameter_name`(param_name)
`constraints`()
`cpu`()	Moves all model parameters and buffers to the CPU.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`epsilon`(dtype)	Get the smallest digit that is representable depending on the passed dtype (NumPy or PyTorch).
`eval`()	Sets the module in evaluation mode.
`extra_repr`()	Additional information used to print if str(method) is called.
`fit`(X, y[, tensorboard])	Fit a GP model to the provided data using Gaussian process optimization.
`fit_transform`(X[, y])	Fit to data, then transform it.
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(x)	Forward method defines the prior for the GP.
`get_buffer`(target)	Returns the buffer given by `target` if it exists, otherwise throws an error.
`get_extra_state`()	Returns any extra state to include in the module's state_dict.
`get_fantasy_model`(inputs, targets, **kwargs)	Returns a new GP model that incorporates the specified inputs and targets as new training data using online variational conditioning (OVC).
`get_metadata_routing`()	Get metadata routing of this object.
`get_parameter`(target)	Returns the parameter given by `target` if it exists, otherwise throws an error.
`get_params`([deep])	Overwrite base method's get_params function to also capture child parameters as variational strategy, LMC coefficients, etc.
`get_submodule`(target)	Returns the submodule given by `target` if it exists, otherwise throws an error.
`guide`(x, y)	Pyro guide that defines the variational distribution for the Gaussian process.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`hyperparameters`()
`initialize`(**kwargs)	Set a value for a parameter
`ipu`([device])	Moves all model parameters and buffers to the IPU.
`load_model`(filename[, use_cuda])	Overwrite base method's load_model function as the parameters for the GP methods are stored differently compared to the remaining methods.
`load_state_dict`(state_dict[, strict])	Copies parameters and buffers from `state_dict` into this module and its descendants.
`load_strict_shapes`(value)
`local_load_samples`(samples_dict, memo, prefix)	Defines local behavior of this Module when loading parameters from a samples_dict generated by a Pyro sampling mechanism.
`model`(x, y)	Model function that defines the computation graph.
`modules`()	Returns an iterator over all modules in the network.
`named_added_loss_terms`()	Returns an iterator over module variational strategies, yielding both the name of the variational strategy as well as the strategy itself.
`named_buffers`([prefix, recurse])	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_constraints`([memo, prefix])
`named_hyperparameters`()
`named_modules`([memo, prefix, remove_duplicate])	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse])	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`named_parameters_and_constraints`()
`named_priors`([memo, prefix])	Returns an iterator over the module's priors, yielding the name of the prior, the prior, the associated parameter names, and the transformation callable.
`named_variational_parameters`()
`parameters`([recurse])	Returns an iterator over module parameters.
`pyro_guide`(input[, beta, name_prefix])	(For Pyro integration only).
`pyro_load_from_samples`(samples_dict)	Convert this Module in to a batch Module by loading parameters from the given samples_dict.
`pyro_model`(input[, beta, name_prefix])	(For Pyro integration only).
`pyro_sample_from_prior`()	For each parameter in this Module and submodule that have defined priors, sample a value for that parameter from its corresponding prior with a pyro.sample primitive and load the resulting value in to the parameter.
`register_added_loss_term`(name)
`register_backward_hook`(hook)	Registers a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Adds a buffer to the module.
`register_constraint`(param_name, constraint)
`register_forward_hook`(hook)	Registers a forward hook on the module.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the module.
`register_full_backward_hook`(hook)	Registers a backward hook on the module.
`register_load_state_dict_post_hook`(hook)	Registers a post hook to be run after module's `load_state_dict` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, parameter)	Adds a parameter to the module.
`register_prior`(name, prior, param_or_closure)	Adds a prior to the module.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`sample_from_prior`(prior_name)	Sample parameter values from prior.
`save_model`(filename)	Save model instance as with torch's save function as this is safer for torch tensors.
`set_extra_state`(state)	This function is called from `load_state_dict()` to handle any extra state found within the state_dict.
`set_fit_request`(*[, tensorboard])	Request metadata passed to the `fit` method.
`set_output`(*[, transform])	Set output container.
`set_params`(**params)	Set the parameters of this estimator.
`set_transform_request`(*[, t])	Request metadata passed to the `transform` method.
`share_memory`()	See `torch.Tensor.share_memory_()`
`state_dict`(*args[, destination, prefix, ...])	Returns a dictionary containing references to the whole state of the module.
`to`(args, *kwargs)	Moves and/or casts the parameters and buffers.
`to_empty`(*, device)	Moves the parameters and buffers to the specified device without copying storage.
`to_pyro_random_module`()
`train`([mode])	Sets the module in training mode.
`transform`(X[, t])	Transform the given stddev to a distribution-calibrated one using the input mean and stddev as priors for the underlying Gaussian process.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`update_added_loss_term`(name, added_loss_term)
`variational_parameters`()
`xpu`([device])	Moves all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Sets gradients of all model parameters to zero.