netcal.regression.GPCauchy¶

class netcal.regression.GPCauchy(n_inducing_points: int = 12, n_random_samples: int = 128, *, name_prefix: str = 'gpcauchy', **kwargs)¶

GP-Cauchy recalibration method for regression uncertainty calibration that consumes an uncalibrated Gaussian distribution but converts it to a calibrated Cauchy distribution. This method uses a Gaussian process (GP) for a flexible estimation of the recalibration parameter (cf. [1]). Similar to netcal.regression.gp.GPNormal, the GP-Cauchy [2] acts as a kind of temperature scaling for the variance of a Gaussian distribution [3], [4]. However, the rescaled variance is interpreted as the scaling parameter of a Cauchy distribution. Furthermore, the rescaling parameter is not globally fixed but obtained by a GP for each sample individually, so that we are able to convert a Gaussian to a Cauchy distribution [2]. Thus, the GP-Cauchy seeks for distribution calibration but for parametric Cauchy distributions. Note that this method does not change the mean but only reinterprets the predicted variance as the Cauchy scaling parameter. The mode of the Cauchy is assumed to be equal to the input mean.

Mathematical background: Let \(f_Y(y)\) denote the uncalibrated probability density function (PDF), targeting the probability distribution for \(Y\). In our case, the uncalibrated PDF is given as a Gaussian, so that \(f_Y(y) = \mathcal{N}\big(y; \mu_Y(X), \sigma^2_Y(X)\big)\) with mean \(\mu_Y(X)\) and variance \(\sigma^2_Y(X)\) obtained by a probabilistic regression model that depends on the input \(X\). The calibrated PDF \(g_Y(y)\) is a rescaled Cauchy distribution with fixed mode \(x_0 \in \mathbb{R}\) and rescaled scaling parameter \(\lambda \in \mathbb{R}_{>0}\), so that

\[g_Y(y) = \text{Cauchy}\Big(y; x_0=\mu_Y(X), \lambda=\big(\theta_y \cdot \sigma_Y(X)\big) \Big) ,\]

where \(\theta_y\) is the adaptive rescaling weight for a certain \(y\).

The GP-Cauchy utilizes a Gaussian process to obtain \(\theta_y\), so that

\[\theta_y \sim \text{gp}(0, k) ,\]

where \(k\) is the kernel function (for a more detailed description of the underlying Gaussian process, see documentation of parent class netcal.regression.gp.AbstractGP).

Parameters:

n_inducing_points (int) – Number of inducing points used to approximate the input space. These inducing points are also optimized.
n_random_samples (int) – Number of random samples used to sample from the parameter distribution during optimization and inference.
n_epochs (int, default: 200) – Number of optimization epochs.
batch_size (int, default: 256) – Size of batches during optimization.
num_workers (int, optional, default: 0) – Number of workers used for the dataloader.
lr (float, optional, default: 1e-2) – Learning rate used for the Adam optimizer.
use_cuda (str or bool, optional, default: False) – The optimization and inference might also run on a CUDA device. If True, use the first available CUDA device. You can also pass a string “cuda:0”, “cuda:1”, etc. to specify the CUDA device. If False, use CPU for optimization and inference.
jitter (float, optional, default: 1e-5) – Small digit that is added to the diagonal of a covariance matrix to stabilize Cholesky decomposition during Gaussian process optimization.
name_prefix (str, optional, default: "gpcauchy") – Name prefix internally used in Pyro to distinguish between parameter stores.

References

Methods

`__init__`([n_inducing_points, ...])	Constructor.
`add_module`(name, module)	Adds a child module to the current module.
`added_loss_terms`()
`apply`(fn)	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Returns an iterator over module buffers.
`children`()	Returns an iterator over immediate children modules.
`clear`()	Clear module parameters.
`constraint_for_parameter_name`(param_name)
`constraints`()
`cpu`()	Moves all model parameters and buffers to the CPU.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`epsilon`(dtype)	Get the smallest digit that is representable depending on the passed dtype (NumPy or PyTorch).
`eval`()	Sets the module in evaluation mode.
`extra_repr`()	Additional information used to print if str(method) is called.
`fit`(X, y[, tensorboard])	Fit a GP model to the provided data using Gaussian process optimization.
`fit_transform`(X[, y])	Fit to data, then transform it.
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(x)	Forward method defines the prior for the GP.
`get_buffer`(target)	Returns the buffer given by `target` if it exists, otherwise throws an error.
`get_extra_state`()	Returns any extra state to include in the module's state_dict.
`get_fantasy_model`(inputs, targets, **kwargs)	Returns a new GP model that incorporates the specified inputs and targets as new training data using online variational conditioning (OVC).
`get_metadata_routing`()	Get metadata routing of this object.
`get_parameter`(target)	Returns the parameter given by `target` if it exists, otherwise throws an error.
`get_params`([deep])	Overwrite base method's get_params function to also capture child parameters as variational strategy, LMC coefficients, etc.
`get_submodule`(target)	Returns the submodule given by `target` if it exists, otherwise throws an error.
`guide`(x, y)	Pyro guide that defines the variational distribution for the Gaussian process.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`hyperparameters`()
`initialize`(**kwargs)	Set a value for a parameter
`ipu`([device])	Moves all model parameters and buffers to the IPU.
`load_model`(filename[, use_cuda])	Overwrite base method's load_model function as the parameters for the GP methods are stored differently compared to the remaining methods.
`load_state_dict`(state_dict[, strict])	Copies parameters and buffers from `state_dict` into this module and its descendants.
`load_strict_shapes`(value)
`local_load_samples`(samples_dict, memo, prefix)	Defines local behavior of this Module when loading parameters from a samples_dict generated by a Pyro sampling mechanism.
`model`(x, y)	Model function that defines the computation graph.
`modules`()	Returns an iterator over all modules in the network.
`named_added_loss_terms`()	Returns an iterator over module variational strategies, yielding both the name of the variational strategy as well as the strategy itself.
`named_buffers`([prefix, recurse])	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_constraints`([memo, prefix])
`named_hyperparameters`()
`named_modules`([memo, prefix, remove_duplicate])	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse])	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`named_parameters_and_constraints`()
`named_priors`([memo, prefix])	Returns an iterator over the module's priors, yielding the name of the prior, the prior, the associated parameter names, and the transformation callable.
`named_variational_parameters`()
`parameters`([recurse])	Returns an iterator over module parameters.
`pyro_guide`(input[, beta, name_prefix])	(For Pyro integration only).
`pyro_load_from_samples`(samples_dict)	Convert this Module in to a batch Module by loading parameters from the given samples_dict.
`pyro_model`(input[, beta, name_prefix])	(For Pyro integration only).
`pyro_sample_from_prior`()	For each parameter in this Module and submodule that have defined priors, sample a value for that parameter from its corresponding prior with a pyro.sample primitive and load the resulting value in to the parameter.
`register_added_loss_term`(name)
`register_backward_hook`(hook)	Registers a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Adds a buffer to the module.
`register_constraint`(param_name, constraint)
`register_forward_hook`(hook)	Registers a forward hook on the module.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the module.
`register_full_backward_hook`(hook)	Registers a backward hook on the module.
`register_load_state_dict_post_hook`(hook)	Registers a post hook to be run after module's `load_state_dict` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, parameter)	Adds a parameter to the module.
`register_prior`(name, prior, param_or_closure)	Adds a prior to the module.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`sample_from_prior`(prior_name)	Sample parameter values from prior.
`save_model`(filename)	Save model instance as with torch's save function as this is safer for torch tensors.
`set_extra_state`(state)	This function is called from `load_state_dict()` to handle any extra state found within the state_dict.
`set_fit_request`(*[, tensorboard])	Request metadata passed to the `fit` method.
`set_output`(*[, transform])	Set output container.
`set_params`(**params)	Set the parameters of this estimator.
`share_memory`()	See `torch.Tensor.share_memory_()`
`state_dict`(*args[, destination, prefix, ...])	Returns a dictionary containing references to the whole state of the module.
`to`(args, *kwargs)	Moves and/or casts the parameters and buffers.
`to_empty`(*, device)	Moves the parameters and buffers to the specified device without copying storage.
`to_pyro_random_module`()
`train`([mode])	Sets the module in training mode.
`transform`(X)	Transform the given stddev to a distribution-calibrated one using the input mean and stddev as priors for the underlying Gaussian process.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`update_added_loss_term`(name, added_loss_term)
`variational_parameters`()
`xpu`([device])	Moves all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Sets gradients of all model parameters to zero.