chemprop.uncertainty.estimator#

Attributes#

Classes#

UncertaintyEstimator

A helper class for making model predictions and associated uncertainty predictions.

NoUncertaintyEstimator

A helper class for making model predictions and associated uncertainty predictions.

MVEEstimator

Class that estimates prediction means and variances (MVE). [nix1994]

EnsembleEstimator

Class that predicts the uncertainty of predictions based on the variance in predictions among

ClassEstimator

A helper class for making model predictions and associated uncertainty predictions.

EvidentialTotalEstimator

Class that predicts the total evidential uncertainty based on hyperparameters of

EvidentialEpistemicEstimator

Class that predicts the epistemic evidential uncertainty based on hyperparameters of

EvidentialAleatoricEstimator

Class that predicts the aleatoric evidential uncertainty based on hyperparameters of

DropoutEstimator

A DropoutEstimator creates a virtual ensemble of models via Monte Carlo dropout with

ClassificationDirichletEstimator

A ClassificationDirichletEstimator predicts an amount of 'evidence' for both the

MulticlassDirichletEstimator

A MulticlassDirichletEstimator predicts an amount of 'evidence' for each class as

QuantileRegressionEstimator

A helper class for making model predictions and associated uncertainty predictions.

Module Contents#

class chemprop.uncertainty.estimator.UncertaintyEstimator[source]#

Bases: abc.ABC

A helper class for making model predictions and associated uncertainty predictions.

abstractmethod __call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

chemprop.uncertainty.estimator.UncertaintyEstimatorRegistry#
class chemprop.uncertainty.estimator.NoUncertaintyEstimator[source]#

Bases: UncertaintyEstimator

A helper class for making model predictions and associated uncertainty predictions.

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.MVEEstimator[source]#

Bases: UncertaintyEstimator

Class that estimates prediction means and variances (MVE). [nix1994]

References

[nix1994] (1,2)

Nix, D. A.; Weigend, A. S. “Estimating the mean and variance of the target probability distribution.” Proceedings of 1994 IEEE International Conference on Neural Networks, 1994 https://doi.org/10.1109/icnn.1994.374138

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.EnsembleEstimator[source]#

Bases: UncertaintyEstimator

Class that predicts the uncertainty of predictions based on the variance in predictions among an ensemble’s submodels.

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.ClassEstimator[source]#

Bases: UncertaintyEstimator

A helper class for making model predictions and associated uncertainty predictions.

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.EvidentialTotalEstimator[source]#

Bases: UncertaintyEstimator

Class that predicts the total evidential uncertainty based on hyperparameters of the evidential distribution [amini2020].

References

[amini2020]

Amini, A.; Schwarting, W.; Soleimany, A.; Rus, D. “Deep Evidential Regression”. NeurIPS, 2020. https://arxiv.org/abs/1910.02600

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.EvidentialEpistemicEstimator[source]#

Bases: UncertaintyEstimator

Class that predicts the epistemic evidential uncertainty based on hyperparameters of the evidential distribution.

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.EvidentialAleatoricEstimator[source]#

Bases: UncertaintyEstimator

Class that predicts the aleatoric evidential uncertainty based on hyperparameters of the evidential distribution.

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.DropoutEstimator(ensemble_size, dropout=None)[source]#

Bases: UncertaintyEstimator

A DropoutEstimator creates a virtual ensemble of models via Monte Carlo dropout with the provided model [gal2016].

Parameters:
  • ensemble_size (int) – The number of samples to draw for the ensemble.

  • dropout (float | None) – The probability of dropping out units in the dropout layers. If unspecified, the training probability is used, which is prefered but not possible if the model was not trained with dropout (i.e. p=0).

References

[gal2016]

Gal, Y.; Ghahramani, Z. “Dropout as a bayesian approximation: Representing model uncertainty in deep learning.” International conference on machine learning. PMLR, 2016. https://arxiv.org/abs/1506.02142

ensemble_size#
dropout = None#
__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.ClassificationDirichletEstimator[source]#

Bases: UncertaintyEstimator

A ClassificationDirichletEstimator predicts an amount of ‘evidence’ for both the negative class and the positive class as described in [sensoy2018]. The class probabilities and the uncertainty are calculated based on the evidence.

\[S = \sum_{i=1}^K \alpha_i p_i = \alpha_i / S u = K / S\]

where \(K\) is the number of classes, \(\alpha_i\) is the evidence for class \(i\), \(p_i\) is the probability of class \(i\), and \(u\) is the uncertainty.

References

[sensoy2018]

Sensoy, M.; Kaplan, L.; Kandemir, M. “Evidential deep learning to quantify classification uncertainty.” NeurIPS, 2018, 31. https://doi.org/10.48550/arXiv.1806.01768

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.MulticlassDirichletEstimator[source]#

Bases: UncertaintyEstimator

A MulticlassDirichletEstimator predicts an amount of ‘evidence’ for each class as described in [sensoy2018]. The class probabilities and the uncertainty are calculated based on the evidence.

\[S = \sum_{i=1}^K \alpha_i p_i = \alpha_i / S u = K / S\]

where \(K\) is the number of classes, \(\alpha_i\) is the evidence for class \(i\), \(p_i\) is the probability of class \(i\), and \(u\) is the uncertainty.

References

[sensoy2018]

Sensoy, M.; Kaplan, L.; Kandemir, M. “Evidential deep learning to quantify classification uncertainty.” NeurIPS, 2018, 31. https://doi.org/10.48550/arXiv.1806.01768

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]

class chemprop.uncertainty.estimator.QuantileRegressionEstimator[source]#

Bases: UncertaintyEstimator

A helper class for making model predictions and associated uncertainty predictions.

__call__(dataloader, models, trainer)[source]#

Calculate the uncalibrated predictions and uncertainties for the dataloader.

dataloader: DataLoader

the dataloader used for model predictions and uncertainty predictions

models: Iterable[MPNN] | Iterable[MolAtomBondMPNN]

the models used for model predictions and uncertainty predictions. If using MolAtomBondMPNN models, the uncertainty estimator will return preds and uncs for each of the mole, atom, and bond predictions and uncertainties.

trainer: pl.Trainer

an instance of the Trainer used to manage model inference

Returns:

  • preds (Tensor) – the model predictions, with shape varying by task type:

    • regression/binary classification: m x n x t

    • multiclass classification: m x n x t x c, where m is the number of models,

    n is the number of inputs, t is the number of tasks, and c is the number of classes.

  • uncs (Tensor) – the predicted uncertainties, with shapes of m' x n x t.

  • .. note:: – The m and m' are different by definition. The m is the number of models, while the m' is the number of uncertainty estimations. For example, if two MVE or evidential models are provided, both m and m' are two. However, for an ensemble of two models, m' would be one (even though m = 2).

Parameters:
Return type:

tuple[torch.Tensor, torch.Tensor] | tuple[tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None], tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]]