chemprop.uncertainty.evaluator

chemprop.uncertainty.evaluator#

Attributes#

UncertaintyEvaluatorRegistry

Classes#

`RegressionEvaluator`	Evaluates the quality of uncertainty estimates in regression tasks.
`NLLRegressionEvaluator`	Evaluate uncertainty values for regression datasets using the mean negative-log-likelihood
`CalibrationAreaEvaluator`	A class for evaluating regression uncertainty values based on how they deviate from perfect
`ExpectedNormalizedErrorEvaluator`	A class that evaluates uncertainty performance by binning together clusters of predictions
`SpearmanEvaluator`	Evaluate the Spearman rank correlation coefficient between the uncertainties and errors in the model predictions.
`RegressionConformalEvaluator`	Evaluate the coverage of conformal prediction for regression datasets.
`BinaryClassificationEvaluator`	Evaluates the quality of uncertainty estimates in binary classification tasks.
`NLLClassEvaluator`	Evaluate uncertainty values for binary classification datasets using the mean negative-log-likelihood
`MultilabelConformalEvaluator`	Evaluate the coverage of conformal prediction for binary classification datasets with multiple labels.
`MulticlassClassificationEvaluator`	Evaluates the quality of uncertainty estimates in multiclass classification tasks.
`NLLMulticlassEvaluator`	Evaluate uncertainty values for multiclass classification datasets using the mean negative-log-likelihood
`MulticlassConformalEvaluator`	Evaluate the coverage of conformal prediction for multiclass classification datasets.

Module Contents#

chemprop.uncertainty.evaluator.UncertaintyEvaluatorRegistry#

class chemprop.uncertainty.evaluator.RegressionEvaluator[source]#

Bases: abc.ABC

Evaluates the quality of uncertainty estimates in regression tasks.

abstractmethod evaluate(preds, uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

preds (Tensor) – the predictions for regression tasks. It is a tensor of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
uncs (Tensor) – the predicted uncertainties of the shape of n x t
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.NLLRegressionEvaluator[source]#

Bases: RegressionEvaluator

Evaluate uncertainty values for regression datasets using the mean negative-log-likelihood of the targets given the probability distributions estimated by the model:

\[\mathrm{NLL}(y, \hat y) = \frac{1}{2} \log(2 \pi \sigma^2) + \frac{(y - \hat{y})^2}{2 \sigma^2}\]

where \(\hat{y}\) is the predicted value, \(y\) is the true value, and \(\sigma^2\) is the predicted uncertainty (variance).

The function returns a tensor containing the mean NLL for each task.

evaluate(preds, uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

preds (Tensor) – the predictions for regression tasks. It is a tensor of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
uncs (Tensor) – the predicted uncertainties of the shape of n x t
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.CalibrationAreaEvaluator[source]#

Bases: RegressionEvaluator

A class for evaluating regression uncertainty values based on how they deviate from perfect calibration on an observed-probability versus expected-probability plot.

evaluate(preds, uncs, targets, mask, num_bins=100)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

preds (Tensor) – the predictions for regression tasks. It is a tensor of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
uncs (Tensor) – the predicted uncertainties (variance) of the shape of n x t
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation
num_bins (int, default=100) – the number of bins to discretize the [0, 1] interval

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.ExpectedNormalizedErrorEvaluator[source]#

Bases: RegressionEvaluator

A class that evaluates uncertainty performance by binning together clusters of predictions and comparing the average predicted variance of the clusters against the RMSE of the cluster. [1]

\[\mathrm{ENCE} = \frac{1}{N} \sum_{i=1}^{N} \frac{|\mathrm{RMV}_i - \mathrm{RMSE}_i|}{\mathrm{RMV}_i}\]

where \(N\) is the number of bins, \(\mathrm{RMV}_i\) is the root of the mean uncertainty over the \(i\)-th bin and \(\mathrm{RMSE}_i\) is the root mean square error over the \(i\)-th bin. This discrepancy is further normalized by the uncertainty over the bin, \(\mathrm{RMV}_i\), because the error is expected to be naturally higher as the uncertainty increases.

References

evaluate(preds, uncs, targets, mask, num_bins=100)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

preds (Tensor) – the predictions for regression tasks. It is a tensor of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
uncs (Tensor) – the predicted uncertainties (variance) of the shape of n x t
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation
num_bins (int, default=100) – the number of bins the data are divided into

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.SpearmanEvaluator[source]#

Bases: RegressionEvaluator

Evaluate the Spearman rank correlation coefficient between the uncertainties and errors in the model predictions.

The correlation coefficient returns a value in the [-1, 1] range, with better scores closer to 1 observed when the uncertainty values are predictive of the rank ordering of the errors in the model prediction.

evaluate(preds, uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

preds (Tensor) – the predictions for regression tasks. It is a tensor of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
uncs (Tensor) – the predicted uncertainties of the shape of n x t
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.RegressionConformalEvaluator[source]#

Bases: RegressionEvaluator

Evaluate the coverage of conformal prediction for regression datasets.

\[\Pr (Y_{\text{test}} \in C(X_{\text{test}}))\]

where the \(C(X_{\text{test}})\) is the predicted interval.

evaluate(preds, uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

preds (Tensor) – the predictions for regression tasks. It is a tensor of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
uncs (Tensor) – the predicted uncertainties of the shape of n x t
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.BinaryClassificationEvaluator[source]#

Bases: abc.ABC

Evaluates the quality of uncertainty estimates in binary classification tasks.

abstractmethod evaluate(uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

uncs (Tensor) – the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.NLLClassEvaluator[source]#

Bases: BinaryClassificationEvaluator

Evaluate uncertainty values for binary classification datasets using the mean negative-log-likelihood of the targets given the assigned probabilities from the model:

\[\mathrm{NLL} = -\log(\hat{y} \cdot y + (1 - \hat{y}) \cdot (1 - y))\]

where \(y\) is the true binary label (0 or 1), and \(\hat{y}\) is the predicted probability associated with the class label 1.

The function returns a tensor containing the mean NLL for each task.

evaluate(uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

uncs (Tensor) – the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.MultilabelConformalEvaluator[source]#

Bases: BinaryClassificationEvaluator

Evaluate the coverage of conformal prediction for binary classification datasets with multiple labels.

\[\Pr \left( \hat{\mathcal C}_{\text{in}}(X) \subseteq \mathcal Y \subseteq \hat{\mathcal C}_{\text{out}}(X) \right)\]

where the in-set \(\hat{\mathcal C}_\text{in}\) is contained by the set of true labels \(\mathcal Y\) and \(\mathcal Y\) is contained within the out-set \(\hat{\mathcal C}_\text{out}\).

evaluate(uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

uncs (Tensor) – the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of n x t, where n is the number of input molecules/reactions, and t is the number of tasks.
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.MulticlassClassificationEvaluator[source]#

Bases: abc.ABC

Evaluates the quality of uncertainty estimates in multiclass classification tasks.

abstractmethod evaluate(uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

uncs (Tensor) – the predicted uncertainties (i.e., the predicted probabilities for each class) of the shape of n x t x c, where n is the number of input molecules/reactions, t is the number of tasks, and c is the number of classes.
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.NLLMulticlassEvaluator[source]#

Bases: MulticlassClassificationEvaluator

Evaluate uncertainty values for multiclass classification datasets using the mean negative-log-likelihood of the targets given the assigned probabilities from the model:

\[\mathrm{NLL} = -\log(p_{y_i})\]

where \(p_{y_i}\) is the predicted probability for the true class \(y_i\), calculated as:

\[p_{y_i} = \sum_{k=1}^{K} \mathbb{1}(y_i = k) \cdot p_k\]

Here: \(K\) is the total number of classes, \(\mathbb{1}(y_i = k)\) is the indicator function that is 1 when the true class \(y_i\) equals class \(k\), and 0 otherwise, and \(p_k\) is the predicted probability for class \(k\).

The function returns a tensor containing the mean NLL for each task.

evaluate(uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

uncs (Tensor) – the predicted uncertainties (i.e., the predicted probabilities for each class) of the shape of n x t x c, where n is the number of input molecules/reactions, t is the number of tasks, and c is the number of classes.
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor

class chemprop.uncertainty.evaluator.MulticlassConformalEvaluator[source]#

Bases: MulticlassClassificationEvaluator

Evaluate the coverage of conformal prediction for multiclass classification datasets.

\[\Pr (Y_{\text{test}} \in C(X_{\text{test}}))\]

where the \(C(X_{\text{test}}) \subset \{1 \mathrel{.\,.} K\}\) is a prediction set of possible labels .

evaluate(uncs, targets, mask)[source]#

Evaluate the performance of uncertainty predictions against the model target values.

Parameters:

uncs (Tensor) – the predicted uncertainties (i.e., the predicted probabilities for each class) of the shape of n x t x c, where n is the number of input molecules/reactions, t is the number of tasks, and c is the number of classes.
targets (Tensor) – a tensor of the shape n x t
mask (Tensor) – a tensor of the shape n x t indicating whether the given values should be used in the evaluation

Returns:

a tensor of the shape t containing the evaluated metrics

Return type:

Tensor