chemprop.uncertainty.evaluator
==============================

.. py:module:: chemprop.uncertainty.evaluator


Attributes
----------

.. autoapisummary::

   chemprop.uncertainty.evaluator.UncertaintyEvaluatorRegistry


Classes
-------

.. autoapisummary::

   chemprop.uncertainty.evaluator.RegressionEvaluator
   chemprop.uncertainty.evaluator.NLLRegressionEvaluator
   chemprop.uncertainty.evaluator.CalibrationAreaEvaluator
   chemprop.uncertainty.evaluator.ExpectedNormalizedErrorEvaluator
   chemprop.uncertainty.evaluator.SpearmanEvaluator
   chemprop.uncertainty.evaluator.RegressionConformalEvaluator
   chemprop.uncertainty.evaluator.BinaryClassificationEvaluator
   chemprop.uncertainty.evaluator.NLLClassEvaluator
   chemprop.uncertainty.evaluator.MultilabelConformalEvaluator
   chemprop.uncertainty.evaluator.MulticlassClassificationEvaluator
   chemprop.uncertainty.evaluator.NLLMulticlassEvaluator
   chemprop.uncertainty.evaluator.MulticlassConformalEvaluator


Module Contents
---------------

.. py:data:: UncertaintyEvaluatorRegistry

.. py:class:: RegressionEvaluator

   Bases: :py:obj:`abc.ABC`


   Evaluates the quality of uncertainty estimates in regression tasks.


   .. py:method:: evaluate(preds, uncs, targets, mask)
      :abstractmethod:


      Evaluate the performance of uncertainty predictions against the model target values.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: NLLRegressionEvaluator

   Bases: :py:obj:`RegressionEvaluator`


   Evaluate uncertainty values for regression datasets using the mean negative-log-likelihood
   of the targets given the probability distributions estimated by the model:

   .. math::

       \mathrm{NLL}(y, \hat y) = \frac{1}{2} \log(2 \pi \sigma^2) + \frac{(y - \hat{y})^2}{2 \sigma^2}

   where :math:`\hat{y}` is the predicted value, :math:`y` is the true value, and
   :math:`\sigma^2` is the predicted uncertainty (variance).

   The function returns a tensor containing the mean NLL for each task.


   .. py:method:: evaluate(preds, uncs, targets, mask)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: CalibrationAreaEvaluator

   Bases: :py:obj:`RegressionEvaluator`


   A class for evaluating regression uncertainty values based on how they deviate from perfect
   calibration on an observed-probability versus expected-probability plot.


   .. py:method:: evaluate(preds, uncs, targets, mask, num_bins = 100)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties (variance) of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor
      :param num_bins: the number of bins to discretize the ``[0, 1]`` interval
      :type num_bins: int, default=100

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: ExpectedNormalizedErrorEvaluator

   Bases: :py:obj:`RegressionEvaluator`


   A class that evaluates uncertainty performance by binning together clusters of predictions
   and comparing the average predicted variance of the clusters against the RMSE of the cluster. [1]_

   .. math::
       \mathrm{ENCE} = \frac{1}{N} \sum_{i=1}^{N} \frac{|\mathrm{RMV}_i - \mathrm{RMSE}_i|}{\mathrm{RMV}_i}

   where :math:`N` is the number of bins, :math:`\mathrm{RMV}_i` is the root of the mean uncertainty over the
   :math:`i`-th bin and :math:`\mathrm{RMSE}_i` is the root mean square error over the :math:`i`-th bin. This
   discrepancy is further normalized by the uncertainty over the bin, :math:`\mathrm{RMV}_i`, because the error
   is expected to be naturally higher as the uncertainty increases.

   .. rubric:: References

   .. [1] Levi, D.; Gispan, L.; Giladi, N.; Fetaya, E. "Evaluating and Calibrating Uncertainty Prediction in Regression Tasks."
       Sensors, 2022, 22(15), 5540. https://www.mdpi.com/1424-8220/22/15/5540


   .. py:method:: evaluate(preds, uncs, targets, mask, num_bins = 100)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties (variance) of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor
      :param num_bins: the number of bins the data are divided into
      :type num_bins: int, default=100

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: SpearmanEvaluator

   Bases: :py:obj:`RegressionEvaluator`


   Evaluate the Spearman rank correlation coefficient between the uncertainties and errors in the model predictions.

   The correlation coefficient returns a value in the [-1, 1] range, with better scores closer to 1
   observed when the uncertainty values are predictive of the rank ordering of the errors in the model prediction.


   .. py:method:: evaluate(preds, uncs, targets, mask)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: RegressionConformalEvaluator

   Bases: :py:obj:`RegressionEvaluator`


   Evaluate the coverage of conformal prediction for regression datasets.

   .. math::
       \Pr (Y_{\text{test}} \in C(X_{\text{test}}))

   where the :math:`C(X_{\text{test}})` is the predicted interval.


   .. py:method:: evaluate(preds, uncs, targets, mask)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: BinaryClassificationEvaluator

   Bases: :py:obj:`abc.ABC`


   Evaluates the quality of uncertainty estimates in binary classification tasks.


   .. py:method:: evaluate(uncs, targets, mask)
      :abstractmethod:


      Evaluate the performance of uncertainty predictions against the model target values.

      :param uncs: the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of ``n x t``, where ``n`` is the number of input
                   molecules/reactions, and ``t`` is the number of tasks.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: NLLClassEvaluator

   Bases: :py:obj:`BinaryClassificationEvaluator`


   Evaluate uncertainty values for binary classification datasets using the mean negative-log-likelihood
   of the targets given the assigned probabilities from the model:

   .. math::

       \mathrm{NLL} = -\log(\hat{y} \cdot y + (1 - \hat{y}) \cdot (1 - y))

   where :math:`y` is the true binary label (0 or 1), and
   :math:`\hat{y}` is the predicted probability associated with the class label 1.

   The function returns a tensor containing the mean NLL for each task.


   .. py:method:: evaluate(uncs, targets, mask)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param uncs: the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of ``n x t``, where ``n`` is the number of input
                   molecules/reactions, and ``t`` is the number of tasks.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: MultilabelConformalEvaluator

   Bases: :py:obj:`BinaryClassificationEvaluator`


   Evaluate the coverage of conformal prediction for binary classification datasets with multiple labels.

   .. math::
       \Pr \left(
           \hat{\mathcal C}_{\text{in}}(X) \subseteq \mathcal Y \subseteq \hat{\mathcal C}_{\text{out}}(X)
       \right)

   where the in-set :math:`\hat{\mathcal C}_\text{in}` is contained by the set of true labels :math:`\mathcal Y` and
   :math:`\mathcal Y` is contained within the out-set :math:`\hat{\mathcal C}_\text{out}`.


   .. py:method:: evaluate(uncs, targets, mask)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param uncs: the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of ``n x t``, where ``n`` is the number of input
                   molecules/reactions, and ``t`` is the number of tasks.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: MulticlassClassificationEvaluator

   Bases: :py:obj:`abc.ABC`


   Evaluates the quality of uncertainty estimates in multiclass classification tasks.


   .. py:method:: evaluate(uncs, targets, mask)
      :abstractmethod:


      Evaluate the performance of uncertainty predictions against the model target values.

      :param uncs: the predicted uncertainties (i.e., the predicted probabilities for each class) of the shape of ``n x t x c``, where ``n`` is the number of input
                   molecules/reactions, ``t`` is the number of tasks, and ``c`` is the number of classes.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: NLLMulticlassEvaluator

   Bases: :py:obj:`MulticlassClassificationEvaluator`


   Evaluate uncertainty values for multiclass classification datasets using the mean negative-log-likelihood
   of the targets given the assigned probabilities from the model:

   .. math::

       \mathrm{NLL} = -\log(p_{y_i})

   where :math:`p_{y_i}` is the predicted probability for the true class :math:`y_i`, calculated as:

   .. math::

       p_{y_i} = \sum_{k=1}^{K} \mathbb{1}(y_i = k) \cdot p_k

   Here: :math:`K` is the total number of classes,
   :math:`\mathbb{1}(y_i = k)` is the indicator function that is 1 when the true class :math:`y_i` equals class :math:`k`, and 0 otherwise,
   and :math:`p_k` is the predicted probability for class :math:`k`.

   The function returns a tensor containing the mean NLL for each task.


   .. py:method:: evaluate(uncs, targets, mask)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param uncs: the predicted uncertainties (i.e., the predicted probabilities for each class) of the shape of ``n x t x c``, where ``n`` is the number of input
                   molecules/reactions, ``t`` is the number of tasks, and ``c`` is the number of classes.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



.. py:class:: MulticlassConformalEvaluator

   Bases: :py:obj:`MulticlassClassificationEvaluator`


   Evaluate the coverage of conformal prediction for multiclass classification datasets.

   .. math::
       \Pr (Y_{\text{test}} \in C(X_{\text{test}}))

   where the :math:`C(X_{\text{test}}) \subset \{1 \mathrel{.\,.} K\}` is a prediction set of possible labels .


   .. py:method:: evaluate(uncs, targets, mask)

      Evaluate the performance of uncertainty predictions against the model target values.

      :param uncs: the predicted uncertainties (i.e., the predicted probabilities for each class) of the shape of ``n x t x c``, where ``n`` is the number of input
                   molecules/reactions, ``t`` is the number of tasks, and ``c`` is the number of classes.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the evaluation
      :type mask: Tensor

      :returns: a tensor of the shape ``t`` containing the evaluated metrics
      :rtype: Tensor



