chemprop.uncertainty.calibrator
===============================

.. py:module:: chemprop.uncertainty.calibrator


Attributes
----------

.. autoapisummary::

   chemprop.uncertainty.calibrator.logger
   chemprop.uncertainty.calibrator.UncertaintyCalibratorRegistry


Classes
-------

.. autoapisummary::

   chemprop.uncertainty.calibrator.CalibratorBase
   chemprop.uncertainty.calibrator.RegressionCalibrator
   chemprop.uncertainty.calibrator.ZScalingCalibrator
   chemprop.uncertainty.calibrator.ZelikmanCalibrator
   chemprop.uncertainty.calibrator.MVEWeightingCalibrator
   chemprop.uncertainty.calibrator.RegressionConformalCalibrator
   chemprop.uncertainty.calibrator.BinaryClassificationCalibrator
   chemprop.uncertainty.calibrator.PlattCalibrator
   chemprop.uncertainty.calibrator.IsotonicCalibrator
   chemprop.uncertainty.calibrator.MultilabelConformalCalibrator
   chemprop.uncertainty.calibrator.MulticlassClassificationCalibrator
   chemprop.uncertainty.calibrator.MulticlassConformalCalibrator
   chemprop.uncertainty.calibrator.AdaptiveMulticlassConformalCalibrator
   chemprop.uncertainty.calibrator.IsotonicMulticlassCalibrator


Module Contents
---------------

.. py:data:: logger

.. py:class:: CalibratorBase

   Bases: :py:obj:`abc.ABC`


   A base class for calibrating the predicted uncertainties.


   .. py:method:: fit(*args, **kwargs)
      :abstractmethod:


      Fit calibration method for the calibration data.



   .. py:method:: apply(uncs)
      :abstractmethod:


      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties
      :rtype: Tensor



.. py:data:: UncertaintyCalibratorRegistry

.. py:class:: RegressionCalibrator

   Bases: :py:obj:`CalibratorBase`


   A class for calibrating the predicted uncertainties in regressions tasks.


   .. py:method:: fit(preds, uncs, targets, mask)
      :abstractmethod:


      Fit calibration method for the calibration data.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: RegressionCalibrator



.. py:class:: ZScalingCalibrator

   Bases: :py:obj:`RegressionCalibrator`


   Calibrate regression datasets by applying a scaling value to the uncalibrated standard deviation,
   fitted by minimizing the negative-log-likelihood of a normal distribution around each prediction. [levi2022]_

   .. rubric:: References

   .. [levi2022] Levi, D.; Gispan, L.; Giladi, N.; Fetaya, E. "Evaluating and Calibrating Uncertainty Prediction in
       Regression Tasks." Sensors, 2022, 22(15), 5540. https://www.mdpi.com/1424-8220/22/15/5540


   .. py:method:: fit(preds, uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: RegressionCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties
      :rtype: Tensor



.. py:class:: ZelikmanCalibrator(p)

   Bases: :py:obj:`RegressionCalibrator`


   Calibrate regression datasets using a method that does not depend on a particular probability function form.

   It uses the "CRUDE" method as described in [zelikman2020]_. We implemented this method to be used with variance as the uncertainty.

   :param p: The target qunatile, :math:`p \in [0, 1]`
   :type p: float

   .. rubric:: References

   .. [zelikman2020] Zelikman, E.; Healy, C.; Zhou, S.; Avati, A. "CRUDE: calibrating regression uncertainty distributions
       empirically." arXiv preprint arXiv:2005.12496. https://doi.org/10.48550/arXiv.2005.12496


   .. py:attribute:: p


   .. py:method:: fit(preds, uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: RegressionCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties
      :rtype: Tensor



.. py:class:: MVEWeightingCalibrator

   Bases: :py:obj:`RegressionCalibrator`


   Calibrate regression datasets that have ensembles of individual models that make variance predictions.

   This method minimizes the negative log likelihood for the predictions versus the targets by applying
   a weighted average across the variance predictions of the ensemble. [wang2021]_

   .. rubric:: References

   .. [wang2021] Wang, D.; Yu, J.; Chen, L.; Li, X.; Jiang, H.; Chen, K.; Zheng, M.; Luo, X. "A hybrid framework
       for improving uncertainty quantification in deep learning-based QSAR regression modeling." J. Cheminform.,
       2021, 13, 1-17. https://doi.org/10.1186/s13321-021-00551-x


   .. py:method:: fit(preds, uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``m x n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: MVEWeightingCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties of the shape of ``m x n x t``
      :type uncs: Tensor

      :returns: the calibrated uncertainties of the shape of ``n x t``
      :rtype: Tensor



.. py:class:: RegressionConformalCalibrator(alpha)

   Bases: :py:obj:`RegressionCalibrator`


   Conformalize quantiles to make the interval :math:`[\hat{t}_{\alpha/2}(x),\hat{t}_{1-\alpha/2}(x)]` to have
   approximately :math:`1-\alpha` coverage. [angelopoulos2021]_

   .. math::
       s(x, y) &= \max \left\{ \hat{t}_{\alpha/2}(x) - y, y - \hat{t}_{1-\alpha/2}(x) \right\}

       \hat{q} &= Q(s_1, \ldots, s_n; \left\lceil \frac{(n+1)(1-\alpha)}{n} \right\rceil)

       C(x) &= \left[ \hat{t}_{\alpha/2}(x) - \hat{q}, \hat{t}_{1-\alpha/2}(x) + \hat{q} \right]

   where :math:`s` is the nonconformity score as the difference between :math:`y` and its nearest quantile.
   :math:`\hat{t}_{\alpha/2}(x)` and :math:`\hat{t}_{1-\alpha/2}(x)` are the predicted quantiles from a quantile
   regression model.

   .. note::
       The algorithm is specifically designed for quantile regression model. Intuitively, the set :math:`C(x)` just
       grows or shrinks the distance between the quantiles by :math:`\hat{q}` to achieve coverage. However, this
       function can also be applied to regression model without quantiles being provided. In this case, both
       :math:`\hat{t}_{\alpha/2}(x)` and :math:`\hat{t}_{1-\alpha/2}(x)` are the same as :math:`\hat{y}`. Then, the
       interval would be the same for every data point (i.e., :math:`\left[-\hat{q}, \hat{q} \right]`).

   :param alpha: The error rate, :math:`\alpha \in [0, 1]`
   :type alpha: float

   .. rubric:: References

   .. [angelopoulos2021] Angelopoulos, A.N.; Bates, S.; "A Gentle Introduction to Conformal Prediction and Distribution-Free
       Uncertainty Quantification." arXiv Preprint 2021, https://arxiv.org/abs/2107.07511


   .. py:attribute:: alpha


   .. py:attribute:: bounds


   .. py:method:: fit(preds, uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param preds: the predictions for regression tasks. It is a tensor of the shape of ``n x t``, where ``n`` is
                    the number of input molecules/reactions, and ``t`` is the number of tasks.
      :type preds: Tensor
      :param uncs: the predicted uncertainties of the shape of ``n x t``
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: RegressionCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties (half intervals).

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated half intervals
      :rtype: Tensor



.. py:class:: BinaryClassificationCalibrator

   Bases: :py:obj:`CalibratorBase`


   A class for calibrating the predicted uncertainties in binary classification tasks.


   .. py:method:: fit(uncs, targets, mask)
      :abstractmethod:


      Fit calibration method for the calibration data.

      :param uncs: the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of ``n x t``, where ``n`` is the number of input
                   molecules/reactions, and ``t`` is the number of tasks.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: BinaryClassificationCalibrator



.. py:class:: PlattCalibrator

   Bases: :py:obj:`BinaryClassificationCalibrator`


   Calibrate classification datasets using the Platt scaling algorithm [guo2017]_, [platt1999]_.

   In [platt1999]_, Platt suggests using the number of positive and negative training examples to
   adjust the value of target probabilities used to fit the parameters.

   .. rubric:: References

   .. [guo2017] Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K. Q. "On calibration of modern neural
       networks". ICML, 2017. https://arxiv.org/abs/1706.04599
   .. [platt1999] Platt, J. "Probabilistic Outputs for Support Vector Machines and Comparisons to
       Regularized Likelihood Methods." Adv. Large Margin Classif. 1999, 10 (3), 61–74.


   .. py:method:: fit(uncs, targets, mask, training_targets = None)

      Fit calibration method for the calibration data.

      :param uncs: the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of ``n x t``, where ``n`` is the number of input
                   molecules/reactions, and ``t`` is the number of tasks.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: BinaryClassificationCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties
      :rtype: Tensor



.. py:class:: IsotonicCalibrator

   Bases: :py:obj:`BinaryClassificationCalibrator`


   Calibrate binary classification datasets using isotonic regression as discussed in [guo2017]_.
   In effect, the method transforms incoming uncalibrated confidences using a histogram-like
   function where the range of each transforming bin and its magnitude is learned.

   .. rubric:: References

   .. [guo2017] Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K. Q. "On calibration of modern neural
       networks". ICML, 2017. https://arxiv.org/abs/1706.04599


   .. py:method:: fit(uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param uncs: the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of ``n x t``, where ``n`` is the number of input
                   molecules/reactions, and ``t`` is the number of tasks.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: BinaryClassificationCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties
      :rtype: Tensor



.. py:class:: MultilabelConformalCalibrator(alpha)

   Bases: :py:obj:`BinaryClassificationCalibrator`


   Creates conformal in-set and conformal out-set such that, for :math:`1-\alpha` proportion of datapoints,
   the set of labels is bounded by the in- and out-sets [1]_:

   .. math::
       \Pr \left(
           \hat{\mathcal C}_{\text{in}}(X) \subseteq \mathcal Y \subseteq \hat{\mathcal C}_{\text{out}}(X)
       \right) \geq 1 - \alpha,

   where the in-set :math:`\hat{\mathcal C}_\text{in}` is contained by the set of true labels :math:`\mathcal Y` and
   :math:`\mathcal Y` is contained within the out-set :math:`\hat{\mathcal C}_\text{out}`.

   :param alpha: The error rate, :math:`\alpha \in [0, 1]`
   :type alpha: float

   .. rubric:: References

   .. [1] Cauchois, M.; Gupta, S.; Duchi, J.; "Knowing What You Know: Valid and Validated Confidence Sets
       in Multiclass and Multilabel Prediction." arXiv Preprint 2020, https://arxiv.org/abs/2004.10181


   .. py:attribute:: alpha


   .. py:method:: nonconformity_scores(preds)
      :staticmethod:


      Compute nonconformity score as the negative of the predicted probability.

      .. math::
          s_i = -\hat{f}(X_i)_{Y_i}



   .. py:method:: fit(uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param uncs: the predicted uncertainties (i.e., the predicted probability of class 1) of the shape of ``n x t``, where ``n`` is the number of input
                   molecules/reactions, and ``t`` is the number of tasks.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: BinaryClassificationCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties of the shape of ``n x t x 2``, where ``n`` is the number of input
                molecules/reactions, ``t`` is the number of tasks, and the first element in the last dimension
                corresponds to the in-set :math:`\hat{\mathcal C}_\text{in}`, while the second corresponds to
                the out-set :math:`\hat{\mathcal C}_\text{out}`.
      :rtype: Tensor



.. py:class:: MulticlassClassificationCalibrator

   Bases: :py:obj:`CalibratorBase`


   A class for calibrating the predicted uncertainties in multiclass classification tasks.


   .. py:method:: fit(uncs, targets, mask)
      :abstractmethod:


      Fit calibration method for the calibration data.

      :param uncs: the predicted uncertainties (i.e., the predicted probabilities for each class) of the
                   shape of ``n x t x c``, where ``n`` is the number of input molecules/reactions, ``t`` is
                   the number of tasks, and ``c`` is the number of classes.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in
                   the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: MulticlassClassificationCalibrator



.. py:class:: MulticlassConformalCalibrator(alpha)

   Bases: :py:obj:`MulticlassClassificationCalibrator`


   Create a prediction sets of possible labels :math:`C(X_{\text{test}}) \subset \{1 \mathrel{.\,.} K\}` that follows:

   .. math::
       1 - \alpha \leq \Pr (Y_{\text{test}} \in C(X_{\text{test}})) \leq 1 - \alpha + \frac{1}{n + 1}

   In other words, the probability that the prediction set contains the correct label is almost exactly :math:`1-\alpha`.
   More detailes can be found in [1]_.

   :param alpha: Error rate, :math:`\alpha \in [0, 1]`
   :type alpha: float

   .. rubric:: References

   .. [1] Angelopoulos, A.N.; Bates, S.; "A Gentle Introduction to Conformal Prediction and Distribution-Free
       Uncertainty Quantification." arXiv Preprint 2021, https://arxiv.org/abs/2107.07511


   .. py:attribute:: alpha


   .. py:method:: nonconformity_scores(preds)
      :staticmethod:


      Compute nonconformity score as the negative of the softmax output for the true class.

      .. math::
          s_i = -\hat{f}(X_i)_{Y_i}



   .. py:method:: fit(uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param uncs: the predicted uncertainties (i.e., the predicted probabilities for each class) of the
                   shape of ``n x t x c``, where ``n`` is the number of input molecules/reactions, ``t`` is
                   the number of tasks, and ``c`` is the number of classes.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in
                   the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: MulticlassClassificationCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties
      :rtype: Tensor



.. py:class:: AdaptiveMulticlassConformalCalibrator(alpha)

   Bases: :py:obj:`MulticlassConformalCalibrator`


   Create a prediction sets of possible labels :math:`C(X_{\text{test}}) \subset \{1 \mathrel{.\,.} K\}` that follows:

   .. math::
       1 - \alpha \leq \Pr (Y_{\text{test}} \in C(X_{\text{test}})) \leq 1 - \alpha + \frac{1}{n + 1}

   In other words, the probability that the prediction set contains the correct label is almost exactly :math:`1-\alpha`.
   More detailes can be found in [1]_.

   :param alpha: Error rate, :math:`\alpha \in [0, 1]`
   :type alpha: float

   .. rubric:: References

   .. [1] Angelopoulos, A.N.; Bates, S.; "A Gentle Introduction to Conformal Prediction and Distribution-Free
       Uncertainty Quantification." arXiv Preprint 2021, https://arxiv.org/abs/2107.07511


   .. py:method:: nonconformity_scores(preds)
      :staticmethod:


      Compute nonconformity score by greedily including classes in the classification set until it reaches the true label.

      .. math::
          s(x, y) = \sum_{j=1}^{k} \hat{f}(x)_{\pi_j(x)}, \text{ where } y = \pi_k(x)

      where :math:`\pi_k(x)` is the permutation of :math:`\{1 \mathrel{.\,.} K\}` that sorts :math:`\hat{f}(X_{test})` from most likely to least likely.



.. py:class:: IsotonicMulticlassCalibrator

   Bases: :py:obj:`MulticlassClassificationCalibrator`


   Calibrate multiclass classification datasets using isotonic regression as discussed in
   [guo2017]_. It uses a one-vs-all aggregation scheme to extend isotonic regression from binary to
   multiclass classifiers.

   .. rubric:: References

   .. [guo2017] Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K. Q. "On calibration of modern neural
       networks". ICML, 2017. https://arxiv.org/abs/1706.04599


   .. py:method:: fit(uncs, targets, mask)

      Fit calibration method for the calibration data.

      :param uncs: the predicted uncertainties (i.e., the predicted probabilities for each class) of the
                   shape of ``n x t x c``, where ``n`` is the number of input molecules/reactions, ``t`` is
                   the number of tasks, and ``c`` is the number of classes.
      :type uncs: Tensor
      :param targets: a tensor of the shape ``n x t``
      :type targets: Tensor
      :param mask: a tensor of the shape ``n x t`` indicating whether the given values should be used in
                   the fitting
      :type mask: Tensor

      :returns: **self** -- the fitted calibrator
      :rtype: MulticlassClassificationCalibrator



   .. py:method:: apply(uncs)

      Apply this calibrator to the input uncertainties.

      :param uncs: a tensor containinig uncalibrated uncertainties
      :type uncs: Tensor

      :returns: the calibrated uncertainties
      :rtype: Tensor



