chemprop.featurizers.atom
=========================

.. py:module:: chemprop.featurizers.atom


Classes
-------

.. autoapisummary::

   chemprop.featurizers.atom.MultiHotAtomFeaturizer
   chemprop.featurizers.atom.RIGRAtomFeaturizer
   chemprop.featurizers.atom.AtomFeatureMode


Functions
---------

.. autoapisummary::

   chemprop.featurizers.atom.get_multi_hot_atom_featurizer


Module Contents
---------------

.. py:class:: MultiHotAtomFeaturizer(atomic_nums, degrees, formal_charges, chiral_tags, num_Hs, hybridizations)

   Bases: :py:obj:`chemprop.featurizers.base.VectorFeaturizer`\ [\ :py:obj:`rdkit.Chem.rdchem.Atom`\ ]


   A :class:`MultiHotAtomFeaturizer` uses a multi-hot encoding to featurize atoms.

   .. seealso::
       The class provides three default parameterization schemes:

       * :meth:`MultiHotAtomFeaturizer.v1`
       * :meth:`MultiHotAtomFeaturizer.v2`
       * :meth:`MultiHotAtomFeaturizer.organic`

   The generated atom features are ordered as follows:
   * atomic number
   * degree
   * formal charge
   * chiral tag
   * number of hydrogens
   * hybridization
   * aromaticity
   * mass

   .. important::
       Each feature, except for aromaticity and mass, includes a pad for unknown values.

   :param atomic_nums: the choices for atom type denoted by atomic number. Ex: ``[4, 5, 6]`` for C, N and O.
   :type atomic_nums: Sequence[int]
   :param degrees: the choices for number of bonds an atom is engaged in.
   :type degrees: Sequence[int]
   :param formal_charges: the choices for integer electronic charge assigned to an atom.
   :type formal_charges: Sequence[int]
   :param chiral_tags: the choices for an atom's chiral tag. See :class:`rdkit.Chem.rdchem.ChiralType` for possible integer values.
   :type chiral_tags: Sequence[int]
   :param num_Hs: the choices for number of bonded hydrogen atoms.
   :type num_Hs: Sequence[int]
   :param hybridizations: the choices for an atom’s hybridization type. See :class:`rdkit.Chem.rdchem.HybridizationType` for possible integer values.
   :type hybridizations: Sequence[int]


   .. py:attribute:: atomic_nums


   .. py:attribute:: degrees


   .. py:attribute:: formal_charges


   .. py:attribute:: chiral_tags


   .. py:attribute:: num_Hs


   .. py:attribute:: hybridizations


   .. py:method:: __len__()


   .. py:method:: __call__(a)


   .. py:method:: num_only(a)

      featurize the atom by setting only the atomic number bit



   .. py:method:: v1(max_atomic_num = 100)
      :classmethod:


      The original implementation used in Chemprop V1 [1]_, [2]_.

      :param max_atomic_num: Include a bit for all atomic numbers in the interval :math:`[1, \mathtt{max\_atomic\_num}]`
      :type max_atomic_num: int, default=100

      .. rubric:: References

      .. [1] Yang, K.; Swanson, K.; Jin, W.; Coley, C.; Eiden, P.; Gao, H.; Guzman-Perez, A.; Hopper, T.;
          Kelley, B.; Mathea, M.; Palmer, A. "Analyzing Learned Molecular Representations for Property Prediction."
          J. Chem. Inf. Model. 2019, 59 (8), 3370–3388. https://doi.org/10.1021/acs.jcim.9b00237
      .. [2] Heid, E.; Greenman, K.P.; Chung, Y.; Li, S.C.; Graff, D.E.; Vermeire, F.H.; Wu, H.; Green, W.H.; McGill,
          C.J. "Chemprop: A machine learning package for chemical property prediction." J. Chem. Inf. Model. 2024,
          64 (1), 9–17. https://doi.org/10.1021/acs.jcim.3c01250



   .. py:method:: v2()
      :classmethod:


      An implementation that includes an atom type bit for all elements in the first four rows of the periodic table plus iodine.



   .. py:method:: organic()
      :classmethod:


      A specific parameterization intended for use with organic or drug-like molecules.

      This parameterization features:
          1. includes an atomic number bit only for H, B, C, N, O, F, Si, P, S, Cl, Br, and I atoms
          2. a hybridization bit for :math:`s, sp, sp^2` and :math:`sp^3` hybridizations.



.. py:class:: RIGRAtomFeaturizer(atomic_nums = None, degrees = None, num_Hs = None)

   Bases: :py:obj:`chemprop.featurizers.base.VectorFeaturizer`\ [\ :py:obj:`rdkit.Chem.rdchem.Atom`\ ]


   A :class:`RIGRAtomFeaturizer` uses a multi-hot encoding to featurize atoms using
   resonance-invariant features [1]_.

   The generated atom features are ordered as follows:
   * atomic number
   * degree
   * number of hydrogens
   * mass

   .. rubric:: References

   .. [1] Zalte, A. S.; Pang, H.-W.; Doner, A. C.; Green, W. H.
       "RIGR: Resonance-Invariant Graph Representation for Molecular Property Prediction."
       J. Chem. Inf. Model. 2025, 65 (20), 10832–10843. https://doi.org/10.1021/acs.jcim.5c00495


   .. py:attribute:: atomic_nums


   .. py:attribute:: degrees


   .. py:attribute:: num_Hs


   .. py:method:: __len__()


   .. py:method:: __call__(a)


   .. py:method:: num_only(a)

      featurize the atom by setting only the atomic number bit



.. py:class:: AtomFeatureMode

   Bases: :py:obj:`chemprop.utils.utils.EnumMapping`


   The mode of an atom is used for featurization into a `MolGraph`


   .. py:attribute:: V1


   .. py:attribute:: V2


   .. py:attribute:: ORGANIC


   .. py:attribute:: RIGR


.. py:function:: get_multi_hot_atom_featurizer(mode)

   Build the corresponding multi-hot atom featurizer.


