chemprop.featurizers.atom
#
Module Contents#
Classes#
A |
|
The mode of an atom is used for featurization into a MolGraph |
Functions#
Build the corresponding multi-hot atom featurizer. |
- class chemprop.featurizers.atom.MultiHotAtomFeaturizer(atomic_nums, degrees, formal_charges, chiral_tags, num_Hs, hybridizations)[source]#
Bases:
chemprop.featurizers.base.VectorFeaturizer
[rdkit.Chem.rdchem.Atom
]A
MultiHotAtomFeaturizer
uses a multi-hot encoding to featurize atoms.See also
The class provides three default parameterization schemes:
The generated atom features are ordered as follows: * atomic number * degree * formal charge * chiral tag * number of hydrogens * hybridization * aromaticity * mass
Important
Each feature, except for aromaticity and mass, includes a pad for unknown values.
- Parameters:
atomic_nums (Sequence[int]) – the choices for atom type denoted by atomic number. Ex:
[4, 5, 6]
for C, N and O.degrees (Sequence[int]) – the choices for number of bonds an atom is engaged in.
formal_charges (Sequence[int]) – the choices for integer electronic charge assigned to an atom.
chiral_tags (Sequence[int]) – the choices for an atom’s chiral tag. See
rdkit.Chem.rdchem.ChiralType
for possible integer values.num_Hs (Sequence[int]) – the choices for number of bonded hydrogen atoms.
hybridizations (Sequence[int]) – the choices for an atom’s hybridization type. See
rdkit.Chem.rdchem.HybridizationType
for possible integer values.
- num_only(a)[source]#
featurize the atom by setting only the atomic number bit
- Parameters:
a (rdkit.Chem.rdchem.Atom)
- Return type:
numpy.ndarray
- classmethod v1(max_atomic_num=100)[source]#
The original implementation used in Chemprop V1 [1], [2]_.
- Parameters:
max_atomic_num (int, default=100) – Include a bit for all atomic numbers in the interval \([1, \mathtt{max_atomic_num}]\)
References
Kelley, B.; Mathea, M.; Palmer, A. “Analyzing Learned Molecular Representations for Property Prediction.” J. Chem. Inf. Model. 2019, 59 (8), 3370–3388. https://doi.org/10.1021/acs.jcim.9b00237 .. [2] Heid, E.; Greenman, K.P.; Chung, Y.; Li, S.C.; Graff, D.E.; Vermeire, F.H.; Wu, H.; Green, W.H.; McGill, C.J. “Chemprop: A machine learning package for chemical property prediction.” J. Chem. Inf. Model. 2024, 64 (1), 9–17. https://doi.org/10.1021/acs.jcim.3c01250
- classmethod v2()[source]#
An implementation that includes an atom type bit for all elements in the first four rows of the periodic table plus iodine.
- classmethod organic()[source]#
A specific parameterization intended for use with organic or drug-like molecules.
- This parameterization features:
includes an atomic number bit only for H, B, C, N, O, F, Si, P, S, Cl, Br, and I atoms
a hybridization bit for \(s, sp, sp^2\) and \(sp^3\) hybridizations.
- class chemprop.featurizers.atom.AtomFeatureMode[source]#
Bases:
chemprop.utils.utils.EnumMapping
The mode of an atom is used for featurization into a MolGraph
- V1#
- V2#
- ORGANIC#
- chemprop.featurizers.atom.get_multi_hot_atom_featurizer(mode)[source]#
Build the corresponding multi-hot atom featurizer.
- Parameters:
mode (str | AtomFeatureMode)
- Return type: