The generated atom features are ordered as follows:
* atomic number
* degree
* formal charge
* chiral tag
* number of hydrogens
* hybridization
* aromaticity
* mass
Important
Each feature, except for aromaticity and mass, includes a pad for unknown values.
Parameters:
atomic_nums (Sequence[int]) – the choices for atom type denoted by atomic number. Ex: [4,5,6] for C, N and O.
degrees (Sequence[int]) – the choices for number of bonds an atom is engaged in.
formal_charges (Sequence[int]) – the choices for integer electronic charge assigned to an atom.
chiral_tags (Sequence[int]) – the choices for an atom’s chiral tag. See rdkit.Chem.rdchem.ChiralType for possible integer values.
num_Hs (Sequence[int]) – the choices for number of bonded hydrogen atoms.
hybridizations (Sequence[int]) – the choices for an atom’s hybridization type. See rdkit.Chem.rdchem.HybridizationType for possible integer values.
The feature vectors produced by this featurizer have the following (general) signature:
slice [start, stop)
subfeature
unknown pad?
0-1
null?
N
1-5
bond type
N
5-6
conjugated?
N
6-8
in ring?
N
7-14
stereochemistry
Y
NOTE: the above signature only applies for the default arguments, as the bond type and
sterochemistry slices can increase in size depending on the input arguments.
Parameters:
bond_types (Sequence[BondType] | None, default=[SINGLE, DOUBLE, TRIPLE, AROMATIC]) – the known bond types
stereos (Sequence[int] | None, default=[0, 1, 2, 3, 4, 5]) – the known bond stereochemistries. See [1]_ for more details
atom_featurizer (AtomFeaturizer, default=MultiHotAtomFeaturizer()) – the featurizer with which to calculate feature representations of the atoms in a given
molecule
bond_featurizer (BondFeaturizer, default=MultiHotBondFeaturizer()) – the featurizer with which to calculate feature representations of the bonds in a given
molecule
extra_atom_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated
features of each atom
extra_bond_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated
features of each bond
NOTE: This class does not accept a AtomFeaturizer instance. This is because
it requries the num_only() method, which is only implemented in the concrete
AtomFeaturizer class
Parameters:
atom_featurizer (AtomFeaturizer, default=AtomFeaturizer()) – the featurizer with which to calculate feature representations of the atoms in a given
molecule
bond_featurizer (BondFeaturizerBase, default=BondFeaturizer()) – the featurizer with which to calculate feature representations of the bonds in a given
molecule
mode (Union[str, ReactionMode], default=ReactionMode.REAC_DIFF) – the mode by which to featurize the reaction as either the string code or enum value
Featurize the input reaction into a molecular graph
Parameters:
rxn (Rxn) – a 2-tuple of atom-mapped rdkit molecules, where the 0th element is the reactant and the
1st element is the product
atom_features_extra (np.ndarray | None, default=None) – UNSUPPORTED maintained only to maintain parity with the method signature of the
MoleculeFeaturizer
bond_features_extra (np.ndarray | None, default=None) – UNSUPPORTED maintained only to maintain parity with the method signature of the
MoleculeFeaturizer