chemprop.featurizers.molgraph#
Submodules#
Attributes#
Classes#
A |
|
A |
|
A |
|
A |
|
A |
|
A |
|
The mode by which a reaction should be featurized into a MolGraph |
Package Contents#
- class chemprop.featurizers.molgraph.MolGraphCache(inputs, V_fs, E_fs, featurizer, n_workers=0)[source]#
Bases:
MolGraphCacheFacadeA
MolGraphCacheprecomputes the correspondingMolGraphs and caches them in memory.- Parameters:
inputs (Iterable[chemprop.featurizers.base.S])
V_fs (Iterable[numpy.ndarray | None])
E_fs (Iterable[numpy.ndarray | None])
featurizer (chemprop.featurizers.base.Featurizer[chemprop.featurizers.base.S, chemprop.data.molgraph.MolGraph])
n_workers (int)
- class chemprop.featurizers.molgraph.MolGraphCacheFacade(inputs, V_fs, E_fs, featurizer)[source]#
Bases:
collections.abc.Sequence[chemprop.data.molgraph.MolGraph],Generic[chemprop.featurizers.base.S]A
MolGraphCacheFacadeprovided an interface for cachingMolGraphs.Note
This class only provides a facade for a cached dataset, but it does not guarantee whether the underlying data is truly cached.
- Parameters:
inputs (Iterable[S]) – The inputs to be featurized.
V_fs (Iterable[np.ndarray]) – The node features for each input.
E_fs (Iterable[np.ndarray]) – The edge features for each input.
featurizer (Featurizer[S, MolGraph]) – The featurizer with which to generate the
MolGraphs.
- class chemprop.featurizers.molgraph.MolGraphCacheOnTheFly(inputs, V_fs, E_fs, featurizer)[source]#
Bases:
MolGraphCacheFacadeA
MolGraphCacheOnTheFlycomputes the correspondingMolGraphs as they are requested.- Parameters:
inputs (Iterable[chemprop.featurizers.base.S])
V_fs (Iterable[numpy.ndarray | None])
E_fs (Iterable[numpy.ndarray | None])
featurizer (chemprop.featurizers.base.Featurizer[chemprop.featurizers.base.S, chemprop.data.molgraph.MolGraph])
- class chemprop.featurizers.molgraph.CuikmolmakerMolGraphFeaturizer[source]#
Bases:
chemprop.featurizers.base.Featurizer[list[str],BatchCuikMolGraph]A
CuikmolmakerMolGraphFeaturizerfeaturizes a list of molecules at once instead of one molecule at a time for efficiency.- Parameters:
atom_featurizer_mode (str, default="V2") – The mode of the atom featurizer (V1, V2, ORGANIC, RIGR) to use.
extra_atom_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated features of each atom
extra_bond_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated features of each bond
add_h (bool, default=False) – whether to add hydrogens to the Chem.Mol objects created from the input SMILES strings
- atom_featurizer_mode: Literal['V1', 'V2', 'ORGANIC', 'RIGR'] = 'V2'#
- extra_atom_fdim: int = 0#
- extra_bond_fdim: int = 0#
- add_h: bool = False#
- atom_fdim: int#
- bond_fdim: int#
- class chemprop.featurizers.molgraph.SimpleMoleculeMolGraphFeaturizer[source]#
Bases:
chemprop.featurizers.molgraph.mixins._MolGraphFeaturizerMixin,chemprop.featurizers.base.GraphFeaturizer[rdkit.Chem.Mol]A
SimpleMoleculeMolGraphFeaturizeris the default implementation of aMoleculeMolGraphFeaturizer- Parameters:
atom_featurizer (AtomFeaturizer, default=MultiHotAtomFeaturizer()) – the featurizer with which to calculate feature representations of the atoms in a given molecule
bond_featurizer (BondFeaturizer, default=MultiHotBondFeaturizer()) – the featurizer with which to calculate feature representations of the bonds in a given molecule
extra_atom_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated features of each atom
extra_bond_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated features of each bond
- extra_atom_fdim: int = 0#
- extra_bond_fdim: int = 0#
- type chemprop.featurizers.molgraph.CGRFeaturizer = CondensedGraphOfReactionFeaturizer#
- class chemprop.featurizers.molgraph.CondensedGraphOfReactionFeaturizer[source]#
Bases:
chemprop.featurizers.molgraph.mixins._MolGraphFeaturizerMixin,chemprop.featurizers.base.GraphFeaturizer[chemprop.types.Rxn]A
CondensedGraphOfReactionFeaturizerfeaturizes reactions using the condensed reaction graph method utilized in [1]NOTE: This class does not accept a
AtomFeaturizerinstance. This is because it requries thenum_only()method, which is only implemented in the concreteAtomFeaturizerclass- Parameters:
atom_featurizer (AtomFeaturizer, default=AtomFeaturizer()) – the featurizer with which to calculate feature representations of the atoms in a given molecule
bond_featurizer (BondFeaturizerBase, default=BondFeaturizer()) – the featurizer with which to calculate feature representations of the bonds in a given molecule
mode (Union[str, ReactionMode], default=ReactionMode.REAC_DIFF) – the mode by which to featurize the reaction as either the string code or enum value
References
- __call__(rxn, atom_features_extra=None, bond_features_extra=None)[source]#
Featurize the input reaction into a molecular graph
- Parameters:
rxn (Rxn) – a 2-tuple of atom-mapped rdkit molecules, where the 0th element is the reactant and the 1st element is the product
atom_features_extra (np.ndarray | None, default=None) – UNSUPPORTED maintained only to maintain parity with the method signature of the MoleculeFeaturizer
bond_features_extra (np.ndarray | None, default=None) – UNSUPPORTED maintained only to maintain parity with the method signature of the MoleculeFeaturizer
- Returns:
the molecular graph of the reaction
- Return type:
- classmethod map_reac_to_prod(reacs, pdts)[source]#
Map atom indices between corresponding atoms in the reactant and product molecules
- Parameters:
reacs (Chem.Mol) – An RDKit molecule of the reactants
pdts (Chem.Mol) – An RDKit molecule of the products
- Returns:
ri2pi (dict[int, int]) – A dictionary of corresponding atom indices from reactant atoms to product atoms
pdt_idxs (list[int]) – atom indices of poduct atoms
rct_idxs (list[int]) – atom indices of reactant atoms
- Return type:
tuple[dict[int, int], list[int], list[int]]
- class chemprop.featurizers.molgraph.RxnMode[source]#
Bases:
chemprop.utils.utils.EnumMappingThe mode by which a reaction should be featurized into a MolGraph
- REAC_PROD#
concatenate the reactant features with the product features.
- REAC_PROD_BALANCE#
concatenate the reactant features with the products feature and balances imbalanced reactions
- REAC_DIFF#
concatenates the reactant features with the difference in features between reactants and products
- REAC_DIFF_BALANCE#
concatenates the reactant features with the difference in features between reactants and product and balances imbalanced reactions
- PROD_DIFF#
concatenates the product features with the difference in features between reactants and products
- PROD_DIFF_BALANCE#
concatenates the product features with the difference in features between reactants and products and balances imbalanced reactions