chemprop.featurizers.molgraph#

Submodules#

Package Contents#

Classes#

MolGraphCacheFacade

A MolGraphCacheFacade provided an interface for caching

MolGraphCache

A MolGraphCache precomputes the corresponding

MolGraphCacheOnTheFly

A MolGraphCacheOnTheFly computes the corresponding

SimpleMoleculeMolGraphFeaturizer

A SimpleMoleculeMolGraphFeaturizer is the default implementation of a

CondensedGraphOfReactionFeaturizer

A CondensedGraphOfReactionFeaturizer featurizes reactions using the condensed

RxnMode

The mode by which a reaction should be featurized into a MolGraph

Attributes#

CGRFeaturizer

class chemprop.featurizers.molgraph.MolGraphCacheFacade(inputs, V_fs, E_fs, featurizer)[source]#

Bases: collections.abc.Sequence[chemprop.data.molgraph.MolGraph], Generic[chemprop.featurizers.base.S]

A MolGraphCacheFacade provided an interface for caching MolGraphs.

Note

This class only provides a facade for a cached dataset, but it _does not guarantee_ whether the underlying data is truly cached.

Parameters:
  • inputs (Iterable[S]) – The inputs to be featurized.

  • V_fs (Iterable[np.ndarray]) – The node features for each input.

  • E_fs (Iterable[np.ndarray]) – The edge features for each input.

  • featurizer (Featurizer[S, MolGraph]) – The featurizer with which to generate the MolGraphs.

class chemprop.featurizers.molgraph.MolGraphCache(inputs, V_fs, E_fs, featurizer)[source]#

Bases: MolGraphCacheFacade

A MolGraphCache precomputes the corresponding MolGraphs and caches them in memory.

Parameters:
__len__()[source]#
Return type:

int

__getitem__(index)[source]#
Parameters:

index (int)

Return type:

chemprop.data.molgraph.MolGraph

class chemprop.featurizers.molgraph.MolGraphCacheOnTheFly(inputs, V_fs, E_fs, featurizer)[source]#

Bases: MolGraphCacheFacade

A MolGraphCacheOnTheFly computes the corresponding MolGraphs as they are requested.

Parameters:
__len__()[source]#
Return type:

int

__getitem__(index)[source]#
Parameters:

index (int)

Return type:

chemprop.data.molgraph.MolGraph

class chemprop.featurizers.molgraph.SimpleMoleculeMolGraphFeaturizer[source]#

Bases: chemprop.featurizers.molgraph.mixins._MolGraphFeaturizerMixin, chemprop.featurizers.base.GraphFeaturizer[rdkit.Chem.Mol]

A SimpleMoleculeMolGraphFeaturizer is the default implementation of a MoleculeMolGraphFeaturizer

Parameters:
  • atom_featurizer (AtomFeaturizer, default=MultiHotAtomFeaturizer()) – the featurizer with which to calculate feature representations of the atoms in a given molecule

  • bond_featurizer (BondFeaturizer, default=MultiHotBondFeaturizer()) – the featurizer with which to calculate feature representations of the bonds in a given molecule

  • extra_atom_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated features of each atom

  • extra_bond_fdim (int, default=0) – the dimension of the additional features that will be concatenated onto the calculated features of each bond

extra_atom_fdim: dataclasses.InitVar[int] = 0#
extra_bond_fdim: dataclasses.InitVar[int] = 0#
__post_init__(extra_atom_fdim=0, extra_bond_fdim=0)[source]#
Parameters:
  • extra_atom_fdim (int)

  • extra_bond_fdim (int)

__call__(mol, atom_features_extra=None, bond_features_extra=None)[source]#
Parameters:
  • mol (rdkit.Chem.Mol)

  • atom_features_extra (numpy.ndarray | None)

  • bond_features_extra (numpy.ndarray | None)

Return type:

chemprop.data.molgraph.MolGraph

class chemprop.featurizers.molgraph.CondensedGraphOfReactionFeaturizer[source]#

Bases: chemprop.featurizers.molgraph.mixins._MolGraphFeaturizerMixin, chemprop.featurizers.base.GraphFeaturizer[chemprop.types.Rxn]

A CondensedGraphOfReactionFeaturizer featurizes reactions using the condensed reaction graph method utilized in [1]

NOTE: This class does not accept a AtomFeaturizer instance. This is because it requries the num_only() method, which is only implemented in the concrete AtomFeaturizer class

Parameters:
  • atom_featurizer (AtomFeaturizer, default=AtomFeaturizer()) – the featurizer with which to calculate feature representations of the atoms in a given molecule

  • bond_featurizer (BondFeaturizerBase, default=BondFeaturizer()) – the featurizer with which to calculate feature representations of the bonds in a given molecule

  • mode (Union[str, ReactionMode], default=ReactionMode.REAC_DIFF) – the mode by which to featurize the reaction as either the string code or enum value

References

property mode: RxnMode#
Return type:

RxnMode

mode_: dataclasses.InitVar[str | RxnMode]#
__post_init__(mode_)[source]#
Parameters:

mode_ (str | RxnMode)

__call__(rxn, atom_features_extra=None, bond_features_extra=None)[source]#

Featurize the input reaction into a molecular graph

Parameters:
  • rxn (Rxn) – a 2-tuple of atom-mapped rdkit molecules, where the 0th element is the reactant and the 1st element is the product

  • atom_features_extra (np.ndarray | None, default=None) – UNSUPPORTED maintained only to maintain parity with the method signature of the MoleculeFeaturizer

  • bond_features_extra (np.ndarray | None, default=None) – UNSUPPORTED maintained only to maintain parity with the method signature of the MoleculeFeaturizer

Returns:

the molecular graph of the reaction

Return type:

MolGraph

classmethod map_reac_to_prod(reacs, pdts)[source]#

Map atom indices between corresponding atoms in the reactant and product molecules

Parameters:
  • reacs (Chem.Mol) – An RDKit molecule of the reactants

  • pdts (Chem.Mol) – An RDKit molecule of the products

Returns:

  • ri2pi (dict[int, int]) – A dictionary of corresponding atom indices from reactant atoms to product atoms

  • pdt_idxs (list[int]) – atom indices of poduct atoms

  • rct_idxs (list[int]) – atom indices of reactant atoms

Return type:

tuple[dict[int, int], list[int], list[int]]

chemprop.featurizers.molgraph.CGRFeaturizer: TypeAlias#
class chemprop.featurizers.molgraph.RxnMode[source]#

Bases: chemprop.utils.utils.EnumMapping

The mode by which a reaction should be featurized into a MolGraph

REAC_PROD#

concatenate the reactant features with the product features.

REAC_PROD_BALANCE#

concatenate the reactant features with the products feature and balances imbalanced reactions

REAC_DIFF#

concatenates the reactant features with the difference in features between reactants and products

REAC_DIFF_BALANCE#

concatenates the reactant features with the difference in features between reactants and product and balances imbalanced reactions

PROD_DIFF#

concatenates the product features with the difference in features between reactants and products

PROD_DIFF_BALANCE#

concatenates the product features with the difference in features between reactants and products and balances imbalanced reactions