Reaction MolGraph featurizers#
[1]:
from chemprop.featurizers.molgraph.reaction import CondensedGraphOfReactionFeaturizer
This is an example reaction to featurize. The sanitizing code is to preserve atom mapped hydrogens in the graph.
[2]:
from rdkit import Chem
rct = Chem.MolFromSmiles("[H:1][C:4]([H:2])([H:3])[F:5]", sanitize=False)
pdt = Chem.MolFromSmiles("[H:1][C:4]([H:2])([H:3]).[F:5]", sanitize=False)
Chem.SanitizeMol(
rct, sanitizeOps=Chem.SanitizeFlags.SANITIZE_ALL ^ Chem.SanitizeFlags.SANITIZE_ADJUSTHS
)
Chem.SanitizeMol(
pdt, sanitizeOps=Chem.SanitizeFlags.SANITIZE_ALL ^ Chem.SanitizeFlags.SANITIZE_ADJUSTHS
)
rxn = (rct, pdt)
Condensed Graph of Reaction featurizer#
Like a molecule MolGraph featurizer, reaction MolGraph featurizers produce a MolGraph. The difference between the molecule and reaction versions is that a reaction takes two rdkit.Chem.Mol objects and need to know what “mode” of featurization to use. Available modes are found in RxnMode.
[3]:
from chemprop.featurizers import RxnMode
for mode in RxnMode:
print(mode)
reac_prod
reac_prod_balance
reac_diff
reac_diff_balance
prod_diff
prod_diff_balance
Briefly, “reac” stands for reactant features, “prod” stands for product features, and “diff” stands for the difference between reactant and product features. The two sets of features are concatenated together. “balance” refers to balancing imablanced reactions. See the 2022 paper by Heid and Green for more details. “reac_diff” is the default.
[4]:
reac_diff = CondensedGraphOfReactionFeaturizer()
reac_prod = CondensedGraphOfReactionFeaturizer(mode_="reac_prod")
[5]:
reac_diff(rxn).E
[5]:
array([[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, -1,
0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, -1,
0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0]])
[6]:
reac_prod(rxn).E
[6]:
array([[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0]])
Custom#
Like molecule MolGraph featurizers, reaction featurizers can use custom atom and bond featurizers.
[7]:
from chemprop.featurizers import MultiHotAtomFeaturizer
atom_featurizer = MultiHotAtomFeaturizer.organic()
rxn_featurizer = CondensedGraphOfReactionFeaturizer(atom_featurizer=atom_featurizer)
rxn_featurizer(rxn)
[7]:
MolGraph(V=array([[ 1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0.01008, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 1. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 1. , 0. ,
0. , 0.12011, 0. , 0. , 0. , 1. ,
-1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0.01008, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0.01008, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. , 1. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 1. , 0. ,
0. , 0.18998, 1. , -1. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ]]), E=array([[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, -1,
0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, -1,
0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0]]), edge_index=array([[0, 1, 1, 2, 1, 3, 1, 4],
[1, 0, 2, 1, 3, 1, 4, 1]]), rev_edge_index=array([1, 0, 3, 2, 5, 4, 7, 6]))
Extra atom and bond features#
Extra atom and bond features are not yet supported for reactions.
[8]:
rxn_featurizer(rxn, atom_features_extra=[1.0])
'atom_features_extra' is currently unsupported for reactions
[8]:
MolGraph(V=array([[ 1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0.01008, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 1. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 1. , 0. ,
0. , 0.12011, 0. , 0. , 0. , 1. ,
-1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0.01008, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 1. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0.01008, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. , 1. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
1. , 0. , 1. , 0. , 0. , 0. ,
0. , 1. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 1. , 0. ,
0. , 0.18998, 1. , -1. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ]]), E=array([[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, -1,
0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, -1,
0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0]]), edge_index=array([[0, 1, 1, 2, 1, 3, 1, 4],
[1, 0, 2, 1, 3, 1, 4, 1]]), rev_edge_index=array([1, 0, 3, 2, 5, 4, 7, 6]))