Bond featurizers#
[1]:
from chemprop.featurizers.bond import MultiHotBondFeaturizer
This is an example bond to featurize.
[2]:
from rdkit import Chem
bond_to_featurize = Chem.MolFromSmiles("CC").GetBondBetweenAtoms(0, 1)
Bond features#
The following bond features are generated by rdkit and cast to one-hot vectors (except for the initial null bit which is True/False depending on if the bond is None). These feature vectors are joined together to a single multi-hot feature vector. Only the stereochemistry vector is padded for unknown values.
null?
bond type
conjugated?
in ring?
stereochemistry
[3]:
featurizer = MultiHotBondFeaturizer()
featurizer(bond_to_featurize)
[3]:
array([0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0])
Custom#
The bond types and stereochemistry can be customized. The defaults are:
bond_type
Single, Double, Triple, Aromatic
stereos
0, 1, 2, 3, 4, 5 - See
rdkit.Chem.rdchem.BondStereofor more details
[4]:
from rdkit.Chem.rdchem import BondType
featurizer = MultiHotBondFeaturizer(bond_types=[BondType.SINGLE], stereos=[0, 1, 2])
featurizer(bond_to_featurize)
[4]:
array([0, 1, 0, 0, 1, 0, 0, 0])
Generic#
Any class that has a length and returns a numpy array when given an rdkit.Chem.rdchem.Bond can be used as a bond featurizer.
[5]:
from rdkit.Chem.rdchem import Bond
import numpy as np
class MyBondFeaturizer:
def __len__(self):
return 1
def __call__(self, a: Bond):
return np.array([a.GetIsConjugated()], dtype=float)
featurizer = MyBondFeaturizer()
featurizer(bond_to_featurize)
[5]:
array([0.])