Hyperparameter Optimization#
Note
Chemprop relies on Ray Tune for hyperparameter optimization which is an optional install. To install the required dependencies, run pip install -U ray[tune] if installing with PyPI, or pip install -e .[hpopt] if installing from source.
Searching Hyperparameter Space#
We include an automated hyperparameter optimization procedure through the Ray Tune package. Hyperparameter optimization can be run as follows:
chemprop hpopt --data-path <data_path> --task-type <task> --search-parameter-keywords <keywords> --hpopt-save-dir <save_dir>
For example:
chemprop hpopt --data-path tests/data/regression.csv \
--task-type regression \
--search-parameter-keywords depth ffn_num_layers message_hidden_dim \
--hpopt-save-dir results
The search parameters can be any combination of hyperparameters or a predefined set. Options include basic (default), which consists of:
depthThe number of message passing steps
ffn_num_layersThe number of layers in the FFN model
dropoutThe probability (from 0.0 to 1.0) of dropout in the MPNN & FNN layers
message_hidden_dimThe hidden dimension in the message passing step
ffn_hidden_dimThe hidden dimension in the FFN model
Another option is learning_rate which includes:
max_lrThe maximum learning rate
init_lrThe initial learning rate. It is searched as a ratio relative to the max learning rate
final_lrThe initial learning rate. It is searched as a ratio relative to the max learning rate
warmup_epochsNumber of warmup epochs, during which the learning rate linearly increases from the initial to the maximum learning rate
Other individual search parameters include:
activationThe activation function used in the MPNN & FFN layers. Choices includerelu,leakyrelu,prelu,tanh, andelu(seluis no longer supported since v2.2.0)
aggregationAggregation mode used during molecule-level predictor. Choices includemean,sum,norm
aggregation_normFornormaggregation, the normalization factor by which atomic features are divided
batch_sizeBatch size for dataloader
Specifying --search-parameter-keywords all will search over all 13 of the above parameters.
The following other common keywords may be used:
--raytune-num-samples <num_samples>The number of trials to perform
--raytune-num-cpus <num_cpus>The number of CPUs to use
--raytune-num-gpus <num_gpus>The number of GPUs to use
--raytune-max-concurrent-trials <num_trials>The maximum number of concurrent trials
--raytune-search-algorithm <algorithm>The choice of control search algorithm (eitherrandom,hyperopt, oroptuna). Ifhyperoptis specified, then the arguments--hyperopt-n-initial-points <num_points>and--hyperopt-random-state-seed <seed>can be specified.
Other keywords related to hyperparameter optimization are also available (see CLI Reference for a full list).
Splitting#
By default, Chemprop will split the data into train / validation / test data splits. The splitting behavior can be modified using the same splitting arguments used in training, i.e., section Train/Validation/Test Splits.
Note
This default splitting behavior is different from Chemprop v1, wherein the hyperparameter optimization was performed on the entirety of the data provided to it.
If --num-replicates is greater than one, Chemprop will only use the first split to perform hyperparameter optimization. If you need to optimize hyperparameters separately for several different cross validation splits, you should e.g. set up a bash script to run chemprop hpopt separately on each split.
Applying Optimal Hyperparameters#
Once hyperparameter optimization is complete, the optimal hyperparameters can be applied during training by specifying the config path. If an argument is both provided via the command line and the config file, the command line takes precedence. For example:
chemprop train --data-path tests/data/regression.csv \
--config-path results/best_config.toml