CLI Reference#
usage: chemprop [-h] {train,predict,convert,fingerprint,hpopt} ...
mode#
- mode
Possible choices: train, predict, convert, fingerprint, hpopt
Sub-commands#
train#
Train a Chemprop model.
chemprop train [-h] [--logfile [LOGFILE]] [-v] [-q]
[-s SMILES_COLUMNS [SMILES_COLUMNS ...]]
[-r REACTION_COLUMNS [REACTION_COLUMNS ...]] [--no-header-row]
[-n NUM_WORKERS] [-b BATCH_SIZE] [--accelerator ACCELERATOR]
[--devices DEVICES]
[--rxn-mode {REAC_PROD,REAC_PROD_BALANCE,REAC_DIFF,REAC_DIFF_BALANCE,PROD_DIFF,PROD_DIFF_BALANCE}]
[--multi-hot-atom-featurizer-mode {V1,V2,ORGANIC,RIGR}]
[--keep-h] [--add-h] [--ignore-stereo] [--reorder-atoms]
[--molecule-featurizers {morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} [{morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} ...]]
[--descriptors-path DESCRIPTORS_PATH]
[--descriptors-columns DESCRIPTORS_COLUMNS [DESCRIPTORS_COLUMNS ...]]
[--no-descriptor-scaling] [--no-atom-feature-scaling]
[--no-atom-descriptor-scaling] [--no-bond-feature-scaling]
[--no-bond-descriptor-scaling]
[--atom-features-path ATOM_FEATURES_PATH [ATOM_FEATURES_PATH ...]]
[--atom-descriptors-path ATOM_DESCRIPTORS_PATH [ATOM_DESCRIPTORS_PATH ...]]
[--bond-features-path BOND_FEATURES_PATH [BOND_FEATURES_PATH ...]]
[--bond-descriptors-path BOND_DESCRIPTORS_PATH [BOND_DESCRIPTORS_PATH ...]]
[--constraints-path CONSTRAINTS_PATH]
[--constraints-to-targets CONSTRAINTS_TO_TARGETS [CONSTRAINTS_TO_TARGETS ...]]
[--use-cuikmolmaker-featurization] [--config-path CONFIG_PATH]
[-i [DATA_PATH ...]] [-o OUTPUT_DIR] [--remove-checkpoints]
[--checkpoint CHECKPOINT [CHECKPOINT ...]] [--freeze-encoder]
[--model-frzn MODEL_FRZN] [--frzn-ffn-layers FRZN_FFN_LAYERS]
[--from-foundation FROM_FOUNDATION]
[--ensemble-size ENSEMBLE_SIZE]
[--message-hidden-dim MESSAGE_HIDDEN_DIM [MESSAGE_HIDDEN_DIM ...]]
[--message-bias] [--depth DEPTH [DEPTH ...]] [--undirected]
[--dropout DROPOUT] [--mpn-shared]
[--aggregation {mean,sum,norm}]
[--aggregation-norm AGGREGATION_NORM] [--atom-messages]
[--activation {RELU,CELU,ELU,GELU,GLU,HARDSHRINK,HARDSIGMOID,HARDSWISH,HARDTANH,LEAKYRELU,LOGSIGMOID,LOGSOFTMAX,MISH,MULTIHEADATTENTION,PRELU,RRELU,RELU6,SILU,SIGMOID,SOFTMAX,SOFTMAX2D,SOFTMIN,SOFTPLUS,SOFTSHRINK,SOFTSIGN,TANH,TANHSHRINK,THRESHOLD}]
[--activation-args [ACTIVATION_ARGS ...]]
[--ffn-hidden-dim FFN_HIDDEN_DIM [FFN_HIDDEN_DIM ...]]
[--ffn-num-layers FFN_NUM_LAYERS] [--batch-norm]
[--multiclass-num-classes MULTICLASS_NUM_CLASSES]
[--atom-task-weights ATOM_TASK_WEIGHTS [ATOM_TASK_WEIGHTS ...]]
[--atom-ffn-hidden-dim ATOM_FFN_HIDDEN_DIM [ATOM_FFN_HIDDEN_DIM ...]]
[--atom-ffn-num-layers ATOM_FFN_NUM_LAYERS]
[--atom-multiclass-num-classes ATOM_MULTICLASS_NUM_CLASSES]
[--bond-task-weights BOND_TASK_WEIGHTS [BOND_TASK_WEIGHTS ...]]
[--bond-ffn-hidden-dim BOND_FFN_HIDDEN_DIM [BOND_FFN_HIDDEN_DIM ...]]
[--bond-ffn-num-layers BOND_FFN_NUM_LAYERS]
[--bond-multiclass-num-classes BOND_MULTICLASS_NUM_CLASSES]
[--atom-constrainer-ffn-hidden-dim ATOM_CONSTRAINER_FFN_HIDDEN_DIM [ATOM_CONSTRAINER_FFN_HIDDEN_DIM ...]]
[--atom-constrainer-ffn-num-layers ATOM_CONSTRAINER_FFN_NUM_LAYERS]
[--bond-constrainer-ffn-hidden-dim BOND_CONSTRAINER_FFN_HIDDEN_DIM [BOND_CONSTRAINER_FFN_HIDDEN_DIM ...]]
[--bond-constrainer-ffn-num-layers BOND_CONSTRAINER_FFN_NUM_LAYERS]
[-w WEIGHT_COLUMN]
[--target-columns TARGET_COLUMNS [TARGET_COLUMNS ...]]
[--mol-target-columns MOL_TARGET_COLUMNS [MOL_TARGET_COLUMNS ...]]
[--atom-target-columns ATOM_TARGET_COLUMNS [ATOM_TARGET_COLUMNS ...]]
[--bond-target-columns BOND_TARGET_COLUMNS [BOND_TARGET_COLUMNS ...]]
[--ignore-columns IGNORE_COLUMNS [IGNORE_COLUMNS ...]]
[--no-cache] [--splits-column SPLITS_COLUMN]
[-t {regression,regression-mve,regression-evidential,regression-quantile,classification,classification-dirichlet,multiclass,multiclass-dirichlet,spectral}]
[-l {mse,mae,rmse,bounded-mse,bounded-mae,bounded-rmse,mve,evidential,quantile-point,pinball-point,bce,ce,binary-mcc,multiclass-mcc,dirichlet,sid,earthmovers,wasserstein,quantile,pinball,nlogprob_enrichment}]
[--v-kl V_KL] [--eps EPS] [--alpha ALPHA]
[--metrics {mse,mae,rmse,bounded-mse,bounded-mae,bounded-rmse,r2,binary-mcc,multiclass-mcc,roc,prc,accuracy,f1} [{mse,mae,rmse,bounded-mse,bounded-mae,bounded-rmse,r2,binary-mcc,multiclass-mcc,roc,prc,accuracy,f1} ...]]
[--tracking-metric TRACKING_METRIC] [--show-individual-scores]
[--task-weights TASK_WEIGHTS [TASK_WEIGHTS ...]]
[--warmup-epochs WARMUP_EPOCHS] [--init-lr INIT_LR]
[--max-lr MAX_LR] [--final-lr FINAL_LR] [--epochs EPOCHS]
[--patience PATIENCE] [--min-delta MIN_DELTA]
[--grad-clip GRAD_CLIP] [--class-balance]
[--split {SCAFFOLD_BALANCED,RANDOM_WITH_REPEATED_SMILES,RANDOM,KENNARD_STONE,KMEANS}]
[--split-sizes SPLIT_SIZES SPLIT_SIZES SPLIT_SIZES]
[--split-key-molecule SPLIT_KEY_MOLECULE]
[--num-replicates NUM_REPLICATES] [-k NUM_FOLDS]
[--save-smiles-splits] [--save-data-splits]
[--splits-file SPLITS_FILE] [--data-seed DATA_SEED]
[--pytorch-seed PYTORCH_SEED]
Named Arguments#
- --logfile, --log
Path to which the log file should be written (specifying just the flag alone will automatically log to a file
chemprop_logs/MODE/TIMESTAMP.log, where ‘MODE’ is the CLI mode chosen, e.g.,chemprop_logs/MODE/2026-06-12T18-58-16.log)- -v
Increase verbosity level to DEBUG
Default:
False- -q
Decrease verbosity level to WARNING or ERROR if specified twice
Default:
0- --accelerator
Passed directly to the lightning
Trainer()Default:
'auto'- --devices
Passed directly to the lightning
Trainer()(must be a single string of comma separated devices, e.g. ‘1, 2’ if specifying multiple devices)Default:
'auto'- --constraints-path
Path to a CSV file containing the constraints for atomic/bond properties prediction. The file should have one column for each property being constrained with no SMILES column. The order of the rows should match the order of the SMILES in the input CSV. See also –constraints-to-targets for how to specify which constraint applies to which prediction.
- --constraints-to-targets
The column names of the atom or bond targets that correspond to each constraint column in the constraints CSV.
- --config-path
Path to a configuration file (command line arguments override values in the configuration file)
- -i, --data-path
Path to one, two, or three input CSV files containing SMILES and the associated target values. If one data file is supplied, it will undergo train-val-test split; if two are supplied, the first will undergo train-val split and the second will be taken as test data; if three are supplied, they will be taken as train, val, test data respectively
- -o, --output-dir, --save-dir
Directory where training outputs will be saved (defaults to
CURRENT_DIRECTORY/chemprop_training/STEM_OF_INPUT/TIME_STAMP)- --remove-checkpoints
Remove intermediate checkpoint files after training is complete.
Default:
False- --ensemble-size
Number of models in ensemble for each splitting of data
Default:
1- --atom-multiclass-num-classes
Number of classes for atom targets when running multiclass classification
Default:
3- --bond-multiclass-num-classes
Number of classes for bond targets when running multiclass classification
Default:
3- --pytorch-seed
Seed for PyTorch randomness (e.g., random initial weights)
Dataloader args#
- -n, --num-workers
Number of workers for parallel data loading where 0 means sequential (Warning: setting
num_workersto a value greater than 0 can cause hangs on Windows and MacOS)Default:
0- -b, --batch-size
Batch size
Default:
64
Featurization args#
- --rxn-mode, --reaction-mode
Possible choices: REAC_PROD, REAC_PROD_BALANCE, REAC_DIFF, REAC_DIFF_BALANCE, PROD_DIFF, PROD_DIFF_BALANCE
Choices for construction of atom and bond features for reactions (case insensitive):
REAC_PROD: concatenates the reactants feature with the products featureREAC_DIFF: concatenates the reactants feature with the difference in features between reactants and products (Default)PROD_DIFF: concatenates the products feature with the difference in features between reactants and productsREAC_PROD_BALANCE: concatenates the reactants feature with the products feature, balances imbalanced reactionsREAC_DIFF_BALANCE: concatenates the reactants feature with the difference in features between reactants and products, balances imbalanced reactionsPROD_DIFF_BALANCE: concatenates the products feature with the difference in features between reactants and products, balances imbalanced reactions
Default:
REAC_DIFF- --multi-hot-atom-featurizer-mode
Possible choices: V1, V2, ORGANIC, RIGR
Selects the multi-hot atom featurization scheme. This affects both non-reaction and reaction featurization (case insensitive):
V1: Corresponds to the original configuration employed in the Chemprop V1V2: Tailored for a broad range of molecules, this configuration encompasses all elements in the first four rows of the periodic table, along with iodine. It is the default in Chemprop V2.ORGANIC: This configuration is designed specifically for use with organic molecules for drug research and development and includes a subset of elements most common in organic chemistry, including H, B, C, N, O, F, Si, P, S, Cl, Br, and I.RIGR: Modified V2 (default) featurizer using only the resonance-invariant atom and bond features.
Default:
V2- --keep-h
Whether hydrogens explicitly specified in input should be kept in the mol graph
Default:
False- --add-h
Whether hydrogens should be added to the mol graph
Default:
False- --molecule-featurizers, --features-generators
Possible choices: morgan_binary, morgan_count, rdkit_2d, v1_rdkit_2d, v1_rdkit_2d_normalized, charge
Method(s) of generating molecule features to use as extra descriptors
- --descriptors-path
Path to extra descriptors to concatenate to learned representation
- --descriptors-columns
Column names in the input CSV containing extra datapoint descriptors, like temperature and pressure. See also –descriptors-path.
- --no-descriptor-scaling
Turn off extra descriptor scaling
Default:
False- --no-atom-feature-scaling
Turn off extra atom feature scaling
Default:
False- --no-atom-descriptor-scaling
Turn off extra atom descriptor scaling
Default:
False- --no-bond-feature-scaling
Turn off extra bond feature scaling
Default:
False- --no-bond-descriptor-scaling
Turn off extra bond descriptor scaling
Default:
False- --atom-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom features to supply before message passing (e.g.,
--atom-features-path 0 /path/to/features_0.npz) indicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-features-path 0 path_zero --atom-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-features-path 0 path_zero 1 path_one).- --atom-descriptors-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom descriptors to supply after message passing (e.g.,
--atom-descriptors-path 0 /path/to/descriptors_0.npzindicates that the descriptors at the given path should be supplied to the 0-th component. To supply additional descriptors for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-descriptors-path 0 path_zero --atom-descriptors-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-descriptors-path 0 path_zero 1 path_one).- --bond-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional bond features to supply before message passing (e.g.,
--bond-features-path 0 /path/to/features_0.npzindicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--bond-features-path 0 path_zero --bond-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--bond-features-path 0 path_zero 1 path_one).- --bond-descriptors-path
Path to additional bond descriptors to use with the learned bond representations after message passing. The file follows the same format as –atom-descriptors-path, i.e. the file is created using np.savez(‘bond_descriptors.npz’, *E_ds) where E_ds is a list of 2D numpy arrays with shape n_bonds x n_descriptors.
- --use-cuikmolmaker-featurization
Use
cuik-molmakerpackage for accelerated atom and bond featurization.Default:
False
transfer learning args#
- --checkpoint
Path to checkpoint(s) or model file(s) for loading and overwriting weights. Accepts a single pre-trained model checkpoint (.ckpt), a single model file (.pt), a directory containing such files, or a list of paths and directories. If a directory is provided, it will recursively search for and use all (.pt) files found for prediction.
- --freeze-encoder
Freeze the message passing layer from the checkpoint model (specified by
--checkpoint).Default:
False- --model-frzn
Path to model checkpoint file to be loaded for overwriting and freezing weights. By default, all MPNN weights are frozen with this option.
- --frzn-ffn-layers
Freeze the first
nlayers of the FFN from the checkpoint model (specified by--checkpoint). The message passing layer should also be frozen with--freeze-encoder.Default:
0- --from-foundation
Name of pretrained foundation model used to initialize message passing. One of: CHEMELEON, or a path to a local model file.
message passing#
- --message-hidden-dim
Hidden dimension of the messages (specify multiple values to customize multicomponent encoders separately)
Default:
[300]- --message-bias
Add bias to the message passing layers
Default:
False- --depth
Number of message passing steps (specify multiple values to customize multicomponent encoders separately)
Default:
[3]- --undirected
Pass messages on undirected bonds/edges (always sum the two relevant bond vectors)
Default:
False- --dropout
Dropout probability in message passing/FFN layers
Default:
0.0- --mpn-shared
Whether to use the same message passing neural network for all input molecules (only relevant if
number_of_molecules> 1)Default:
False- --aggregation, --agg
Possible choices: mean, sum, norm
Aggregation mode to use during graph predictor
Default:
'norm'- --aggregation-norm
Normalization factor by which to divide summed up atomic features for
normaggregationDefault:
100- --atom-messages
Pass messages on atoms rather than bonds.
Default:
False- --activation
Possible choices: RELU, CELU, ELU, GELU, GLU, HARDSHRINK, HARDSIGMOID, HARDSWISH, HARDTANH, LEAKYRELU, LOGSIGMOID, LOGSOFTMAX, MISH, MULTIHEADATTENTION, PRELU, RRELU, RELU6, SILU, SIGMOID, SOFTMAX, SOFTMAX2D, SOFTMIN, SOFTPLUS, SOFTSHRINK, SOFTSIGN, TANH, TANHSHRINK, THRESHOLD
Activation function in message passing/FFN layers.
Default:
RELU- --activation-args
Arguments for the activation function (Example: arg1 arg2 key1=value1 key2=value2).
FFN args#
- --ffn-hidden-dim
Hidden dimension(s) in the FFN top model. A single value is applied to all layers; multiple values specify per-layer widths (must match –ffn-num-layers)
Default:
[300]- --ffn-num-layers
Number of hidden layers in FFN top model (v2 semantics, differs from v1)
Default:
1
extra MPNN args#
- --batch-norm
Turn on batch normalization after aggregation
Default:
False- --multiclass-num-classes
Number of classes when running multiclass classification
Default:
3
Atom FFN args#
- --atom-task-weights
Weights to apply for all atom tasks in the loss function
- --atom-ffn-hidden-dim
Hidden dimension(s) in the atom FFN top model
Default:
[300]- --atom-ffn-num-layers
Number of layers in atom FFN top model
Default:
1
Bond FFN args#
- --bond-task-weights
Weights to apply for all bond tasks in the loss function
- --bond-ffn-hidden-dim
Hidden dimension(s) in the bond FFN top model
Default:
[300]- --bond-ffn-num-layers
Number of layers in bond FFN top model
Default:
1
Atom constrainer FFN args#
- --atom-constrainer-ffn-hidden-dim
Hidden dimension(s) in the atom constrainer FFN top model
Default:
[300]- --atom-constrainer-ffn-num-layers
Number of layers in atom constrainer FFN top model
Default:
1
Bond constrainer FFN args#
- --bond-constrainer-ffn-hidden-dim
Hidden dimension(s) in the bond constrainer FFN top model
Default:
[300]- --bond-constrainer-ffn-num-layers
Number of layers in bond constrainer FFN top model
Default:
1
training input data args#
- -w, --weight-column
Name of the column in the input CSV containing individual data weights
- --target-columns
Name of the columns containing target values (by default, uses all columns except the SMILES column and the
ignore_columns)- --mol-target-columns
Names of the columns containing mol target values (when training on mol and atom/bond targets simultaneously).
- --atom-target-columns
Names of the columns containing atom target values.
- --bond-target-columns
Names of the columns containing bond target values.
- --ignore-columns
Name of the columns to ignore when
target_columnsis not provided- --no-cache
Turn off caching the featurized
MolGraphs at the beginning of trainingDefault:
False- --splits-column
Name of the column in the input CSV file containing ‘train’, ‘val’, or ‘test’ for each row.
training args#
- -t, --task-type
Possible choices: regression, regression-mve, regression-evidential, regression-quantile, classification, classification-dirichlet, multiclass, multiclass-dirichlet, spectral
Type of dataset (determines the default loss function used during training, defaults to
regression)Default:
'regression'- -l, --loss-function
Possible choices: mse, mae, rmse, bounded-mse, bounded-mae, bounded-rmse, mve, evidential, quantile-point, pinball-point, bce, ce, binary-mcc, multiclass-mcc, dirichlet, sid, earthmovers, wasserstein, quantile, pinball, nlogprob_enrichment
Loss function to use during training (will use the default loss function for the given task type if not specified)
- --v-kl, --evidential-regularization
Specify the value used in regularization for evidential loss function. The default value recommended by Soleimany et al. (2021) is 0.2. However, the optimal value is dataset-dependent, so it is recommended that users test different values to find the best value for their model.
Default:
0.0- --eps
Evidential regularization epsilon
Default:
1e-08- --alpha
Target error bounds for quantile interval loss
Default:
0.1- --metrics, --metric
Possible choices: mse, mae, rmse, bounded-mse, bounded-mae, bounded-rmse, r2, binary-mcc, multiclass-mcc, roc, prc, accuracy, f1
Specify the evaluation metrics. If unspecified, Chemprop will use the following metrics for given dataset types: regression ->
rmse, classification ->roc, multiclass ->ce(‘cross entropy’), spectral ->sid. If multiple metrics are provided, the 0-th one will be used for early stopping and checkpointing.- --tracking-metric
The metric to track for early stopping, checkpointing, and hyperparameter optimization. Defaults to the criterion used during training. When training on two or three of molecule, atom, and bond targets, and not tracking the default (‘val_loss’), you must append ‘-mol’, ‘-atom’, or ‘-bond’ to the metric name to specify which individual metric to track. For example, ‘val_loss-bond’ will track the criterion value of the bond predictions and ‘rmse-atom’ will track the RMSE of the atom predictions.
Default:
'val_loss'- --show-individual-scores
Show all scores for individual targets, not just average, at the end.
Default:
False- --task-weights
Weights to apply for whole tasks in the loss function
- --warmup-epochs
Number of epochs during which learning rate increases linearly from
init_lrtomax_lr(afterwards, learning rate decreases exponentially frommax_lrtofinal_lr)Default:
2- --init-lr
Initial learning rate.
Default:
0.0001- --max-lr
Maximum learning rate.
Default:
0.001- --final-lr
Final learning rate.
Default:
0.0001- --epochs
Number of epochs to train over
Default:
50- --patience
Number of epochs to wait for improvement before early stopping
- --min-delta
Minimum change in the monitored quantity to qualify as an improvement
Default:
0.0- --grad-clip
Passed directly to the lightning trainer which controls grad clipping (see the
Trainer()docstring for details)- --class-balance
Ensures each training batch contains an equal number of positive and negative samples.
Default:
False
split args#
- --split, --split-type
Possible choices: SCAFFOLD_BALANCED, RANDOM_WITH_REPEATED_SMILES, RANDOM, KENNARD_STONE, KMEANS
Method of splitting the data into train/val/test (case insensitive)
Default:
RANDOM- --split-sizes
Split proportions for train/validation/test sets
Default:
[0.8, 0.1, 0.1]- --split-key-molecule
Specify the index of the key molecule used for splitting when multiple molecules are present and constrained split_type is used (e.g.,
scaffold_balancedorrandom_with_repeated_smiles). Note that this index begins with zero for the first molecule.Default:
0- --num-replicates
Number of replicates.
Default:
1- -k, --num-folds
The -k/–num-folds argument was removed in v2.1.0 - use –num-replicates instead.
- --save-smiles-splits
Whether to store the SMILES in each train/val/test split
Default:
False- --save-data-splits
Whether to store the input data in each train/val/test split
Default:
False- --splits-file
Path to a JSON file containing pre-defined splits for the input data, formatted as a list of dictionaries with keys
train,val, andtestand values as lists of indices or formatted strings (e.g. [0, 1, 2, 4] or ‘0-2,4’)- --data-seed
Specify the random seed to use when splitting data into train/val/test sets. When
--num-replicates> 1, the first replicate uses this seed and all subsequent replicates add 1 to the seed (also used for shuffling data inbuild_dataloaderwhenshuffleis True).Default:
0
predict#
use a pretrained Chemprop model for prediction
chemprop predict [-h] [--logfile [LOGFILE]] [-v] [-q]
[-s SMILES_COLUMNS [SMILES_COLUMNS ...]]
[-r REACTION_COLUMNS [REACTION_COLUMNS ...]]
[--no-header-row] [-n NUM_WORKERS] [-b BATCH_SIZE]
[--accelerator ACCELERATOR] [--devices DEVICES]
[--rxn-mode {REAC_PROD,REAC_PROD_BALANCE,REAC_DIFF,REAC_DIFF_BALANCE,PROD_DIFF,PROD_DIFF_BALANCE}]
[--multi-hot-atom-featurizer-mode {V1,V2,ORGANIC,RIGR}]
[--keep-h] [--add-h] [--ignore-stereo] [--reorder-atoms]
[--molecule-featurizers {morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} [{morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} ...]]
[--descriptors-path DESCRIPTORS_PATH]
[--descriptors-columns DESCRIPTORS_COLUMNS [DESCRIPTORS_COLUMNS ...]]
[--no-descriptor-scaling] [--no-atom-feature-scaling]
[--no-atom-descriptor-scaling] [--no-bond-feature-scaling]
[--no-bond-descriptor-scaling]
[--atom-features-path ATOM_FEATURES_PATH [ATOM_FEATURES_PATH ...]]
[--atom-descriptors-path ATOM_DESCRIPTORS_PATH [ATOM_DESCRIPTORS_PATH ...]]
[--bond-features-path BOND_FEATURES_PATH [BOND_FEATURES_PATH ...]]
[--bond-descriptors-path BOND_DESCRIPTORS_PATH [BOND_DESCRIPTORS_PATH ...]]
[--constraints-path CONSTRAINTS_PATH]
[--constraints-to-targets CONSTRAINTS_TO_TARGETS [CONSTRAINTS_TO_TARGETS ...]]
[--use-cuikmolmaker-featurization] -i TEST_PATH [-o OUTPUT]
[--drop-extra-columns] --model-paths MODEL_PATHS
[MODEL_PATHS ...] [--cal-path CAL_PATH]
[--uncertainty-method {none,mve,ensemble,classification,evidential-total,evidential-epistemic,evidential-aleatoric,dropout,classification-dirichlet,multiclass-dirichlet,quantile-regression}]
[--calibration-method {zscaling,zelikman-interval,mve-weighting,conformal-regression,platt,isotonic,conformal-multilabel,conformal-multiclass,conformal-adaptive,isotonic-multiclass}]
[--evaluation-methods {nll-regression,miscalibration_area,ence,spearman,conformal-coverage-regression,nll-classification,conformal-coverage-classification,nll-multiclass,conformal-coverage-multiclass} [{nll-regression,miscalibration_area,ence,spearman,conformal-coverage-regression,nll-classification,conformal-coverage-classification,nll-multiclass,conformal-coverage-multiclass} ...]]
[--uncertainty-dropout-p UNCERTAINTY_DROPOUT_P]
[--dropout-sampling-size DROPOUT_SAMPLING_SIZE]
[--calibration-interval-percentile CALIBRATION_INTERVAL_PERCENTILE]
[--conformal-alpha CONFORMAL_ALPHA]
[--cal-descriptors-path CAL_DESCRIPTORS_PATH [CAL_DESCRIPTORS_PATH ...]]
[--cal-atom-features-path CAL_ATOM_FEATURES_PATH [CAL_ATOM_FEATURES_PATH ...]]
[--cal-atom-descriptors-path CAL_ATOM_DESCRIPTORS_PATH [CAL_ATOM_DESCRIPTORS_PATH ...]]
[--cal-bond-features-path CAL_BOND_FEATURES_PATH [CAL_BOND_FEATURES_PATH ...]]
[--cal-bond-descriptors-path CAL_BOND_DESCRIPTORS_PATH [CAL_BOND_DESCRIPTORS_PATH ...]]
[--cal-constraints-path CAL_CONSTRAINTS_PATH]
Named Arguments#
- --logfile, --log
Path to which the log file should be written (specifying just the flag alone will automatically log to a file
chemprop_logs/MODE/TIMESTAMP.log, where ‘MODE’ is the CLI mode chosen, e.g.,chemprop_logs/MODE/2026-06-12T18-58-16.log)- -v
Increase verbosity level to DEBUG
Default:
False- -q
Decrease verbosity level to WARNING or ERROR if specified twice
Default:
0- --accelerator
Passed directly to the lightning
Trainer()Default:
'auto'- --devices
Passed directly to the lightning
Trainer()(must be a single string of comma separated devices, e.g. ‘1, 2’ if specifying multiple devices)Default:
'auto'- --constraints-path
Path to a CSV file containing the constraints for atomic/bond properties prediction. The file should have one column for each property being constrained with no SMILES column. The order of the rows should match the order of the SMILES in the input CSV. See also –constraints-to-targets for how to specify which constraint applies to which prediction.
- --constraints-to-targets
The column names of the atom or bond targets that correspond to each constraint column in the constraints CSV.
- -i, --test-path
Path to an input CSV file containing SMILES
- -o, --output, --preds-path
Specify path to which predictions will be saved. If the file extension is .pkl, it will be saved as a pickle file. Otherwise, Chemprop will save predictions as a CSV. If multiple models are used to make predictions, the average predictions will be saved in the file, and another file ending in ‘_individual’ with the same file extension will save the predictions for each individual model, with the column names being the target names appended with the model index (e.g., ‘_model_<index>’).
- --drop-extra-columns
Whether to drop all columns from the test data file besides the SMILES columns and the new prediction columns
Default:
False- --model-paths, --model-path
Location of checkpoint(s) or model file(s) to use for prediction. It can be a path to either a single pretrained model checkpoint (.ckpt) or single pretrained model file (.pt), a directory that contains these files, or a list of path(s) and directory(s). If a directory, will recursively search and predict on all found (.pt) models.
- --cal-constraints-path
Path to constraints applied to atomic/bond properties prediction for the calibration set.
Dataloader args#
- -n, --num-workers
Number of workers for parallel data loading where 0 means sequential (Warning: setting
num_workersto a value greater than 0 can cause hangs on Windows and MacOS)Default:
0- -b, --batch-size
Batch size
Default:
64
Featurization args#
- --rxn-mode, --reaction-mode
Possible choices: REAC_PROD, REAC_PROD_BALANCE, REAC_DIFF, REAC_DIFF_BALANCE, PROD_DIFF, PROD_DIFF_BALANCE
Choices for construction of atom and bond features for reactions (case insensitive):
REAC_PROD: concatenates the reactants feature with the products featureREAC_DIFF: concatenates the reactants feature with the difference in features between reactants and products (Default)PROD_DIFF: concatenates the products feature with the difference in features between reactants and productsREAC_PROD_BALANCE: concatenates the reactants feature with the products feature, balances imbalanced reactionsREAC_DIFF_BALANCE: concatenates the reactants feature with the difference in features between reactants and products, balances imbalanced reactionsPROD_DIFF_BALANCE: concatenates the products feature with the difference in features between reactants and products, balances imbalanced reactions
Default:
REAC_DIFF- --multi-hot-atom-featurizer-mode
Possible choices: V1, V2, ORGANIC, RIGR
Selects the multi-hot atom featurization scheme. This affects both non-reaction and reaction featurization (case insensitive):
V1: Corresponds to the original configuration employed in the Chemprop V1V2: Tailored for a broad range of molecules, this configuration encompasses all elements in the first four rows of the periodic table, along with iodine. It is the default in Chemprop V2.ORGANIC: This configuration is designed specifically for use with organic molecules for drug research and development and includes a subset of elements most common in organic chemistry, including H, B, C, N, O, F, Si, P, S, Cl, Br, and I.RIGR: Modified V2 (default) featurizer using only the resonance-invariant atom and bond features.
Default:
V2- --keep-h
Whether hydrogens explicitly specified in input should be kept in the mol graph
Default:
False- --add-h
Whether hydrogens should be added to the mol graph
Default:
False- --molecule-featurizers, --features-generators
Possible choices: morgan_binary, morgan_count, rdkit_2d, v1_rdkit_2d, v1_rdkit_2d_normalized, charge
Method(s) of generating molecule features to use as extra descriptors
- --descriptors-path
Path to extra descriptors to concatenate to learned representation
- --descriptors-columns
Column names in the input CSV containing extra datapoint descriptors, like temperature and pressure. See also –descriptors-path.
- --no-descriptor-scaling
Turn off extra descriptor scaling
Default:
False- --no-atom-feature-scaling
Turn off extra atom feature scaling
Default:
False- --no-atom-descriptor-scaling
Turn off extra atom descriptor scaling
Default:
False- --no-bond-feature-scaling
Turn off extra bond feature scaling
Default:
False- --no-bond-descriptor-scaling
Turn off extra bond descriptor scaling
Default:
False- --atom-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom features to supply before message passing (e.g.,
--atom-features-path 0 /path/to/features_0.npz) indicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-features-path 0 path_zero --atom-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-features-path 0 path_zero 1 path_one).- --atom-descriptors-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom descriptors to supply after message passing (e.g.,
--atom-descriptors-path 0 /path/to/descriptors_0.npzindicates that the descriptors at the given path should be supplied to the 0-th component. To supply additional descriptors for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-descriptors-path 0 path_zero --atom-descriptors-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-descriptors-path 0 path_zero 1 path_one).- --bond-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional bond features to supply before message passing (e.g.,
--bond-features-path 0 /path/to/features_0.npzindicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--bond-features-path 0 path_zero --bond-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--bond-features-path 0 path_zero 1 path_one).- --bond-descriptors-path
Path to additional bond descriptors to use with the learned bond representations after message passing. The file follows the same format as –atom-descriptors-path, i.e. the file is created using np.savez(‘bond_descriptors.npz’, *E_ds) where E_ds is a list of 2D numpy arrays with shape n_bonds x n_descriptors.
- --use-cuikmolmaker-featurization
Use
cuik-molmakerpackage for accelerated atom and bond featurization.Default:
False
Uncertainty and calibration args#
- --cal-path
Path to data file to be used for uncertainty calibration.
- --uncertainty-method
Possible choices: none, mve, ensemble, classification, evidential-total, evidential-epistemic, evidential-aleatoric, dropout, classification-dirichlet, multiclass-dirichlet, quantile-regression
The method of calculating uncertainty.
Default:
'none'- --calibration-method
Possible choices: zscaling, zelikman-interval, mve-weighting, conformal-regression, platt, isotonic, conformal-multilabel, conformal-multiclass, conformal-adaptive, isotonic-multiclass
The method used for calibrating the uncertainty calculated with uncertainty method.
- --evaluation-methods, --evaluation-method
Possible choices: nll-regression, miscalibration_area, ence, spearman, conformal-coverage-regression, nll-classification, conformal-coverage-classification, nll-multiclass, conformal-coverage-multiclass
The methods used for evaluating the uncertainty performance if the test data provided includes targets. Available methods are [nll, miscalibration_area, ence, spearman] or any available classification or multiclass metric.
- --uncertainty-dropout-p
The probability to use for Monte Carlo dropout uncertainty estimation.
Default:
0.1- --dropout-sampling-size
The number of samples to use for Monte Carlo dropout uncertainty estimation. Distinct from the dropout used during training.
Default:
10- --calibration-interval-percentile
Sets the percentile used in the calibration methods. Must be in the range (1, 100).
Default:
95- --conformal-alpha
Target error rate for conformal prediction. Must be in the range (0, 1).
Default:
0.1- --cal-descriptors-path
Path to extra descriptors to concatenate to learned representation in calibration dataset.
- --cal-atom-features-path
Path to the extra atom features in calibration dataset.
- --cal-atom-descriptors-path
Path to the extra atom descriptors in calibration dataset.
- --cal-bond-features-path
Path to the extra bond descriptors in calibration dataset.
- --cal-bond-descriptors-path
Path to the extra bond descriptors in calibration dataset.
convert#
Convert model checkpoint (.pt) to more recent version (.pt).
chemprop convert [-h] [--logfile [LOGFILE]] [-v] [-q]
[-c {v1_to_v2,v2_0_to_v2_1}] -i INPUT_PATH [-o OUTPUT_PATH]
Named Arguments#
- --logfile, --log
Path to which the log file should be written (specifying just the flag alone will automatically log to a file
chemprop_logs/MODE/TIMESTAMP.log, where ‘MODE’ is the CLI mode chosen, e.g.,chemprop_logs/MODE/2026-06-12T18-58-16.log)- -v
Increase verbosity level to DEBUG
Default:
False- -q
Decrease verbosity level to WARNING or ERROR if specified twice
Default:
0- -c, --conversion
Possible choices: v1_to_v2, v2_0_to_v2_1
Conversion to perform. Models converted from v1 to v2 must be run with the v1 featurizer via –multi-hot-atom-featurizer-mode v1.
Default:
'v1_to_v2'- -i, --input-path
Path to a model .pt checkpoint file
- -o, --output-path
Path to which the converted model will be saved (
CURRENT_DIRECTORY/STEM_OF_INPUT_newversion.ptby default)
fingerprint#
Use a pretrained Chemprop model to calculate learned representations.
chemprop fingerprint [-h] [--logfile [LOGFILE]] [-v] [-q]
[-s SMILES_COLUMNS [SMILES_COLUMNS ...]]
[-r REACTION_COLUMNS [REACTION_COLUMNS ...]]
[--no-header-row] [-n NUM_WORKERS] [-b BATCH_SIZE]
[--accelerator ACCELERATOR] [--devices DEVICES]
[--rxn-mode {REAC_PROD,REAC_PROD_BALANCE,REAC_DIFF,REAC_DIFF_BALANCE,PROD_DIFF,PROD_DIFF_BALANCE}]
[--multi-hot-atom-featurizer-mode {V1,V2,ORGANIC,RIGR}]
[--keep-h] [--add-h] [--ignore-stereo] [--reorder-atoms]
[--molecule-featurizers {morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} [{morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} ...]]
[--descriptors-path DESCRIPTORS_PATH]
[--descriptors-columns DESCRIPTORS_COLUMNS [DESCRIPTORS_COLUMNS ...]]
[--no-descriptor-scaling] [--no-atom-feature-scaling]
[--no-atom-descriptor-scaling]
[--no-bond-feature-scaling]
[--no-bond-descriptor-scaling]
[--atom-features-path ATOM_FEATURES_PATH [ATOM_FEATURES_PATH ...]]
[--atom-descriptors-path ATOM_DESCRIPTORS_PATH [ATOM_DESCRIPTORS_PATH ...]]
[--bond-features-path BOND_FEATURES_PATH [BOND_FEATURES_PATH ...]]
[--bond-descriptors-path BOND_DESCRIPTORS_PATH [BOND_DESCRIPTORS_PATH ...]]
[--constraints-path CONSTRAINTS_PATH]
[--constraints-to-targets CONSTRAINTS_TO_TARGETS [CONSTRAINTS_TO_TARGETS ...]]
[--use-cuikmolmaker-featurization] -i TEST_PATH
[-o OUTPUT] --model-paths MODEL_PATHS [MODEL_PATHS ...]
--ffn-block-index FFN_BLOCK_INDEX
Named Arguments#
- --logfile, --log
Path to which the log file should be written (specifying just the flag alone will automatically log to a file
chemprop_logs/MODE/TIMESTAMP.log, where ‘MODE’ is the CLI mode chosen, e.g.,chemprop_logs/MODE/2026-06-12T18-58-16.log)- -v
Increase verbosity level to DEBUG
Default:
False- -q
Decrease verbosity level to WARNING or ERROR if specified twice
Default:
0- --accelerator
Passed directly to the lightning
Trainer()Default:
'auto'- --devices
Passed directly to the lightning
Trainer()(must be a single string of comma separated devices, e.g. ‘1, 2’ if specifying multiple devices)Default:
'auto'- --constraints-path
Path to a CSV file containing the constraints for atomic/bond properties prediction. The file should have one column for each property being constrained with no SMILES column. The order of the rows should match the order of the SMILES in the input CSV. See also –constraints-to-targets for how to specify which constraint applies to which prediction.
- --constraints-to-targets
The column names of the atom or bond targets that correspond to each constraint column in the constraints CSV.
- -i, --test-path
Path to an input CSV file containing SMILES
- -o, --output, --preds-path
Specify the path where predictions will be saved. If the file extension is .npz, they will be saved as a npz file. Otherwise, the predictions will be saved as a CSV. The index of the model will be appended to the filename’s stem. By default, predictions will be saved to the same location as
--test-pathwith ‘_fps’ appended (e.g., ‘PATH/TO/TEST_PATH_fps_0.csv’).- --model-paths, --model-path
Specify location of checkpoint(s) or model file(s) to use for prediction. It can be a path to either a single pretrained model checkpoint (.ckpt) or single pretrained model file (.pt), a directory that contains these files, or a list of path(s) and directory(s). If a directory, chemprop will recursively search and predict on all found (.pt) models.
- --ffn-block-index
The index indicates which linear layer returns the encoding in the FFN. An index of 0 denotes the post-aggregation representation through a 0-layer MLP, while an index of 1 represents the output from the first linear layer in the FFN, and so forth.
Default:
-1
Dataloader args#
- -n, --num-workers
Number of workers for parallel data loading where 0 means sequential (Warning: setting
num_workersto a value greater than 0 can cause hangs on Windows and MacOS)Default:
0- -b, --batch-size
Batch size
Default:
64
Featurization args#
- --rxn-mode, --reaction-mode
Possible choices: REAC_PROD, REAC_PROD_BALANCE, REAC_DIFF, REAC_DIFF_BALANCE, PROD_DIFF, PROD_DIFF_BALANCE
Choices for construction of atom and bond features for reactions (case insensitive):
REAC_PROD: concatenates the reactants feature with the products featureREAC_DIFF: concatenates the reactants feature with the difference in features between reactants and products (Default)PROD_DIFF: concatenates the products feature with the difference in features between reactants and productsREAC_PROD_BALANCE: concatenates the reactants feature with the products feature, balances imbalanced reactionsREAC_DIFF_BALANCE: concatenates the reactants feature with the difference in features between reactants and products, balances imbalanced reactionsPROD_DIFF_BALANCE: concatenates the products feature with the difference in features between reactants and products, balances imbalanced reactions
Default:
REAC_DIFF- --multi-hot-atom-featurizer-mode
Possible choices: V1, V2, ORGANIC, RIGR
Selects the multi-hot atom featurization scheme. This affects both non-reaction and reaction featurization (case insensitive):
V1: Corresponds to the original configuration employed in the Chemprop V1V2: Tailored for a broad range of molecules, this configuration encompasses all elements in the first four rows of the periodic table, along with iodine. It is the default in Chemprop V2.ORGANIC: This configuration is designed specifically for use with organic molecules for drug research and development and includes a subset of elements most common in organic chemistry, including H, B, C, N, O, F, Si, P, S, Cl, Br, and I.RIGR: Modified V2 (default) featurizer using only the resonance-invariant atom and bond features.
Default:
V2- --keep-h
Whether hydrogens explicitly specified in input should be kept in the mol graph
Default:
False- --add-h
Whether hydrogens should be added to the mol graph
Default:
False- --molecule-featurizers, --features-generators
Possible choices: morgan_binary, morgan_count, rdkit_2d, v1_rdkit_2d, v1_rdkit_2d_normalized, charge
Method(s) of generating molecule features to use as extra descriptors
- --descriptors-path
Path to extra descriptors to concatenate to learned representation
- --descriptors-columns
Column names in the input CSV containing extra datapoint descriptors, like temperature and pressure. See also –descriptors-path.
- --no-descriptor-scaling
Turn off extra descriptor scaling
Default:
False- --no-atom-feature-scaling
Turn off extra atom feature scaling
Default:
False- --no-atom-descriptor-scaling
Turn off extra atom descriptor scaling
Default:
False- --no-bond-feature-scaling
Turn off extra bond feature scaling
Default:
False- --no-bond-descriptor-scaling
Turn off extra bond descriptor scaling
Default:
False- --atom-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom features to supply before message passing (e.g.,
--atom-features-path 0 /path/to/features_0.npz) indicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-features-path 0 path_zero --atom-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-features-path 0 path_zero 1 path_one).- --atom-descriptors-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom descriptors to supply after message passing (e.g.,
--atom-descriptors-path 0 /path/to/descriptors_0.npzindicates that the descriptors at the given path should be supplied to the 0-th component. To supply additional descriptors for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-descriptors-path 0 path_zero --atom-descriptors-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-descriptors-path 0 path_zero 1 path_one).- --bond-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional bond features to supply before message passing (e.g.,
--bond-features-path 0 /path/to/features_0.npzindicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--bond-features-path 0 path_zero --bond-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--bond-features-path 0 path_zero 1 path_one).- --bond-descriptors-path
Path to additional bond descriptors to use with the learned bond representations after message passing. The file follows the same format as –atom-descriptors-path, i.e. the file is created using np.savez(‘bond_descriptors.npz’, *E_ds) where E_ds is a list of 2D numpy arrays with shape n_bonds x n_descriptors.
- --use-cuikmolmaker-featurization
Use
cuik-molmakerpackage for accelerated atom and bond featurization.Default:
False
hpopt#
Perform hyperparameter optimization on the given task.
chemprop hpopt [-h] [--logfile [LOGFILE]] [-v] [-q]
[-s SMILES_COLUMNS [SMILES_COLUMNS ...]]
[-r REACTION_COLUMNS [REACTION_COLUMNS ...]] [--no-header-row]
[-n NUM_WORKERS] [-b BATCH_SIZE] [--accelerator ACCELERATOR]
[--devices DEVICES]
[--rxn-mode {REAC_PROD,REAC_PROD_BALANCE,REAC_DIFF,REAC_DIFF_BALANCE,PROD_DIFF,PROD_DIFF_BALANCE}]
[--multi-hot-atom-featurizer-mode {V1,V2,ORGANIC,RIGR}]
[--keep-h] [--add-h] [--ignore-stereo] [--reorder-atoms]
[--molecule-featurizers {morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} [{morgan_binary,morgan_count,rdkit_2d,v1_rdkit_2d,v1_rdkit_2d_normalized,charge} ...]]
[--descriptors-path DESCRIPTORS_PATH]
[--descriptors-columns DESCRIPTORS_COLUMNS [DESCRIPTORS_COLUMNS ...]]
[--no-descriptor-scaling] [--no-atom-feature-scaling]
[--no-atom-descriptor-scaling] [--no-bond-feature-scaling]
[--no-bond-descriptor-scaling]
[--atom-features-path ATOM_FEATURES_PATH [ATOM_FEATURES_PATH ...]]
[--atom-descriptors-path ATOM_DESCRIPTORS_PATH [ATOM_DESCRIPTORS_PATH ...]]
[--bond-features-path BOND_FEATURES_PATH [BOND_FEATURES_PATH ...]]
[--bond-descriptors-path BOND_DESCRIPTORS_PATH [BOND_DESCRIPTORS_PATH ...]]
[--constraints-path CONSTRAINTS_PATH]
[--constraints-to-targets CONSTRAINTS_TO_TARGETS [CONSTRAINTS_TO_TARGETS ...]]
[--use-cuikmolmaker-featurization] [--config-path CONFIG_PATH]
[-i [DATA_PATH ...]] [-o OUTPUT_DIR] [--remove-checkpoints]
[--checkpoint CHECKPOINT [CHECKPOINT ...]] [--freeze-encoder]
[--model-frzn MODEL_FRZN] [--frzn-ffn-layers FRZN_FFN_LAYERS]
[--from-foundation FROM_FOUNDATION]
[--ensemble-size ENSEMBLE_SIZE]
[--message-hidden-dim MESSAGE_HIDDEN_DIM [MESSAGE_HIDDEN_DIM ...]]
[--message-bias] [--depth DEPTH [DEPTH ...]] [--undirected]
[--dropout DROPOUT] [--mpn-shared]
[--aggregation {mean,sum,norm}]
[--aggregation-norm AGGREGATION_NORM] [--atom-messages]
[--activation {RELU,CELU,ELU,GELU,GLU,HARDSHRINK,HARDSIGMOID,HARDSWISH,HARDTANH,LEAKYRELU,LOGSIGMOID,LOGSOFTMAX,MISH,MULTIHEADATTENTION,PRELU,RRELU,RELU6,SILU,SIGMOID,SOFTMAX,SOFTMAX2D,SOFTMIN,SOFTPLUS,SOFTSHRINK,SOFTSIGN,TANH,TANHSHRINK,THRESHOLD}]
[--activation-args [ACTIVATION_ARGS ...]]
[--ffn-hidden-dim FFN_HIDDEN_DIM [FFN_HIDDEN_DIM ...]]
[--ffn-num-layers FFN_NUM_LAYERS] [--batch-norm]
[--multiclass-num-classes MULTICLASS_NUM_CLASSES]
[--atom-task-weights ATOM_TASK_WEIGHTS [ATOM_TASK_WEIGHTS ...]]
[--atom-ffn-hidden-dim ATOM_FFN_HIDDEN_DIM [ATOM_FFN_HIDDEN_DIM ...]]
[--atom-ffn-num-layers ATOM_FFN_NUM_LAYERS]
[--atom-multiclass-num-classes ATOM_MULTICLASS_NUM_CLASSES]
[--bond-task-weights BOND_TASK_WEIGHTS [BOND_TASK_WEIGHTS ...]]
[--bond-ffn-hidden-dim BOND_FFN_HIDDEN_DIM [BOND_FFN_HIDDEN_DIM ...]]
[--bond-ffn-num-layers BOND_FFN_NUM_LAYERS]
[--bond-multiclass-num-classes BOND_MULTICLASS_NUM_CLASSES]
[--atom-constrainer-ffn-hidden-dim ATOM_CONSTRAINER_FFN_HIDDEN_DIM [ATOM_CONSTRAINER_FFN_HIDDEN_DIM ...]]
[--atom-constrainer-ffn-num-layers ATOM_CONSTRAINER_FFN_NUM_LAYERS]
[--bond-constrainer-ffn-hidden-dim BOND_CONSTRAINER_FFN_HIDDEN_DIM [BOND_CONSTRAINER_FFN_HIDDEN_DIM ...]]
[--bond-constrainer-ffn-num-layers BOND_CONSTRAINER_FFN_NUM_LAYERS]
[-w WEIGHT_COLUMN]
[--target-columns TARGET_COLUMNS [TARGET_COLUMNS ...]]
[--mol-target-columns MOL_TARGET_COLUMNS [MOL_TARGET_COLUMNS ...]]
[--atom-target-columns ATOM_TARGET_COLUMNS [ATOM_TARGET_COLUMNS ...]]
[--bond-target-columns BOND_TARGET_COLUMNS [BOND_TARGET_COLUMNS ...]]
[--ignore-columns IGNORE_COLUMNS [IGNORE_COLUMNS ...]]
[--no-cache] [--splits-column SPLITS_COLUMN]
[-t {regression,regression-mve,regression-evidential,regression-quantile,classification,classification-dirichlet,multiclass,multiclass-dirichlet,spectral}]
[-l {mse,mae,rmse,bounded-mse,bounded-mae,bounded-rmse,mve,evidential,quantile-point,pinball-point,bce,ce,binary-mcc,multiclass-mcc,dirichlet,sid,earthmovers,wasserstein,quantile,pinball,nlogprob_enrichment}]
[--v-kl V_KL] [--eps EPS] [--alpha ALPHA]
[--metrics {mse,mae,rmse,bounded-mse,bounded-mae,bounded-rmse,r2,binary-mcc,multiclass-mcc,roc,prc,accuracy,f1} [{mse,mae,rmse,bounded-mse,bounded-mae,bounded-rmse,r2,binary-mcc,multiclass-mcc,roc,prc,accuracy,f1} ...]]
[--tracking-metric TRACKING_METRIC] [--show-individual-scores]
[--task-weights TASK_WEIGHTS [TASK_WEIGHTS ...]]
[--warmup-epochs WARMUP_EPOCHS] [--init-lr INIT_LR]
[--max-lr MAX_LR] [--final-lr FINAL_LR] [--epochs EPOCHS]
[--patience PATIENCE] [--min-delta MIN_DELTA]
[--grad-clip GRAD_CLIP] [--class-balance]
[--split {SCAFFOLD_BALANCED,RANDOM_WITH_REPEATED_SMILES,RANDOM,KENNARD_STONE,KMEANS}]
[--split-sizes SPLIT_SIZES SPLIT_SIZES SPLIT_SIZES]
[--split-key-molecule SPLIT_KEY_MOLECULE]
[--num-replicates NUM_REPLICATES] [-k NUM_FOLDS]
[--save-smiles-splits] [--save-data-splits]
[--splits-file SPLITS_FILE] [--data-seed DATA_SEED]
[--pytorch-seed PYTORCH_SEED]
[--search-parameter-keywords SEARCH_PARAMETER_KEYWORDS [SEARCH_PARAMETER_KEYWORDS ...]]
[--hpopt-save-dir HPOPT_SAVE_DIR]
[--raytune-num-samples RAYTUNE_NUM_SAMPLES]
[--raytune-search-algorithm {random,hyperopt,optuna}]
[--raytune-trial-scheduler {FIFO,AsyncHyperBand}]
[--raytune-num-workers RAYTUNE_NUM_WORKERS] [--raytune-use-gpu]
[--raytune-num-checkpoints-to-keep RAYTUNE_NUM_CHECKPOINTS_TO_KEEP]
[--raytune-grace-period RAYTUNE_GRACE_PERIOD]
[--raytune-reduction-factor RAYTUNE_REDUCTION_FACTOR]
[--raytune-temp-dir RAYTUNE_TEMP_DIR]
[--raytune-num-cpus RAYTUNE_NUM_CPUS]
[--raytune-num-gpus RAYTUNE_NUM_GPUS]
[--raytune-max-concurrent-trials RAYTUNE_MAX_CONCURRENT_TRIALS]
[--hyperopt-n-initial-points HYPEROPT_N_INITIAL_POINTS]
[--hyperopt-random-state-seed HYPEROPT_RANDOM_STATE_SEED]
Named Arguments#
- --logfile, --log
Path to which the log file should be written (specifying just the flag alone will automatically log to a file
chemprop_logs/MODE/TIMESTAMP.log, where ‘MODE’ is the CLI mode chosen, e.g.,chemprop_logs/MODE/2026-06-12T18-58-16.log)- -v
Increase verbosity level to DEBUG
Default:
False- -q
Decrease verbosity level to WARNING or ERROR if specified twice
Default:
0- --accelerator
Passed directly to the lightning
Trainer()Default:
'auto'- --devices
Passed directly to the lightning
Trainer()(must be a single string of comma separated devices, e.g. ‘1, 2’ if specifying multiple devices)Default:
'auto'- --constraints-path
Path to a CSV file containing the constraints for atomic/bond properties prediction. The file should have one column for each property being constrained with no SMILES column. The order of the rows should match the order of the SMILES in the input CSV. See also –constraints-to-targets for how to specify which constraint applies to which prediction.
- --constraints-to-targets
The column names of the atom or bond targets that correspond to each constraint column in the constraints CSV.
- --config-path
Path to a configuration file (command line arguments override values in the configuration file)
- -i, --data-path
Path to one, two, or three input CSV files containing SMILES and the associated target values. If one data file is supplied, it will undergo train-val-test split; if two are supplied, the first will undergo train-val split and the second will be taken as test data; if three are supplied, they will be taken as train, val, test data respectively
- -o, --output-dir, --save-dir
Directory where training outputs will be saved (defaults to
CURRENT_DIRECTORY/chemprop_training/STEM_OF_INPUT/TIME_STAMP)- --remove-checkpoints
Remove intermediate checkpoint files after training is complete.
Default:
False- --ensemble-size
Number of models in ensemble for each splitting of data
Default:
1- --atom-multiclass-num-classes
Number of classes for atom targets when running multiclass classification
Default:
3- --bond-multiclass-num-classes
Number of classes for bond targets when running multiclass classification
Default:
3- --pytorch-seed
Seed for PyTorch randomness (e.g., random initial weights)
Dataloader args#
- -n, --num-workers
Number of workers for parallel data loading where 0 means sequential (Warning: setting
num_workersto a value greater than 0 can cause hangs on Windows and MacOS)Default:
0- -b, --batch-size
Batch size
Default:
64
Featurization args#
- --rxn-mode, --reaction-mode
Possible choices: REAC_PROD, REAC_PROD_BALANCE, REAC_DIFF, REAC_DIFF_BALANCE, PROD_DIFF, PROD_DIFF_BALANCE
Choices for construction of atom and bond features for reactions (case insensitive):
REAC_PROD: concatenates the reactants feature with the products featureREAC_DIFF: concatenates the reactants feature with the difference in features between reactants and products (Default)PROD_DIFF: concatenates the products feature with the difference in features between reactants and productsREAC_PROD_BALANCE: concatenates the reactants feature with the products feature, balances imbalanced reactionsREAC_DIFF_BALANCE: concatenates the reactants feature with the difference in features between reactants and products, balances imbalanced reactionsPROD_DIFF_BALANCE: concatenates the products feature with the difference in features between reactants and products, balances imbalanced reactions
Default:
REAC_DIFF- --multi-hot-atom-featurizer-mode
Possible choices: V1, V2, ORGANIC, RIGR
Selects the multi-hot atom featurization scheme. This affects both non-reaction and reaction featurization (case insensitive):
V1: Corresponds to the original configuration employed in the Chemprop V1V2: Tailored for a broad range of molecules, this configuration encompasses all elements in the first four rows of the periodic table, along with iodine. It is the default in Chemprop V2.ORGANIC: This configuration is designed specifically for use with organic molecules for drug research and development and includes a subset of elements most common in organic chemistry, including H, B, C, N, O, F, Si, P, S, Cl, Br, and I.RIGR: Modified V2 (default) featurizer using only the resonance-invariant atom and bond features.
Default:
V2- --keep-h
Whether hydrogens explicitly specified in input should be kept in the mol graph
Default:
False- --add-h
Whether hydrogens should be added to the mol graph
Default:
False- --molecule-featurizers, --features-generators
Possible choices: morgan_binary, morgan_count, rdkit_2d, v1_rdkit_2d, v1_rdkit_2d_normalized, charge
Method(s) of generating molecule features to use as extra descriptors
- --descriptors-path
Path to extra descriptors to concatenate to learned representation
- --descriptors-columns
Column names in the input CSV containing extra datapoint descriptors, like temperature and pressure. See also –descriptors-path.
- --no-descriptor-scaling
Turn off extra descriptor scaling
Default:
False- --no-atom-feature-scaling
Turn off extra atom feature scaling
Default:
False- --no-atom-descriptor-scaling
Turn off extra atom descriptor scaling
Default:
False- --no-bond-feature-scaling
Turn off extra bond feature scaling
Default:
False- --no-bond-descriptor-scaling
Turn off extra bond descriptor scaling
Default:
False- --atom-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom features to supply before message passing (e.g.,
--atom-features-path 0 /path/to/features_0.npz) indicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-features-path 0 path_zero --atom-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-features-path 0 path_zero 1 path_one).- --atom-descriptors-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional atom descriptors to supply after message passing (e.g.,
--atom-descriptors-path 0 /path/to/descriptors_0.npzindicates that the descriptors at the given path should be supplied to the 0-th component. To supply additional descriptors for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--atom-descriptors-path 0 path_zero --atom-descriptors-path 1 path_one) or pass each two-tuple in a series (e.g.,--atom-descriptors-path 0 path_zero 1 path_one).- --bond-features-path
If a single path is given, it is assumed to correspond to the 0-th molecule. Alternatively, it can be a two-tuple of molecule index and path to additional bond features to supply before message passing (e.g.,
--bond-features-path 0 /path/to/features_0.npzindicates that the features at the given path should be supplied to the 0-th component. To supply additional features for multiple components, repeat this argument on the command line for each component’s respective values (e.g.,--bond-features-path 0 path_zero --bond-features-path 1 path_one) or pass each two-tuple in a series (e.g.,--bond-features-path 0 path_zero 1 path_one).- --bond-descriptors-path
Path to additional bond descriptors to use with the learned bond representations after message passing. The file follows the same format as –atom-descriptors-path, i.e. the file is created using np.savez(‘bond_descriptors.npz’, *E_ds) where E_ds is a list of 2D numpy arrays with shape n_bonds x n_descriptors.
- --use-cuikmolmaker-featurization
Use
cuik-molmakerpackage for accelerated atom and bond featurization.Default:
False
transfer learning args#
- --checkpoint
Path to checkpoint(s) or model file(s) for loading and overwriting weights. Accepts a single pre-trained model checkpoint (.ckpt), a single model file (.pt), a directory containing such files, or a list of paths and directories. If a directory is provided, it will recursively search for and use all (.pt) files found for prediction.
- --freeze-encoder
Freeze the message passing layer from the checkpoint model (specified by
--checkpoint).Default:
False- --model-frzn
Path to model checkpoint file to be loaded for overwriting and freezing weights. By default, all MPNN weights are frozen with this option.
- --frzn-ffn-layers
Freeze the first
nlayers of the FFN from the checkpoint model (specified by--checkpoint). The message passing layer should also be frozen with--freeze-encoder.Default:
0- --from-foundation
Name of pretrained foundation model used to initialize message passing. One of: CHEMELEON, or a path to a local model file.
message passing#
- --message-hidden-dim
Hidden dimension of the messages (specify multiple values to customize multicomponent encoders separately)
Default:
[300]- --message-bias
Add bias to the message passing layers
Default:
False- --depth
Number of message passing steps (specify multiple values to customize multicomponent encoders separately)
Default:
[3]- --undirected
Pass messages on undirected bonds/edges (always sum the two relevant bond vectors)
Default:
False- --dropout
Dropout probability in message passing/FFN layers
Default:
0.0- --mpn-shared
Whether to use the same message passing neural network for all input molecules (only relevant if
number_of_molecules> 1)Default:
False- --aggregation, --agg
Possible choices: mean, sum, norm
Aggregation mode to use during graph predictor
Default:
'norm'- --aggregation-norm
Normalization factor by which to divide summed up atomic features for
normaggregationDefault:
100- --atom-messages
Pass messages on atoms rather than bonds.
Default:
False- --activation
Possible choices: RELU, CELU, ELU, GELU, GLU, HARDSHRINK, HARDSIGMOID, HARDSWISH, HARDTANH, LEAKYRELU, LOGSIGMOID, LOGSOFTMAX, MISH, MULTIHEADATTENTION, PRELU, RRELU, RELU6, SILU, SIGMOID, SOFTMAX, SOFTMAX2D, SOFTMIN, SOFTPLUS, SOFTSHRINK, SOFTSIGN, TANH, TANHSHRINK, THRESHOLD
Activation function in message passing/FFN layers.
Default:
RELU- --activation-args
Arguments for the activation function (Example: arg1 arg2 key1=value1 key2=value2).
FFN args#
- --ffn-hidden-dim
Hidden dimension(s) in the FFN top model. A single value is applied to all layers; multiple values specify per-layer widths (must match –ffn-num-layers)
Default:
[300]- --ffn-num-layers
Number of hidden layers in FFN top model (v2 semantics, differs from v1)
Default:
1
extra MPNN args#
- --batch-norm
Turn on batch normalization after aggregation
Default:
False- --multiclass-num-classes
Number of classes when running multiclass classification
Default:
3
Atom FFN args#
- --atom-task-weights
Weights to apply for all atom tasks in the loss function
- --atom-ffn-hidden-dim
Hidden dimension(s) in the atom FFN top model
Default:
[300]- --atom-ffn-num-layers
Number of layers in atom FFN top model
Default:
1
Bond FFN args#
- --bond-task-weights
Weights to apply for all bond tasks in the loss function
- --bond-ffn-hidden-dim
Hidden dimension(s) in the bond FFN top model
Default:
[300]- --bond-ffn-num-layers
Number of layers in bond FFN top model
Default:
1
Atom constrainer FFN args#
- --atom-constrainer-ffn-hidden-dim
Hidden dimension(s) in the atom constrainer FFN top model
Default:
[300]- --atom-constrainer-ffn-num-layers
Number of layers in atom constrainer FFN top model
Default:
1
Bond constrainer FFN args#
- --bond-constrainer-ffn-hidden-dim
Hidden dimension(s) in the bond constrainer FFN top model
Default:
[300]- --bond-constrainer-ffn-num-layers
Number of layers in bond constrainer FFN top model
Default:
1
training input data args#
- -w, --weight-column
Name of the column in the input CSV containing individual data weights
- --target-columns
Name of the columns containing target values (by default, uses all columns except the SMILES column and the
ignore_columns)- --mol-target-columns
Names of the columns containing mol target values (when training on mol and atom/bond targets simultaneously).
- --atom-target-columns
Names of the columns containing atom target values.
- --bond-target-columns
Names of the columns containing bond target values.
- --ignore-columns
Name of the columns to ignore when
target_columnsis not provided- --no-cache
Turn off caching the featurized
MolGraphs at the beginning of trainingDefault:
False- --splits-column
Name of the column in the input CSV file containing ‘train’, ‘val’, or ‘test’ for each row.
training args#
- -t, --task-type
Possible choices: regression, regression-mve, regression-evidential, regression-quantile, classification, classification-dirichlet, multiclass, multiclass-dirichlet, spectral
Type of dataset (determines the default loss function used during training, defaults to
regression)Default:
'regression'- -l, --loss-function
Possible choices: mse, mae, rmse, bounded-mse, bounded-mae, bounded-rmse, mve, evidential, quantile-point, pinball-point, bce, ce, binary-mcc, multiclass-mcc, dirichlet, sid, earthmovers, wasserstein, quantile, pinball, nlogprob_enrichment
Loss function to use during training (will use the default loss function for the given task type if not specified)
- --v-kl, --evidential-regularization
Specify the value used in regularization for evidential loss function. The default value recommended by Soleimany et al. (2021) is 0.2. However, the optimal value is dataset-dependent, so it is recommended that users test different values to find the best value for their model.
Default:
0.0- --eps
Evidential regularization epsilon
Default:
1e-08- --alpha
Target error bounds for quantile interval loss
Default:
0.1- --metrics, --metric
Possible choices: mse, mae, rmse, bounded-mse, bounded-mae, bounded-rmse, r2, binary-mcc, multiclass-mcc, roc, prc, accuracy, f1
Specify the evaluation metrics. If unspecified, Chemprop will use the following metrics for given dataset types: regression ->
rmse, classification ->roc, multiclass ->ce(‘cross entropy’), spectral ->sid. If multiple metrics are provided, the 0-th one will be used for early stopping and checkpointing.- --tracking-metric
The metric to track for early stopping, checkpointing, and hyperparameter optimization. Defaults to the criterion used during training. When training on two or three of molecule, atom, and bond targets, and not tracking the default (‘val_loss’), you must append ‘-mol’, ‘-atom’, or ‘-bond’ to the metric name to specify which individual metric to track. For example, ‘val_loss-bond’ will track the criterion value of the bond predictions and ‘rmse-atom’ will track the RMSE of the atom predictions.
Default:
'val_loss'- --show-individual-scores
Show all scores for individual targets, not just average, at the end.
Default:
False- --task-weights
Weights to apply for whole tasks in the loss function
- --warmup-epochs
Number of epochs during which learning rate increases linearly from
init_lrtomax_lr(afterwards, learning rate decreases exponentially frommax_lrtofinal_lr)Default:
2- --init-lr
Initial learning rate.
Default:
0.0001- --max-lr
Maximum learning rate.
Default:
0.001- --final-lr
Final learning rate.
Default:
0.0001- --epochs
Number of epochs to train over
Default:
50- --patience
Number of epochs to wait for improvement before early stopping
- --min-delta
Minimum change in the monitored quantity to qualify as an improvement
Default:
0.0- --grad-clip
Passed directly to the lightning trainer which controls grad clipping (see the
Trainer()docstring for details)- --class-balance
Ensures each training batch contains an equal number of positive and negative samples.
Default:
False
split args#
- --split, --split-type
Possible choices: SCAFFOLD_BALANCED, RANDOM_WITH_REPEATED_SMILES, RANDOM, KENNARD_STONE, KMEANS
Method of splitting the data into train/val/test (case insensitive)
Default:
RANDOM- --split-sizes
Split proportions for train/validation/test sets
Default:
[0.8, 0.1, 0.1]- --split-key-molecule
Specify the index of the key molecule used for splitting when multiple molecules are present and constrained split_type is used (e.g.,
scaffold_balancedorrandom_with_repeated_smiles). Note that this index begins with zero for the first molecule.Default:
0- --num-replicates
Number of replicates.
Default:
1- -k, --num-folds
The -k/–num-folds argument was removed in v2.1.0 - use –num-replicates instead.
- --save-smiles-splits
Whether to store the SMILES in each train/val/test split
Default:
False- --save-data-splits
Whether to store the input data in each train/val/test split
Default:
False- --splits-file
Path to a JSON file containing pre-defined splits for the input data, formatted as a list of dictionaries with keys
train,val, andtestand values as lists of indices or formatted strings (e.g. [0, 1, 2, 4] or ‘0-2,4’)- --data-seed
Specify the random seed to use when splitting data into train/val/test sets. When
--num-replicates> 1, the first replicate uses this seed and all subsequent replicates add 1 to the seed (also used for shuffling data inbuild_dataloaderwhenshuffleis True).Default:
0
Chemprop hyperparameter optimization arguments#
- --search-parameter-keywords
- The model parameters over which to search for an optimal hyperparameter configuration. Some options are bundles of parameters or otherwise special parameter operations. Special keywords include:
basic: Default set of hyperparameters for search (depth, ffn_num_layers, dropout, message_hidden_dim, and ffn_hidden_dim)learning_rate: Search for max_lr, init_lr_ratio, final_lr_ratio, and warmup_epochs. The search for init_lr and final_lr values are defined as fractions of the max_lr value. The search for warmup_epochs is as a fraction of the total epochs used.all: Include search for all 13 individual keyword options (including: activation, aggregation, aggregation_norm, and batch_size which aren’t included in the other two keywords).
- Individual supported parameters:
[‘activation’, ‘dropout’, ‘message_hidden_dim’, ‘depth’, ‘aggregation’, ‘aggregation_norm’, ‘ffn_hidden_dim’, ‘ffn_num_layers’, ‘atom_ffn_hidden_dim’, ‘atom_ffn_num_layers’, ‘atom_constrainer_ffn_hidden_dim’, ‘atom_constrainer_ffn_num_layers’, ‘bond_ffn_hidden_dim’, ‘bond_ffn_num_layers’, ‘bond_constrainer_ffn_hidden_dim’, ‘bond_constrainer_ffn_num_layers’, ‘batch_size’, ‘init_lr_ratio’, ‘max_lr’, ‘final_lr_ratio’, ‘warmup_epochs’]
Default:
['basic']- --hpopt-save-dir
Directory to save the hyperparameter optimization results
Ray Tune arguments#
- --raytune-num-samples
Passed directly to Ray Tune
TuneConfigto control number of trials to runDefault:
10- --raytune-search-algorithm
Possible choices: random, hyperopt, optuna
Passed to Ray Tune
TuneConfigto control search algorithmDefault:
'hyperopt'- --raytune-trial-scheduler
Possible choices: FIFO, AsyncHyperBand
Passed to Ray Tune
TuneConfigto control trial schedulerDefault:
'FIFO'- --raytune-num-workers
Passed directly to Ray Tune
ScalingConfigto control number of workers to useDefault:
1- --raytune-use-gpu
Passed directly to Ray Tune
ScalingConfigto control whether to use GPUsDefault:
False- --raytune-num-checkpoints-to-keep
Passed directly to Ray Tune
CheckpointConfigto control number of checkpoints to keepDefault:
1- --raytune-grace-period
Passed directly to Ray Tune
ASHASchedulerto control grace periodDefault:
10- --raytune-reduction-factor
Passed directly to Ray Tune
ASHASchedulerto control reduction factorDefault:
2- --raytune-temp-dir
Passed directly to Ray Tune init to control temporary directory
- --raytune-num-cpus
Passed directly to Ray Tune init to control number of CPUs to use
- --raytune-num-gpus
Passed directly to Ray Tune init to control number of GPUs to use
- --raytune-max-concurrent-trials
Passed directly to Ray Tune TuneConfig to control maximum concurrent trials
Hyperopt arguments#
- --hyperopt-n-initial-points
Passed directly to
HyperOptSearchto control number of initial points to sample- --hyperopt-random-state-seed
Passed directly to
HyperOptSearchto control random state seed