DiffDock Configuration Parameters Reference
This document provides comprehensive details on all DiffDock configuration parameters and command-line options.
Model & Checkpoint Settings
Model Paths
--model_dir: Directory containing the score model checkpoint- Default:
./workdir/v1.1/score_model -
DiffDock-L model (current default)
-
--confidence_model_dir: Directory containing the confidence model checkpoint -
Default:
./workdir/v1.1/confidence_model -
--ckpt: Name of the score model checkpoint file -
Default:
best_ema_inference_epoch_model.pt -
--confidence_ckpt: Name of the confidence model checkpoint file - Default:
best_model_epoch75.pt
Model Version Flags
--old_score_model: Use original DiffDock model instead of DiffDock-L-
Default:
false(uses DiffDock-L) -
--old_filtering_model: Use legacy confidence filtering approach - Default:
true
Input/Output Options
Input Specification
--protein_path: Path to protein PDB file- Example:
--protein_path protein.pdb -
Alternative to
--protein_sequence -
--protein_sequence: Amino acid sequence for ESMFold folding - Automatically generates protein structure from sequence
-
Alternative to
--protein_path -
--ligand: Ligand specification (SMILES string or file path) - SMILES string:
--ligand "COc(cc1)ccc1C#N" -
File path:
--ligand ligand.sdfor.mol2 -
--protein_ligand_csv: CSV file for batch processing - Required columns:
complex_name,protein_path,ligand_description,protein_sequence - Example:
--protein_ligand_csv data/protein_ligand_example.csv
Output Control
--out_dir: Output directory for predictions-
Example:
--out_dir results/user_predictions/ -
--save_visualisation: Export predicted molecules as SDF files - Enables visualization of results
Inference Parameters
Diffusion Steps
--inference_steps: Number of planned inference iterations- Default:
20 -
Higher values may improve accuracy but increase runtime
-
--actual_steps: Actual diffusion steps executed -
Default:
19 -
--no_final_step_noise: Omit noise at the final diffusion step - Default:
true
Sampling Settings
--samples_per_complex: Number of samples to generate per complex- Default:
10 -
More samples provide better coverage but increase computation
-
--sigma_schedule: Noise schedule type -
Default:
expbeta(exponential-beta) -
--initial_noise_std_proportion: Initial noise standard deviation scaling - Default:
1.46
Temperature Parameters
Sampling Temperatures (Controls diversity of predictions)
--temp_sampling_tr: Translation sampling temperature-
Default:
1.17 -
--temp_sampling_rot: Rotation sampling temperature -
Default:
2.06 -
--temp_sampling_tor: Torsion sampling temperature - Default:
7.04
Psi Angle Temperatures
--temp_psi_tr: Translation psi temperature-
Default:
0.73 -
--temp_psi_rot: Rotation psi temperature -
Default:
0.90 -
--temp_psi_tor: Torsion psi temperature - Default:
0.59
Sigma Data Temperatures
--temp_sigma_data_tr: Translation data distribution scaling-
Default:
0.93 -
--temp_sigma_data_rot: Rotation data distribution scaling -
Default:
0.75 -
--temp_sigma_data_tor: Torsion data distribution scaling - Default:
0.69
Processing Options
Performance
--batch_size: Processing batch size- Default:
10 -
Larger values increase throughput but require more memory
-
--tqdm: Enable progress bar visualization - Useful for monitoring long-running jobs
Protein Structure
--chain_cutoff: Maximum number of protein chains to process- Example:
--chain_cutoff 10 -
Useful for large multi-chain complexes
-
--esm_embeddings_path: Path to pre-computed ESM2 protein embeddings - Speeds up inference by reusing embeddings
- Optional optimization
Dataset Options
--split: Dataset split to use (train/test/val)- Used for evaluation on standard benchmarks
Advanced Flags
Debugging & Testing
--no_model: Disable model inference (debugging)-
Default:
false -
--no_random: Disable randomization - Default:
false - Useful for reproducibility testing
Alternative Sampling
--ode: Use ODE solver instead of SDE- Default:
false -
Alternative sampling approach
-
--different_schedules: Use different noise schedules per component - Default:
false
Error Handling
--limit_failures: Maximum allowed failures before stopping- Default:
5
Configuration File
All parameters can be specified in a YAML configuration file (typically default_inference_args.yaml) or overridden via command line:
python -m inference --config default_inference_args.yaml --samples_per_complex 20
Command-line arguments take precedence over configuration file values.