Skip to content

Boltz Predict

Runs a Boltz structure prediction job from a prepared Boltz YAML configuration and auxiliary files. It supports tuning diffusion and sampling parameters, optional MSA handling controls, and can optionally compute ligand-binding affinity if requested in properties.
Preview

Usage

Use after assembling your system with Boltz Sequence/Constraint/Template/Property nodes and combining them via Boltz List Combiner and Boltz YAML Combiner. Connect the resulting boltz_yaml and boltz_files here, set your seed and optional parameters, then run to obtain predicted structures (as PDB or mmCIF), model confidence metrics, and affinity results if requested.

Inputs

FieldRequiredTypeDescriptionExample
boltz_yamlTrueBOLTZ_YAMLComplete Boltz YAML configuration describing sequences, optional constraints, templates, and properties. Must include at least one sequence.{ "version": 1, "sequences": [ { "protein": { "id": "A", "sequence": "..." } } ], "properties": [ { "affinity": { "binder": "L" } } ] }
boltz_filesTrueBOLTZ_FILESAuxiliary file map referenced by the YAML (e.g., MSA .a3m and template .pdb files). Keys are filenames referenced in YAML; values are file contents.{ "msa_1.a3m": "...", "template_1.pdb": "...PDB text..." }
seedTrueINTBase random seed for reproducibility.42
recycling_stepsFalseINTNumber of recycling iterations during structure refinement.3
sampling_stepsFalseINTNumber of sampling steps for the diffusion process.200
diffusion_samplesFalseINTHow many independent diffusion samples to generate.1
max_parallel_samplesFalseINTMaximum number of samples processed in parallel (resource-dependent).5
step_scaleFalseFLOATDiffusion step scale (temperature-like factor).1.638
output_formatFalseSTRINGStructure output format: pdb or mmcif.pdb
num_workersFalseINTNumber of worker processes (0 disables multiprocessing).0
max_msa_seqsFalseINTMaximum number of MSA sequences to use.8192
subsample_msaFalseBOOLEANWhether to subsample MSA sequences.false
num_subsampled_msaFalseINTNumber of MSA sequences after subsampling (if enabled).1024
use_potentialsFalseBOOLEANEnable inference-time potentials for improved quality (may be slower).false
write_full_paeFalseBOOLEANWrite the full Predicted Aligned Error (PAE) matrix to outputs.false
write_full_pdeFalseBOOLEANWrite the full Predicted Distance Error (PDE) matrix to outputs.false
affinity_mw_correctionFalseBOOLEANApply molecular weight correction to affinity estimates (used only if affinity is requested in properties).false
sampling_steps_affinityFalseINTSampling steps for affinity estimation (used only if affinity is requested).200
diffusion_samples_affinityFalseINTNumber of diffusion samples for affinity estimation (used only if affinity is requested).5

Outputs

FieldTypeDescriptionExample
structures.pdbPDBDictionary mapping structure filenames to PDB (or mmCIF) content for ranked predictions.{ "rank1.pdb": "...PDB text...", "rank2.pdb": "..." }
confidence.jsonJSONDictionary of confidence-related outputs per sample (e.g., pLDDT/PAE data or filenames to JSON).{ "rank1_confidence.json": "{...}", "rank2_confidence.json": "{...}" }
affinity.jsonJSONDictionary of affinity results when affinity is requested in properties; empty if not applicable.{ "rank1_affinity.json": "{...}" }

Important Notes

  • Affinity requires a ligand: If properties include affinity, at least one ligand sequence must be present; otherwise the node raises an error.
  • YAML and files must align: Any filenames referenced in the YAML (e.g., MSA or template entries) must exist in boltz_files with matching keys.
  • Output format: Structures are emitted in the selected format (pdb or mmcif), but the field name remains structures.pdb; use the contents accordingly.
  • Parallelism and resources: Increasing max_parallel_samples and num_workers increases resource usage and may exhaust memory/compute on limited systems.
  • MSA controls: You can cap or subsample MSAs via max_msa_seqs, subsample_msa, and num_subsampled_msa to manage speed and memory.
  • Reproducibility: The seed is applied, but multiple diffusion samples and parallelism can still introduce variability across runs.

Troubleshooting

  • Error: 'Boltz YAML must contain at least one sequence': Ensure you connected a valid boltz_yaml from Boltz YAML Combiner that includes the sequences list.
  • Error: 'Affinity prediction requires at least one ligand sequence': Remove affinity properties or add a ligand sequence and binder chain.
  • Empty or missing outputs: Verify sampling_steps and diffusion_samples are reasonable; check that boltz_files include required MSA/template files referenced by YAML.
  • High memory or long runtimes: Reduce max_parallel_samples, num_workers, sampling_steps, or MSA sizes; disable use_potentials if not needed.
  • Invalid file references: If you see key errors for MSA/template files, confirm the YAML references filenames that exist as keys in boltz_files.
  • Unexpected structure format: If you chose mmcif output but expected PDB, set output_format to "pdb" or adapt downstream consumers to handle mmCIF.