Skip to content

Boltz Predict

Runs a Boltz structure prediction job from a prepared Boltz YAML plus its auxiliary files. It validates inputs, submits a prediction request to the Boltz service, and returns ranked structure files with associated confidence metrics, and optionally affinity predictions when requested. Output may include multiple ranked structures aggregated into a dictionary.
Preview

Usage

Use this node after assembling sequences, constraints, templates, and optional properties with the Boltz YAML Combiner. Connect the resulting boltz_yaml and boltz_files here, set a seed and optional inference parameters, then execute to obtain predicted structures and confidence JSON. Include an affinity property and at least one ligand sequence in the YAML to get affinity outputs.

Inputs

FieldRequiredTypeDescriptionExample
boltz_yamlTrueBOLTZ_YAMLBoltz configuration produced by BoltzYAMLCombinerNode. Must include at least one sequence and valid chain IDs.{'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'MSEQN...'}}]}
boltz_filesTrueBOLTZ_FILESAuxiliary files referenced by the YAML (e.g., MSA and template files). Keys are filenames; values are their file contents.{'msa_1.a3m': '', 'template_1.pdb': ''}
seedTrueINTRandom seed for reproducibility.42
recycling_stepsFalseINTNumber of recycling iterations during prediction (1–20).3
sampling_stepsFalseINTNumber of diffusion sampling steps (50–500). Higher values can improve quality with more compute.200
diffusion_samplesFalseINTNumber of independent diffusion samples to generate per prediction (1–25).1
max_parallel_samplesFalseINTMaximum number of samples to compute in parallel (1–10).5
step_scaleFalseFLOATDiffusion step scale (temperature-like parameter, 1.0–2.0).1.638
output_formatFalse[pdb, mmcif]Desired output structure file format.pdb
num_workersFalseINTNumber of worker processes (0 disables multiprocessing).0
max_msa_seqsFalseINTMaximum number of MSA sequences to use (256–16384).8192
subsample_msaFalseBOOLEANWhether to subsample MSA sequences before use.False
num_subsampled_msaFalseINTNumber of MSA sequences to keep when subsampling (256–4096).1024
use_potentialsFalseBOOLEANEnable inference-time potentials for improved quality (slower).False
write_full_paeFalseBOOLEANIf true, write full Predicted Aligned Error (PAE) matrix.False
write_full_pdeFalseBOOLEANIf true, write full Predicted Distance Error (PDE) matrix.False
affinity_mw_correctionFalseBOOLEANApply molecular weight correction when computing affinity (only used if affinity is requested).False
sampling_steps_affinityFalseINTSampling steps used for affinity calculation (50–500).200
diffusion_samples_affinityFalseINTNumber of diffusion samples used for affinity prediction (1–10).5

Outputs

FieldTypeDescriptionExample
structures.pdbPDBDictionary mapping result names to predicted structure contents in the chosen format (PDB or mmCIF). Keys typically include ranked file names.{'sequence_0_ranked_0.pdb': '', 'sequence_0_ranked_1.pdb': ''}
confidence.jsonJSONDictionary of confidence-related outputs (e.g., pLDDT, PAE summaries) keyed by file name.{'sequence_0_confidence.json': '{"plddt": [...], "pae": ...}'}
affinity.jsonJSONDictionary of affinity prediction outputs when requested; empty if no affinity property was included.{'sequence_0_affinity.json': '{"kd": 1.23, "units": "uM"}'}

Important Notes

  • Affinity requires a ligand: If the YAML includes an affinity property, at least one ligand sequence must be present; otherwise the node raises an error.
  • Input validation: boltz_yaml must be a dictionary containing at least one sequence; boltz_files must be a dictionary of auxiliary file contents.
  • Multiple outputs: The node may return multiple ranked structures and associated metrics as a dictionary of name->content.
  • Output format: Although the output field is named structures.pdb, the actual structure content respects the output_format setting (pdb or mmcif).
  • Performance settings: Increasing sampling_steps, diffusion_samples, and enabling use_potentials can improve quality at higher compute cost.
  • MSA handling: MSA file preparation and de-duplication are handled upstream by the Boltz YAML Combiner; ensure those inputs are correctly wired.
  • Reproducibility: The seed controls stochastic parts of the process; keep it fixed to reproduce results, adjusting only when intentionally exploring variability.
  • Execution: The node submits work to the Boltz prediction service and waits up to a set timeout server-side.

Troubleshooting

  • Error: Affinity prediction requires at least one ligand sequence: Add a ligand sequence to the YAML and ensure a property of type affinity is included.
  • Error: Boltz YAML must contain at least one sequence: Verify the upstream combiner produced a non-empty sequences list.
  • Error: Boltz files must be a dictionary: Connect the boltz_files output from Boltz YAML Combiner directly; do not pass a list or string.
  • Empty affinity.json output: This is expected if no affinity property was provided in the YAML; add it via the property builder if needed.
  • Unexpected structure format: If you requested mmcif but see a .pdb-like key name, check that output_format is correctly set; the content adheres to the format even if the display name is generic.
  • Slow runtime: Reduce diffusion_samples, sampling_steps, or disable use_potentials; also consider lowering max_parallel_samples or num_workers depending on resource limits.
  • Invalid chain IDs or missing files: Ensure all referenced chain IDs and auxiliary file names in boltz_yaml match those provided in boltz_files.