Skip to content

Boltz Predict

Runs Boltz structure prediction from a prepared Boltz YAML and its auxiliary files. It produces ranked 3D structures plus associated confidence and scoring data. If affinity prediction is requested and valid ligand inputs are present, it also returns affinity estimates. PAE, PDE, and pLDDT matrices are returned as JSON-decoded arrays.
Preview

Usage

Use this node after assembling inputs with Boltz Sequence/Constraint/Template/Property builders and combining them with Boltz List Combiner and Boltz YAML Combiner. Connect the resulting boltz_yaml and boltz_files here, set prediction parameters (steps, samples, format), and run to obtain structures and confidence/affinity outputs. Choose output_format to get PDB or mmCIF files.

Inputs

FieldRequiredTypeDescriptionExample
boltz_yamlTrueBOLTZ_YAMLThe Boltz configuration produced by Boltz YAML Combiner. Must contain at least one valid sequence entry.{'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'MKT...', '_sequence_name': 'seqA'}}]}
boltz_filesTrueBOLTZ_FILESAuxiliary files referenced by the YAML (e.g., MSA files, templates) produced by Boltz YAML Combiner.{'msa_1.a3m': '', 'template_1.pdb': 'ATOM ...'}
seedTrueINTRandom seed for reproducibility.42
recycling_stepsFalseINTNumber of recycling steps during prediction.3
sampling_stepsFalseINTNumber of sampling steps for the main diffusion process.200
diffusion_samplesFalseINTHow many independent samples to generate.1
max_parallel_samplesFalseINTMaximum number of samples processed in parallel.5
step_scaleFalseFLOATDiffusion step scale (temperature-like parameter).1.638
output_formatFalseCHOICE['pdb','mmcif']Format of the output structures. When 'mmcif' is chosen, structure file names will use .cif extension.pdb
num_workersFalseINTNumber of worker processes (0 disables multiprocessing).0
max_msa_seqsFalseINTMaximum number of MSA sequences to consider.8192
subsample_msaFalseBOOLEANWhether to subsample MSA sequences.False
num_subsampled_msaFalseINTNumber of MSA sequences to keep when subsampling is enabled.1024
use_potentialsFalseBOOLEANEnable inference-time potentials to improve quality (may be slower).False
write_full_paeFalseBOOLEANIf true, write full PAE matrices to outputs.False
write_full_pdeFalseBOOLEANIf true, write full PDE matrices to outputs.False
affinity_mw_correctionFalseBOOLEANApply molecular weight correction when computing affinity (only relevant if affinity is requested).False
sampling_steps_affinityFalseINTSampling steps to use for affinity prediction runs.200
diffusion_samples_affinityFalseINTNumber of diffusion samples for affinity prediction.5

Outputs

FieldTypeDescriptionExample
structures.pdbPDBRanked structure files keyed by filename. If output_format is 'mmcif', keys use .cif extension.{'rank_001_model_1.pdb': 'ATOM ...'}
confidence.jsonJSONPer-structure confidence data (e.g., scores/metrics) keyed by filename.{'confidence_rank_001.json': {'iptm': 0.78, 'ptm': 0.65}}
affinity.jsonJSONAffinity prediction results (only populated if affinity was requested and ligands are present).{'affinity_rank_001.json': {'kd': 1.2e-07, 'units': 'M'}}
pae.jsonJSONPredicted Aligned Error matrices per model, decoded from NPZ to JSON-serializable arrays.{'pae_rank_001': {'predicted_aligned_error': [[0.0, 1.2], [1.1, 0.0]]}}
pde.jsonJSONPredicted Distance Error matrices per model, decoded from NPZ to JSON-serializable arrays.{'pde_rank_001': {'predicted_distance_error': [[0.0, 0.9], [0.8, 0.0]]}}
plddt.jsonJSONPer-residue pLDDT scores per model, decoded from NPZ to JSON-serializable arrays.{'plddt_rank_001': {'plddt': [92.1, 88.4, 85.0]}}

Important Notes

  • Affinity outputs are only produced if the YAML includes at least one ligand sequence and an affinity property; otherwise affinity.json will be empty.
  • When output_format is set to 'mmcif', the structure result keys are renamed to use .cif extensions.
  • Boltz YAML and files must be consistent: any referenced MSA/template in the YAML must exist in boltz_files.
  • Large sampling_steps, diffusion_samples, or enabling potentials can significantly increase runtime and resource usage.
  • MSA subsampling parameters only affect runs where MSA content is provided via the YAML.

Troubleshooting

  • Error: 'Boltz YAML must contain at least one sequence' — Ensure the YAML from Boltz YAML Combiner includes a valid sequences list.
  • Error: 'Boltz files must be a dictionary' — Connect the boltz_files output from Boltz YAML Combiner directly.
  • Error: 'Affinity prediction requires at least one ligand sequence' — Add a ligand sequence and include an affinity property when requesting affinity.
  • Empty or missing PAE/PDE/pLDDT arrays — These matrices are decoded from NPZ; if the backend did not produce them or decoding failed, outputs may be empty.
  • Unexpected structure file extensions — Verify output_format. Select 'pdb' for .pdb or 'mmcif' for .cif naming.
  • Slow execution or timeouts — Reduce diffusion_samples or sampling_steps, disable use_potentials, lower max_parallel_samples, or decrease num_workers if resource-constrained.