Boltz Predict¶
Runs Boltz structure prediction from a prepared Boltz YAML configuration and its auxiliary files. Produces ranked structure files and confidence metrics, and optionally ligand-binding affinity predictions when requested in properties. Seeds are offset per sequence to ensure varied sampling across multiple sequences in one YAML.

Usage¶
Use this node after building sequences, constraints, templates, and properties and combining them into a valid Boltz YAML with its auxiliary files. Connect the YAML and files outputs from the Boltz YAML Combiner, set prediction parameters (sampling, recycling, step scale), and optionally enable affinity-related options if affinity is requested. The outputs include ranked structure files, confidence JSON, and optional affinity JSON.
Inputs¶
| Field | Required | Type | Description | Example | 
|---|---|---|---|---|
| boltz_yaml | True | BOLTZ_YAML | The Boltz YAML configuration generated by the Boltz YAML Combiner. Must include at least one sequence and adhere to Boltz schema. | {'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'AAAA', 'msa': 'empty'}}]} | 
| boltz_files | True | BOLTZ_FILES | Auxiliary files referenced by the YAML (e.g., A3M MSAs, template PDBs). Keys are filenames and values are file contents. | {'msa_1.a3m': '>A\nAAAA', 'template_1.pdb': 'ATOM ...'} | 
| seed | True | INT | Base RNG seed for reproducibility. Each sequence in the YAML uses an offset of this seed to diversify samples. | 42 | 
| recycling_steps | False | INT | Number of recycling iterations during prediction. | 3 | 
| sampling_steps | False | INT | Number of diffusion sampling steps for structure generation. | 200 | 
| diffusion_samples | False | INT | How many independent diffusion samples to generate per run. | 1 | 
| max_parallel_samples | False | INT | Degree of parallelism for sampling (hardware dependent). | 5 | 
| step_scale | False | FLOAT | Diffusion step scale (temperature-like control). Higher values generally increase exploration. | 1.638 | 
| output_format | False | STRING | Desired structure file format for outputs. Options: pdb or mmcif. | pdb | 
| num_workers | False | INT | Number of worker processes. Set to 0 to disable multiprocessing. | 0 | 
| max_msa_seqs | False | INT | Maximum number of MSA sequences to use. | 8192 | 
| subsample_msa | False | BOOLEAN | Whether to subsample MSA sequences. | False | 
| num_subsampled_msa | False | INT | Number of MSA sequences to use if subsampling is enabled. | 1024 | 
| use_potentials | False | BOOLEAN | Enable inference-time potentials to improve quality at the cost of speed. | False | 
| write_full_pae | False | BOOLEAN | If true, outputs include full Predicted Aligned Error (PAE) matrices. | False | 
| write_full_pde | False | BOOLEAN | If true, outputs include full Predicted Distance Error (PDE) matrices. | False | 
| affinity_mw_correction | False | BOOLEAN | Applies molecular weight correction when computing affinity (only used if affinity property is requested). | False | 
| sampling_steps_affinity | False | INT | Sampling steps specifically for affinity prediction (if requested). | 200 | 
| diffusion_samples_affinity | False | INT | Number of diffusion samples specifically for affinity prediction (if requested). | 5 | 
Outputs¶
| Field | Type | Description | Example | 
|---|---|---|---|
| structures.pdb | PDB | Dictionary mapping ranked structure filenames to content. Filenames are prefixed by sequence names for multi-sequence YAMLs. | {'sequence_0_rank_001.pdb': 'ATOM ...', 'sequence_0_rank_002.pdb': 'ATOM ...'} | 
| confidence.json | JSON | Dictionary of confidence-related outputs (e.g., pLDDT/PAE metrics) keyed by filenames. | {'sequence_0_confidence.json': '{"pLDDT": [...], "PAE": [...]}'} | 
| affinity.json | JSON | Dictionary of affinity prediction outputs keyed by filenames. Only populated if an affinity property is present and at least one ligand exists in sequences. | {'sequence_0_affinity.json': '{"Kd_pred": 1.2e-8, "details": {...}}'} | 
Important Notes¶
- YAML validity: The YAML must contain at least one sequence and follow the Boltz schema used by the Boltz YAML Combiner.
- Affinity requirements: Affinity prediction is only allowed if the YAML includes an affinity property and at least one ligand sequence; otherwise the node will raise an error.
- Multiple sequences: If YAML contains multiple sequences, the node runs them with seed offsets (seed + index) and prefixes output filenames with the detected sequence names.
- Output format: The output structure format is controlled by output_format (pdb or mmcif); filenames reflect the chosen format.
- Performance: max_parallel_samples, num_workers, and use_potentials significantly affect runtime and resource usage.
- MSA handling: Ensure any MSA or template filenames referenced by the YAML exist in boltz_files; large MSAs may be truncated via max_msa_seqs or sampled if subsample_msa is enabled.
Troubleshooting¶
- Error: Boltz YAML must contain at least one sequence: Check that boltz_yaml.sequences is present and non-empty from the YAML Combiner.
- Error: Boltz files must be a dictionary: Provide the auxiliary files mapping as a dict of filename -> content.
- Error: Affinity prediction requires at least one ligand sequence: Add a ligand to sequences and ensure an affinity property is included, or disable affinity-related options.
- Empty or missing outputs: Verify sampling_steps and diffusion_samples are reasonable, and confirm that template/MSA filenames in YAML match keys in boltz_files.
- Unexpected runtime or slow performance: Reduce diffusion_samples or sampling_steps, lower max_parallel_samples, or disable use_potentials to speed up.
- Invalid parameter ranges: Ensure integers and floats are within the documented bounds (e.g., recycling_steps >= 1, step_scale within allowed range).