Boltz Predict¶
Runs a full Boltz structure prediction job using a prepared Boltz YAML configuration and its auxiliary files. It submits the job to the Salt biotech backend and returns predicted structures along with confidence metrics, and optionally affinity metrics when requested in the YAML.

Usage¶
Use after assembling your sequences, constraints, templates, and optional properties into a Boltz YAML bundle (typically via Boltz YAML Combiner). Provide that YAML and its auxiliary files to this node, set the random seed and optional inference parameters, then run to obtain predicted PDBs and confidence JSON. If your YAML includes affinity properties and ligand entities, an affinity JSON will also be returned.
Inputs¶
| Field | Required | Type | Description | Example |
|---|---|---|---|---|
| boltz_yaml | True | BOLTZ_YAML | Structured configuration describing sequences (proteins/DNA/RNA/ligands), constraints, templates, and optional properties for the Boltz run. Typically produced by Boltz YAML Combiner. | {'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'ACDE...'}}]} |
| boltz_files | True | BOLTZ_FILES | Auxiliary files referenced by the YAML (e.g., MSA files, template PDB/CIF files) packaged as a filename-to-content mapping. | {'msa_1.a3m': ' |
| seed | True | INT | Base random seed used for the prediction process. | 42 |
| recycling_steps | False | INT | Number of recycling iterations during inference. | 3 |
| sampling_steps | False | INT | Number of diffusion sampling steps. | 200 |
| diffusion_samples | False | INT | How many independent diffusion samples to generate. | 1 |
| max_parallel_samples | False | INT | Maximum number of samples processed in parallel. | 5 |
| step_scale | False | FLOAT | Diffusion step scale (temperature-like parameter). | 1.638 |
| output_format | False | CHOICE | Structure output format. | pdb |
| num_workers | False | INT | Worker processes (0 disables multiprocessing). | 0 |
| max_msa_seqs | False | INT | Maximum number of MSA sequences to use. | 8192 |
| subsample_msa | False | BOOLEAN | Whether to subsample MSA sequences. | False |
| num_subsampled_msa | False | INT | Number of MSA sequences to keep when subsampling is enabled. | 1024 |
| use_potentials | False | BOOLEAN | Enable inference-time potentials for improved quality (may be slower). | False |
| write_full_pae | False | BOOLEAN | Write the full Predicted Aligned Error (PAE) matrix. | False |
| write_full_pde | False | BOOLEAN | Write the full Predicted Distance Error (PDE) matrix. | False |
| affinity_mw_correction | False | BOOLEAN | Apply molecular weight correction for affinity output (only used if affinity is predicted). | False |
| sampling_steps_affinity | False | INT | Number of sampling steps for affinity prediction. | 200 |
| diffusion_samples_affinity | False | INT | Number of diffusion samples for affinity prediction. | 5 |
Outputs¶
| Field | Type | Description | Example |
|---|---|---|---|
| structures.pdb | PDB | Dictionary mapping generated structure names to PDB (or chosen format) contents for ranked predictions. | {'ranked_0.pdb': ' |
| confidence.json | JSON | Confidence metrics (e.g., per-model scores, optional matrices when enabled). | {'ranking_confidence': {'ranked_0': 0.78}, 'pae': ' |
| affinity.json | JSON | Affinity prediction outputs when affinity is requested in the YAML and ligands are present; empty otherwise. | {'predicted_affinity': {'complex_0': {'kd': 1.2}}} |
Important Notes¶
- Affinity requirements: Affinity outputs are only produced when the YAML includes affinity properties and at least one ligand sequence; otherwise affinity.json will be empty.
- Validation: The node validates that boltz_yaml is a dictionary including at least one sequence and that boltz_files is a dictionary; invalid inputs raise errors.
- Performance settings: Increasing sampling_steps, diffusion_samples, or enabling use_potentials can improve quality at the cost of runtime.
- Output format: output_format controls structure serialization (pdb or mmcif).
- MSA controls: Use max_msa_seqs, subsample_msa, and num_subsampled_msa to manage memory/runtime for large MSAs.
- No trajectory output here: This node does not return diffusion trajectory data; use Boltz Partial Diffusion for trajectories.
Troubleshooting¶
- Error: 'Boltz YAML must be a dictionary': Ensure you pass the YAML object produced by Boltz YAML Combiner, not a string or malformed data.
- Error: 'Boltz YAML must contain at least one sequence': Add at least one protein/DNA/RNA/ligand sequence in the YAML.
- Error: 'Boltz files must be a dictionary': Provide auxiliary files as a mapping of filenames to file contents.
- Error: 'Affinity prediction requires at least one ligand sequence': Add a ligand entity to sequences and include appropriate affinity properties in the YAML.
- Timeouts or long runs: Reduce sampling_steps or diffusion_samples, disable use_potentials, lower max_msa_seqs, or enable subsample_msa.
- Empty outputs: Check that the YAML references auxiliary files by the expected filenames and that boltz_files contains those filenames with valid contents.