Boltz Predict¶
Runs a Boltz structure prediction job from a prepared Boltz YAML plus its auxiliary files. It validates inputs, submits a prediction request to the Boltz service, and returns ranked structure files with associated confidence metrics, and optionally affinity predictions when requested. Output may include multiple ranked structures aggregated into a dictionary.

Usage¶
Use this node after assembling sequences, constraints, templates, and optional properties with the Boltz YAML Combiner. Connect the resulting boltz_yaml and boltz_files here, set a seed and optional inference parameters, then execute to obtain predicted structures and confidence JSON. Include an affinity property and at least one ligand sequence in the YAML to get affinity outputs.
Inputs¶
| Field | Required | Type | Description | Example |
|---|---|---|---|---|
| boltz_yaml | True | BOLTZ_YAML | Boltz configuration produced by BoltzYAMLCombinerNode. Must include at least one sequence and valid chain IDs. | {'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'MSEQN...'}}]} |
| boltz_files | True | BOLTZ_FILES | Auxiliary files referenced by the YAML (e.g., MSA and template files). Keys are filenames; values are their file contents. | {'msa_1.a3m': ' |
| seed | True | INT | Random seed for reproducibility. | 42 |
| recycling_steps | False | INT | Number of recycling iterations during prediction (1–20). | 3 |
| sampling_steps | False | INT | Number of diffusion sampling steps (50–500). Higher values can improve quality with more compute. | 200 |
| diffusion_samples | False | INT | Number of independent diffusion samples to generate per prediction (1–25). | 1 |
| max_parallel_samples | False | INT | Maximum number of samples to compute in parallel (1–10). | 5 |
| step_scale | False | FLOAT | Diffusion step scale (temperature-like parameter, 1.0–2.0). | 1.638 |
| output_format | False | [pdb, mmcif] | Desired output structure file format. | pdb |
| num_workers | False | INT | Number of worker processes (0 disables multiprocessing). | 0 |
| max_msa_seqs | False | INT | Maximum number of MSA sequences to use (256–16384). | 8192 |
| subsample_msa | False | BOOLEAN | Whether to subsample MSA sequences before use. | False |
| num_subsampled_msa | False | INT | Number of MSA sequences to keep when subsampling (256–4096). | 1024 |
| use_potentials | False | BOOLEAN | Enable inference-time potentials for improved quality (slower). | False |
| write_full_pae | False | BOOLEAN | If true, write full Predicted Aligned Error (PAE) matrix. | False |
| write_full_pde | False | BOOLEAN | If true, write full Predicted Distance Error (PDE) matrix. | False |
| affinity_mw_correction | False | BOOLEAN | Apply molecular weight correction when computing affinity (only used if affinity is requested). | False |
| sampling_steps_affinity | False | INT | Sampling steps used for affinity calculation (50–500). | 200 |
| diffusion_samples_affinity | False | INT | Number of diffusion samples used for affinity prediction (1–10). | 5 |
Outputs¶
| Field | Type | Description | Example |
|---|---|---|---|
| structures.pdb | PDB | Dictionary mapping result names to predicted structure contents in the chosen format (PDB or mmCIF). Keys typically include ranked file names. | {'sequence_0_ranked_0.pdb': ' |
| confidence.json | JSON | Dictionary of confidence-related outputs (e.g., pLDDT, PAE summaries) keyed by file name. | {'sequence_0_confidence.json': '{"plddt": [...], "pae": ...}'} |
| affinity.json | JSON | Dictionary of affinity prediction outputs when requested; empty if no affinity property was included. | {'sequence_0_affinity.json': '{"kd": 1.23, "units": "uM"}'} |
Important Notes¶
- Affinity requires a ligand: If the YAML includes an affinity property, at least one ligand sequence must be present; otherwise the node raises an error.
- Input validation: boltz_yaml must be a dictionary containing at least one sequence; boltz_files must be a dictionary of auxiliary file contents.
- Multiple outputs: The node may return multiple ranked structures and associated metrics as a dictionary of name->content.
- Output format: Although the output field is named structures.pdb, the actual structure content respects the output_format setting (pdb or mmcif).
- Performance settings: Increasing sampling_steps, diffusion_samples, and enabling use_potentials can improve quality at higher compute cost.
- MSA handling: MSA file preparation and de-duplication are handled upstream by the Boltz YAML Combiner; ensure those inputs are correctly wired.
- Reproducibility: The seed controls stochastic parts of the process; keep it fixed to reproduce results, adjusting only when intentionally exploring variability.
- Execution: The node submits work to the Boltz prediction service and waits up to a set timeout server-side.
Troubleshooting¶
- Error: Affinity prediction requires at least one ligand sequence: Add a ligand sequence to the YAML and ensure a property of type affinity is included.
- Error: Boltz YAML must contain at least one sequence: Verify the upstream combiner produced a non-empty sequences list.
- Error: Boltz files must be a dictionary: Connect the boltz_files output from Boltz YAML Combiner directly; do not pass a list or string.
- Empty affinity.json output: This is expected if no affinity property was provided in the YAML; add it via the property builder if needed.
- Unexpected structure format: If you requested mmcif but see a .pdb-like key name, check that output_format is correctly set; the content adheres to the format even if the display name is generic.
- Slow runtime: Reduce diffusion_samples, sampling_steps, or disable use_potentials; also consider lowering max_parallel_samples or num_workers depending on resource limits.
- Invalid chain IDs or missing files: Ensure all referenced chain IDs and auxiliary file names in boltz_yaml match those provided in boltz_files.