Boltz Predict¶
Runs Boltz structure prediction from a prepared Boltz YAML and its auxiliary files. It produces ranked 3D structures plus associated confidence and scoring data. If affinity prediction is requested and valid ligand inputs are present, it also returns affinity estimates. PAE, PDE, and pLDDT matrices are returned as JSON-decoded arrays.

Usage¶
Use this node after assembling inputs with Boltz Sequence/Constraint/Template/Property builders and combining them with Boltz List Combiner and Boltz YAML Combiner. Connect the resulting boltz_yaml and boltz_files here, set prediction parameters (steps, samples, format), and run to obtain structures and confidence/affinity outputs. Choose output_format to get PDB or mmCIF files.
Inputs¶
| Field | Required | Type | Description | Example |
|---|---|---|---|---|
| boltz_yaml | True | BOLTZ_YAML | The Boltz configuration produced by Boltz YAML Combiner. Must contain at least one valid sequence entry. | {'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'MKT...', '_sequence_name': 'seqA'}}]} |
| boltz_files | True | BOLTZ_FILES | Auxiliary files referenced by the YAML (e.g., MSA files, templates) produced by Boltz YAML Combiner. | {'msa_1.a3m': ' |
| seed | True | INT | Random seed for reproducibility. | 42 |
| recycling_steps | False | INT | Number of recycling steps during prediction. | 3 |
| sampling_steps | False | INT | Number of sampling steps for the main diffusion process. | 200 |
| diffusion_samples | False | INT | How many independent samples to generate. | 1 |
| max_parallel_samples | False | INT | Maximum number of samples processed in parallel. | 5 |
| step_scale | False | FLOAT | Diffusion step scale (temperature-like parameter). | 1.638 |
| output_format | False | CHOICE['pdb','mmcif'] | Format of the output structures. When 'mmcif' is chosen, structure file names will use .cif extension. | pdb |
| num_workers | False | INT | Number of worker processes (0 disables multiprocessing). | 0 |
| max_msa_seqs | False | INT | Maximum number of MSA sequences to consider. | 8192 |
| subsample_msa | False | BOOLEAN | Whether to subsample MSA sequences. | False |
| num_subsampled_msa | False | INT | Number of MSA sequences to keep when subsampling is enabled. | 1024 |
| use_potentials | False | BOOLEAN | Enable inference-time potentials to improve quality (may be slower). | False |
| write_full_pae | False | BOOLEAN | If true, write full PAE matrices to outputs. | False |
| write_full_pde | False | BOOLEAN | If true, write full PDE matrices to outputs. | False |
| affinity_mw_correction | False | BOOLEAN | Apply molecular weight correction when computing affinity (only relevant if affinity is requested). | False |
| sampling_steps_affinity | False | INT | Sampling steps to use for affinity prediction runs. | 200 |
| diffusion_samples_affinity | False | INT | Number of diffusion samples for affinity prediction. | 5 |
Outputs¶
| Field | Type | Description | Example |
|---|---|---|---|
| structures.pdb | PDB | Ranked structure files keyed by filename. If output_format is 'mmcif', keys use .cif extension. | {'rank_001_model_1.pdb': 'ATOM ...'} |
| confidence.json | JSON | Per-structure confidence data (e.g., scores/metrics) keyed by filename. | {'confidence_rank_001.json': {'iptm': 0.78, 'ptm': 0.65}} |
| affinity.json | JSON | Affinity prediction results (only populated if affinity was requested and ligands are present). | {'affinity_rank_001.json': {'kd': 1.2e-07, 'units': 'M'}} |
| pae.json | JSON | Predicted Aligned Error matrices per model, decoded from NPZ to JSON-serializable arrays. | {'pae_rank_001': {'predicted_aligned_error': [[0.0, 1.2], [1.1, 0.0]]}} |
| pde.json | JSON | Predicted Distance Error matrices per model, decoded from NPZ to JSON-serializable arrays. | {'pde_rank_001': {'predicted_distance_error': [[0.0, 0.9], [0.8, 0.0]]}} |
| plddt.json | JSON | Per-residue pLDDT scores per model, decoded from NPZ to JSON-serializable arrays. | {'plddt_rank_001': {'plddt': [92.1, 88.4, 85.0]}} |
Important Notes¶
- Affinity outputs are only produced if the YAML includes at least one ligand sequence and an affinity property; otherwise affinity.json will be empty.
- When output_format is set to 'mmcif', the structure result keys are renamed to use .cif extensions.
- Boltz YAML and files must be consistent: any referenced MSA/template in the YAML must exist in boltz_files.
- Large sampling_steps, diffusion_samples, or enabling potentials can significantly increase runtime and resource usage.
- MSA subsampling parameters only affect runs where MSA content is provided via the YAML.
Troubleshooting¶
- Error: 'Boltz YAML must contain at least one sequence' — Ensure the YAML from Boltz YAML Combiner includes a valid sequences list.
- Error: 'Boltz files must be a dictionary' — Connect the boltz_files output from Boltz YAML Combiner directly.
- Error: 'Affinity prediction requires at least one ligand sequence' — Add a ligand sequence and include an affinity property when requesting affinity.
- Empty or missing PAE/PDE/pLDDT arrays — These matrices are decoded from NPZ; if the backend did not produce them or decoding failed, outputs may be empty.
- Unexpected structure file extensions — Verify output_format. Select 'pdb' for .pdb or 'mmcif' for .cif naming.
- Slow execution or timeouts — Reduce diffusion_samples or sampling_steps, disable use_potentials, lower max_parallel_samples, or decrease num_workers if resource-constrained.