Boltz Predict¶

Runs Boltz structure prediction from a prepared Boltz YAML and its auxiliary files. It produces ranked 3D structures plus associated confidence and scoring data. If affinity prediction is requested and valid ligand inputs are present, it also returns affinity estimates. PAE, PDE, and pLDDT matrices are returned as JSON-decoded arrays.

Usage¶

Use this node after assembling inputs with Boltz Sequence/Constraint/Template/Property builders and combining them with Boltz List Combiner and Boltz YAML Combiner. Connect the resulting boltz_yaml and boltz_files here, set prediction parameters (steps, samples, format), and run to obtain structures and confidence/affinity outputs. Choose output_format to get PDB or mmCIF files.

Inputs¶

Field	Required	Type	Description	Example
boltz_yaml	True	BOLTZ_YAML	The Boltz configuration produced by Boltz YAML Combiner. Must contain at least one valid sequence entry.	{'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'MKT...', '_sequence_name': 'seqA'}}]}
boltz_files	True	BOLTZ_FILES	Auxiliary files referenced by the YAML (e.g., MSA files, templates) produced by Boltz YAML Combiner.	{'msa_1.a3m': '', 'template_1.pdb': 'ATOM ...'}
seed	True	INT	Random seed for reproducibility.	42
recycling_steps	False	INT	Number of recycling steps during prediction.	3
sampling_steps	False	INT	Number of sampling steps for the main diffusion process.	200
diffusion_samples	False	INT	How many independent samples to generate.	1
max_parallel_samples	False	INT	Maximum number of samples processed in parallel.	5
step_scale	False	FLOAT	Diffusion step scale (temperature-like parameter).	1.638
output_format	False	CHOICE['pdb','mmcif']	Format of the output structures. When 'mmcif' is chosen, structure file names will use .cif extension.	pdb
num_workers	False	INT	Number of worker processes (0 disables multiprocessing).	0
max_msa_seqs	False	INT	Maximum number of MSA sequences to consider.	8192
subsample_msa	False	BOOLEAN	Whether to subsample MSA sequences.	False
num_subsampled_msa	False	INT	Number of MSA sequences to keep when subsampling is enabled.	1024
use_potentials	False	BOOLEAN	Enable inference-time potentials to improve quality (may be slower).	False
write_full_pae	False	BOOLEAN	If true, write full PAE matrices to outputs.	False
write_full_pde	False	BOOLEAN	If true, write full PDE matrices to outputs.	False
affinity_mw_correction	False	BOOLEAN	Apply molecular weight correction when computing affinity (only relevant if affinity is requested).	False
sampling_steps_affinity	False	INT	Sampling steps to use for affinity prediction runs.	200
diffusion_samples_affinity	False	INT	Number of diffusion samples for affinity prediction.	5

Outputs¶

Field	Type	Description	Example
structures.pdb	PDB	Ranked structure files keyed by filename. If output_format is 'mmcif', keys use .cif extension.	{'rank_001_model_1.pdb': 'ATOM ...'}
confidence.json	JSON	Per-structure confidence data (e.g., scores/metrics) keyed by filename.	{'confidence_rank_001.json': {'iptm': 0.78, 'ptm': 0.65}}
affinity.json	JSON	Affinity prediction results (only populated if affinity was requested and ligands are present).	{'affinity_rank_001.json': {'kd': 1.2e-07, 'units': 'M'}}
pae.json	JSON	Predicted Aligned Error matrices per model, decoded from NPZ to JSON-serializable arrays.	{'pae_rank_001': {'predicted_aligned_error': [[0.0, 1.2], [1.1, 0.0]]}}
pde.json	JSON	Predicted Distance Error matrices per model, decoded from NPZ to JSON-serializable arrays.	{'pde_rank_001': {'predicted_distance_error': [[0.0, 0.9], [0.8, 0.0]]}}
plddt.json	JSON	Per-residue pLDDT scores per model, decoded from NPZ to JSON-serializable arrays.	{'plddt_rank_001': {'plddt': [92.1, 88.4, 85.0]}}

Important Notes¶

Affinity outputs are only produced if the YAML includes at least one ligand sequence and an affinity property; otherwise affinity.json will be empty.
When output_format is set to 'mmcif', the structure result keys are renamed to use .cif extensions.
Boltz YAML and files must be consistent: any referenced MSA/template in the YAML must exist in boltz_files.
Large sampling_steps, diffusion_samples, or enabling potentials can significantly increase runtime and resource usage.
MSA subsampling parameters only affect runs where MSA content is provided via the YAML.

Troubleshooting¶

Error: 'Boltz YAML must contain at least one sequence' — Ensure the YAML from Boltz YAML Combiner includes a valid sequences list.
Error: 'Boltz files must be a dictionary' — Connect the boltz_files output from Boltz YAML Combiner directly.
Error: 'Affinity prediction requires at least one ligand sequence' — Add a ligand sequence and include an affinity property when requesting affinity.
Empty or missing PAE/PDE/pLDDT arrays — These matrices are decoded from NPZ; if the backend did not produce them or decoding failed, outputs may be empty.
Unexpected structure file extensions — Verify output_format. Select 'pdb' for .pdb or 'mmcif' for .cif naming.
Slow execution or timeouts — Reduce diffusion_samples or sampling_steps, disable use_potentials, lower max_parallel_samples, or decrease num_workers if resource-constrained.