Boltz Partial Diffusion¶

Runs Boltz partial diffusion using provided reference structures to refine or partially resample protein (and optional ligand) configurations. Supports single-chain and multimer complexes, optional fixed chains, and can save intermediate diffusion trajectories. When affinity properties are requested in the YAML and ligands are present, it also produces affinity-related outputs.

Usage¶

Use this node to refine structures guided by an initial/reference PDB. Typical workflow: build sequences, constraints, templates, and properties with the Boltz builders, combine them into a Boltz YAML, provide one or more reference PDB structures, and then run partial diffusion. Enable trajectory saving for analysis, and optionally fix specific chains to remain constrained during diffusion.

Inputs¶

Field	Required	Type	Description	Example
boltz_yaml	True	BOLTZ_YAML	Combined Boltz YAML configuration describing sequences, templates, constraints, and optional properties (e.g., affinity).	{'sequences': [{'protein': {'id': 'A', 'sequence': 'ACDEFGHIKLMNPQRSTVWY'}}], 'properties': [{'affinity': {'target': 'ligand_1'}}]}
boltz_files	True	BOLTZ_FILES	Auxiliary files referenced by the Boltz YAML (e.g., template structures, constraint files).	{'template_A.pdb': '', 'ligand_params.sdf': ''}
reference_structures	True	PDB	Reference structure(s) used to anchor partial diffusion. Provide as a dictionary of {name: pdb_content}.	{'ref_complex_A_B.pdb': ''}
seed	True	INT	Base random seed for reproducibility. For multiple sequence entries, per-sequence seeds may be offset internally.	42
partial_diffusion_fraction	False	FLOAT	Fraction of the diffusion process to apply (0.0 = no diffusion; 1.0 = full diffusion). Lower values keep structures closer to the reference.	0.25
save_trajectory	False	BOOLEAN	If true, saves the diffusion trajectory as PDBs for analysis.	True
fixed_chains	False	STRING	Comma-separated chain IDs to keep fixed (must match chain IDs defined in the YAML sequences). Leave empty to allow all chains to move.	A,B
recycling_steps	False	INT	Number of recycling steps during inference.	3
sampling_steps	False	INT	Number of sampling steps in diffusion.	200
diffusion_samples	False	INT	Number of diffusion samples to generate per run.	1
max_parallel_samples	False	INT	Maximum number of diffusion samples to run in parallel.	5
step_scale	False	FLOAT	Diffusion step scale (temperature-like parameter).	1.638
output_format	False	STRING	Output structure format.	pdb
num_workers	False	INT	Number of worker processes to use (0 disables multiprocessing).	0
use_potentials	False	BOOLEAN	Use inference-time potentials. Recommended for partial diffusion.	True
write_full_pae	False	BOOLEAN	If true, outputs the full PAE matrix in confidence results.	False
write_full_pde	False	BOOLEAN	If true, outputs the full PDE matrix in confidence results.	False
max_msa_seqs	False	INT	Maximum number of MSA sequences to use.	8192
subsample_msa	False	BOOLEAN	Enable MSA subsampling.	False
num_subsampled_msa	False	INT	Number of MSA sequences to use when subsampling is enabled.	1024
affinity_mw_correction	False	BOOLEAN	Apply molecular weight correction for affinity prediction (only relevant when affinity is requested in the YAML).	False
sampling_steps_affinity	False	INT	Sampling steps for affinity prediction (only applies if affinity is requested).	200
diffusion_samples_affinity	False	INT	Number of diffusion samples for affinity prediction (only applies if affinity is requested).	5

Outputs¶

Field	Type	Description	Example
structures.pdb	PDB	Generated or refined structures keyed by name. Each entry is a PDB string.	{'seqA_rank_1.pdb': ''}
confidence.json	JSON	Confidence metrics for generated structures (e.g., pLDDT and optionally full PAE/PDE if enabled).	{'seqA_confidence.json': {'plddt': [0.85, 0.9], 'pae': None, 'pde': None}}
trajectory.pdb	PDB	Diffusion trajectory PDBs if trajectory saving is enabled; otherwise may be empty.	{'seqA_traj_0.pdb': ''}
affinity.json	JSON	Affinity prediction outputs when affinity is requested in the YAML and ligands are present; empty otherwise.	{'seqA_affinity.json': {'predicted_affinity': 1.23}}

Important Notes¶

Reference structures are required and must be provided in PDB format as a dictionary of {name: pdb_content}.
Chain IDs referenced in fixed_chains must match the IDs defined in the YAML sequences.
Affinity outputs are only produced when the YAML includes an affinity property and at least one ligand is present.
Trajectory outputs are only populated if save_trajectory is true.
For multimers, ensure all chain IDs in the YAML and reference PDBs are consistent.
MSA-related parameters (max_msa_seqs, subsample_msa, num_subsampled_msa) control multiple sequence alignment usage and can affect runtime and quality.

Troubleshooting¶

Error: 'Boltz YAML must contain at least one sequence' — Ensure your boltz_yaml has a non-empty 'sequences' list.
Error: 'Reference structures are required for partial diffusion and must be in PDB format' — Provide at least one PDB in reference_structures as a dict {name: pdb_content}.
Error: 'Fixed chains {...} not found in YAML sequences' — Check that fixed_chains match chain IDs defined under sequences in the YAML.
No affinity.json output — Confirm your YAML includes an affinity property and that a ligand sequence is defined.
Empty trajectory output — Set save_trajectory to true to record diffusion trajectories.
Invalid chain IDs or mismatched chain labeling — Align chain IDs between YAML and reference PDB files.