Boltz Partial Diffusion¶
Runs Boltz partial diffusion using provided reference structures to refine or partially resample protein (and optional ligand) configurations. Supports single-chain and multimer complexes, optional fixed chains, and can save intermediate diffusion trajectories. When affinity properties are requested in the YAML and ligands are present, it also produces affinity-related outputs.

Usage¶
Use this node to refine structures guided by an initial/reference PDB. Typical workflow: build sequences, constraints, templates, and properties with the Boltz builders, combine them into a Boltz YAML, provide one or more reference PDB structures, and then run partial diffusion. Enable trajectory saving for analysis, and optionally fix specific chains to remain constrained during diffusion.
Inputs¶
| Field | Required | Type | Description | Example |
|---|---|---|---|---|
| boltz_yaml | True | BOLTZ_YAML | Combined Boltz YAML configuration describing sequences, templates, constraints, and optional properties (e.g., affinity). | {'sequences': [{'protein': {'id': 'A', 'sequence': 'ACDEFGHIKLMNPQRSTVWY'}}], 'properties': [{'affinity': {'target': 'ligand_1'}}]} |
| boltz_files | True | BOLTZ_FILES | Auxiliary files referenced by the Boltz YAML (e.g., template structures, constraint files). | {'template_A.pdb': ' |
| reference_structures | True | PDB | Reference structure(s) used to anchor partial diffusion. Provide as a dictionary of {name: pdb_content}. | {'ref_complex_A_B.pdb': ' |
| seed | True | INT | Base random seed for reproducibility. For multiple sequence entries, per-sequence seeds may be offset internally. | 42 |
| partial_diffusion_fraction | False | FLOAT | Fraction of the diffusion process to apply (0.0 = no diffusion; 1.0 = full diffusion). Lower values keep structures closer to the reference. | 0.25 |
| save_trajectory | False | BOOLEAN | If true, saves the diffusion trajectory as PDBs for analysis. | True |
| fixed_chains | False | STRING | Comma-separated chain IDs to keep fixed (must match chain IDs defined in the YAML sequences). Leave empty to allow all chains to move. | A,B |
| recycling_steps | False | INT | Number of recycling steps during inference. | 3 |
| sampling_steps | False | INT | Number of sampling steps in diffusion. | 200 |
| diffusion_samples | False | INT | Number of diffusion samples to generate per run. | 1 |
| max_parallel_samples | False | INT | Maximum number of diffusion samples to run in parallel. | 5 |
| step_scale | False | FLOAT | Diffusion step scale (temperature-like parameter). | 1.638 |
| output_format | False | STRING | Output structure format. | pdb |
| num_workers | False | INT | Number of worker processes to use (0 disables multiprocessing). | 0 |
| use_potentials | False | BOOLEAN | Use inference-time potentials. Recommended for partial diffusion. | True |
| write_full_pae | False | BOOLEAN | If true, outputs the full PAE matrix in confidence results. | False |
| write_full_pde | False | BOOLEAN | If true, outputs the full PDE matrix in confidence results. | False |
| max_msa_seqs | False | INT | Maximum number of MSA sequences to use. | 8192 |
| subsample_msa | False | BOOLEAN | Enable MSA subsampling. | False |
| num_subsampled_msa | False | INT | Number of MSA sequences to use when subsampling is enabled. | 1024 |
| affinity_mw_correction | False | BOOLEAN | Apply molecular weight correction for affinity prediction (only relevant when affinity is requested in the YAML). | False |
| sampling_steps_affinity | False | INT | Sampling steps for affinity prediction (only applies if affinity is requested). | 200 |
| diffusion_samples_affinity | False | INT | Number of diffusion samples for affinity prediction (only applies if affinity is requested). | 5 |
Outputs¶
| Field | Type | Description | Example |
|---|---|---|---|
| structures.pdb | PDB | Generated or refined structures keyed by name. Each entry is a PDB string. | {'seqA_rank_1.pdb': ' |
| confidence.json | JSON | Confidence metrics for generated structures (e.g., pLDDT and optionally full PAE/PDE if enabled). | {'seqA_confidence.json': {'plddt': [0.85, 0.9], 'pae': None, 'pde': None}} |
| trajectory.pdb | PDB | Diffusion trajectory PDBs if trajectory saving is enabled; otherwise may be empty. | {'seqA_traj_0.pdb': ' |
| affinity.json | JSON | Affinity prediction outputs when affinity is requested in the YAML and ligands are present; empty otherwise. | {'seqA_affinity.json': {'predicted_affinity': 1.23}} |
Important Notes¶
- Reference structures are required and must be provided in PDB format as a dictionary of {name: pdb_content}.
- Chain IDs referenced in fixed_chains must match the IDs defined in the YAML sequences.
- Affinity outputs are only produced when the YAML includes an affinity property and at least one ligand is present.
- Trajectory outputs are only populated if save_trajectory is true.
- For multimers, ensure all chain IDs in the YAML and reference PDBs are consistent.
- MSA-related parameters (max_msa_seqs, subsample_msa, num_subsampled_msa) control multiple sequence alignment usage and can affect runtime and quality.
Troubleshooting¶
- Error: 'Boltz YAML must contain at least one sequence' — Ensure your boltz_yaml has a non-empty 'sequences' list.
- Error: 'Reference structures are required for partial diffusion and must be in PDB format' — Provide at least one PDB in reference_structures as a dict {name: pdb_content}.
- Error: 'Fixed chains {...} not found in YAML sequences' — Check that fixed_chains match chain IDs defined under sequences in the YAML.
- No affinity.json output — Confirm your YAML includes an affinity property and that a ligand sequence is defined.
- Empty trajectory output — Set save_trajectory to true to record diffusion trajectories.
- Invalid chain IDs or mismatched chain labeling — Align chain IDs between YAML and reference PDB files.