Skip to content

Boltz Partial Diffusion

Runs Boltz partial diffusion using provided reference structures to refine or partially resample protein (and optional ligand) configurations. Supports single-chain and multimer complexes, optional fixed chains, and can save intermediate diffusion trajectories. When affinity properties are requested in the YAML and ligands are present, it also produces affinity-related outputs.
Preview

Usage

Use this node to refine structures guided by an initial/reference PDB. Typical workflow: build sequences, constraints, templates, and properties with the Boltz builders, combine them into a Boltz YAML, provide one or more reference PDB structures, and then run partial diffusion. Enable trajectory saving for analysis, and optionally fix specific chains to remain constrained during diffusion.

Inputs

FieldRequiredTypeDescriptionExample
boltz_yamlTrueBOLTZ_YAMLCombined Boltz YAML configuration describing sequences, templates, constraints, and optional properties (e.g., affinity).{'sequences': [{'protein': {'id': 'A', 'sequence': 'ACDEFGHIKLMNPQRSTVWY'}}], 'properties': [{'affinity': {'target': 'ligand_1'}}]}
boltz_filesTrueBOLTZ_FILESAuxiliary files referenced by the Boltz YAML (e.g., template structures, constraint files).{'template_A.pdb': '', 'ligand_params.sdf': ''}
reference_structuresTruePDBReference structure(s) used to anchor partial diffusion. Provide as a dictionary of {name: pdb_content}.{'ref_complex_A_B.pdb': ''}
seedTrueINTBase random seed for reproducibility. For multiple sequence entries, per-sequence seeds may be offset internally.42
partial_diffusion_fractionFalseFLOATFraction of the diffusion process to apply (0.0 = no diffusion; 1.0 = full diffusion). Lower values keep structures closer to the reference.0.25
save_trajectoryFalseBOOLEANIf true, saves the diffusion trajectory as PDBs for analysis.True
fixed_chainsFalseSTRINGComma-separated chain IDs to keep fixed (must match chain IDs defined in the YAML sequences). Leave empty to allow all chains to move.A,B
recycling_stepsFalseINTNumber of recycling steps during inference.3
sampling_stepsFalseINTNumber of sampling steps in diffusion.200
diffusion_samplesFalseINTNumber of diffusion samples to generate per run.1
max_parallel_samplesFalseINTMaximum number of diffusion samples to run in parallel.5
step_scaleFalseFLOATDiffusion step scale (temperature-like parameter).1.638
output_formatFalseSTRINGOutput structure format.pdb
num_workersFalseINTNumber of worker processes to use (0 disables multiprocessing).0
use_potentialsFalseBOOLEANUse inference-time potentials. Recommended for partial diffusion.True
write_full_paeFalseBOOLEANIf true, outputs the full PAE matrix in confidence results.False
write_full_pdeFalseBOOLEANIf true, outputs the full PDE matrix in confidence results.False
max_msa_seqsFalseINTMaximum number of MSA sequences to use.8192
subsample_msaFalseBOOLEANEnable MSA subsampling.False
num_subsampled_msaFalseINTNumber of MSA sequences to use when subsampling is enabled.1024
affinity_mw_correctionFalseBOOLEANApply molecular weight correction for affinity prediction (only relevant when affinity is requested in the YAML).False
sampling_steps_affinityFalseINTSampling steps for affinity prediction (only applies if affinity is requested).200
diffusion_samples_affinityFalseINTNumber of diffusion samples for affinity prediction (only applies if affinity is requested).5

Outputs

FieldTypeDescriptionExample
structures.pdbPDBGenerated or refined structures keyed by name. Each entry is a PDB string.{'seqA_rank_1.pdb': ''}
confidence.jsonJSONConfidence metrics for generated structures (e.g., pLDDT and optionally full PAE/PDE if enabled).{'seqA_confidence.json': {'plddt': [0.85, 0.9], 'pae': None, 'pde': None}}
trajectory.pdbPDBDiffusion trajectory PDBs if trajectory saving is enabled; otherwise may be empty.{'seqA_traj_0.pdb': ''}
affinity.jsonJSONAffinity prediction outputs when affinity is requested in the YAML and ligands are present; empty otherwise.{'seqA_affinity.json': {'predicted_affinity': 1.23}}

Important Notes

  • Reference structures are required and must be provided in PDB format as a dictionary of {name: pdb_content}.
  • Chain IDs referenced in fixed_chains must match the IDs defined in the YAML sequences.
  • Affinity outputs are only produced when the YAML includes an affinity property and at least one ligand is present.
  • Trajectory outputs are only populated if save_trajectory is true.
  • For multimers, ensure all chain IDs in the YAML and reference PDBs are consistent.
  • MSA-related parameters (max_msa_seqs, subsample_msa, num_subsampled_msa) control multiple sequence alignment usage and can affect runtime and quality.

Troubleshooting

  • Error: 'Boltz YAML must contain at least one sequence' — Ensure your boltz_yaml has a non-empty 'sequences' list.
  • Error: 'Reference structures are required for partial diffusion and must be in PDB format' — Provide at least one PDB in reference_structures as a dict {name: pdb_content}.
  • Error: 'Fixed chains {...} not found in YAML sequences' — Check that fixed_chains match chain IDs defined under sequences in the YAML.
  • No affinity.json output — Confirm your YAML includes an affinity property and that a ligand sequence is defined.
  • Empty trajectory output — Set save_trajectory to true to record diffusion trajectories.
  • Invalid chain IDs or mismatched chain labeling — Align chain IDs between YAML and reference PDB files.