Skip to content

Boltz Partial Diffusion

Runs Boltz partial diffusion using a reference structure to refine or sample structures while optionally fixing specific chains. Supports both single-chain and multimer systems, can save diffusion trajectories, and can include affinity prediction when requested in the YAML properties.
Preview

Usage

Use this node after building Boltz YAML and auxiliary files. Provide a reference PDB structure to guide partial diffusion. Optionally fix selected chains, set the diffusion fraction, and enable trajectory saving for analysis. Integrate it in workflows where you want to refine models toward a reference, sample alternative conformations, or predict affinity (when a ligand and affinity property are present).

Inputs

FieldRequiredTypeDescriptionExample
boltz_yamlTrueBOLTZ_YAMLBoltz YAML configuration that defines sequences (and optionally constraints, templates, and properties).{'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'ACDEFGHIKLM'}}], 'properties': [{'affinity': {'binder': 'L'}}]}
boltz_filesTrueBOLTZ_FILESAuxiliary files referenced by the YAML (e.g., MSA .a3m files, template PDB files).{'msa_1.a3m': '', 'template_1.pdb': ''}
reference_structuresTruePDBReference structures in PDB format that guide the partial diffusion process. Provide as a mapping of filename to PDB text.{'reference.pdb': ''}
seedTrueINTBase random seed; each sequence may be offset internally to produce multiple samples deterministically.42
partial_diffusion_fractionFalseFLOATFraction of the diffusion process to perform (0.0 = no diffusion, 1.0 = full diffusion). Lower values keep models closer to the reference.0.25
save_trajectoryFalseBOOLEANIf true, saves intermediate diffusion trajectory frames for analysis.True
fixed_chainsFalseSTRINGComma-separated chain IDs to keep fixed (e.g., A,B). Chains must exist in the YAML sequences.A,B
recycling_stepsFalseINTNumber of recycling steps during prediction.3
sampling_stepsFalseINTNumber of sampling steps for diffusion.200
diffusion_samplesFalseINTNumber of diffusion samples to generate per sequence.1
max_parallel_samplesFalseINTMaximum number of diffusion samples to run in parallel.5
step_scaleFalseFLOATDiffusion step scale (temperature-like parameter).1.638
output_formatFalseSTRINGOutput format for structures. Allowed values: pdb, mmcif.pdb
num_workersFalseINTNumber of worker processes to use (0 disables multiprocessing).0
use_potentialsFalseBOOLEANUse inference-time potentials. Recommended for partial diffusion.True
write_full_paeFalseBOOLEANIf true, writes full PAE matrices to the confidence outputs.False
write_full_pdeFalseBOOLEANIf true, writes full PDE matrices to the confidence outputs.False
max_msa_seqsFalseINTMaximum number of MSA sequences to use.8192
subsample_msaFalseBOOLEANIf true, subsamples the MSA to reduce size.False
num_subsampled_msaFalseINTNumber of sequences to keep when subsampling the MSA.1024
affinity_mw_correctionFalseBOOLEANApply molecular weight correction for affinity prediction (only relevant if affinity property is present).False
sampling_steps_affinityFalseINTSampling steps used specifically for affinity prediction.200
diffusion_samples_affinityFalseINTNumber of diffusion samples used for affinity prediction.5

Outputs

FieldTypeDescriptionExample
structures.pdbPDBRanked predicted structures produced by partial diffusion. Keys may be prefixed with sequence names when multiple sequences are present.{'sequence_0_rank_1_model_1.pdb': ''}
confidence.jsonJSONConfidence metrics (e.g., pLDDT/PAE) for the generated structures, optionally including full matrices if enabled.{'sequence_0_confidence.json': '{"pLDDT": [...], "PAE": [...]}'}
trajectory.pdbPDBTrajectory frames from the diffusion process if trajectory saving is enabled. May be empty if not requested.{'sequence_0_traj_000.pdb': ''}
affinity.jsonJSONAffinity prediction outputs when the YAML includes an affinity property and a binder. Empty if affinity is not requested.{'sequence_0_affinity.json': '{"Kd": 1.2, "units": "uM"}'}

Important Notes

  • Reference structures are required: Provide PDB-format reference structures as a mapping name->content; the node will not run without them.
  • Fixed chains must exist: The fixed_chains list must be a subset of chain IDs declared in the YAML; otherwise validation fails.
  • Affinity output requires affinity property: The affinity.json output is only populated if the YAML properties include an affinity entry (and the model supports it).
  • Sequence name prefixes: For multi-sequence inputs, output filenames/keys may be prefixed with sequence names to distinguish results.
  • Trajectory output: trajectory.pdb will only contain data when save_trajectory is true; otherwise it may be empty.
  • Multimer support: The node automatically handles single-chain and multimer systems; parameters like fixed_chains are especially useful in multimer scenarios.

Troubleshooting

  • Error: 'Boltz YAML must contain at least one sequence': Ensure boltz_yaml includes a non-empty sequences list.
  • Error: 'Reference structures are required': Provide a PDB mapping under reference_structures, e.g., {"reference.pdb": ""}.
  • Error: 'Fixed chains {…} not found in YAML sequences': Check that fixed_chains only includes chain IDs defined under each entity's id in the YAML.
  • Empty trajectory output: Enable save_trajectory to capture diffusion trajectories.
  • No affinity outputs: Include an affinity property in boltz_yaml (and ensure a binder chain/ligand is defined) to produce affinity.json.
  • Unexpected empty PDB outputs: Verify sampling_steps, partial_diffusion_fraction, and that reference_structures content is valid PDB text.