Skip to content

Boltz Partial Diffusion

Runs partial diffusion-based structure refinement/generation using a Boltz YAML specification and reference structures. Supports single-chain and multimer systems, selectively fixing specified chains and optionally saving diffusion trajectories. Can also compute affinity if requested in properties.
Preview

Usage

Use this node after assembling a valid Boltz YAML and auxiliary files (MSAs/templates) with the Boltz YAML Combiner. Provide one or more reference structures in PDB format to anchor partial diffusion. Optionally set which chains remain fixed, the fraction of the diffusion process to run, sampling settings, and whether to save trajectories. If affinity is included in properties, enable the related options to obtain affinity predictions.

Inputs

FieldRequiredTypeDescriptionExample
boltz_yamlTrueBOLTZ_YAMLBoltz YAML configuration describing sequences (and optionally constraints/templates/properties).{'version': 1, 'sequences': [{'protein': {'id': 'A', 'sequence': 'MSEQN...', '_sequence_name': 'protA', 'msa': 'msa_1.a3m'}}], 'properties': [{'affinity': {'binder': 'L'}}]}
boltz_filesTrueBOLTZ_FILESAuxiliary files referenced by the YAML (e.g., MSA A3M files, template PDB/CIF files) as a mapping from filename to file content.{'msa_1.a3m': '', 'template_1.pdb': ''}
reference_structuresTruePDBReference structure(s) in PDB format used to anchor the partial diffusion. Provide as a mapping name -> PDB text.{'ref_model.pdb': ''}
seedTrueINTBase random seed. When multiple sequence entries are present, the seed is incremented per entry.42
partial_diffusion_fractionFalseFLOATFraction of the diffusion process to apply (0.0 = no diffusion, 1.0 = full diffusion).0.25
save_trajectoryFalseBOOLEANIf true, saves the diffusion trajectory (intermediate structures) for analysis.True
fixed_chainsFalseSTRINGComma-separated chain IDs to keep fixed during partial diffusion. Must match chain IDs defined in YAML.A,B
recycling_stepsFalseINTNumber of recycling iterations during prediction/refinement.3
sampling_stepsFalseINTNumber of diffusion sampling steps.200
diffusion_samplesFalseINTNumber of independent diffusion samples to generate per input.1
max_parallel_samplesFalseINTMaximum number of samples to run in parallel.5
step_scaleFalseFLOATDiffusion step scale (temperature-like parameter) controlling exploration.1.638
output_formatFalse['pdb', 'mmcif']Output structure format.pdb
num_workersFalseINTNumber of parallel workers (0 disables multiprocessing).0
use_potentialsFalseBOOLEANUse inference-time potentials to guide sampling (recommended for partial diffusion).True
write_full_paeFalseBOOLEANIf true, writes the full Predicted Aligned Error matrix to confidence output.False
write_full_pdeFalseBOOLEANIf true, writes the full Predicted Distance Error matrix to confidence output.False
max_msa_seqsFalseINTMaximum number of MSA sequences to use.8192
subsample_msaFalseBOOLEANWhether to subsample MSA sequences.False
num_subsampled_msaFalseINTNumber of MSA sequences to keep when subsampling is enabled.1024
affinity_mw_correctionFalseBOOLEANApply molecular weight correction to affinity predictions (effective when affinity property is requested).False
sampling_steps_affinityFalseINTSampling steps dedicated to affinity prediction (when requested).200
diffusion_samples_affinityFalseINTNumber of diffusion samples for affinity prediction (when requested).5

Outputs

FieldTypeDescriptionExample
structures.pdbPDBPredicted/refined structure files as a mapping from name to PDB text. May include multiple ranked structures per input.{'sequence_0_ranked_0.pdb': '', 'sequence_0_ranked_1.pdb': ''}
confidence.jsonJSONConfidence metrics per output (e.g., pLDDT/PAE/PDE), keyed by name.{'sequence_0_confidence.json': '{"plddt": [...], "pae": [...]}'}
trajectory.pdbPDBSaved diffusion trajectory PDB(s) when trajectory saving is enabled; otherwise may be empty.{'sequence_0_trajectory.pdb': ''}
affinity.jsonJSONAffinity prediction outputs when affinity is requested in properties; empty otherwise.{'sequence_0_affinity.json': '{"KD": 1.2, "units": "uM"}'}

Important Notes

  • Reference structures required: You must provide at least one PDB reference structure; this anchors partial diffusion.
  • Chain IDs must match: Any fixed_chains must correspond to chain IDs defined in the YAML sequences.
  • Trajectory output: trajectory.pdb is only populated if save_trajectory is true.
  • Affinity outputs: affinity.json is produced only when the YAML includes an affinity property and the system is set up for affinity prediction; including a ligand sequence is typically required.
  • Multimer handling: The node automatically detects multimer vs. single-chain mode based on the number of unique chain IDs in the YAML.
  • Per-sequence batching: If multiple sequence entries exist in the YAML, the node processes them sequentially and aggregates outputs, incrementing the seed for each.
  • MSA/templates linkage: Ensure boltz_files contains all filenames referenced in boltz_yaml (e.g., msa_.a3m, template_.pdb).
  • Partial diffusion fraction: Lower values keep the structure closer to the reference; higher values allow more deviation.

Troubleshooting

  • Error: 'Fixed chains {...} not found in YAML sequences': Check fixed_chains values and ensure all chain IDs are defined in the YAML sequences.
  • Empty trajectory output: Set save_trajectory to true to record intermediate diffusion states.
  • Missing affinity output: Verify that boltz_yaml includes an affinity property and that the system includes a ligand sequence; enable affinity-related options if needed.
  • Invalid YAML or files: Ensure boltz_yaml is a dictionary with at least one sequence and boltz_files is a dictionary mapping filenames to contents; all referenced files must exist in boltz_files.
  • PDB parsing issues: Confirm reference_structures is a dict with at least one valid PDB text value.
  • Unexpected few outputs for multimers: Verify that all chains and MSAs are correctly specified; check that fixed_chains does not unintentionally constrain all chains.
  • Performance or timeout concerns: Reduce diffusion_samples, sampling_steps, or max_parallel_samples; consider lowering save_trajectory or enabling multiprocessing via num_workers.